idnits 2.17.1 

draft-ietf-822ext-mime-imb-06.txt:
  ** The Abstract section seems to be numbered


  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in
     this document.

     Expected boilerplate is as follows today (2024-04-24) according to
     https://trustee.ietf.org/license-info :

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.a:
        This Internet-Draft is submitted in full conformance with the provisions
        of BCP 78 and BCP 79.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2:
        Copyright (c) 2024 IETF Trust and the persons identified as the document
        authors.  All rights reserved.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3:
        This document is subject to BCP 78 and the IETF Trust's Legal Provisions
        Relating to IETF Documents
        (https://trustee.ietf.org/license-info) in effect on the date of
        publication of this document.  Please review these documents
        carefully, as they describe your rights and restrictions with
        respect to this document.  Code Components extracted from this
        document must include Simplified BSD License text as described in
        Section 4.e of the Trust Legal Provisions and are provided
        without warranty as described in the Simplified BSD License.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity -- however, there's a paragraph with a matching
     beginning. Boilerplate error?

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     current Internet-Drafts. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     Shadow Directories. 

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack a both a reference to RFC 2119 and the
     recommended RFC 2119 boilerplate, even if it appears to use RFC 2119
     keywords. 

     RFC 2119 keyword, line 417: '... in accordance with this document MUST...'
     RFC 2119 keyword, line 946: '...       MAY be represented as the US-AS...'
     RFC 2119 keyword, line 951: '... Octets with values of 9 and 32 MAY be...'
     RFC 2119 keyword, line 953: '...espectively, but MUST NOT be so repres...'
     RFC 2119 keyword, line 955: '... an encoded line MUST thus be followed...'
     (4 more instances...)


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == Line 1453 has weird spacing: '...     no  inter...'

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (March 1996) is 10267 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Missing reference section? 'RFC821' on line 345 looks like a reference

  -- Missing reference section? 'ATK' on line 142 looks like a reference

  -- Missing reference section? 'X400' on line 147 looks like a reference

  -- Missing reference section? 'RFC-1741' on line 1122 looks like a reference


     Summary: 9 errors (**), 0 flaws (~~), 2 warnings (==), 6 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Network Working Group                     Nathaniel Borenstein
2	Internet Draft                                       Ned Freed
3	                           <draft-ietf-822ext-mime-imb-06.txt>

5	            Multipurpose Internet Mail Extensions
6	                       (MIME) Part One:

8	              Format of Internet Message Bodies

10	                          March 1996

12	                     Status of this Memo

14	This document is an Internet-Draft.  Internet-Drafts are
15	working documents of the Internet Engineering Task Force
16	(IETF), its areas, and its working groups. Note that other
17	groups may also distribute working documents as Internet-
18	Drafts.

20	Internet-Drafts are draft documents valid for a maximum of six
21	months. Internet-Drafts may be updated, replaced, or obsoleted
22	by other documents at any time.  It is not appropriate to use
23	Internet-Drafts as reference material or to cite them other
24	than as a "working draft" or "work in progress".

26	To learn the current status of any Internet-Draft, please
27	check the 1id-abstracts.txt listing contained in the
28	Internet-Drafts Shadow Directories on ds.internic.net (US East
29	Coast), nic.nordu.net (Europe), ftp.isi.edu (US West Coast),
30	or munnari.oz.au (Pacific Rim).

32	1.  Abstract

34	STD 11, RFC 822, defines a message representation protocol
35	specifying considerable detail about US-ASCII message headers,
36	and leaves the message content, or message body, as flat US-
37	ASCII text.  This set of documents, collectively called the
38	Multipurpose Internet Mail Extensions, or MIME, redefines the
39	format of messages to allow for
40	 (1)   textual message bodies in character sets other than
41	       US-ASCII,

43	 (2)   an extensible set of different formats for non-textual
44	       message bodies,

46	 (3)   multi-part message bodies, and

48	 (4)   textual header information in character sets other than
49	       US-ASCII.

51	These documents are based on earlier work documented in RFC
52	934, STD 11, and RFC 1049, but extends and revises them.
53	Because RFC 822 said so little about message bodies, these
54	documents are largely orthogonal to (rather than a revision
55	of) RFC 822.

57	This initial document specifies the various headers used to
58	describe the structure of MIME messages. The second document,
59	RFC MIME-IMT, defines the general structure of the MIME media
60	typing system and defines an initial set of media types. The
61	third document, RFC MIME-HEADERS, describes extensions to RFC
62	822 to allow non-US-ASCII text data in Internet mail header
63	fields. The fourth document, RFC MIME-REG, specifies various
64	IANA registration procedures for MIME-related facilities. The
65	fifth and final document, RFC MIME-CONF, describes MIME
66	conformance criteria as well as providing some illustrative
67	examples of MIME message formats, acknowledgements, and the
68	bibliography.

70	These documents are revisions of RFCs 1521, 1522, and 1590,
71	which themselves were revisions of RFCs 1341 and 1342.  An
72	appendix in RFC MIME-CONF describes differences and changes
73	from previous versions.

75	2.  Table of Contents

77	1 Abstract ..............................................    1
78	2 Table of Contents .....................................    3
79	3 Introduction ..........................................    4
80	4 Definitions, Conventions, and Generic BNF Grammar .....    6
81	4.1 CRLF ................................................    7
82	4.2 Character Set .......................................    7
83	4.3 Message .............................................    8
84	4.4 Entity ..............................................    8
85	4.5 Body Part ...........................................    8
86	4.6 Body ................................................    8
87	4.7 7bit Data ...........................................    9
88	4.8 8bit Data ...........................................    9
89	4.9 Binary Data .........................................    9
90	4.10 Lines ..............................................    9
91	5 MIME Header Fields ....................................    9
92	6 MIME-Version Header Field .............................   10
93	7 Content-Type Header Field .............................   12
94	7.1 Syntax of the Content-Type Header Field .............   14
95	7.2 Content-Type Defaults ...............................   16
96	8 Content-Transfer-Encoding Header Field ................   17
97	8.1 Content-Transfer-Encoding Syntax ....................   17
98	8.2 Content-Transfer-Encodings Semantics ................   17
99	8.3 New Content-Transfer-Encodings ......................   19
100	8.4 Interpretation and Use ..............................   19
101	8.5 Translating Encodings ...............................   21
102	8.6 Canonical Encoding Model ............................   22
103	8.7 Quoted-Printable Content-Transfer-Encoding ..........   22
104	8.8 Base64 Content-Transfer-Encoding ....................   26
105	9 Content-ID Header Field ...............................   29
106	10 Content-Description Header Field .....................   30
107	11 Additional MIME Header Fields ........................   30
108	12 Summary ..............................................   30
109	13 Security Considerations ..............................   31
110	14 Authors' Addresses ...................................   32
111	A Collected Grammar .....................................   33
112	3.  Introduction

114	Since its publication in 1982, RFC 822 has defined the
115	standard format of textual mail messages on the Internet.  Its
116	success has been such that the RFC 822 format has been
117	adopted, wholly or partially, well beyond the confines of the
118	Internet and the Internet SMTP transport defined by RFC 821.
119	As the format has seen wider use, a number of limitations have
120	proven increasingly restrictive for the user community.

122	RFC 822 was intended to specify a format for text messages.
123	As such, non-text messages, such as multimedia messages that
124	might include audio or images, are simply not mentioned.  Even
125	in the case of text, however, RFC 822 is inadequate for the
126	needs of mail users whose languages require the use of
127	character sets richer than US-ASCII.  Since RFC 822 does not
128	specify mechanisms for mail containing audio, video, Asian
129	language text, or even text in most European languages,
130	additional specifications are needed.

132	One of the notable limitations of RFC 821/822 based mail
133	systems is the fact that they limit the contents of electronic
134	mail messages to relatively short lines (e.g. 1000 characters
135	or less [RFC821]) of 7bit US-ASCII.  This forces users to
136	convert any non-textual data that they may wish to send into
137	seven-bit bytes representable as printable US-ASCII characters
138	before invoking a local mail UA (User Agent, a program with
139	which human users send and receive mail). Examples of such
140	encodings currently used in the Internet include pure
141	hexadecimal, uuencode, the 3-in-4 base 64 scheme specified in
142	RFC 1421, the Andrew Toolkit Representation [ATK], and many
143	others.

145	The limitations of RFC 822 mail become even more apparent as
146	gateways are designed to allow for the exchange of mail
147	messages between RFC 822 hosts and X.400 hosts.  X.400 [X400]
148	specifies mechanisms for the inclusion of non-textual material
149	within electronic mail messages.  The current standards for
150	the mapping of X.400 messages to RFC 822 messages specify
151	either that X.400 non-textual material must be converted to
152	(not encoded in) IA5Text format, or that they must be
153	discarded, notifying the RFC 822 user that discarding has
154	occurred.  This is clearly undesirable, as information that a
155	user may wish to receive is lost.  Even though a user agent
156	may not have the capability of dealing with the non-textual
157	material, the user might have some mechanism external to the
158	UA that can extract useful information from the material.
159	Moreover, it does not allow for the fact that the message may
160	eventually be gatewayed back into an X.400 message handling
161	system (i.e., the X.400 message is "tunneled" through Internet
162	mail), where the non-textual information would definitely
163	become useful again.

165	This document describes several mechanisms that combine to
166	solve most of these problems without introducing any serious
167	incompatibilities with the existing world of RFC 822 mail.  In
168	particular, it describes:

170	 (1)   A MIME-Version header field, which uses a version
171	       number to declare a message to be conformant with this
172	       specification and allows mail processing agents to
173	       distinguish between such messages and those generated
174	       by older or non-conformant software, which are presumed
175	       to lack such a field.

177	 (2)   A Content-Type header field, generalized from RFC 1049,
178	       which can be used to specify the media type and subtype
179	       of data in the body of a message and to fully specify
180	       the native representation (canonical form) of such
181	       data.

183	 (3)   A Content-Transfer-Encoding header field, which can be
184	       used to specify both the encoding transformation that
185	       was applied to the body and the domain of the result.
186	       Encoding transformations other than the identity
187	       transformation are usually applied to data in order to
188	       allow it to pass through mail transport mechanisms
189	       which may have data or character set limitations.

191	 (4)   Two additional header fields that can be used to
192	       further describe the data in a body, the Content-ID and
193	       Content-Description header fields.

195	All of the header fields defined in this document are subject
196	to the general syntactic rules for header fields specified in
197	RFC 822.  In particular, all of these header fields except for
198	Content-Disposition can include RFC 822 comments, which have
199	no semantic content and should be ignored during MIME
200	processing.

202	Finally, to specify and promote interoperability, RFC MIME-
203	CONF provides a basic applicability statement for a subset of
204	the above mechanisms that defines a minimal level of
205	"conformance" with this document.

207	HISTORICAL NOTE:  Several of the mechanisms described in this
208	set of documents may seem somewhat strange or even baroque at
209	first reading.  It is important to note that compatibility
210	with existing standards AND robustness across existing
211	practice were two of the highest priorities of the working
212	group that developed this set of documents.  In particular,
213	compatibility was always favored over elegance.

215	Please refer to the current edition of the "IAB Official
216	Protocol Standards" for the standardization state and status
217	of this protocol.  RFC 822  and RFC 1123 also provide
218	essential background for MIME since no conforming
219	implementation of MIME can violate them.  In addition, several
220	other informational RFC documents will be of interest to the
221	MIME implementor, in particular RFC 1344, RFC 1345, and RFC
222	1524.

224	4.  Definitions, Conventions, and Generic BNF Grammar

226	Although the mechanisms specified in this set of documents are
227	all described in prose, most are also described formally in
228	the augmented BNF notation of RFC 822. Implementors will need
229	to be familiar with this notation in order to understand this
230	specification, and are referred to RFC 822 for a complete
231	explanation of the augmented BNF notation.

233	Some of the augmented BNF in this set of documents makes named
234	references to syntax rules defined in RFC 822.  A complete
235	formal grammar, then, is obtained by combining the collected
236	grammar appendices in each document in this set with the BNF
237	of RFC 822 plus the modifications to RFC 822 defined in RFC
238	1123 (which specifically changes the syntax for `return',
239	`date' and `mailbox').

241	All numeric and octet values are given in decimal notation in
242	this set of documents. All media type values, subtype values,
243	and parameter names as defined are case-insensitive.  However,
244	parameter values are case-sensitive unless otherwise specified
245	for the specific parameter.

247	FORMATTING NOTE:  Notes, such at this one, provide additional
248	nonessential information which may be skipped by the reader
249	without missing anything essential.  The primary purpose of
250	these non-essential notes is to convey information about the
251	rationale of this set of documents, or to place these
252	documents in the proper historical or evolutionary context.
253	Such information may in particular be skipped by those who are
254	focused entirely on building a conformant implementation, but
255	may be of use to those who wish to understand why certain
256	design choices were made.

258	4.1.  CRLF

260	The term CRLF, in this set of documents, refers to the
261	sequence of octets corresponding to the two US-ASCII
262	characters CR (decimal value 13) and LF (decimal value 10)
263	which, taken together, in this order, denote a line break in
264	RFC 822 mail.

266	4.2.  Character Set

268	The term "character set" is used in MIME to refer to a method
269	of converting a sequence of octets into a sequence of
270	characters.  Note that unconditional and unambiguous
271	conversion in the other direction is not required, in that not
272	all characters may be representable by a given character set
273	and a character set may provide more than one sequence of
274	octets to represent a particular sequence of characters.

276	This definition is intended to allow various kinds of
277	character encodings, from simple single-table mappings such as
278	US-ASCII to complex table switching methods such as those that
279	use ISO 2022's techniques, to be used as character sets.
280	However, the definition associated with a MIME character set
281	name must fully specify the mapping to be performed.  In
282	particular, use of external profiling information to determine
283	the exact mapping is not permitted.

285	NOTE: The term "character set" was originally used in MIME
286	with specifications such as US-ASCII and other 7bit and 8bit
287	schemes which have a simple mapping from single octets to
288	single characters. Multi-octet coded character sets and
289	switching techniques make the situation more complex. For
290	example, some communities use the term "character encoding"
291	for what MIME calls a "character set", while using the phrase
292	"coded character set" to denote an abstract mapping from
293	integers (not octets) to characters.

295	4.3.  Message

297	The term "message", when not further qualified, means either a
298	(complete or "top-level") RFC 822 message being transferred on
299	a network, or a message encapsulated in a body of type
300	"message/rfc822" or "message/partial".

302	4.4.  Entity

304	The term "entity", refers specifically to the MIME-defined
305	header fields and contents of either a message or one of the
306	parts in the body of a multipart entity.  The specification of
307	such entities is the essence of MIME.  Since the contents of
308	an entity are often called the "body", it makes sense to speak
309	about the body of an entity.  Any sort of field may be present
310	in the header of an entity, but only those fields whose names
311	begin with "content-" actually have any MIME-related meaning.
312	Note that this does NOT imply thay they have no meaning at all
313	-- an entity that is also a message has non-MIME header fields
314	whose meanings are defined by RFC 822.

316	4.5.  Body Part

318	The term "body part" refers to an entity inside of a multipart
319	entity.

321	4.6.  Body

323	The term "body", when not further qualified, means the body of
324	an entity, that is, the body of either a message or of a body
325	part.

327	NOTE:  The previous four definitions are clearly circular.
328	This is unavoidable, since the overall structure of a MIME
329	message is indeed recursive.

331	4.7.  7bit Data

333	"7bit data" refers to data that is all represented as
334	relatively short lines with 998 octets or less between CRLF
335	line separation sequences [RFC821].  No octets with decimal
336	values greater than 127 are allowed and neither are NULs
337	(octets with decimal value 0).  CR (decimal value 13) and LF
338	(decimal value 10) octets only occur as part of CRLF line
339	separation sequences.

341	4.8.  8bit Data

343	"8bit data" refers to data that is all represented as
344	relatively short lines with 998 octets or less between CRLF
345	line separation sequences [RFC821]), but octets with decimal
346	values greater than 127 may be used.  As with "7bit data" CR
347	and LF octets only occur as part of CRLF line separation
348	sequences and no NULs are allowed.

350	4.9.  Binary Data

352	"Binary data" refers to data where any sequence of octets
353	whatsoever is allowed.

355	4.10.  Lines

357	"Lines" are defined as sequences of octets separated by a CRLF
358	sequences.  This is consistent with both RFC 821 and RFC 822.
359	"Lines" only refers to a unit of data in a message, which may
360	or may not correspond to something that is actually displayed
361	by a user agent.

363	5.  MIME Header Fields

365	MIME defines a number of new RFC 822 header fields that are
366	used to describe the content of a MIME entity.  These header
367	fields occur in at least two contexts:

369	 (1)   As part of a regular RFC 822 message header.

371	 (2)   In a MIME body part header within a multipart
372	       construct.

374	The formal definition of these header fields is as follows:

376	  entity-headers := [ content CRLF ]
377	                    [ encoding CRLF ]
378	                    [ id CRLF ]
379	                    [ description CRLF ]
380	                    *( MIME-extension-field CRLF )

382	  MIME-message-headers := entity-headers
383	                          fields
384	                          version CRLF
385	                          ; The ordering of the header
386	                          ; fields implied by this BNF
387	                          ; definition should be ignored.

389	  MIME-part-headers := entity-headers
390	                       [ fields ]
391	                       ; Any field not beginning with
392	                       ; "content-" can have no defined
393	                       ; meaning and may be ignored.
394	                       ; The ordering of the header
395	                       ; fields implied by this BNF
396	                       ; definition should be ignored.

398	The syntax of the various specific MIME header fields will be
399	described in the following sections.

401	6.  MIME-Version Header Field

403	Since RFC 822 was published in 1982, there has really been
404	only one format standard for Internet messages, and there has
405	been little perceived need to declare the format standard in
406	use.  This document is an independent document that
407	complements RFC 822.  Although the extensions in this document
408	have been defined in such a way as to be compatible with RFC
409	822, there are still circumstances in which it might be
410	desirable for a mail-processing agent to know whether a
411	message was composed with the new standard in mind.

413	Therefore, this document defines a new header field, "MIME-
414	Version", which is to be used to declare the version of the
415	Internet message body format standard in use.

417	Messages composed in accordance with this document MUST
418	include such a header field, with the following verbatim text:

420	  MIME-Version: 1.0

422	The presence of this header field is an assertion that the
423	message has been composed in compliance with this document.

425	Since it is possible that a future document might extend the
426	message format standard again, a formal BNF is given for the
427	content of the MIME-Version field:

429	  version := "MIME-Version" ":" 1*DIGIT "." 1*DIGIT

431	Thus, future format specifiers, which might replace or extend
432	"1.0", are constrained to be two integer fields, separated by
433	a period.  If a message is received with a MIME-version value
434	other than "1.0", it cannot be assumed to conform with this
435	specification.

437	Note that the MIME-Version header field is required at the top
438	level of a message.  It is not required for each body part of
439	a multipart entity.  It is required for the embedded headers
440	of a body of type "message/rfc822" or "message/partial" if and
441	only if the embedded message is itself claimed to be MIME-
442	conformant.

444	It is not possible to fully specify how a mail reader that
445	conforms with MIME as defined in this document should treat a
446	message that might arrive in the future with some value of
447	MIME-Version other than "1.0".

449	It is also worth noting that version control for specific
450	media types is not accomplished using the MIME-Version
451	mechanism.  In particular, some formats (such as
452	application/postscript) have version numbering conventions
453	that are internal to the media format.  Where such conventions
454	exist, MIME does nothing to supersede them.  Where no such
455	conventions exist, a MIME media type might use a "version"
456	parameter in the content-type field if necessary.

458	NOTE TO IMPLEMENTORS:  When checking MIME-Version values any
459	RFC 822 comment strings that are present must be ignored.  In
460	particular, the following four MIME-Version fields are
461	equivalent:

463	  MIME-Version: 1.0

465	  MIME-Version: 1.0 (produced by MetaSend Vx.x)

467	  MIME-Version: (produced by MetaSend Vx.x) 1.0

469	  MIME-Version: 1.(produced by MetaSend Vx.x)0

471	In the absence of a MIME-Version field, a receiving mail user
472	agent (whether conforming to MIME requirements or not) may
473	optionally choose to interpret the body of the message
474	according to local conventions.  Many such conventions are
475	currently in use and it should be noted that in practice non-
476	MIME messages can contain just about anything.

478	It is impossible to be certain that a non-MIME mail message is
479	actually plain text in the US-ASCII character set since it
480	might well be a message that, using some set of nonstandard
481	local conventions that predate this document, includes text in
482	another character set or non-textual data presented in a
483	manner that cannot be automatically recognized (e.g., a
484	uuencoded compressed UNIX tar file).

486	7.  Content-Type Header Field

488	The purpose of the Content-Type field is to describe the data
489	contained in the body fully enough that the receiving user
490	agent can pick an appropriate agent or mechanism to present
491	the data to the user, or otherwise deal with the data in an
492	appropriate manner. The value in this field is called a media
493	type.

495	HISTORICAL NOTE:  The Content-Type header field was first
496	defined in RFC 1049.  RFC 1049 used a simpler and less
497	powerful syntax, but one that is largely compatible with the
498	mechanism given here.

500	The Content-Type header field specifies the nature of the data
501	in the body of an entity by giving media type and subtype
502	identifiers, and by providing auxiliary information that may
503	be required for certain media types.  After the media type and
504	subtype names, the remainder of the header field is simply a
505	set of parameters, specified in an attribute=value notation.
506	The ordering of parameters is not significant.

508	In general, the top-level media type is used to declare the
509	general type of data, while the subtype specifies a specific
510	format for that type of data.  Thus, a media type of
511	"image/xyz" is enough to tell a user agent that the data is an
512	image, even if the user agent has no knowledge of the specific
513	image format "xyz".  Such information can be used, for
514	example, to decide whether or not to show a user the raw data
515	from an unrecognized subtype -- such an action might be
516	reasonable for unrecognized subtypes of text, but not for
517	unrecognized subtypes of image or audio.  For this reason,
518	registered subtypes of text, image, audio, and video should
519	not contain embedded information that is really of a different
520	type.  Such compound formats should be represented using the
521	"multipart" or "application" types.

523	Parameters are modifiers of the media subtype, and as such do
524	not fundamentally affect the nature of the content.  The set
525	of meaningful parameters depends on the media type and
526	subtype.  Most parameters are associated with a single
527	specific subtype.  However, a given top-level media type may
528	define parameters which are applicable to any subtype of that
529	type.  Parameters may be required by their defining content
530	type or subtype or they may be optional. MIME implementations
531	must ignore any parameters whose names they do not recognize.

533	For example, the "charset" parameter is applicable to any
534	subtype of "text", while the "boundary" parameter is required
535	for any subtype of the "multipart" media type.

537	There are NO globally-meaningful parameters that apply to all
538	media types.  Truly global mechanisms are best addressed, in
539	the MIME model, by the definition of additional Content-*
540	header fields.

542	An initial set of seven top-level media types is defined in
543	MIME-IMT.  Five of these are discrete types whose content is
544	essentially opaque as far as MIME processing is concerned.
545	The remaining two are composite types whose contents require
546	additional handling by MIME processors.

548	This set of top-level media types is intended to be
549	substantially complete.  It is expected that additions to the
550	larger set of supported types can generally be accomplished by
551	the creation of new subtypes of these initial types.  In the
552	future, more top-level types may be defined only by a
553	standards-track extension to this standard.  If another top-
554	level type is to be used for any reason, it must be given a
555	name starting with "X-" to indicate its non-standard status
556	and to avoid a potential conflict with a future official name.

558	7.1.  Syntax of the Content-Type Header Field

560	In the Augmented BNF notation of RFC 822, a Content-Type
561	header field value is defined as follows:

563	  content := "Content-Type" ":" type "/" subtype
564	             *(";" parameter)
565	             ; Matching of media type and subtype
566	             ; is ALWAYS case-insensitive.

568	  type := discrete-type / composite-type

570	  discrete-type := "text" / "image" / "audio" / "video" /
571	                   "application" / extension-token

573	  composite-type := "message" / "multipart" / extension-token

575	  extension-token := ietf-token / x-token

577	  ietf-token := <An extension token defined by a
578	                 standards-track RFC and registered
579	                 with IANA.>

581	  x-token := <The two characters "X-" or "x-" followed, with
582	              no intervening white space, by any token>

584	  subtype := extension-token / iana-token

586	  iana-token := <A publicly-defined extension token. Tokens
587	                 of this form must be registered with IANA
588	                 as specified in RFC MIME-REG.>

590	  parameter := attribute "=" value
591	  attribute := token
592	               ; Matching of attributes
593	               ; is ALWAYS case-insensitive.

595	  value := token / quoted-string

597	  token := 1*<any (US-ASCII) CHAR except SPACE, CTLs,
598	              or tspecials>

600	  tspecials :=  "(" / ")" / "<" / ">" / "@" /
601	                "," / ";" / ":" / "\" / <">
602	                "/" / "[" / "]" / "?" / "="
603	                ; Must be in quoted-string,
604	                ; to use within parameter values

606	Note that the definition of "tspecials" is the same as the RFC
607	822 definition of "specials" with the addition of the three
608	characters "/", "?", and "=", and the removal of ".".

610	Note also that a subtype specification is MANDATORY -- it may
611	not be omitted from a Content-Type header field.  As such,
612	there are no default subtypes.

614	The type, subtype, and parameter names are not case sensitive.
615	For example, TEXT, Text, and TeXt are all equivalent top-level
616	media types.  Parameter values are normally case sensitive,
617	but sometimes are interpreted in a case-insensitive fashion,
618	depending on the intended use.  (For example, multipart
619	boundaries are case-sensitive, but the "access-type" parameter
620	for message/External-body is not case-sensitive.)

622	Note that the value of a quoted string parameter does not
623	include the quotes.  That is, the quotation marks in a
624	quoted-string are not a part of the value of the parameter,
625	but are merely used to delimit that parameter value.  In
626	addition, comments are allowed in accordance with RFC 822
627	rules for structured header fields.  Thus the following two
628	forms

630	  Content-type: text/plain; charset=us-ascii (Plain text)

632	  Content-type: text/plain; charset="us-ascii"

634	are completely equivalent.

636	Beyond this syntax, the only syntactic constraint on the
637	definition of subtype names is the desire that their uses must
638	not conflict.  That is, it would be undesirable to have two
639	different communities using "Content-Type: application/foobar"
640	to mean two different things.  The process of defining new
641	media subtypes, then, is not intended to be a mechanism for
642	imposing restrictions, but simply a mechanism for publicizing
643	their definition and usage.  There are, therefore, two
644	acceptable mechanisms for defining new media subtypes:

646	 (1)   Private values (starting with "X-") may be defined
647	       bilaterally between two cooperating agents without
648	       outside registration or standardization. Such values
649	       cannot be registered or standardized.

651	 (2)   New standard values should be registered with IANA as
652	       described in RFC MIME-REG.

654	The second document in this set, RFC MIME-IMT, defines the
655	initial set of media types for MIME.

657	7.2.  Content-Type Defaults

659	Default RFC 822 messages without a MIME Content-Type header
660	are taken by this protocol to be plain text in the US-ASCII
661	character set, which can be explicitly specified as:

663	  Content-type: text/plain; charset=us-ascii

665	This default is assumed if no Content-Type header field is
666	specified.  It is also recommend that this default be assumed
667	when a syntactically invalid Content-Type header field is
668	encountered. In the presence of a MIME-Version header field
669	and the absence of any Content-Type header field, a receiving
670	User Agent can also assume that plain US-ASCII text was the
671	sender's intent.  Plain US-ASCII text may still be assumed in
672	the absence of a MIME-Version or the presence of an
673	syntactically invalid Content-Type header field, but the
674	sender's intent might have been otherwise.

676	8.  Content-Transfer-Encoding Header Field

678	Many media types which could be usefully transported via email
679	are represented, in their "natural" format, as 8bit character
680	or binary data.  Such data cannot be transmitted over some
681	transfer protocols.  For example, RFC 821 (SMTP) restricts
682	mail messages to 7bit US-ASCII data with lines no longer than
683	1000 characters including any trailing CRLF line separator.

685	It is necessary, therefore, to define a standard mechanism for
686	encoding such data into a 7bit short line format.  Proper
687	labelling of unencoded material in less restrictive formats
688	for direct use over less restrictive transports is also
689	desireable.  This document specifies that such encodings will
690	be indicated by a new "Content-Transfer-Encoding" header
691	field.  This field has not been defined by any previous
692	standard.

694	8.1.  Content-Transfer-Encoding Syntax

696	The Content-Transfer-Encoding field's value is a single token
697	specifying the type of encoding, as enumerated below.
698	Formally:

700	  encoding := "Content-Transfer-Encoding" ":" mechanism

702	  mechanism := "7bit" / "8bit" / "binary" /
703	               "quoted-printable" / "base64" /
704	               ietf-token / x-token

706	These values are not case sensitive -- Base64 and BASE64 and
707	bAsE64 are all equivalent.  An encoding type of 7BIT requires
708	that the body is already in a 7bit mail-ready representation.
709	This is the default value -- that is, "Content-Transfer-
710	Encoding: 7BIT" is assumed if the Content-Transfer-Encoding
711	header field is not present.

713	8.2.  Content-Transfer-Encodings Semantics

715	This single Content-Transfer-Encoding token actually provides
716	two pieces of information.  It specifies what sort of encoding
717	transformation the body was subjected to, and it specifies
718	what the domain of the result is.

720	Three transformations are currently defined: identity, the
721	"quoted-printable" encoding, and the "base64" encoding.  The
722	domains are "binary", "8bit" and "7bit".

724	The Content-Transfer-Encoding values "7bit", "8bit", and
725	"binary" all mean that the identity (i.e. NO) encoding
726	transformation has been performed.  As such, they serve simply
727	as indicators of the domain of the body data, and provide
728	useful information about the sort of encoding that might be
729	needed for transmission in a given transport system.  The
730	terms "7bit data", "8bit data", and "binary data" are all
731	defined in Section 4.

733	The quoted-printable and base64 encodings transform their
734	input from an arbitrary domain into material in the "7bit"
735	range, thus making it safe to carry over restricted
736	transports.  The specific definition of the transformations
737	are given below.

739	The proper Content-Transfer-Encoding label must always be
740	used.  Labelling unencoded data containing 8bit characters as
741	"7bit" is not allowed, nor is labelling unencoded non-line-
742	oriented data as anything other than "binary" allowed.

744	Unlike media subtypes, a proliferation of Content-Transfer-
745	Encoding values is both undesirable and unnecessary.  However,
746	establishing only a single transformation into the "7bit"
747	domain does not seem possible.  There is a tradeoff between
748	the desire for a compact and efficient encoding of largely-
749	binary data and the desire for a readable encoding of data
750	that is mostly, but not entirely, 7bit.  For this reason, at
751	least two encoding mechanisms are necessary: a "readable"
752	encoding (quoted-printable) and a "dense" encoding (base64).

754	Mail transport for unencoded 8bit data is defined in RFC 1652.
755	As of the initial publication of this document, there are no
756	standardized Internet mail transports for which it is
757	legitimate to include unencoded binary data in mail bodies.
758	Thus there are no circumstances in which the "binary"
759	Content-Transfer-Encoding is actually valid in Internet mail.
760	However, in the event that binary mail transport becomes a
761	reality in Internet mail, or when this document is used in
762	conjunction with any other binary-capable transport mechanism,
763	binary bodies should be labelled as such using this mechanism.

765	NOTE:  The five values defined for the Content-Transfer-
766	Encoding field imply nothing about the media type other than
767	the algorithm by which it was encoded or the transport system
768	requirements if unencoded.

770	8.3.  New Content-Transfer-Encodings

772	Implementors may, if necessary, define private Content-
773	Transfer-Encoding values, but must use an x-token, which is a
774	name prefixed by "X-", to indicate its non-standard status,
775	e.g., "Content-Transfer-Encoding:  x-my-new-encoding".
776	Additional standardized Content-Transfer-Encoding values must
777	be specified by a standards-track RFC.  Additional
778	requirements such specifications must meet are given in RFC
779	REG.  As such, all content-transfer-encoding namespace except
780	that beginning with "X-" is explicitly reserved to the IETF
781	for future use.

783	Unlike media types and subtypes, the creation of new Content-
784	Transfer-Encoding values is STRONGLY discouraged, as it seems
785	likely to hinder interoperability with little potential
786	benefit

788	8.4.  Interpretation and Use

790	If a Content-Transfer-Encoding header field appears as part of
791	a message header, it applies to the entire body of that
792	message.  If a Content-Transfer-Encoding header field appears
793	as part of an entity's headers, it applies only to the body of
794	that entity.  If an entity is of type "multipart" the
795	Content-Transfer-Encoding is not permitted to have any value
796	other than "7bit", "8bit" or "binary".  Even more severe
797	restrictions apply to some subtypes of the "message" type.

799	It should be noted that most media types are defined in terms
800	of octets rather than bits, so that the mechanisms described
801	here are mechanisms for encoding arbitrary octet streams, not
802	bit streams.  If a bit stream is to be encoded via one of
803	these mechanisms, it must first be converted to an 8bit byte
804	stream using the network standard bit order ("big-endian"), in
805	which the earlier bits in a stream become the higher-order
806	bits in a 8bit byte.  A bit stream not ending at an 8bit
807	boundary must be padded with zeroes. RFC MIME-IMT provides a
808	mechanism for noting the addition of such padding in the case
809	of the application/octet-stream media type, which has a
810	"padding" parameter.

812	The encoding mechanisms defined here explicitly encode all
813	data in US-ASCII.  Thus, for example, suppose an entity has
814	header fields such as:

816	  Content-Type: text/plain; charset=ISO-8859-1
817	  Content-transfer-encoding: base64

819	This must be interpreted to mean that the body is a base64
820	US-ASCII encoding of data that was originally in ISO-8859-1,
821	and will be in that character set again after decoding.

823	Certain Content-Transfer-Encoding values may only be used on
824	certain media types.  In particular, it is EXPRESSLY FORBIDDEN
825	to use any encodings other than "7bit", "8bit", or "binary"
826	with any composite media type, i.e. one that recursively
827	includes other Content-Type fields.  Currently the only
828	composite media types are "multipart" and "message".  All
829	encodings that are desired for bodies of type multipart or
830	message must be done at the innermost level, by encoding the
831	actual body that needs to be encoded.

833	It should also be noted that, by definition, if a composite
834	entity has a transfer-encoding value such as "7bit", but one
835	of the enclosed entities has a less restrictive value such as
836	"8bit", then either the outer "7bit" labelling is in error,
837	because 8bit data are included, or the inner "8bit" labelling
838	placed an unnecessarily high demand on the transport system
839	because the actual included data were actually 7bit-safe.

841	NOTE ON ENCODING RESTRICTIONS:  Though the prohibition against
842	using content-transfer-encodings on composite body data may
843	seem overly restrictive, it is necessary to prevent nested
844	encodings, in which data are passed through an encoding
845	algorithm multiple times, and must be decoded multiple times
846	in order to be properly viewed.  Nested encodings add
847	considerable complexity to user agents:  Aside from the
848	obvious efficiency problems with such multiple encodings, they
849	can obscure the basic structure of a message.  In particular,
850	they can imply that several decoding operations are necessary
851	simply to find out what types of bodies a message contains.

853	Banning nested encodings may complicate the job of certain
854	mail gateways, but this seems less of a problem than the
855	effect of nested encodings on user agents.

857	Any entity with an unrecognized Content-Transfer-Encoding must
858	be treated as if it has a Content-Type of "application/octet-
859	stream", regardless of what the Content-Type header field
860	actually says.

862	NOTE ON THE RELATIONSHIP BETWEEN CONTENT-TYPE AND CONTENT-
863	TRANSFER-ENCODING: It may seem that the Content-Transfer-
864	Encoding could be inferred from the characteristics of the
865	media that is to be encoded, or, at the very least, that
866	certain Content-Transfer-Encodings could be mandated for use
867	with specific media types.  There are several reasons why this
868	is not the case. First, given the varying types of transports
869	used for mail, some encodings may be appropriate for some
870	combinations of media types and transports but not for others.
871	(For example, in an 8bit transport, no encoding would be
872	required for text in certain character sets, while such
873	encodings are clearly required for 7bit SMTP.)

875	Second, certain media types may require different types of
876	transfer encoding under different circumstances.  For example,
877	many PostScript bodies might consist entirely of short lines
878	of 7bit data and hence require no encoding at all.  Other
879	PostScript bodies (especially those using Level 2 PostScript's
880	binary encoding mechanism) may only be reasonably represented
881	using a binary transport encoding.  Finally, since the
882	Content-Type field is intended to be an open-ended
883	specification mechanism, strict specification of an
884	association between media types and encodings effectively
885	couples the specification of an application protocol with a
886	specific lower-level transport.  This is not desirable since
887	the developers of a media type should not have to be aware of
888	all the transports in use and what their limitations are.

890	8.5.  Translating Encodings

892	The quoted-printable and base64 encodings are designed so that
893	conversion between them is possible.  The only issue that
894	arises in such a conversion is the handling of hard line
895	breaks in quoted-printable encoding output. When converting
896	from quoted-printable to base64 a hard line break must be
897	converted into a CRLF sequence.  Similarly, a CRLF sequence in
898	base64 data must be converted to a quoted-printable hard line
899	break, but ONLY when converting text data.

901	8.6.  Canonical Encoding Model

903	There was some confusion, in the previous versions of this
904	RFC, regarding the model for when email data was to be
905	converted to canonical form and encoded, and in particular how
906	this process would affect the treatment of CRLFs, given that
907	the representation of newlines varies greatly from system to
908	system, and the relationship between content-transfer-
909	encodings and character sets.  A canonical model for encoding
910	is presented in RFC MIME-CONF for this reason.

912	8.7.  Quoted-Printable Content-Transfer-Encoding

914	The Quoted-Printable encoding is intended to represent data
915	that largely consists of octets that correspond to printable
916	characters in the US-ASCII character set.  It encodes the data
917	in such a way that the resulting octets are unlikely to be
918	modified by mail transport.  If the data being encoded are
919	mostly US-ASCII text, the encoded form of the data remains
920	largely recognizable by humans.  A body which is entirely US-
921	ASCII may also be encoded in Quoted-Printable to ensure the
922	integrity of the data should the message pass through a
923	character-translating, and/or line-wrapping gateway.

925	In this encoding, octets are to be represented as determined
926	by the following rules:

928	 (1)   (General 8bit representation) Any octet, except a CR or
929	       LF that is part of a CRLF line break of the canonical
930	       (standard) form of the data being encoded, may be
931	       represented by an "=" followed by a two digit
932	       hexadecimal representation of the octet's value.  The
933	       digits of the hexadecimal alphabet, for this purpose,
934	       are "0123456789ABCDEF".  Uppercase letters must be used
935	       when sending hexadecimal data, though a robust
936	       implementation may choose to recognize lowercase
937	       letters on receipt.  Thus, for example, the decimal
938	       value 12 (US-ASCII form feed) can be represented by
939	       "=0C", and the decimal value 61 (US-ASCII EQUAL SIGN)
940	       can be represented by "=3D".  This rule must be
941	       followed except when the following rules allow an
942	       alternative encoding.

944	 (2)   (Literal representation) Octets with decimal values of
945	       33 through 60 inclusive, and 62 through 126, inclusive,
946	       MAY be represented as the US-ASCII characters which
947	       correspond to those octets (EXCLAMATION POINT through
948	       LESS THAN, and GREATER THAN through TILDE,
949	       respectively).

951	 (3)   (White Space) Octets with values of 9 and 32 MAY be
952	       represented as US-ASCII TAB (HT) and SPACE characters,
953	       respectively, but MUST NOT be so represented at the end
954	       of an encoded line.  Any TAB (HT) or SPACE characters
955	       on an encoded line MUST thus be followed on that line
956	       by a printable character.  In particular, an "=" at the
957	       end of an encoded line, indicating a soft line break
958	       (see rule #5) may follow one or more TAB (HT) or SPACE
959	       characters.  It follows that an octet with decimal
960	       value 9 or 32 appearing at the end of an encoded line
961	       must be represented according to Rule #1.  This rule is
962	       necessary because some MTAs (Message Transport Agents,
963	       programs which transport messages from one user to
964	       another, or perform a portion of such transfers) are
965	       known to pad lines of text with SPACEs, and others are
966	       known to remove "white space" characters from the end
967	       of a line.  Therefore, when decoding a Quoted-Printable
968	       body, any trailing white space on a line must be
969	       deleted, as it will necessarily have been added by
970	       intermediate transport agents.

972	 (4)   (Line Breaks) A line break in a text body, represented
973	       as a CRLF sequence in the text canonical form, must be
974	       represented by a (RFC 822) line break, which is also a
975	       CRLF sequence, in the Quoted-Printable encoding.  Since
976	       the canonical representation of media types other than
977	       text do not generally include the representation of
978	       line breaks as CRLF sequences, no hard line breaks
979	       (i.e. line breaks that are intended to be meaningful
980	       and to be displayed to the user) should occur in the
981	       quoted-printable encoding of such types.  Sequences
982	       like "=0D", "=0A", "=0A=0D" and "=0D=0A" will routinely
983	       appear in non-text data represented in quoted-
984	       printable, of course.

986	       Note that many implementations may elect to encode the
987	       local representation of various content types directly
988	       rather than converting to canonical form first,
989	       encoding, and then converting back to local
990	       representation.  In particular, this may apply to plain
991	       text material on systems that use newline conventions
992	       other than a CRLF terminator sequence.  Such an
993	       implementation optimization is permissible, but only
994	       when the combined canonicalization-encoding step is
995	       equivalent to performing the three steps separately.

997	 (5)   (Soft Line Breaks) The Quoted-Printable encoding
998	       REQUIRES that encoded lines be no more than 76
999	       characters long.  If longer lines are to be encoded
1000	       with the Quoted-Printable encoding, "soft" line breaks
1001	       must be used.  An equal sign as the last character on a
1002	       encoded line indicates such a non-significant ("soft")
1003	       line break in the encoded text.

1005	Thus if the "raw" form of the line is a single unencoded line
1006	that says:

1008	  Now's the time for all folk to come to the aid of their country.

1010	This can be represented, in the Quoted-Printable encoding, as:

1012	  Now's the time =
1013	  for all folk to come=
1014	   to the aid of their country.

1016	This provides a mechanism with which long lines are encoded in
1017	such a way as to be restored by the user agent.  The 76
1018	character limit does not count the trailing CRLF, but counts
1019	all other characters, including any equal signs.

1021	Since the hyphen character ("-") may be represented as itself
1022	in the Quoted-Printable encoding, care must be taken, when
1023	encapsulating a quoted-printable encoded body inside one or
1024	more multipart entities, to ensure that the boundary delimiter
1025	does not appear anywhere in the encoded body.  (A good
1026	strategy is to choose a boundary that includes a character
1027	sequence such as "=_" which can never appear in a quoted-
1028	printable body.  See the definition of multipart messages in
1029	MIME-IMT.)
1030	NOTE:  The quoted-printable encoding represents something of a
1031	compromise between readability and reliability in transport.
1032	Bodies encoded with the quoted-printable encoding will work
1033	reliably over most mail gateways, but may not work perfectly
1034	over a few gateways, notably those involving translation into
1035	EBCDIC.  A higher level of confidence is offered by the base64
1036	Content-Transfer-Encoding.  A way to get reasonably reliable
1037	transport through EBCDIC gateways is to also quote the US-
1038	ASCII characters

1040	  !"#$@[\]^`{|}~

1042	according to rule #1.

1044	Because quoted-printable data is generally assumed to be
1045	line-oriented, it is to be expected that the representation of
1046	the breaks between the lines of quoted printable data may be
1047	altered in transport, in the same manner that plain text mail
1048	has always been altered in Internet mail when passing between
1049	systems with differing newline conventions.  If such
1050	alterations are likely to constitute a corruption of the data,
1051	it is probably more sensible to use the base64 encoding rather
1052	than the quoted-printable encoding.

1054	WARNING TO IMPLEMENTORS:  If binary data are encoded in
1055	quoted-printable, care must be taken to encode CR and LF
1056	characters as "=0D" and "=0A", respectively.  In particular, a
1057	CRLF sequence in binary data should be encoded as "=0D=0A".
1058	Otherwise, if CRLF were represented as a hard line break, it
1059	might be incorrectly decoded on platforms with different line
1060	break conventions.

1062	For formalists, the syntax of quoted-printable data is
1063	described by the following grammar:

1065	  quoted-printable := qp-line *(CRLF qp-line)

1067	  qp-line := *(qp-segment transport-padding CRLF)
1068	             qp-part transport-padding

1070	  qp-part := qp-section
1071	             ; Maximum length of 76 characters

1073	  qp-segment := qp-section *(SPACE / TAB) "="
1074	                ; Maximum length of 76 characters

1076	  qp-section := [*(ptext / SPACE / TAB) ptext]

1078	  ptext := hex-octet / safe-char

1080	  safe-char := <any octet with decimal value of 33 through
1081	               60 inclusive, and 62 through 126>
1082	               ; Characters not listed as "mail-safe" in
1083	               ; RFC MIME-CONF are also not recommended.

1085	  hex-octet := "=" 2(DIGIT / "A" / "B" / "C" / "D" / "E" / "F")
1086	               ; Octet must be used for characters > 127, =,
1087	               ; SPACEs or TABs at the ends of lines, and is
1088	               ; recommended for any character not listed in
1089	               ; RFC MIME-CONF as "mail-safe".

1091	  transport-padding := *LWSP-char
1092	                       ; Composers MUST NOT generate
1093	                       ; non-zero length transport
1094	                       ; padding, but receivers MUST
1095	                       ; be able to handle padding
1096	                       ; added by message transports.

1098	IMPORTANT:  The addition of LWSP between the elements shown in
1099	this BNF is NOT allowed since this BNF does not specify a
1100	structured header field.

1102	8.8.  Base64 Content-Transfer-Encoding

1104	The Base64 Content-Transfer-Encoding is designed to represent
1105	arbitrary sequences of octets in a form that need not be
1106	humanly readable.  The encoding and decoding algorithms are
1107	simple, but the encoded data are consistently only about 33
1108	percent larger than the unencoded data.  This encoding is
1109	virtually identical to the one used in Privacy Enhanced Mail
1110	(PEM) applications, as defined in RFC 1421.

1112	A 65-character subset of US-ASCII is used, enabling 6 bits to
1113	be represented per printable character. (The extra 65th
1114	character, "=", is used to signify a special processing
1115	function.)

1117	NOTE:  This subset has the important property that it is
1118	represented identically in all versions of ISO 646, including
1119	US-ASCII, and all characters in the subset are also
1120	represented identically in all versions of EBCDIC. Other
1121	popular encodings, such as the encoding used by the uuencode
1122	utility, Macintosh binhex 4.0 [RFC-1741], and the base85
1123	encoding specified as part of Level 2 PostScript, do not share
1124	these properties, and thus do not fulfill the portability
1125	requirements a binary transport encoding for mail must meet.

1127	The encoding process represents 24-bit groups of input bits as
1128	output strings of 4 encoded characters.  Proceeding from left
1129	to right, a 24-bit input group is formed by concatenating 3
1130	8bit input groups.  These 24 bits are then treated as 4
1131	concatenated 6-bit groups, each of which is translated into a
1132	single digit in the base64 alphabet.  When encoding a bit
1133	stream via the base64 encoding, the bit stream must be
1134	presumed to be ordered with the most-significant-bit first.
1135	That is, the first bit in the stream will be the high-order
1136	bit in the first 8bit byte, and the eighth bit will be the
1137	low-order bit in the first 8bit byte, and so on.

1139	Each 6-bit group is used as an index into an array of 64
1140	printable characters.  The character referenced by the index
1141	is placed in the output string.  These characters, identified
1142	in Table 1, below, are selected so as to be universally
1143	representable, and the set excludes characters with particular
1144	significance to SMTP (e.g., ".", CR, LF) and to the multipart
1145	boundary delimiters defined in MIME-IMT (e.g., "-").

1147	                 Table 1: The Base64 Alphabet

1149	  Value Encoding  Value Encoding  Value Encoding  Value Encoding
1150	      0 A            17 R            34 i            51 z
1151	      1 B            18 S            35 j            52 0
1152	      2 C            19 T            36 k            53 1
1153	      3 D            20 U            37 l            54 2
1154	      4 E            21 V            38 m            55 3
1155	      5 F            22 W            39 n            56 4
1156	      6 G            23 X            40 o            57 5
1157	      7 H            24 Y            41 p            58 6
1158	      8 I            25 Z            42 q            59 7
1159	      9 J            26 a            43 r            60 8
1160	     10 K            27 b            44 s            61 9
1161	     11 L            28 c            45 t            62 +
1162	     12 M            29 d            46 u            63 /
1163	     13 N            30 e            47 v
1164	     14 O            31 f            48 w         (pad) =
1165	     15 P            32 g            49 x
1166	     16 Q            33 h            50 y

1168	The encoded output stream must be represented in lines of no
1169	more than 76 characters each.  All line breaks or other
1170	characters not found in Table 1 must be ignored by decoding
1171	software.  In base64 data, characters other than those in
1172	Table 1, line breaks, and other white space probably indicate
1173	a transmission error, about which a warning message or even a
1174	message rejection might be appropriate under some
1175	circumstances.

1177	Special processing is performed if fewer than 24 bits are
1178	available at the end of the data being encoded.  A full
1179	encoding quantum is always completed at the end of a body.
1180	When fewer than 24 input bits are available in an input group,
1181	zero bits are added (on the right) to form an integral number
1182	of 6-bit groups.  Padding at the end of the data is performed
1183	using the "=" character.  Since all base64 input is an
1184	integral number of octets, only the following cases can arise:
1185	(1) the final quantum of encoding input is an integral
1186	multiple of 24 bits; here, the final unit of encoded output
1187	will be an integral multiple of 4 characters with no "="
1188	padding, (2) the final quantum of encoding input is exactly 8
1189	bits; here, the final unit of encoded output will be two
1190	characters followed by two "=" padding characters, or (3) the
1191	final quantum of encoding input is exactly 16 bits; here, the
1192	final unit of encoded output will be three characters followed
1193	by one "=" padding character.

1195	Because it is used only for padding at the end of the data,
1196	the occurrence of any "=" characters may be taken as evidence
1197	that the end of the data has been reached (without truncation
1198	in transit).  No such assurance is possible, however, when the
1199	number of octets transmitted was a multiple of three and no
1200	"=" characters are present.

1202	Any characters outside of the base64 alphabet are to be
1203	ignored in base64-encoded data.

1205	Care must be taken to use the proper octets for line breaks if
1206	base64 encoding is applied directly to text material that has
1207	not been converted to canonical form.  In particular, text
1208	line breaks must be converted into CRLF sequences prior to
1209	base64 encoding.  The important thing to note is that this may
1210	be done directly by the encoder rather than in a prior
1211	canonicalization step in some implementations.

1213	NOTE: There is no need to worry about quoting potential
1214	boundary delimiters within base64-encoded bodies within
1215	multipart entities because no hyphen characters are used in
1216	the base64 encoding.

1218	9.  Content-ID Header Field

1220	In constructing a high-level user agent, it may be desirable
1221	to allow one body to make reference to another.  Accordingly,
1222	bodies may be labelled using the "Content-ID" header field,
1223	which is syntactically identical to the "Message-ID" header
1224	field:

1226	  id := "Content-ID" ":" msg-id

1228	Like the Message-ID values, Content-ID values must be
1229	generated to be world-unique.

1231	The Content-ID value may be used for uniquely identifying MIME
1232	entities in several contexts, particularly for caching data
1233	referenced by the message/external-body mechanism.  Although
1234	the Content-ID header is generally optional, its use is
1235	MANDATORY in implementations which generate data of the
1236	optional MIME media type "message/external-body".  That is,
1237	each message/external-body entity must have a Content-ID field
1238	to permit caching of such data.

1240	It is also worth noting that the Content-ID value has special
1241	semantics in the case of the multipart/alternative media type.
1242	This is explained in the section of MIME-IMT dealing with
1243	multipart/alternative.

1245	10.  Content-Description Header Field

1247	The ability to associate some descriptive information with a
1248	given body is often desirable.  For example, it may be useful
1249	to mark an "image" body as "a picture of the Space Shuttle
1250	Endeavor."  Such text may be placed in the Content-Description
1251	header field.  This header field is always optional.

1253	  description := "Content-Description" ":" *text

1255	The description is presumed to be given in the US-ASCII
1256	character set, although the mechanism specified in RFC MIME-
1257	HEADERS may be used for non-US-ASCII Content-Description
1258	values.

1260	11.  Additional MIME Header Fields

1262	Future documents may elect to define additional MIME header
1263	fields for various purposes.  Any new header field that
1264	further describes the content of a message should begin with
1265	the string "Content-" to allow such fields which appear in a
1266	message header to be distinguished from ordinary RFC 822
1267	message header fields.

1269	  MIME-extension-field := <Any RFC 822 header field which
1270	                           begins with the string
1271	                           "Content-">

1273	12.  Summary

1275	Using the MIME-Version, Content-Type, and Content-Transfer-
1276	Encoding header fields, it is possible to include, in a
1277	standardized way, arbitrary types of data with RFC 822
1278	conformant mail messages.  No restrictions imposed by either
1279	RFC 821 or RFC 822 are violated, and care has been taken to
1280	avoid problems caused by additional restrictions imposed by
1281	the characteristics of some Internet mail transport mechanisms
1282	(see RFC MIME-CONF).

1284	The next document in this set, RFC MIME-IMT, specifies the
1285	initial set of media types that can be labelled and
1286	transported using these headers.

1288	13.  Security Considerations

1290	Security issues are discussed in the second document in this
1291	set, RFC MIME-IMT.

1293	14.  Authors' Addresses

1295	For more information, the authors of this document are best
1296	contacted via Internet mail:

1298	Nathaniel S. Borenstein
1299	First Virtual Holdings
1300	25 Washington Avenue
1301	Morristown, NJ 07960
1302	USA

1304	Email: nsb@nsb.fv.com
1305	Phone: +1 201 540 8967
1306	Fax:   +1 201 993 3032

1308	Ned Freed
1309	Innosoft International, Inc.
1310	1050 East Garvey Avenue South
1311	West Covina, CA 91790
1312	USA

1314	Email: ned@innosoft.com
1315	Phone: +1 818 919 3600
1316	Fax:   +1 818 919 3614

1318	MIME is a result of the work of the Internet Engineering Task
1319	Force Working Group on Email Extensions.  The chairman of that
1320	group, Greg Vaudreuil, may be reached at:

1322	Gregory M. Vaudreuil
1323	Octel Network Services
1324	17080 Dallas Parkway
1325	Dallas, TX 75248-1905
1326	USA

1328	Email: Greg.Vaudreuil@Octel.Com
1329	               Appendix A -- Collected Grammar

1331	This appendix contains the complete BNF grammar for all the
1332	syntax specified by this document.

1334	By itself, however, this grammar is incomplete.  It refers by
1335	name to several syntax rules that are defined by RFC 822.
1336	Rather than reproduce those definitions here, and risk
1337	unintentional differences between the two, this document
1338	simply refers the reader to RFC 822 for the remaining
1339	definitions. Wherever a term is undefined, it refers to the
1340	RFC 822 definition.

1342	  attribute := token
1343	               ; Matching of attributes
1344	               ; is ALWAYS case-insensitive.

1346	  composite-type := "message" / "multipart" / extension-token

1348	  content := "Content-Type" ":" type "/" subtype
1349	             *(";" parameter)
1350	             ; Matching of media type and subtype
1351	             ; is ALWAYS case-insensitive.

1353	  description := "Content-Description" ":" *text

1355	  discrete-type := "text" / "image" / "audio" / "video" /
1356	                   "application" / extension-token

1358	  encoding := "Content-Transfer-Encoding" ":" mechanism

1360	  entity-headers := [ content CRLF ]
1361	                    [ encoding CRLF ]
1362	                    [ id CRLF ]
1363	                    [ description CRLF ]
1364	                    *( MIME-extension-field CRLF )

1366	  extension-token := ietf-token / x-token
1367	  hex-octet := "=" 2(DIGIT / "A" / "B" / "C" / "D" / "E" / "F")
1368	               ; Octet must be used for characters > 127, =,
1369	               ; SPACEs or TABs at the ends of lines, and is
1370	               ; recommended for any character not listed in
1371	               ; RFC MIME-CONF as "mail-safe".

1373	  iana-token := <A publicly-defined extension token. Tokens
1374	                 of this form must be registered with IANA
1375	                 as specified in RFC MIME-REG.>

1377	  ietf-token := <An extension token defined by a
1378	                 standards-track RFC and registered
1379	                 with IANA.>

1381	  id := "Content-ID" ":" msg-id

1383	  mechanism := "7bit" / "8bit" / "binary" /
1384	               "quoted-printable" / "base64" /
1385	               ietf-token / x-token

1387	  MIME-extension-field := <Any RFC 822 header field which
1388	                           begins with the string
1389	                           "Content-">

1391	  MIME-message-headers := entity-headers
1392	                          fields
1393	                          version CRLF
1394	                          ; The ordering of the header
1395	                          ; fields implied by this BNF
1396	                          ; definition should be ignored.

1398	  MIME-part-headers := entity-headers
1399	                       [fields]
1400	                       ; Any field not beginning with
1401	                       ; "content-" can have no defined
1402	                       ; meaning and may be ignored.
1403	                       ; The ordering of the header
1404	                       ; fields implied by this BNF
1405	                       ; definition should be ignored.

1407	  parameter := attribute "=" value

1409	  ptext := hex-octet / safe-char
1410	  qp-line := *(qp-segment transport-padding CRLF)
1411	             qp-part transport-padding

1413	  qp-part := qp-section
1414	             ; Maximum length of 76 characters

1416	  qp-section := [*(ptext / SPACE / TAB) ptext]

1418	  qp-segment := qp-section *(SPACE / TAB) "="
1419	                ; Maximum length of 76 characters

1421	  quoted-printable := qp-line *(CRLF qp-line)

1423	  safe-char := <any octet with decimal value of 33 through
1424	               60 inclusive, and 62 through 126>
1425	               ; Characters not listed as "mail-safe" in
1426	               ; RFC MIME-CONF are also not recommended.

1428	  subtype := extension-token / iana-token

1430	  token := 1*<any (US-ASCII) CHAR except SPACE, CTLs,
1431	              or tspecials>

1433	  transport-padding := *LWSP-char
1434	                       ; Composers MUST NOT generate
1435	                       ; non-zero length transport
1436	                       ; padding, but receivers MUST
1437	                       ; be able to handle padding
1438	                       ; added by message transports.

1440	  tspecials :=  "(" / ")" / "<" / ">" / "@" /
1441	                "," / ";" / ":" / "\" / <">
1442	                "/" / "[" / "]" / "?" / "="
1443	                ; Must be in quoted-string,
1444	                ; to use within parameter values

1446	  type := discrete-type / composite-type

1448	  value := token / quoted-string

1450	  version := "MIME-Version" ":" 1*DIGIT "." 1*DIGIT

1452	  x-token := <The two characters "X-" or "x-" followed, with
1453	              no  intervening white space, by any token>