< draft-ietf-822ext-mime-imb-03.txt   draft-ietf-822ext-mime-imb-04.txt >
Network Working Group Nathaniel Borenstein Network Working Group Nathaniel Borenstein
Internet Draft Ned Freed Internet Draft Ned Freed
<draft-ietf-822ext-mime-imb-03.txt> <draft-ietf-822ext-mime-imb-04.txt>
Multipurpose Internet Mail Extensions Multipurpose Internet Mail Extensions
(MIME) Part One: (MIME) Part One:
Format of Internet Message Bodies Format of Internet Message Bodies
May 5, 1995 December 1995
Status of this Memo Status of this Memo
This document is an Internet-Draft. Internet-Drafts are This document is an Internet-Draft. Internet-Drafts are
working documents of the Internet Engineering Task Force working documents of the Internet Engineering Task Force
(IETF), its areas, and its working groups. Note that other (IETF), its areas, and its working groups. Note that other
groups may also distribute working documents as Internet- groups may also distribute working documents as Internet-
Drafts. Drafts.
Internet-Drafts are draft documents valid for a maximum of six Internet-Drafts are draft documents valid for a maximum of six
skipping to change at page 2, line ? skipping to change at page 2, line ?
These documents are based on earlier work documented in RFC These documents are based on earlier work documented in RFC
934, STD 11, and RFC 1049, but extends and revises them. 934, STD 11, and RFC 1049, but extends and revises them.
Because RFC 822 said so little about message bodies, these Because RFC 822 said so little about message bodies, these
documents are largely orthogonal to (rather than a revision documents are largely orthogonal to (rather than a revision
of) RFC 822. of) RFC 822.
In particular, these documents are designed to provide In particular, these documents are designed to provide
facilities to include multiple parts in a single message, to facilities to include multiple parts in a single message, to
represent body and header text in character sets other than represent body and header text in character sets other than
US-ASCII, to represent formatted multi-font text messages, to US-ASCII, to represent formatted multi-font text messages, to
represent non-textual material such as images and audio represent non-textual material such as images and audio clips,
fragments, and generally to facilitate later extensions and generally to facilitate later extensions defining new
defining new types of Internet mail for use by cooperating types of Internet mail for use by cooperating mail agents.
mail agents.
This initial document specifies the various headers used to This initial document specifies the various headers used to
describe the structure of MIME messages. The second document, describe the structure of MIME messages. The second document,
RFC MIME-IMT, defines the general structure of the MIME media RFC MIME-IMT, defines the general structure of the MIME media
typing system and defines an initial set of media types. The typing system and defines an initial set of media types. The
third document, RFC MIME-HEADERS, describes extensions to RFC third document, RFC MIME-HEADERS, describes extensions to RFC
822 to allow non-US-ASCII text data in Internet mail header 822 to allow non-US-ASCII text data in Internet mail header
fields. The fourth document, RFC MIME-REG, specifies various fields. The fourth document, RFC MIME-REG, specifies various
IANA registration procedures for MIME-related entities. The IANA registration procedures for MIME-related facilities. The
fifth and final document, RFC MIME-CONF, describes MIME fifth and final document, RFC MIME-CONF, describes MIME
conformance conformance criteria as well as providing some conformance criteria as well as providing some illustrative
illustrative examples of MIME message formats, examples of MIME message formats, acknowledgements, and the
acknowledgements, and the bibliography. bibliography.
These documents are revisions of RFCs 1521, 1522, and 1590, These documents are revisions of RFCs 1521, 1522, and 1590,
which themselves were revisions of RFCs 1341 and 1342. An which themselves were revisions of RFCs 1341 and 1342. An
appendix in RFC MIME-CONF describes differences and changes appendix in RFC MIME-CONF describes differences and changes
from previous versions. from previous versions.
2. Table of Contents 2. Table of Contents
1 Abstract .............................................. 1 1 Abstract .............................................. 1
2 Table of Contents ..................................... 3 2 Table of Contents ..................................... 3
3 Introduction .......................................... 4 3 Introduction .......................................... 4
4 Definitions, Conventions, and Generic BNF Grammar ..... 6 4 Definitions, Conventions, and Generic BNF Grammar ..... 6
4.1 CRLF ................................................ 7 4.1 CRLF ................................................ 7
4.2 Character Set ....................................... 7 4.2 Character Set ....................................... 7
4.3 Message ............................................. 8 4.3 Message ............................................. 8
4.4 Body Part ........................................... 8 4.4 Entity .............................................. 8
4.5 Entity .............................................. 8 4.5 Body Part ........................................... 8
4.6 Body ................................................ 8 4.6 Body ................................................ 8
4.7 7bit Data ........................................... 8 4.7 7bit Data ........................................... 9
4.8 8bit Data ........................................... 9 4.8 8bit Data ........................................... 9
4.9 Binary Data ......................................... 9 4.9 Binary Data ......................................... 9
4.10 Lines .............................................. 9 4.10 Lines .............................................. 9
5 MIME Header Fields .................................... 9 5 MIME Header Fields .................................... 9
6 MIME-Version Header Field ............................. 10 6 MIME-Version Header Field ............................. 10
7 Content-Type Header Field ............................. 12 7 Content-Type Header Field ............................. 13
7.1 Syntax of the Content-Type Header Field ............. 14 7.1 Syntax of the Content-Type Header Field ............. 14
7.2 Content-Type Defaults ............................... 16 7.2 Content-Type Defaults ............................... 16
8 Content-Transfer-Encoding Header Field ................ 17 8 Content-Transfer-Encoding Header Field ................ 17
8.1 Content-Transfer-Encoding Syntax .................... 17 8.1 Content-Transfer-Encoding Syntax .................... 17
8.2 Content-Transfer-Encodings Sematics ................. 17 8.2 Content-Transfer-Encodings Sematics ................. 18
8.3 New Content-Transfer-Encodings ...................... 19 8.3 New Content-Transfer-Encodings ...................... 19
8.4 Interpretation and Use .............................. 19 8.4 Interpretation and Use .............................. 19
8.5 Translating Encodings ............................... 21 8.5 Translating Encodings ............................... 22
8.6 Canonical Encoding Model ............................ 22 8.6 Canonical Encoding Model ............................ 22
8.7 Quoted-Printable Content-Transfer-Encoding .......... 22 8.7 Quoted-Printable Content-Transfer-Encoding .......... 22
8.8 Base64 Content-Transfer-Encoding .................... 26 8.8 Base64 Content-Transfer-Encoding .................... 27
9 Content-ID Header Field ............................... 29 9 Content-ID Header Field ............................... 29
10 Content-Description Header Field ..................... 29 10 Content-Description Header Field ..................... 30
11 Additional MIME Header Fields ........................ 30 11 Additional MIME Header Fields ........................ 30
12 Summary .............................................. 30 12 Summary .............................................. 30
13 Security Considerations .............................. 30 13 Security Considerations .............................. 31
14 Authors' Addresses ................................... 31 14 Authors' Addresses ................................... 32
A Collected Grammar ..................................... 32 A Collected Grammar ..................................... 33
3. Introduction 3. Introduction
Since its publication in 1982, RFC 822 [RFC-822] has defined Since its publication in 1982, RFC 822 [RFC-822] has defined
the standard format of textual mail messages on the Internet. the standard format of textual mail messages on the Internet.
Its success has been such that the RFC 822 format has been Its success has been such that the RFC 822 format has been
adopted, wholly or partially, well beyond the confines of the adopted, wholly or partially, well beyond the confines of the
Internet and the Internet SMTP transport defined by RFC 821 Internet and the Internet SMTP transport defined by RFC 821
[RFC-821]. As the format has seen wider use, a number of [RFC-821]. As the format has seen wider use, a number of
limitations have proven increasingly restrictive for the user limitations have proven increasingly restrictive for the user
community. community.
skipping to change at page 4, line 28 skipping to change at page 4, line 28
in the case of text, however, RFC 822 is inadequate for the in the case of text, however, RFC 822 is inadequate for the
needs of mail users whose languages require the use of needs of mail users whose languages require the use of
character sets richer than US-ASCII. Since RFC 822 does not character sets richer than US-ASCII. Since RFC 822 does not
specify mechanisms for mail containing audio, video, Asian specify mechanisms for mail containing audio, video, Asian
language text, or even text in most European languages, language text, or even text in most European languages,
additional specifications are needed. additional specifications are needed.
One of the notable limitations of RFC 821/822 based mail One of the notable limitations of RFC 821/822 based mail
systems is the fact that they limit the contents of electronic systems is the fact that they limit the contents of electronic
mail messages to relatively short lines (e.g. 1000 characters mail messages to relatively short lines (e.g. 1000 characters
or less [RFC821]) of 7-bit US-ASCII. This forces users to or less [RFC821]) of 7bit US-ASCII. This forces users to
convert any non-textual data that they may wish to send into convert any non-textual data that they may wish to send into
seven-bit bytes representable as printable US-ASCII characters seven-bit bytes representable as printable US-ASCII characters
before invoking a local mail UA (User Agent, a program with before invoking a local mail UA (User Agent, a program with
which human users send and receive mail). Examples of such which human users send and receive mail). Examples of such
encodings currently used in the Internet include pure encodings currently used in the Internet include pure
hexadecimal, uuencode, the 3-in-4 base 64 scheme specified in hexadecimal, uuencode, the 3-in-4 base 64 scheme specified in
RFC 1421, the Andrew Toolkit Representation [ATK], and many RFC 1421, the Andrew Toolkit Representation [ATK], and many
others. others.
The limitations of RFC 822 mail become even more apparent as The limitations of RFC 822 mail become even more apparent as
gateways are designed to allow for the exchange of mail gateways are designed to allow for the exchange of mail
messages between RFC 822 hosts and X.400 hosts. X.400 [X400] messages between RFC 822 hosts and X.400 hosts. X.400 [X400]
specifies mechanisms for the inclusion of non-textual body specifies mechanisms for the inclusion of non-textual material
parts within electronic mail messages. The current standards within electronic mail messages. The current standards for
for the mapping of X.400 messages to RFC 822 messages specify the mapping of X.400 messages to RFC 822 messages specify
either that X.400 non-textual body parts must be converted to either that X.400 non-textual material must be converted to
(not encoded in) IA5Text format, or that they must be (not encoded in) IA5Text format, or that they must be
discarded, notifying the RFC 822 user that discarding has discarded, notifying the RFC 822 user that discarding has
occurred. This is clearly undesirable, as information that a occurred. This is clearly undesirable, as information that a
user may wish to receive is lost. Even though a user agent user may wish to receive is lost. Even though a user agent
may not have the capability of dealing with the non-textual may not have the capability of dealing with the non-textual
body part, the user might have some mechanism external to the material, the user might have some mechanism external to the
UA that can extract useful information from the body part. UA that can extract useful information from the material.
Moreover, it does not allow for the fact that the message may Moreover, it does not allow for the fact that the message may
eventually be gatewayed back into an X.400 message handling eventually be gatewayed back into an X.400 message handling
system (i.e., the X.400 message is "tunneled" through Internet system (i.e., the X.400 message is "tunneled" through Internet
mail), where the non-textual information would definitely mail), where the non-textual information would definitely
become useful again. become useful again.
This document describes several mechanisms that combine to This document describes several mechanisms that combine to
solve most of these problems without introducing any serious solve most of these problems without introducing any serious
incompatibilities with the existing world of RFC 822 mail. In incompatibilities with the existing world of RFC 822 mail. In
particular, it describes: particular, it describes:
skipping to change at page 5, line 32 skipping to change at page 5, line 32
by older or non-conformant software, which are presumed by older or non-conformant software, which are presumed
to lack such a field. to lack such a field.
(2) A Content-Type header field, generalized from RFC 1049 (2) A Content-Type header field, generalized from RFC 1049
[RFC-1049], which can be used to specify the media type [RFC-1049], which can be used to specify the media type
and subtype of data in the body of a message and to and subtype of data in the body of a message and to
fully specify the native representation (canonical fully specify the native representation (canonical
form) of such data. form) of such data.
(3) A Content-Transfer-Encoding header field, which can be (3) A Content-Transfer-Encoding header field, which can be
used to specify an auxiliary encoding that was applied used to specify both the encoding transformation that
to the data in order to allow it to pass through mail was applied to the body and the domain of the result.
transport mechanisms which may have data or character Encoding transformations other than the identity
set limitations. transformation are usually applied to data in order to
allow it to pass through mail transport mechanisms
which may have data or character set limitations.
(4) Two additional header fields that can be used to (4) Two additional header fields that can be used to
further describe the data in a body, the Content-ID and further describe the data in a body, the Content-ID and
Content-Description header fields. Content-Description header fields.
All of the header fields defined in this document are subject All of the header fields defined in this document are subject
to the general syntactic rules for header fields specified in to the general syntactic rules for header fields specified in
RFC 822. In particular, all of these header fields can RFC 822. In particular, all of these header fields except for
include RFC 822 comments, which have no semantic content and Content-Disposition can include RFC 822 comments, which have
should be ignored during MIME processing. no semantic content and should be ignored during MIME
processing.
Finally, to specify and promote interoperability, RFC MIME- Finally, to specify and promote interoperability, RFC MIME-
CONF provides a basic applicability statement for a subset of CONF provides a basic applicability statement for a subset of
the above mechanisms that defines a minimal level of the above mechanisms that defines a minimal level of
"conformance" with this document. "conformance" with this document.
HISTORICAL NOTE: Several of the mechanisms described in this HISTORICAL NOTE: Several of the mechanisms described in this
document may seem somewhat strange or even baroque at first set of documents may seem somewhat strange or even baroque at
reading. It is important to note that compatibility with first reading. It is important to note that compatibility
existing standards AND robustness across existing practice with existing standards AND robustness across existing
were two of the highest priorities of the working group that practice were two of the highest priorities of the working
developed this document. In particular, compatibility was group that developed this set of documents. In particular,
always favored over elegance. compatibility was always favored over elegance.
Please refer to the current edition of the "IAB Official Please refer to the current edition of the "IAB Official
Protocol Standards" for the standardization state and status Protocol Standards" for the standardization state and status
of this protocol. RFC 822 and RFC 1123 [RFC-1123] also of this protocol. RFC 822 and RFC 1123 [RFC-1123] also
provide essential background for MIME since no conforming provide essential background for MIME since no conforming
implementation of MIME can violate them. In addition, several implementation of MIME can violate them. In addition, several
other informational RFC documents will be of interest to the other informational RFC documents will be of interest to the
MIME implementor, in particular RFC 1344 [RFC-1344], RFC 1345 MIME implementor, in particular RFC 1344 [RFC-1344], RFC 1345
[RFC-1345], and RFC 1524 [RFC-1524]. [RFC-1345], and RFC 1524 [RFC-1524].
4. Definitions, Conventions, and Generic BNF Grammar 4. Definitions, Conventions, and Generic BNF Grammar
Although the mechanisms specified in this document are all Although the mechanisms specified in this set of documents are
described in prose, most are also described formally in the all described in prose, most are also described formally in
augmented BNF notation of RFC 822. Implementors will need to the augmented BNF notation of RFC 822. Implementors will need
be familiar with this notation in order to understand this to be familiar with this notation in order to understand this
specification, and are referred to RFC 822 for a complete specification, and are referred to RFC 822 for a complete
explanation of the augmented BNF notation. explanation of the augmented BNF notation.
Some of the augmented BNF in this document makes reference to Some of the augmented BNF in this set of documents makes named
syntactic entities that are defined in RFC 822 and not in this references to syntax rules defined in RFC 822. A complete
document. A complete formal grammar, then, is obtained by formal grammar, then, is obtained by combining the collected
combining Appendix A of this document, the collected grammar, grammar appendices in each document in this set with the BNF
with the BNF of RFC 822 plus the modifications to RFC 822 of RFC 822 plus the modifications to RFC 822 defined in RFC
defined in RFC 1123 (which specifically changes the syntax for 1123 (which specifically changes the syntax for `return',
`return', `date' and `mailbox'). `date' and `mailbox').
In this document, all numeric and octet values are given in All numeric and octet values are given in decimal notation in
decimal notation. All media type values, subtype values, and this set of documents. All media type values, subtype values,
parameter names as defined in this document are case- and parameter names as defined are case-insensitive. However,
insensitive. However, parameter values are case-sensitive parameter values are case-sensitive unless otherwise specified
unless otherwise specified for the specific parameter. for the specific parameter.
FORMATTING NOTE: Notes, such at this one, provide additional FORMATTING NOTE: Notes, such at this one, provide additional
nonessential information which may be skipped by the reader nonessential information which may be skipped by the reader
without missing anything essential. The primary purpose of without missing anything essential. The primary purpose of
these non-essential notes is to convey information about the these non-essential notes is to convey information about the
rationale of this document, or to place this document in the rationale of this set of documents, or to place these
proper historical or evolutionary context. Such information documents in the proper historical or evolutionary context.
may in particular be skipped by those who are focused entirely Such information may in particular be skipped by those who are
on building a conformant implementation, but may be of use to focused entirely on building a conformant implementation, but
those who wish to understand why certain design choices were may be of use to those who wish to understand why certain
made. design choices were made.
4.1. CRLF 4.1. CRLF
The term CRLF, in this document, refers to the sequence of The term CRLF, in this set of documents, refers to the
octets corresponding to the two US-ASCII characters CR sequence of octets corresponding to the two US-ASCII
(decimal value 13) and LF (decimal value 10) which, taken characters CR (decimal value 13) and LF (decimal value 10)
together, in this order, denote a line break in RFC 822 mail. which, taken together, in this order, denote a line break in
RFC 822 mail.
4.2. Character Set 4.2. Character Set
The term "character set" is used in this document to refer to The term "character set" is used in MIME to refer to a method
a table-based method of converting a sequence of octets into a of converting a sequence of octets into a sequence of
sequence of characters. Note that unconditional and characters. Note that unconditional and unambiguous
unambiguous conversion in the other direction is not required, conversion in the other direction is not required, in that not
in that not all characters may be available in a given all characters may be representable by a given character set
character set and a character set may provide more than one and a character set may provide more than one sequence of
sequence of octets to represent a particular character. This octets to represent a particular sequence of characters.
definition is intended to allow various kinds of character
encodings, from simple single-table mappings such as US-ASCII
to complex table switching methods such as those that use ISO
2022's techniques. However, the definition associated with a
MIME character set name must fully specify the mapping to be
performed from octets to characters. In particular, use of
external profiling information to determine the exact mapping
is not permitted.
HISTORICAL NOTE: The term "character set" originated in the This definition is intended to allow various kinds of
definition of US-ASCII and similar 7-bit and 8-bit character encodings, from simple single-table mappings such as
specifications. These define true sets. However, the advent US-ASCII to complex table switching methods such as those that
of multi-octet character encodings and switching techniques use ISO 2022's techniques. However, the definition associated
have transformed character sets into entities that properly with a MIME character set name must fully specify the mapping
speaking are no longer strictly sets. Some other communities to be performed. In particular, use of external profiling
have adopted the term "character encoding" for what MIME calls information to determine the exact mapping is not permitted.
a "character set" as a result.
NOTE: The term "character set" was originally used in MIME
with specifications such as US-ASCII and other 7bit and 8bit
schemes which have a simple mapping from single octets to
single characters. Multi-octet coded character sets and
switching techniques make the situation more complex. For
example, some communities use the term "character encoding"
for what MIME calls a "character set", while using the phrase
"coded character set" to denote an abstract mapping from
integers (not octets) to characters.
4.3. Message 4.3. Message
The term "message", when not further qualified, means either The term "message", when not further qualified, means either a
the (complete or "top-level") message being transferred on a (complete or "top-level") RFC 822 message being transferred on
network, or a message encapsulated in a body part of type a network, or a message encapsulated in a body of type
"message". "message/rfc822" or "message/partial".
4.4. Body Part 4.4. Entity
The term "body part", in this document, refers to content The term "entity", refers specifically to the MIME-defined
headers and contents of either a message or one of the parts header fields and contents of either a message or one of the
in the body of a multipart entity. A body part has a header parts in the body of a multipart entity. The specification of
and a body, so it makes sense to speak about the body of a such entities is the essence of MIME. Since the contents of
body part. an entity are often called the "body", it makes sense to speak
about the body of an entity. Any sort of field may be present
in the header of an entity, but only those fields whose names
begin with "content-" actually have any MIME-related meaning.
Note that this does NOT imply thay they have no meaning at all
-- an entity that is also a message has non-MIME header fields
whose meanings are defined by RFC 822.
4.5. Entity 4.5. Body Part
The term "entity", in this document, means either a message or The term "body part" refers to an entity inside of a multipart
a body part. All kinds of entities share the property that entity.
they have a header and a body.
4.6. Body 4.6. Body
The term "body", when not further qualified, means the body of The term "body", when not further qualified, means the body of
an entity, that is the body of either a message or of a body an entity, that is, the body of either a message or of a body
part. part.
NOTE: The previous four definitions are clearly circular. NOTE: The previous four definitions are clearly circular.
This is unavoidable, since the overall structure of a MIME This is unavoidable, since the overall structure of a MIME
message is indeed recursive. message is indeed recursive.
4.7. 7bit Data 4.7. 7bit Data
"7bit data" refers to data that is all represented as "7bit data" refers to data that is all represented as
relatively short lines with 998 octets or less between CRLF relatively short lines with 998 octets or less between CRLF
skipping to change at page 9, line 25 skipping to change at page 9, line 33
4.9. Binary Data 4.9. Binary Data
"Binary data" refers to data where any sequence of octets "Binary data" refers to data where any sequence of octets
whatsoever is allowed. whatsoever is allowed.
4.10. Lines 4.10. Lines
"Lines" are defined as sequences of octets separated by a CRLF "Lines" are defined as sequences of octets separated by a CRLF
sequences. This is consistent with both RFC 821 and RFC 822. sequences. This is consistent with both RFC 821 and RFC 822.
"Lines" only refers to a unit of text in a message, which may "Lines" only refers to a unit of data in a message, which may
or may not correspond to something that is actually displayed or may not correspond to something that is actually displayed
by a user agent. by a user agent.
5. MIME Header Fields 5. MIME Header Fields
MIME defines a number of new RFC 822 header fields that are MIME defines a number of new RFC 822 header fields that are
used to describe the content of messages. These header fields used to describe the content of a MIME entity. These header
occur in two contexts: fields occur in at least two contexts:
(1) As part of a regular RFC 822 message header. (1) As part of a regular RFC 822 message header.
(2) In a MIME body part header within a multipart (2) In a MIME body part header within a multipart
construct. construct.
The formal definition of these header fields is as follows: The formal definition of these header fields is as follows:
MIME-message-headers := fields entity-headers := [ content CRLF ]
[ encoding CRLF ]
[ id CRLF ]
[ description CRLF ]
*( MIME-extension-field CRLF )
MIME-message-headers := entity-headers
fields
version CRLF version CRLF
[ content CRLF ]
[ encoding CRLF ]
[ id CRLF ]
[ description CRLF ]
*( mime-extension-field CRLF )
; The ordering of the header ; The ordering of the header
; fields implied by this BNF ; fields implied by this BNF
; definition should be ignored ; definition should be ignored.
MIME-part-headers := [ content CRLF ] MIME-part-headers := entity-headers
[ encoding CRLF ] [ fields ]
[ id CRLF ] ; Any field not beginning with
[ description CRLF ] ; "content-" can have no defined
*( mime-extension-field CRLF ) ; meaning and should be ignored.
; The ordering of the header ; The ordering of the header
; fields implied by this BNF ; fields implied by this BNF
; definition should be ignored ; definition should be ignored.
The syntax of the various specific MIME header fields will be The syntax of the various specific MIME header fields will be
described in the following sections. described in the following sections.
6. MIME-Version Header Field 6. MIME-Version Header Field
Since RFC 822 was published in 1982, there has really been Since RFC 822 was published in 1982, there has really been
only one format standard for Internet messages, and there has only one format standard for Internet messages, and there has
been little perceived need to declare the format standard in been little perceived need to declare the format standard in
use. This document is an independent document that use. This document is an independent document that
skipping to change at page 11, line 23 skipping to change at page 11, line 29
Thus, future format specifiers, which might replace or extend Thus, future format specifiers, which might replace or extend
"1.0", are constrained to be two integer fields, separated by "1.0", are constrained to be two integer fields, separated by
a period. If a message is received with a MIME-version value a period. If a message is received with a MIME-version value
other than "1.0", it cannot be assumed to conform with this other than "1.0", it cannot be assumed to conform with this
specification. specification.
Note that the MIME-Version header field is required at the top Note that the MIME-Version header field is required at the top
level of a message. It is not required for each body part of level of a message. It is not required for each body part of
a multipart entity. It is required for the embedded headers a multipart entity. It is required for the embedded headers
of a body of type "message" if and only if the embedded of a body of type "message/rfc822" or "message/partial" if and
message is itself claimed to be MIME-conformant. only if the embedded message is itself claimed to be MIME-
conformant.
It is not possible to fully specify how a mail reader that It is not possible to fully specify how a mail reader that
conforms with MIME as defined in this document should treat a conforms with MIME as defined in this document should treat a
message that might arrive in the future with some value of message that might arrive in the future with some value of
MIME-Version other than "1.0". MIME-Version other than "1.0".
It is also worth noting that version control for specific It is also worth noting that version control for specific
media types is not accomplished using the MIME-Version media types is not accomplished using the MIME-Version
mechanism. In particular, some formats (such as mechanism. In particular, some formats (such as
application/postscript) have version numbering conventions application/postscript) have version numbering conventions
that are internal to the document format. Where such that are internal to the media format. Where such conventions
conventions exist, MIME does nothing to supersede them. Where exist, MIME does nothing to supersede them. Where no such
no such conventions exist, a MIME media type might use a conventions exist, a MIME media type might use a "version"
"version" parameter in the content-type field if necessary. parameter in the content-type field if necessary.
NOTE TO IMPLEMENTORS: When checking MIME-Version values any NOTE TO IMPLEMENTORS: When checking MIME-Version values any
RFC 822 comment strings that are present must be ignored. In RFC 822 comment strings that are present must be ignored. In
particular, the following four MIME-Version fields are particular, the following four MIME-Version fields are
equivalent: equivalent:
MIME-Version: 1.0 MIME-Version: 1.0
MIME-Version: 1.0 (produced by MetaSend Vx.x) MIME-Version: 1.0 (produced by MetaSend Vx.x)
skipping to change at page 14, line 34 skipping to change at page 14, line 39
and to avoid a potential conflict with a future official name. and to avoid a potential conflict with a future official name.
7.1. Syntax of the Content-Type Header Field 7.1. Syntax of the Content-Type Header Field
In the Augmented BNF notation of RFC 822, a Content-Type In the Augmented BNF notation of RFC 822, a Content-Type
header field value is defined as follows: header field value is defined as follows:
content := "Content-Type" ":" type "/" subtype content := "Content-Type" ":" type "/" subtype
*(";" parameter) *(";" parameter)
; Matching of media type and subtype ; Matching of media type and subtype
; is ALWAYS case-insensitive ; is ALWAYS case-insensitive.
type := discrete-type / composite-type type := discrete-type / composite-type
discrete-type := "text" / "image" / "audio" / "video" / discrete-type := "text" / "image" / "audio" / "video" /
"application" / extension-token "application" / extension-token
composite-type := "message" / "multipart" / extension-token composite-type := "message" / "multipart" / extension-token
extension-token := ietf-token / x-token
extension-token := iana-token / ietf-token / x-token
iana-token := <a publicly-defined extension token,
registered with IANA, as specified in
RFC MIME-REG [RFC-MIME-REG]>
ietf-token := <a publicly-defined extension token, ietf-token := <a publicly-defined extension token,
initially registered with IANA and initially registered with IANA and
subsequently standardized by the IETF> subsequently standardized by the IETF>
x-token := <The two characters "X-" or "x-" followed, with x-token := <The two characters "X-" or "x-" followed, with
no intervening white space, by any token> no intervening white space, by any token>
subtype := extension-token subtype := extension-token / iana-token
iana-token := <a publicly-defined extension token,
registered with IANA, as specified in
RFC MIME-REG [RFC-MIME-REG]>
parameter := attribute "=" value parameter := attribute "=" value
attribute := token attribute := token
; Matching of attributes
; is ALWAYS case-insensitive.
value := token / quoted-string value := token / quoted-string
token := 1*<any (US-ASCII) CHAR except SPACE, CTLs, token := 1*<any (US-ASCII) CHAR except SPACE, CTLs,
or tspecials> or tspecials>
tspecials := "(" / ")" / "<" / ">" / "@" / tspecials := "(" / ")" / "<" / ">" / "@" /
"," / ";" / ":" / "\" / <"> "," / ";" / ":" / "\" / <">
"/" / "[" / "]" / "?" / "=" "/" / "[" / "]" / "?" / "="
; Must be in quoted-string, ; Must be in quoted-string,
skipping to change at page 16, line 43 skipping to change at page 17, line 8
7.2. Content-Type Defaults 7.2. Content-Type Defaults
Default RFC 822 messages without a MIME Content-Type header Default RFC 822 messages without a MIME Content-Type header
are taken by this protocol to be plain text in the US-ASCII are taken by this protocol to be plain text in the US-ASCII
character set, which can be explicitly specified as: character set, which can be explicitly specified as:
Content-type: text/plain; charset=us-ascii Content-type: text/plain; charset=us-ascii
This default is assumed if no Content-Type header field is This default is assumed if no Content-Type header field is
specified. In the presence of a MIME-Version header field, a specified. It is also recommend that this default be assumed
receiving User Agent can also assume that plain US-ASCII text when a syntactically invalid Content-Type header field is
was the sender's intent. Plain US-ASCII text may still be encountered. In the presence of a MIME-Version header field
assumed in the absence of a MIME-Version specification, but and the absence of any Content-Type header field, a receiving
the sender's intent might have been otherwise. User Agent can also assume that plain US-ASCII text was the
sender's intent. Plain US-ASCII text may still be assumed in
the absence of a MIME-Version or the presence of an
syntactically invalid Content-Type header field, but the
sender's intent might have been otherwise.
8. Content-Transfer-Encoding Header Field 8. Content-Transfer-Encoding Header Field
Many media types which could be usefully transported via email Many media types which could be usefully transported via email
are represented, in their "natural" format, as 8-bit character are represented, in their "natural" format, as 8bit character
or binary data. Such data cannot be transmitted over some or binary data. Such data cannot be transmitted over some
transfer protocols. For example, RFC 821 (SMTP) restricts transfer protocols. For example, RFC 821 (SMTP) restricts
mail messages to 7-bit US-ASCII data with lines no longer than mail messages to 7bit US-ASCII data with lines no longer than
1000 characters including any trailing CRLF line separator. 1000 characters including any trailing CRLF line separator.
It is necessary, therefore, to define a standard mechanism for It is necessary, therefore, to define a standard mechanism for
encoding such data into a 7-bit short line format. Proper encoding such data into a 7bit short line format. Proper
labelling of unencoded material in less restrictive formats labelling of unencoded material in less restrictive formats
for direct use over less restrictive transports is also for direct use over less restrictive transports is also
desireable. This document specifies that such encodings will desireable. This document specifies that such encodings will
be indicated by a new "Content-Transfer-Encoding" header be indicated by a new "Content-Transfer-Encoding" header
field. This field has not been defined by any previous field. This field has not been defined by any previous
standard. standard.
8.1. Content-Transfer-Encoding Syntax 8.1. Content-Transfer-Encoding Syntax
The Content-Transfer-Encoding field's value is a single token The Content-Transfer-Encoding field's value is a single token
skipping to change at page 17, line 37 skipping to change at page 18, line 7
Formally: Formally:
encoding := "Content-Transfer-Encoding" ":" mechanism encoding := "Content-Transfer-Encoding" ":" mechanism
mechanism := "7bit" / "8bit" / "binary" / mechanism := "7bit" / "8bit" / "binary" /
"quoted-printable" / "base64" / "quoted-printable" / "base64" /
ietf-token / x-token ietf-token / x-token
These values are not case sensitive -- Base64 and BASE64 and These values are not case sensitive -- Base64 and BASE64 and
bAsE64 are all equivalent. An encoding type of 7BIT requires bAsE64 are all equivalent. An encoding type of 7BIT requires
that the body is already in a 7-bit mail-ready representation. that the body is already in a 7bit mail-ready representation.
This is the default value -- that is, "Content-Transfer- This is the default value -- that is, "Content-Transfer-
Encoding: 7BIT" is assumed if the Content-Transfer-Encoding Encoding: 7BIT" is assumed if the Content-Transfer-Encoding
header field is not present. header field is not present.
8.2. Content-Transfer-Encodings Sematics 8.2. Content-Transfer-Encodings Sematics
This single Content-Transfer-Encoding token actually provides This single Content-Transfer-Encoding token actually provides
two pieces of information. It specifies what sort of encoding two pieces of information. It specifies what sort of encoding
transformation the body was subjected to, and it specifies transformation the body was subjected to, and it specifies
what the domain of the result is. what the domain of the result is.
skipping to change at page 18, line 25 skipping to change at page 18, line 39
terms "7bit data", "8bit data", and "binary data" are all terms "7bit data", "8bit data", and "binary data" are all
defined in Section 4. defined in Section 4.
The quoted-printable and base64 encodings transform their The quoted-printable and base64 encodings transform their
input from an arbitrary domain into material in the "7bit" input from an arbitrary domain into material in the "7bit"
range, thus making it safe to carry over restricted range, thus making it safe to carry over restricted
transports. The specific definition of the transformations transports. The specific definition of the transformations
are given below. are given below.
The proper Content-Transfer-Encoding label must always be The proper Content-Transfer-Encoding label must always be
used. Labelling unencoded data containing 8-bit characters as used. Labelling unencoded data containing 8bit characters as
"7bit" is not allowed, nor is labelling unencoded non-line- "7bit" is not allowed, nor is labelling unencoded non-line-
oriented data as anything other than "binary" allowed. oriented data as anything other than "binary" allowed.
Unlike media subtypes, a proliferation of Content-Transfer- Unlike media subtypes, a proliferation of Content-Transfer-
Encoding values is both undesirable and unnecessary. However, Encoding values is both undesirable and unnecessary. However,
establishing only a single transformation into the "7bit" establishing only a single transformation into the "7bit"
domain does not seem possible. There is a tradeoff between domain does not seem possible. There is a tradeoff between
the desire for a compact and efficient encoding of largely- the desire for a compact and efficient encoding of largely-
binary data and the desire for a readable encoding of data binary data and the desire for a readable encoding of data
that is mostly, but not entirely, 7-bit. For this reason, at that is mostly, but not entirely, 7bit. For this reason, at
least two encoding mechanisms are necessary: a "readable" least two encoding mechanisms are necessary: a "readable"
encoding (quoted-printable) and a "dense" encoding (base64). encoding (quoted-printable) and a "dense" encoding (base64).
Mail transport for unencoded 8-bit data is defined in RFC 1652 Mail transport for unencoded 8bit data is defined in RFC 1652
[RFC-1652]. As of the publication of this document, there are [RFC-1652]. As of the initial publication of this document,
no standardized Internet mail transports for which it is there are no standardized Internet mail transports for which
legitimate to include unencoded binary data in mail bodies. it is legitimate to include unencoded binary data in mail
Thus there are no circumstances in which the "binary" bodies. Thus there are no circumstances in which the "binary"
Content-Transfer-Encoding is actually valid in Internet mail. Content-Transfer-Encoding is actually valid in Internet mail.
However, in the event that binary mail transport becomes a However, in the event that binary mail transport becomes a
reality in Internet mail, or when this document is used in reality in Internet mail, or when this document is used in
conjunction with any other binary-capable transport mechanism, conjunction with any other binary-capable transport mechanism,
binary bodies should be labelled as such using this mechanism. binary bodies should be labelled as such using this mechanism.
NOTE: The five values defined for the Content-Transfer- NOTE: The five values defined for the Content-Transfer-
Encoding field imply nothing about the media type other than Encoding field imply nothing about the media type other than
the algorithm by which it was encoded or the transport system the algorithm by which it was encoded or the transport system
requirements if unencoded. requirements if unencoded.
skipping to change at page 19, line 20 skipping to change at page 19, line 34
8.3. New Content-Transfer-Encodings 8.3. New Content-Transfer-Encodings
Implementors may, if necessary, define private Content- Implementors may, if necessary, define private Content-
Transfer-Encoding values, but must use an x-token, which is a Transfer-Encoding values, but must use an x-token, which is a
name prefixed by "X-", to indicate its non-standard status, name prefixed by "X-", to indicate its non-standard status,
e.g., "Content-Transfer-Encoding: x-my-new-encoding". e.g., "Content-Transfer-Encoding: x-my-new-encoding".
Additional standardized Content-Transfer-Encoding values must Additional standardized Content-Transfer-Encoding values must
be specified by a standards-track RFC. Additional be specified by a standards-track RFC. Additional
requirements such specifications must meet are given in RFC requirements such specifications must meet are given in RFC
REG. As such, all content-transfer-encoding namespace except REG. As such, all content-transfer-encoding namespace except
that beginning with "X-" is explicitly reserved to the IANA that beginning with "X-" is explicitly reserved to the IETF
for future use. for future use.
Unlike media types and subtypes, the creation of new Content- Unlike media types and subtypes, the creation of new Content-
Transfer-Encoding values is STRONGLY discouraged, as it seems Transfer-Encoding values is STRONGLY discouraged, as it seems
likely to hinder interoperability with little potential likely to hinder interoperability with little potential
benefit benefit
8.4. Interpretation and Use 8.4. Interpretation and Use
If a Content-Transfer-Encoding header field appears as part of If a Content-Transfer-Encoding header field appears as part of
a message header, it applies to the entire body of that a message header, it applies to the entire body of that
message. If a Content-Transfer-Encoding header field appears message. If a Content-Transfer-Encoding header field appears
as part of a body part's headers, it applies only to the body as part of an entity's headers, it applies only to the body of
of that body part. If an entity is of type "multipart" the that entity. If an entity is of type "multipart" the
Content-Transfer-Encoding is not permitted to have any value Content-Transfer-Encoding is not permitted to have any value
other than "7bit", "8bit" or "binary". Even more severe other than "7bit", "8bit" or "binary". Even more severe
restrictions apply to some subtypes of the "message" type. restrictions apply to some subtypes of the "message" type.
It should be noted that most media types are defined in terms It should be noted that most media types are defined in terms
of octets rather than bits, so that the mechanisms described of octets rather than bits, so that the mechanisms described
here are mechanisms for encoding arbitrary octet streams, not here are mechanisms for encoding arbitrary octet streams, not
bit streams. If a bit stream is to be encoded via one of bit streams. If a bit stream is to be encoded via one of
these mechanisms, it must first be converted to an 8-bit byte these mechanisms, it must first be converted to an 8bit byte
stream using the network standard bit order ("big-endian"), in stream using the network standard bit order ("big-endian"), in
which the earlier bits in a stream become the higher-order which the earlier bits in a stream become the higher-order
bits in a 8-bit byte. A bit stream not ending at an 8-bit bits in a 8bit byte. A bit stream not ending at an 8bit
boundary must be padded with zeroes. This document provides a boundary must be padded with zeroes. RFC MIME-IMT provides a
mechanism for noting the addition of such padding in the case mechanism for noting the addition of such padding in the case
of the application/octet-stream media type, which has a of the application/octet-stream media type, which has a
"padding" parameter. "padding" parameter.
The encoding mechanisms defined here explicitly encode all The encoding mechanisms defined here explicitly encode all
data in US-ASCII. Thus, for example, suppose an entity has data in US-ASCII. Thus, for example, suppose an entity has
header fields such as: header fields such as:
Content-Type: text/plain; charset=ISO-8859-1 Content-Type: text/plain; charset=ISO-8859-1
Content-transfer-encoding: base64 Content-transfer-encoding: base64
skipping to change at page 20, line 32 skipping to change at page 20, line 47
to use any encodings other than "7bit", "8bit", or "binary" to use any encodings other than "7bit", "8bit", or "binary"
with any composite media type, i.e. one that recursively with any composite media type, i.e. one that recursively
includes other Content-Type fields. Currently the only includes other Content-Type fields. Currently the only
composite media types are "multipart" and "message". All composite media types are "multipart" and "message". All
encodings that are desired for bodies of type multipart or encodings that are desired for bodies of type multipart or
message must be done at the innermost level, by encoding the message must be done at the innermost level, by encoding the
actual body that needs to be encoded. actual body that needs to be encoded.
It should also be noted that, by definition, if a composite It should also be noted that, by definition, if a composite
entity has a transfer-encoding value such as "7bit", but one entity has a transfer-encoding value such as "7bit", but one
of the enclosed parts has a less restrictive value such as of the enclosed entities has a less restrictive value such as
"8bit", then either the outer "7bit" labelling is in error, "8bit", then either the outer "7bit" labelling is in error,
because 8-bit data are included, or the inner "8bit" labelling because 8bit data are included, or the inner "8bit" labelling
placed an unnecessarily high demand on the transport system placed an unnecessarily high demand on the transport system
because the actual included data were actually 7-bit-safe. because the actual included data were actually 7bit-safe.
NOTE ON ENCODING RESTRICTIONS: Though the prohibition against NOTE ON ENCODING RESTRICTIONS: Though the prohibition against
using content-transfer-encodings on composite body data may using content-transfer-encodings on composite body data may
seem overly restrictive, it is necessary to prevent nested seem overly restrictive, it is necessary to prevent nested
encodings, in which data are passed through an encoding encodings, in which data are passed through an encoding
algorithm multiple times, and must be decoded multiple times algorithm multiple times, and must be decoded multiple times
in order to be properly viewed. Nested encodings add in order to be properly viewed. Nested encodings add
considerable complexity to user agents: Aside from the considerable complexity to user agents: Aside from the
obvious efficiency problems with such multiple encodings, they obvious efficiency problems with such multiple encodings, they
can obscure the basic structure of a message. In particular, can obscure the basic structure of a message. In particular,
skipping to change at page 21, line 4 skipping to change at page 21, line 19
using content-transfer-encodings on composite body data may using content-transfer-encodings on composite body data may
seem overly restrictive, it is necessary to prevent nested seem overly restrictive, it is necessary to prevent nested
encodings, in which data are passed through an encoding encodings, in which data are passed through an encoding
algorithm multiple times, and must be decoded multiple times algorithm multiple times, and must be decoded multiple times
in order to be properly viewed. Nested encodings add in order to be properly viewed. Nested encodings add
considerable complexity to user agents: Aside from the considerable complexity to user agents: Aside from the
obvious efficiency problems with such multiple encodings, they obvious efficiency problems with such multiple encodings, they
can obscure the basic structure of a message. In particular, can obscure the basic structure of a message. In particular,
they can imply that several decoding operations are necessary they can imply that several decoding operations are necessary
simply to find out what types of bodies a message contains. simply to find out what types of bodies a message contains.
Banning nested encodings may complicate the job of certain Banning nested encodings may complicate the job of certain
mail gateways, but this seems less of a problem than the mail gateways, but this seems less of a problem than the
effect of nested encodings on user agents. effect of nested encodings on user agents.
Any entity with an unrecognized Content-Transfer-Encoding must
be treated as if it has a Content-Type of "application/octet-
stream", regardless of what the Content-Type header field
actually says.
NOTE ON THE RELATIONSHIP BETWEEN CONTENT-TYPE AND CONTENT- NOTE ON THE RELATIONSHIP BETWEEN CONTENT-TYPE AND CONTENT-
TRANSFER-ENCODING: It may seem that the Content-Transfer- TRANSFER-ENCODING: It may seem that the Content-Transfer-
Encoding could be inferred from the characteristics of the Encoding could be inferred from the characteristics of the
media that is to be encoded, or, at the very least, that media that is to be encoded, or, at the very least, that
certain Content-Transfer-Encodings could be mandated for use certain Content-Transfer-Encodings could be mandated for use
with specific media types. There are several reasons why this with specific media types. There are several reasons why this
is not the case. First, given the varying types of transports is not the case. First, given the varying types of transports
used for mail, some encodings may be appropriate for some used for mail, some encodings may be appropriate for some
combinations of media types and transports but not for others. combinations of media types and transports but not for others.
(For example, in an 8-bit transport, no encoding would be (For example, in an 8bit transport, no encoding would be
required for text in certain character sets, while such required for text in certain character sets, while such
encodings are clearly required for 7-bit SMTP.) encodings are clearly required for 7bit SMTP.)
Second, certain media types may require different types of Second, certain media types may require different types of
transfer encoding under different circumstances. For example, transfer encoding under different circumstances. For example,
many PostScript bodies might consist entirely of short lines many PostScript bodies might consist entirely of short lines
of 7-bit data and hence require no encoding at all. Other of 7bit data and hence require no encoding at all. Other
PostScript bodies (especially those using Level 2 PostScript's PostScript bodies (especially those using Level 2 PostScript's
binary encoding mechanism) may only be reasonably represented binary encoding mechanism) may only be reasonably represented
using a binary transport encoding. Finally, since the using a binary transport encoding. Finally, since the
Content-Type field is intended to be an open-ended Content-Type field is intended to be an open-ended
specification mechanism, strict specification of an specification mechanism, strict specification of an
association between media types and encodings effectively association between media types and encodings effectively
couples the specification of an application protocol with a couples the specification of an application protocol with a
specific lower-level transport. This is not desirable since specific lower-level transport. This is not desirable since
the developers of a media type should not have to be aware of the developers of a media type should not have to be aware of
all the transports in use and what their limitations are. all the transports in use and what their limitations are.
8.5. Translating Encodings 8.5. Translating Encodings
The quoted-printable and base64 encodings are designed so that The quoted-printable and base64 encodings are designed so that
conversion between them is possible. The only issue that conversion between them is possible. The only issue that
arises in such a conversion is the handling of line breaks. arises in such a conversion is the handling of hard line
When converting from quoted-printable to base64 a line break breaks. When converting from quoted-printable to base64 a
must be converted into a CRLF sequence. Similarly, a CRLF hard line break must be converted into a CRLF sequence.
sequence in base64 data must be converted to a quoted- Similarly, a CRLF sequence in base64 data must be converted to
printable line break, but ONLY when converting text data. a quoted-printable hard line break, but ONLY when converting
text data.
8.6. Canonical Encoding Model 8.6. Canonical Encoding Model
There was some confusion, in the predecessors of this RFC, There was some confusion, in the predecessors of this RFC,
regarding the model for when email data was to be converted to regarding the model for when email data was to be converted to
canonical form and encoded, and in particular how this process canonical form and encoded, and in particular how this process
would affect the treatment of CRLFs, given that the would affect the treatment of CRLFs, given that the
representation of newlines varies greatly from system to representation of newlines varies greatly from system to
system, and the relationship between content-transfer- system, and the relationship between content-transfer-
encodings and character sets. A canonical model for encoding encodings and character sets. A canonical model for encoding
skipping to change at page 22, line 32 skipping to change at page 23, line 8
modified by mail transport. If the data being encoded are modified by mail transport. If the data being encoded are
mostly US-ASCII text, the encoded form of the data remains mostly US-ASCII text, the encoded form of the data remains
largely recognizable by humans. A body which is entirely US- largely recognizable by humans. A body which is entirely US-
ASCII may also be encoded in Quoted-Printable to ensure the ASCII may also be encoded in Quoted-Printable to ensure the
integrity of the data should the message pass through a integrity of the data should the message pass through a
character-translating, and/or line-wrapping gateway. character-translating, and/or line-wrapping gateway.
In this encoding, octets are to be represented as determined In this encoding, octets are to be represented as determined
by the following rules: by the following rules:
(1) (General 8-bit representation) Any octet, except a CR (1) (General 8bit representation) Any octet, except a CR or
or LF that is part of a CRLF line break of the LF that is part of a CRLF line break of the canonical
canonical (standard) form of the data being encoded, (standard) form of the data being encoded, may be
may be represented by an "=" followed by a two digit represented by an "=" followed by a two digit
hexadecimal representation of the octet's value. The hexadecimal representation of the octet's value. The
digits of the hexadecimal alphabet, for this purpose, digits of the hexadecimal alphabet, for this purpose,
are "0123456789ABCDEF". Uppercase letters must be used are "0123456789ABCDEF". Uppercase letters must be used
when sending hexadecimal data, though a robust when sending hexadecimal data, though a robust
implementation may choose to recognize lowercase implementation may choose to recognize lowercase
letters on receipt. Thus, for example, the decimal letters on receipt. Thus, for example, the decimal
value 12 (US-ASCII form feed) can be represented by value 12 (US-ASCII form feed) can be represented by
"=0C", and the decimal value 61 (US-ASCII EQUAL SIGN) "=0C", and the decimal value 61 (US-ASCII EQUAL SIGN)
can be represented by "=3D". This rule must be can be represented by "=3D". This rule must be
followed except when the following rules allow an followed except when the following rules allow an
skipping to change at page 23, line 25 skipping to change at page 23, line 44
of an encoded line. Any TAB (HT) or SPACE characters of an encoded line. Any TAB (HT) or SPACE characters
on an encoded line MUST thus be followed on that line on an encoded line MUST thus be followed on that line
by a printable character. In particular, an "=" at the by a printable character. In particular, an "=" at the
end of an encoded line, indicating a soft line break end of an encoded line, indicating a soft line break
(see rule #5) may follow one or more TAB (HT) or SPACE (see rule #5) may follow one or more TAB (HT) or SPACE
characters. It follows that an octet with decimal characters. It follows that an octet with decimal
value 9 or 32 appearing at the end of an encoded line value 9 or 32 appearing at the end of an encoded line
must be represented according to Rule #1. This rule is must be represented according to Rule #1. This rule is
necessary because some MTAs (Message Transport Agents, necessary because some MTAs (Message Transport Agents,
programs which transport messages from one user to programs which transport messages from one user to
another, or perform a part of such transfers) are known another, or perform a portion of such transfers) are
to pad lines of text with SPACEs, and others are known known to pad lines of text with SPACEs, and others are
to remove "white space" characters from the end of a known to remove "white space" characters from the end
line. Therefore, when decoding a Quoted-Printable of a line. Therefore, when decoding a Quoted-Printable
body, any trailing white space on a line must be body, any trailing white space on a line must be
deleted, as it will necessarily have been added by deleted, as it will necessarily have been added by
intermediate transport agents. intermediate transport agents.
(4) (Line Breaks) A line break in a text body, represented (4) (Line Breaks) A line break in a text body, represented
as a CRLF sequence in the text canonical form, must be as a CRLF sequence in the text canonical form, must be
represented by a (RFC 822) line break, which is also a represented by a (RFC 822) line break, which is also a
CRLF sequence, in the Quoted-Printable encoding. Since CRLF sequence, in the Quoted-Printable encoding. Since
the canonical representation of media types other than the canonical representation of media types other than
text do not generally include the representation of text do not generally include the representation of
skipping to change at page 24, line 36 skipping to change at page 25, line 10
Now's the time = Now's the time =
for all folk to come= for all folk to come=
to the aid of their country. to the aid of their country.
This provides a mechanism with which long lines are encoded in This provides a mechanism with which long lines are encoded in
such a way as to be restored by the user agent. The 76 such a way as to be restored by the user agent. The 76
character limit does not count the trailing CRLF, but counts character limit does not count the trailing CRLF, but counts
all other characters, including any equal signs. all other characters, including any equal signs.
Since the hyphen character ("-") is represented as itself in Since the hyphen character ("-") may be represented as itself
the Quoted-Printable encoding, care must be taken, when in the Quoted-Printable encoding, care must be taken, when
encapsulating a quoted-printable encoded body inside one or encapsulating a quoted-printable encoded body inside one or
more multipart entities, to ensure that the boundary delimiter more multipart entities, to ensure that the boundary delimiter
does not appear anywhere in the encoded body. (A good does not appear anywhere in the encoded body. (A good
strategy is to choose a boundary that includes a character strategy is to choose a boundary that includes a character
sequence such as "=_" which can never appear in a quoted- sequence such as "=_" which can never appear in a quoted-
printable body. See the definition of multipart messages in printable body. See the definition of multipart messages in
MIME-IMT.) MIME-IMT.)
NOTE: The quoted-printable encoding represents something of a NOTE: The quoted-printable encoding represents something of a
compromise between readability and reliability in transport. compromise between readability and reliability in transport.
skipping to change at page 26, line 4 skipping to change at page 26, line 21
qp-line := *(qp-segment transport-padding CRLF) qp-line := *(qp-segment transport-padding CRLF)
qp-part transport-padding qp-part transport-padding
qp-part := qp-section qp-part := qp-section
; Maximum length of 76 characters ; Maximum length of 76 characters
qp-segment := qp-section *(SPACE / TAB) "=" qp-segment := qp-section *(SPACE / TAB) "="
; Maximum length of 76 characters ; Maximum length of 76 characters
qp-section := [*(ptext / SPACE / TAB) ptext] qp-section := [*(ptext / SPACE / TAB) ptext]
ptext := octet / safe-char
ptext := hex-octet / safe-char
safe-char := <any octet with decimal value of 33 through safe-char := <any octet with decimal value of 33 through
60 inclusive, and 62 through 126> 60 inclusive, and 62 through 126>
; Characters not listed as "mail-safe" in ; Characters not listed as "mail-safe" in
; RFC MIME-CONF are also not recommended. ; RFC MIME-CONF are also not recommended.
octet := "=" 2(DIGIT / "A" / "B" / "C" / "D" / "E" / "F") hex-octet := "=" 2(DIGIT / "A" / "B" / "C" / "D" / "E" / "F")
; Octet must be used for characters > 127, =, ; Octet must be used for characters > 127, =,
; SPACEs or TABs at the ends of lines, and is ; SPACEs or TABs at the ends of lines, and is
; recommended for any character not listed in ; recommended for any character not listed in
; RFC MIME-CONF as "mail-safe". ; RFC MIME-CONF as "mail-safe".
transport-padding := *LWSP-char transport-padding := *LWSP-char
; Composers MUST NOT generate ; Composers MUST NOT generate
; non-zero length transport ; non-zero length transport
; padding, but receivers MUST ; padding, but receivers MUST
; be able to handle padding ; be able to handle padding
; added by message transports. ; added by message transports.
IMPORTANT NOTE: The addition of LWSP between the elements IMPORTANT: The addition of LWSP between the elements shown in
shown in this BNF is NOT allowed since this BNF does not this BNF is NOT allowed since this BNF does not specify a
specify a structured header field. structured header field.
8.8. Base64 Content-Transfer-Encoding 8.8. Base64 Content-Transfer-Encoding
The Base64 Content-Transfer-Encoding is designed to represent The Base64 Content-Transfer-Encoding is designed to represent
arbitrary sequences of octets in a form that need not be arbitrary sequences of octets in a form that need not be
humanly readable. The encoding and decoding algorithms are humanly readable. The encoding and decoding algorithms are
simple, but the encoded data are consistently only about 33 simple, but the encoded data are consistently only about 33
percent larger than the unencoded data. This encoding is percent larger than the unencoded data. This encoding is
virtually identical to the one used in Privacy Enhanced Mail virtually identical to the one used in Privacy Enhanced Mail
(PEM) applications, as defined in RFC 1421 [RFC-1421]. (PEM) applications, as defined in RFC 1421 [RFC-1421].
A 65-character subset of US-ASCII is used, enabling 6 bits to A 65-character subset of US-ASCII is used, enabling 6 bits to
be represented per printable character. (The extra 65th be represented per printable character. (The extra 65th
character, "=", is used to signify a special processing character, "=", is used to signify a special processing
function.) function.)
NOTE: This subset has the important property that it is NOTE: This subset has the important property that it is
represented identically in all versions of ISO 646, including represented identically in all versions of ISO 646, including
US-ASCII, and all characters in the subset are also US-ASCII, and all characters in the subset are also
represented identically in all versions of EBCDIC. Other represented identically in all versions of EBCDIC. Other
popular encodings, such as the encoding used by the uuencode popular encodings, such as the encoding used by the uuencode
utility and the base85 encoding specified as part of Level 2 utility, Macintosh binhex 4.0 [RFC-1741], and the base85
PostScript, do not share these properties, and thus do not encoding specified as part of Level 2 PostScript, do not share
fulfill the portability requirements a binary transport these properties, and thus do not fulfill the portability
encoding for mail must meet. requirements a binary transport encoding for mail must meet.
The encoding process represents 24-bit groups of input bits as The encoding process represents 24-bit groups of input bits as
output strings of 4 encoded characters. Proceeding from left output strings of 4 encoded characters. Proceeding from left
to right, a 24-bit input group is formed by concatenating 3 to right, a 24-bit input group is formed by concatenating 3
8-bit input groups. These 24 bits are then treated as 4 8bit input groups. These 24 bits are then treated as 4
concatenated 6-bit groups, each of which is translated into a concatenated 6-bit groups, each of which is translated into a
single digit in the base64 alphabet. When encoding a bit single digit in the base64 alphabet. When encoding a bit
stream via the base64 encoding, the bit stream must be stream via the base64 encoding, the bit stream must be
presumed to be ordered with the most-significant-bit first. presumed to be ordered with the most-significant-bit first.
That is, the first bit in the stream will be the high-order That is, the first bit in the stream will be the high-order
bit in the first 8-bit byte, and the eighth bit will be the bit in the first 8bit byte, and the eighth bit will be the
low-order bit in the first 8-bit byte, and so on. low-order bit in the first 8bit byte, and so on.
Each 6-bit group is used as an index into an array of 64 Each 6-bit group is used as an index into an array of 64
printable characters. The character referenced by the index printable characters. The character referenced by the index
is placed in the output string. These characters, identified is placed in the output string. These characters, identified
in Table 1, below, are selected so as to be universally in Table 1, below, are selected so as to be universally
representable, and the set excludes characters with particular representable, and the set excludes characters with particular
significance to SMTP (e.g., ".", CR, LF) and to the multipart significance to SMTP (e.g., ".", CR, LF) and to the multipart
boundary delimiters defined in MIME-IMT (e.g., "-"). boundary delimiters defined in MIME-IMT (e.g., "-").
Table 1: The Base64 Alphabet Table 1: The Base64 Alphabet
skipping to change at page 29, line 6 skipping to change at page 29, line 26
Care must be taken to use the proper octets for line breaks if Care must be taken to use the proper octets for line breaks if
base64 encoding is applied directly to text material that has base64 encoding is applied directly to text material that has
not been converted to canonical form. In particular, text not been converted to canonical form. In particular, text
line breaks must be converted into CRLF sequences prior to line breaks must be converted into CRLF sequences prior to
base64 encoding. The important thing to note is that this may base64 encoding. The important thing to note is that this may
be done directly by the encoder rather than in a prior be done directly by the encoder rather than in a prior
canonicalization step in some implementations. canonicalization step in some implementations.
NOTE: There is no need to worry about quoting potential NOTE: There is no need to worry about quoting potential
boundary delimiters within base64-encoded parts of multipart boundary delimiters within base64-encoded bodies within
entities because no hyphen characters are used in the base64 multipart entities because no hyphen characters are used in
encoding. the base64 encoding.
9. Content-ID Header Field 9. Content-ID Header Field
In constructing a high-level user agent, it may be desirable In constructing a high-level user agent, it may be desirable
to allow one body to make reference to another. Accordingly, to allow one body to make reference to another. Accordingly,
bodies may be labelled using the "Content-ID" header field, bodies may be labelled using the "Content-ID" header field,
which is syntactically identical to the "Message-ID" header which is syntactically identical to the "Message-ID" header
field: field:
id := "Content-ID" ":" msg-id id := "Content-ID" ":" msg-id
skipping to change at page 30, line 25 skipping to change at page 30, line 45
message header fields. message header fields.
MIME-extension-field := <Any RFC 822 header field which MIME-extension-field := <Any RFC 822 header field which
begins with the string begins with the string
"Content-"> "Content-">
12. Summary 12. Summary
Using the MIME-Version, Content-Type, and Content-Transfer- Using the MIME-Version, Content-Type, and Content-Transfer-
Encoding header fields, it is possible to include, in a Encoding header fields, it is possible to include, in a
standardized way, arbitrary types of data objects with RFC 822 standardized way, arbitrary types of data with RFC 822
conformant mail messages. No restrictions imposed by either conformant mail messages. No restrictions imposed by either
RFC 821 or RFC 822 are violated, and care has been taken to RFC 821 or RFC 822 are violated, and care has been taken to
avoid problems caused by additional restrictions imposed by avoid problems caused by additional restrictions imposed by
the characteristics of some Internet mail transport mechanisms the characteristics of some Internet mail transport mechanisms
(see RFC MIME-CONF). (see RFC MIME-CONF).
The next document in this set, RFC MIME-IMT, specifies the The next document in this set, RFC MIME-IMT, specifies the
media types that can be labelled and transported using these initial set of media types that can be labelled and
headers. transported using these headers.
13. Security Considerations 13. Security Considerations
Security issues are discussed in the second document in this Security issues are discussed in the second document in this
set, RFC MIME-IMT. set, RFC MIME-IMT.
14. Authors' Addresses 14. Authors' Addresses
For more information, the authors of this document are best For more information, the authors of this document are best
contacted via Internet mail: contacted via Internet mail:
skipping to change at page 32, line 9 skipping to change at page 33, line 9
17060 Dallas Parkway 17060 Dallas Parkway
Dallas Texas, 75248 Dallas Texas, 75248
Email: greg.vaudreuil@ons.octel.com Email: greg.vaudreuil@ons.octel.com
Phone: +1 214 733 2722 Phone: +1 214 733 2722
Appendix A -- Collected Grammar Appendix A -- Collected Grammar
This appendix contains the complete BNF grammar for all the This appendix contains the complete BNF grammar for all the
syntax specified by this document. syntax specified by this document.
By itself, however, this grammar is incomplete. It refers to By itself, however, this grammar is incomplete. It refers by
several entities that are defined by RFC 822. Rather than name to several syntax rules that are defined by RFC 822.
reproduce those definitions here, and risk unintentional Rather than reproduce those definitions here, and risk
differences between the two, this document simply refers the unintentional differences between the two, this document
reader to RFC 822 for the remaining definitions. Wherever a simply refers the reader to RFC 822 for the remaining
term is undefined, it refers to the RFC 822 definition. definitions. Wherever a term is undefined, it refers to the
RFC 822 definition.
attribute := token attribute := token
; Matching of attributes
; is ALWAYS case-insensitive.
composite-type := "message" / "multipart" / extension-token composite-type := "message" / "multipart" / extension-token
content := "Content-Type" ":" type "/" subtype content := "Content-Type" ":" type "/" subtype
*(";" parameter) *(";" parameter)
; Matching of media type and subtype ; Matching of media type and subtype
; is ALWAYS case-insensitive ; is ALWAYS case-insensitive.
description := "Content-Description" ":" *text description := "Content-Description" ":" *text
discrete-type := "text" / "image" / "audio" / "video" / discrete-type := "text" / "image" / "audio" / "video" /
"application" / extension-token "application" / extension-token
encoding := "Content-Transfer-Encoding" ":" mechanism encoding := "Content-Transfer-Encoding" ":" mechanism
extension-token := iana-token / ietf-token / x-token entity-headers := [ content CRLF ]
[ encoding CRLF ]
[ id CRLF ]
[ description CRLF ]
*( MIME-extension-field CRLF )
extension-token := ietf-token / x-token
hex-octet := "=" 2(DIGIT / "A" / "B" / "C" / "D" / "E" / "F")
; Octet must be used for characters > 127, =,
; SPACEs or TABs at the ends of lines, and is
; recommended for any character not listed in
; RFC MIME-CONF as "mail-safe".
iana-token := <a publicly-defined extension token, iana-token := <a publicly-defined extension token,
registered with IANA, as specified in registered with IANA, as specified in
RFC MIME-REG> RFC MIME-REG>
ietf-token := <a publicly-defined extension token, ietf-token := <a publicly-defined extension token,
initially registered with IANA and initially registered with IANA and
subsequently standardized by the IETF> subsequently standardized by the IETF>
id := "Content-ID" ":" msg-id id := "Content-ID" ":" msg-id
skipping to change at page 33, line 4 skipping to change at page 34, line 19
iana-token := <a publicly-defined extension token, iana-token := <a publicly-defined extension token,
registered with IANA, as specified in registered with IANA, as specified in
RFC MIME-REG> RFC MIME-REG>
ietf-token := <a publicly-defined extension token, ietf-token := <a publicly-defined extension token,
initially registered with IANA and initially registered with IANA and
subsequently standardized by the IETF> subsequently standardized by the IETF>
id := "Content-ID" ":" msg-id id := "Content-ID" ":" msg-id
mechanism := "7bit" / "8bit" / "binary" / mechanism := "7bit" / "8bit" / "binary" /
"quoted-printable" / "base64" / "quoted-printable" / "base64" /
ietf-token / x-token ietf-token / x-token
octet := "=" 2(DIGIT / "A" / "B" / "C" / "D" / "E" / "F") MIME-extension-field := <Any RFC 822 header field which
; Octet must be used for characters > 127, =, begins with the string
; SPACEs or TABs at the ends of lines, and is "Content-">
; recommended for any character not listed in
; RFC MIME-CONF as "mail-safe".
parameter := attribute "=" value MIME-message-headers := entity-headers
fields
version CRLF
; The ordering of the header
; fields implied by this BNF
; definition should be ignored.
ptext := octet / safe-char MIME-part-headers := entity-headers
[fields]
; Any field not beginning with
; "content-" can have no defined
; meaning and should be ignored.
; The ordering of the header
; fields implied by this BNF
; definition should be ignored.
parameter := attribute "=" value
ptext := hex-octet / safe-char
qp-line := *(qp-segment transport-padding CRLF) qp-line := *(qp-segment transport-padding CRLF)
qp-part transport-padding qp-part transport-padding
qp-part := qp-section qp-part := qp-section
; Maximum length of 76 characters ; Maximum length of 76 characters
qp-section := [*(ptext / SPACE / TAB) ptext] qp-section := [*(ptext / SPACE / TAB) ptext]
qp-segment := qp-section *(SPACE / TAB) "=" qp-segment := qp-section *(SPACE / TAB) "="
; Maximum length of 76 characters ; Maximum length of 76 characters
quoted-printable := qp-line *(CRLF qp-line) quoted-printable := qp-line *(CRLF qp-line)
safe-char := <any octet with decimal value of 33 through safe-char := <any octet with decimal value of 33 through
60 inclusive, and 62 through 126> 60 inclusive, and 62 through 126>
; Characters not listed as "mail-safe" in ; Characters not listed as "mail-safe" in
; RFC MIME-CONF are also not recommended. ; RFC MIME-CONF are also not recommended.
subtype := extension-token subtype := extension-token / iana-token
token := 1*<any (US-ASCII) CHAR except SPACE, CTLs, token := 1*<any (US-ASCII) CHAR except SPACE, CTLs,
or tspecials> or tspecials>
transport-padding := *LWSP-char transport-padding := *LWSP-char
; Composers MUST NOT generate ; Composers MUST NOT generate
; non-zero length transport ; non-zero length transport
; padding, but receivers MUST ; padding, but receivers MUST
; be able to handle padding ; be able to handle padding
; added by message transports. ; added by message transports.
 End of changes. 89 change blocks. 
218 lines changed or deleted 269 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/