| < draft-ietf-822ext-mime-imb-00.txt | draft-ietf-822ext-mime-imb-01.txt > | |||
|---|---|---|---|---|
| Network Working Group N. Borenstein | Network Working Group Nathaniel Borenstein | |||
| Internet Draft First Virtual Holdings | Internet Draft Ned Freed | |||
| Expires in six months N. Freed, Innosoft | <draft-ietf-822ext-mime-imb-01.txt> | |||
| May 1994 | ||||
| MIME (Multipurpose Internet Mail Extensions) Part One: | ||||
| Mechanisms for Specifying and Describing | ||||
| the Format of Internet Message Bodies | ||||
| <draft-ietf-822ext-mime-imb-00.txt> | ||||
| Status of this Memo | ||||
| This document is an Internet-Draft. Internet-Drafts are | ||||
| working documents of the Internet Engineering Task Force | ||||
| (IETF), its areas, and its working groups. Note that other | ||||
| groups may also distribute working documents as Internet- | ||||
| Drafts. | ||||
| Internet-Drafts are draft documents valid for a maximum of | ||||
| six months and may be updated, replaced, or obsoleted by | ||||
| other documents at any time. It is inappropriate to use | ||||
| Internet- Drafts as reference material or to cite them other | ||||
| than as ``work in progress.'' | ||||
| To learn the current status of any Internet-Draft, please | ||||
| check the ``1id-abstracts.txt'' listing contained in the | ||||
| Internet- Drafts Shadow Directories on ds.internic.net (US | ||||
| East Coast), nic.nordu.net (Europe), ftp.isi.edu (US West | ||||
| Coast), or munnari.oz.au (Pacific Rim). | ||||
| Abstract | ||||
| STD 11, RFC 822 defines a message representation protocol | ||||
| which specifies considerable detail about message headers, | ||||
| but which leaves the message content, or message body, as | ||||
| flat ASCII text. This document redefines the format of | ||||
| message bodies to allow multi-part textual and non-textual | ||||
| message bodies to be represented and exchanged without loss | ||||
| of information. This is based on earlier work documented | ||||
| in RFC 934, STD 11, and RFC 1049, but extends and revises | ||||
| that work. Because RFC 822 said so little about message | ||||
| bodies, this document is largely orthogonal to (rather than | ||||
| a revision of) RFC 822. | ||||
| In particular, this document is designed to provide | ||||
| facilities to include multiple objects in a single message, | ||||
| to represent body text in character sets other than US- | ||||
| ASCII, to represent formatted multi-font text messages, to | ||||
| represent non-textual material such as images and audio | ||||
| fragments, and generally to facilitate later extensions | ||||
| defining new types of Internet mail for use by cooperating | ||||
| mail agents. | ||||
| This document does NOT extend Internet mail header fields to | ||||
| permit anything other than US-ASCII text data. Such | ||||
| extensions are the subject of a companion document [RFC | ||||
| -1522]. | ||||
| This document is a revision of RFC 1521, which was a | ||||
| revision of RFC 1341. Significant differences from RFC 1521 | ||||
| are summarized in Appendix H. | ||||
| THIS PAGE INTENTIONALLY LEFT BLANK. | ||||
| The table of contents should be inserted after this page. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | ||||
| 1 Introduction | ||||
| Since its publication in 1982, RFC 822 [RFC-822] has defined | ||||
| the standard format of textual mail messages on the | ||||
| Internet. Its success has been such that the RFC 822 format | ||||
| has been adopted, wholly or partially, well beyond the | ||||
| confines of the Internet and the Internet SMTP transport | ||||
| defined by RFC 821 [RFC-821]. As the format has seen wider | ||||
| use, a number of limitations have proven increasingly | ||||
| restrictive for the user community. | ||||
| RFC 822 was intended to specify a format for text messages. | ||||
| As such, non-text messages, such as multimedia messages that | ||||
| might include audio or images, are simply not mentioned. | ||||
| Even in the case of text, however, RFC 822 is inadequate for | ||||
| the needs of mail users whose languages require the use of | ||||
| character sets richer than US ASCII [US-ASCII]. Since RFC | ||||
| 822 does not specify mechanisms for mail containing audio, | ||||
| video, Asian language text, or even text in most European | ||||
| languages, additional specifications are needed. | ||||
| One of the notable limitations of RFC 821/822 based mail | ||||
| systems is the fact that they limit the contents of | ||||
| electronic mail messages to relatively short lines of | ||||
| seven-bit ASCII. This forces users to convert any non- | ||||
| textual data that they may wish to send into seven-bit bytes | ||||
| representable as printable ASCII characters before invoking | ||||
| a local mail UA (User Agent, a program with which human | ||||
| users send and receive mail). Examples of such encodings | ||||
| currently used in the Internet include pure hexadecimal, | ||||
| uuencode, the 3-in-4 base 64 scheme specified in RFC 1421, | ||||
| the Andrew Toolkit Representation [ATK], and many others. | ||||
| The limitations of RFC 822 mail become even more apparent as | ||||
| gateways are designed to allow for the exchange of mail | ||||
| messages between RFC 822 hosts and X.400 hosts. X.400 | ||||
| [X400] specifies mechanisms for the inclusion of non-textual | ||||
| body parts within electronic mail messages. The current | ||||
| standards for the mapping of X.400 messages to RFC 822 | ||||
| messages specify either that X.400 non-textual body parts | ||||
| must be converted to (not encoded in) an ASCII format, or | ||||
| that they must be discarded, notifying the RFC 822 user that | ||||
| discarding has occurred. This is clearly undesirable, as | ||||
| information that a user may wish to receive is lost. Even | ||||
| though a user's UA may not have the capability of dealing | ||||
| with the non-textual body part, the user might have some | ||||
| mechanism external to the UA that can extract useful | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | ||||
| information from the body part. Moreover, it does not allow | ||||
| for the fact that the message may eventually be gatewayed | ||||
| back into an X.400 message handling system (i.e., the X.400 | ||||
| message is "tunneled" through Internet mail), where the | ||||
| non-textual information would definitely become useful | ||||
| again. | ||||
| This document describes several mechanisms that combine to | ||||
| solve most of these problems without introducing any serious | ||||
| incompatibilities with the existing world of RFC 822 mail. | ||||
| In particular, it describes: | ||||
| 1. A MIME-Version header field, which uses a version number | ||||
| to declare a message to be conformant with this | ||||
| specification and allows mail processing agents to | ||||
| distinguish between such messages and those generated | ||||
| by older or non-conformant software, which is presumed | ||||
| to lack such a field. | ||||
| 2. A Content-Type header field, generalized from RFC 1049 | ||||
| [RFC-1049], which can be used to specify the type and | ||||
| subtype of data in the body of a message and to fully | ||||
| specify the native representation (encoding) of such | ||||
| data. | ||||
| 2.a. A "text" Content-Type value, which can be used to | ||||
| represent textual information in a number of | ||||
| character sets and formatted text description | ||||
| languages in a standardized manner. | ||||
| 2.b. A "multipart" Content-Type value, which can be | ||||
| used to combine several body parts, possibly of | ||||
| differing types of data, into a single message. | ||||
| 2.c. An "application" Content-Type value, which can be | ||||
| used to transmit application data or binary data, | ||||
| and hence, among other uses, to implement an | ||||
| electronic mail file transfer service. | ||||
| 2.d. A "message" Content-Type value, for encapsulating | ||||
| another mail message. | ||||
| 2.e An "image" Content-Type value, for transmitting | ||||
| still image (picture) data. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | ||||
| 2.f. An "audio" Content-Type value, for transmitting | ||||
| audio or voice data. | ||||
| 2.g. A "video" Content-Type value, for transmitting | ||||
| video or moving image data, possibly with audio as | ||||
| part of the composite video data format. | ||||
| 3. A Content-Transfer-Encoding header field, which can be | ||||
| used to specify an auxiliary encoding that was applied | ||||
| to the data in order to allow it to pass through mail | ||||
| transport mechanisms which may have data or character | ||||
| set limitations. | ||||
| 4. Two additional header fields that can be used to further | ||||
| describe the data in a message body, the Content-ID and | ||||
| Content-Description header fields. | ||||
| MIME has been carefully designed as an extensible mechanism, | ||||
| and it is expected that the set of content-type/subtype | ||||
| pairs and their associated parameters will grow | ||||
| significantly with time. Several other MIME fields, notably | ||||
| including character set names, are likely to have new values | ||||
| defined over time. In order to ensure that the set of such | ||||
| values is developed in an orderly, well-specified, and | ||||
| public manner, MIME defines a registration process which | ||||
| uses the Internet Assigned Numbers Authority (IANA) as a | ||||
| central registry for such values. Appendix E provides | ||||
| details about how IANA registration is accomplished. | ||||
| Finally, to specify and promote interoperability, Appendix A | ||||
| of this document provides a basic applicability statement | ||||
| for a subset of the above mechanisms that defines a minimal | ||||
| level of "conformance" with this document. | ||||
| HISTORICAL NOTE: Several of the mechanisms | ||||
| described in this document may seem somewhat | ||||
| strange or even baroque at first reading. It is | ||||
| important to note that compatibility with existing | ||||
| standards AND robustness across existing practice | ||||
| were two of the highest priorities of the working | ||||
| group that developed this document. In | ||||
| particular, compatibility was always favored over | ||||
| elegance. | ||||
| MIME was first defined and published as RFCs 1341 and 1342 | ||||
| [RFC-1341] [RFC-1342], then revised as RFCs 1521 and 1522 | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | ||||
| [RFC-1521] [RFC-1522]. This document is a relatively minor | ||||
| updating of RFC 1521, and is intended to supersede it. The | ||||
| differences between this document and RFC 1521 are | ||||
| summarized in Appendix H. Please refer to the current | ||||
| edition of the "IAB Official Protocol Standards" for the | ||||
| standardization state and status of this protocol. | ||||
| Several other RFC documents will be of interest to the MIME | ||||
| implementor, in particular [RFC 1343], [RFC-1344], and | ||||
| [RFC-1345]. | ||||
| 2 Notations, Conventions, and Generic BNF Grammar | ||||
| This document is being published in two versions, one as | ||||
| plain ASCII text and one as PostScript1 . The latter is | ||||
| recommended, though the textual contents are identical. An | ||||
| Andrew-format copy of this document is also available from | ||||
| the first author (Borenstein). | ||||
| Although the mechanisms specified in this document are all | ||||
| described in prose, most are also described formally in the | ||||
| modified BNF notation of RFC 822. Implementors will need to | ||||
| be familiar with this notation in order to understand this | ||||
| specification, and are referred to RFC 822 for a complete | ||||
| explanation of the modified BNF notation. | ||||
| Some of the modified BNF in this document makes reference to | ||||
| syntactic entities that are defined in RFC 822 and not in | ||||
| this document. A complete formal grammar, then, is obtained | ||||
| by combining the collected grammar appendix of this document | ||||
| with that of RFC 822 plus the modifications to RFC 822 | ||||
| defined in RFC 1123, which specifically changes the syntax | ||||
| for `return', `date' and `mailbox'. | ||||
| The term CRLF, in this document, refers to the sequence of | ||||
| the two ASCII characters CR (13) and LF (10) which, taken | ||||
| together, in this order, denote a line break in RFC 822 | ||||
| mail. | ||||
| The term "character set" is used in this document to refer | ||||
| to a method used with one or more tables to convert encoded | ||||
| text to a series of octets. This definition is intended to | ||||
| allow various kinds of text encodings, from simple single- | ||||
| table mappings such as ASCII to complex table switching | ||||
| methods such as those that use ISO 2022's techniques. | ||||
| __________ | ||||
| 1PostScript is a trademark of Adobe Systems Incorporated. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | ||||
| However, a MIME character set name must fully specify the | ||||
| mapping to be performed. | ||||
| The term "message", when not further qualified, means either | ||||
| the (complete or "top-level") message being transferred on a | ||||
| network, or a message encapsulated in a body of type | ||||
| "message". | ||||
| The term "body part", in this document, means one of the | ||||
| parts of the body of a multipart entity. A body part has a | ||||
| header and a body, so it makes sense to speak about the body | ||||
| of a body part. | ||||
| The term "entity", in this document, means either a message | ||||
| or a body part. All kinds of entities share the property | ||||
| that they have a header and a body. | ||||
| The term "body", when not further qualified, means the body | ||||
| of an entity, that is the body of either a message or of a | ||||
| body part. | ||||
| NOTE: The previous four definitions are clearly | ||||
| circular. This is unavoidable, since the overall | ||||
| structure of a MIME message is indeed recursive. | ||||
| In this document, all numeric and octet values are given in | ||||
| decimal notation. | ||||
| It must be noted that Content-Type values, subtypes, and | ||||
| parameter names as defined in this document are case- | ||||
| insensitive. However, parameter values are case-sensitive | ||||
| unless otherwise specified for the specific parameter. | ||||
| FORMATTING NOTE: This document has been carefully | ||||
| formatted for ease of reading. The PostScript | ||||
| version of this document, in particular, places | ||||
| notes like this one, which may be skipped by the | ||||
| reader, in a smaller, italicized, font, and | ||||
| indents it as well. In the text version, only the | ||||
| indentation is preserved, so if you are reading | ||||
| the text version of this you might consider using | ||||
| the PostScript version instead. However, all such | ||||
| notes will be indented and preceded by "NOTE:" or | ||||
| some similar introduction, even in the text | ||||
| version. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | ||||
| The primary purpose of these non-essential notes | ||||
| is to convey information about the rationale of | ||||
| this document, or to place this document in the | ||||
| proper historical or evolutionary context. Such | ||||
| information may be skipped by those who are | ||||
| focused entirely on building a conformant | ||||
| implementation, but may be of use to those who | ||||
| wish to understand why this document is written as | ||||
| it is. | ||||
| For ease of recognition, all BNF definitions have | ||||
| been placed in a fixed-width font in the | ||||
| PostScript version of this document. | ||||
| 3 The MIME-Version Header Field | ||||
| Since RFC 822 was published in 1982, there has really been | ||||
| only one format standard for Internet messages, and there | ||||
| has been little perceived need to declare the format | ||||
| standard in use. This document is an independent document | ||||
| that complements RFC 822. Although the extensions in this | ||||
| document have been defined in such a way as to be compatible | ||||
| with RFC 822, there are still circumstances in which it | ||||
| might be desirable for a mail-processing agent to know | ||||
| whether a message was composed with the new standard in | ||||
| mind. | ||||
| Therefore, this document defines a new header field, "MIME- | ||||
| Version", which is to be used to declare the version of the | ||||
| Internet message body format standard in use. | ||||
| Messages composed in accordance with this document MUST | ||||
| include such a header field, with the following verbatim | ||||
| text: | ||||
| MIME-Version: 1.0 | ||||
| The presence of this header field is an assertion that the | ||||
| message has been composed in compliance with this document. | ||||
| Since it is possible that a future document might extend the | ||||
| message format standard again, a formal BNF is given for the | ||||
| content of the MIME-Version field: | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | ||||
| version := "MIME-Version" ":" 1*DIGIT "." 1*DIGIT | ||||
| Thus, future format specifiers, which might replace or | ||||
| extend "1.0", are constrained to be two integer fields, | ||||
| separated by a period. If a message is received with a | ||||
| MIME-version value other than "1.0", it cannot be assumed to | ||||
| conform with this specification. | ||||
| Note that the MIME-Version header field is required at the | ||||
| top level of a message. It is not required for each body | ||||
| part of a multipart entity. It is required for the embedded | ||||
| headers of a body of type "message" if and only if the | ||||
| embedded message is itself claimed to be MIME-conformant. | ||||
| It is not possible to fully specify how a mail reader that | ||||
| conforms with MIME as defined in this document should treat | ||||
| a message that might arrive in the future with some value of | ||||
| MIME-Version other than "1.0". However, conformant | ||||
| software is encouraged to check the version number and at | ||||
| least warn the user if an unrecognized MIME-version is | ||||
| encountered. | ||||
| It is also worth noting that version control for specific | ||||
| content-types is not accomplished using the MIME-Version | ||||
| mechanism. In particular, some formats (such as | ||||
| application/postscript) have version numbering conventions | ||||
| that are internal to the document format. Where such | ||||
| conventions exist, MIME does nothing to supersede them. | ||||
| Where no such conventions exist, a MIME type might use a | ||||
| "version" parameter in the content-type field if necessary. | ||||
| NOTE TO IMPLEMENTORS: All header fields defined in this | ||||
| document, including MIME-Version, Content-type, etc., are | ||||
| subject to the general syntactic rules for header fields | ||||
| specified in RFC 822. In particular, all can include | ||||
| comments, which means that the following two MIME-Version | ||||
| fields are equivalent: | ||||
| MIME-Version: 1.0 | ||||
| MIME-Version: 1.0 (Generated by GBD-killer 3.7) | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | ||||
| 4 The Content-Type Header Field | ||||
| The purpose of the Content-Type field is to describe the | ||||
| data contained in the body fully enough that the receiving | ||||
| user agent can pick an appropriate agent or mechanism to | ||||
| present the data to the user, or otherwise deal with the | ||||
| data in an appropriate manner. | ||||
| HISTORICAL NOTE: The Content-Type header field | ||||
| was first defined in RFC 1049. RFC 1049 Content- | ||||
| types used a simpler and less powerful syntax, but | ||||
| one that is largely compatible with the mechanism | ||||
| given here. | ||||
| The Content-Type header field is used to specify the nature | ||||
| of the data in the body of an entity, by giving type and | ||||
| subtype identifiers, and by providing auxiliary information | ||||
| that may be required for certain types. After the type and | ||||
| subtype names, the remainder of the header field is simply a | ||||
| set of parameters, specified in an attribute/value notation. | ||||
| The set of meaningful parameters differs for the different | ||||
| types. In particular, there are NO globally-meaningful | ||||
| parameters that apply to all content-types. Global | ||||
| mechanisms are best addressed, in the MIME model, by the | ||||
| definition of additional Content-* header fields. The | ||||
| ordering of parameters is not significant. Among the | ||||
| defined parameters is a "charset" parameter by which the | ||||
| character set used in the body may be declared. Comments | ||||
| are allowed in accordance with RFC 822 rules for structured | ||||
| header fields. | ||||
| In general, the top-level Content-Type is used to declare | ||||
| the general type of data, while the subtype specifies a | ||||
| specific format for that type of data. Thus, a Content-Type | ||||
| of "image/xyz" is enough to tell a user agent that the data | ||||
| is an image, even if the user agent has no knowledge of the | ||||
| specific image format "xyz". Such information can be used, | ||||
| for example, to decide whether or not to show a user the raw | ||||
| data from an unrecognized subtype -- such an action might be | ||||
| reasonable for unrecognized subtypes of text, but not for | ||||
| unrecognized subtypes of image or audio. For this reason, | ||||
| registered subtypes of audio, image, text, and video, should | ||||
| not contain embedded information that is really of a | ||||
| different type. Such compound types should be represented | ||||
| using the "multipart" or "application" types. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | ||||
| Parameters are modifiers of the content-subtype, and do not | ||||
| fundamentally affect the requirements of the host system. | ||||
| Although most parameters make sense only with certain | ||||
| content-types, others are "global" in the sense that they | ||||
| might apply to any subtype. For example, the "boundary" | ||||
| parameter makes sense only for the "multipart" content-type, | ||||
| but the "charset" parameter might make sense with several | ||||
| content-types. | ||||
| An initial set of seven Content-Types is defined by this | ||||
| document. This set of top-level names is intended to be | ||||
| substantially complete. It is expected that additions to | ||||
| the larger set of supported types can generally be | ||||
| accomplished by the creation of new subtypes of these | ||||
| initial types. In the future, more top-level types may be | ||||
| defined only by an extension to this standard. If another | ||||
| primary type is to be used for any reason, it must be given | ||||
| a name starting with "X-" to indicate its non-standard | ||||
| status and to avoid a potential conflict with a future | ||||
| official name. | ||||
| In the Augmented BNF notation of RFC 822, a Content-Type | ||||
| header field value is defined as follows: | ||||
| content := "Content-Type" ":" type "/" subtype | ||||
| *(";" parameter) | ||||
| ; case-insensitive matching of type and subtype | ||||
| type := "application" / "audio" | ||||
| / "image" / "message" | ||||
| / "multipart" / "text" | ||||
| / "video" / extension-token | ||||
| ; All values case-insensitive | ||||
| extension-token := x-token / iana-token | ||||
| iana-token := <a publicly-defined extension token, | ||||
| registered with IANA, as specified in | ||||
| appendix E> | ||||
| x-token := <The two characters "X-" or "x-" followed, with | ||||
| no | ||||
| intervening white space, by any token> | ||||
| subtype := token ; case-insensitive | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | ||||
| parameter := attribute "=" value | ||||
| attribute := token ; case-insensitive | ||||
| value := token / quoted-string | ||||
| token := 1*<any (ASCII) CHAR except SPACE, CTLs, or | ||||
| tspecials> | ||||
| tspecials := "(" / ")" / "<" / ">" / "@" | ||||
| / "," / ";" / ":" / "\" / <"> | ||||
| / "/" / "[" / "]" / "?" / "=" | ||||
| ; Must be in quoted-string, | ||||
| ; to use within parameter values | ||||
| Note that the definition of "tspecials" is the same as the | Multipurpose Internet Mail Extensions | |||
| RFC 822 definition of "specials" with the addition of the | (MIME) Part One: | |||
| three characters "/", "?", and "=", and the removal of ".". | ||||
| Note also that a subtype specification is MANDATORY. There | Format of Internet Message Bodies | |||
| are no default subtypes. | ||||
| The type, subtype, and parameter names are not case | November 21, 1994 | |||
| sensitive. For example, TEXT, Text, and TeXt are all | ||||
| equivalent. Parameter values are normally case sensitive, | ||||
| but certain parameters are interpreted to be case- | ||||
| insensitive, depending on the intended use. (For example, | ||||
| multipart boundaries are case-sensitive, but the "access- | ||||
| type" for message/External-body is not case-sensitive.) | ||||
| Note that the value of a quoted string parameter does not | Status of this Memo | |||
| include the quotes. That is, the quotation marks in a | ||||
| quoted-string are not a part of the value of an object, but | ||||
| are merely used to delimit that object. Thus the following | ||||
| two forms: | ||||
| Content-type: text/plain; charset=us-ascii | This document is an Internet-Draft. Internet-Drafts are | |||
| Content-type: text/plain; charset="us-ascii" | working documents of the Internet Engineering Task Force | |||
| (IETF), its areas, and its working groups. Note that other | ||||
| groups may also distribute working documents as Internet- | ||||
| Drafts. | ||||
| are completely equivalent. | Internet-Drafts are draft documents valid for a maximum of six | |||
| months. Internet-Drafts may be updated, replaced, or obsoleted | ||||
| by other documents at any time. It is not appropriate to use | ||||
| Internet-Drafts as reference material or to cite them other | ||||
| than as a "working draft" or "work in progress". | ||||
| Beyond this syntax, the only constraint on the definition of | To learn the current status of any Internet-Draft, please | |||
| subtype names is the desire that their uses must not | check the 1id-abstracts.txt listing contained in the | |||
| conflict. That is, it would be undesirable to have two | Internet-Drafts Shadow Directories on ds.internic.net (US East | |||
| different communities using "Content-Type: | Coast), nic.nordu.net (Europe), ftp.isi.edu (US West Coast), | |||
| application/foobar" to mean two different things. The | or munnari.oz.au (Pacific Rim). | |||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | 1. Abstract | |||
| process of defining new content-subtypes, then, is not | STD 11, RFC 822 defines a message representation protocol | |||
| intended to be a mechanism for imposing restrictions, but | specifying considerable detail about message headers, but | |||
| simply a mechanism for publicizing the usages. There are, | which leaves the message content, or message body, as flat | |||
| therefore, two acceptable mechanisms for defining new | US-ASCII text. This document redefines the format of message | |||
| Content-Type subtypes: | bodies to allow multi-part textual and non-textual message | |||
| bodies to be represented and exchanged without loss of | ||||
| information. This is based on earlier work documented in RFC | ||||
| 934, STD 11, and RFC 1049, but extends and revises them. | ||||
| Because RFC 822 said so little about message bodies, this | ||||
| document is largely orthogonal to (rather than a revision of) | ||||
| RFC 822. | ||||
| 1. Private values (starting with "X-") may be | In particular, this document is designed to provide facilities | |||
| defined bilaterally between two cooperating | to include multiple parts in a single message, to represent | |||
| agents without outside registration or | body text in character sets other than US-ASCII, to represent | |||
| standardization. | formatted multi-font text messages, to represent non-textual | |||
| material such as images and audio fragments, and generally to | ||||
| facilitate later extensions defining new types of Internet | ||||
| mail for use by cooperating mail agents. | ||||
| 2. New standard values must be documented, | This document does NOT extend Internet mail header fields to | |||
| registered with, and approved by IANA, as | permit anything other than US-ASCII text data. Such | |||
| described in Appendix E. Where intended for | extensions are the subject of [RFC-MIME-HEADERS]. | |||
| public use, the formats they refer to must | ||||
| also be defined by a published specification, | ||||
| and possibly offered for standardization. | ||||
| The seven standard initial predefined Content-Types are | This document is a revision of RFC 1521, which was a revision | |||
| detailed in the bulk of this document. They are: | of RFC 1341. Significant differences from RFC 1521 are | |||
| summarized in Appendix G. | ||||
| text -- textual information. The primary subtype, | 2. Table of Contents | |||
| "plain", indicates plain (unformatted) text. No | ||||
| special software is required to get the full | ||||
| meaning of the text, aside from support for the | ||||
| indicated character set. Subtypes are to be used | ||||
| for enriched text in forms where application | ||||
| software may enhance the appearance of the text, | ||||
| but such software must not be required in order to | ||||
| get the general idea of the content. Possible | ||||
| subtypes thus include any readable word processor | ||||
| format. A very simple and portable subtype, | ||||
| richtext, was defined in RFC 1341 [RFC-1341], with | ||||
| a further revision in RFC 1563 [RFC-1563]. | ||||
| multipart -- data consisting of multiple parts of | ||||
| independent data types. Four initial subtypes | ||||
| are defined, including the primary "mixed" | ||||
| subtype, "alternative" for representing the same | ||||
| data in multiple formats, "parallel" for parts | ||||
| intended to be viewed simultaneously, and "digest" | ||||
| for multipart entities in which each part is of | ||||
| type "message". | ||||
| message -- an encapsulated message. A body of | ||||
| Content-Type "message" is itself all or part of a | ||||
| fully formatted RFC 822 conformant message which | ||||
| may contain its own different Content-Type header | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | 1 Abstract .............................................. 2 | |||
| 2 Table of Contents ..................................... 3 | ||||
| 3 Introduction .......................................... 5 | ||||
| 4 Notations, Conventions, and Generic BNF Grammar ....... 9 | ||||
| 5 MIME Header Fields .................................... 12 | ||||
| 5.1 MIME-Version Header Field ........................... 12 | ||||
| 5.2 Content-Type Header Field ........................... 14 | ||||
| 5.2.1 Syntax of the Content-Type Header Field ........... 15 | ||||
| 5.2.2 Definition of a Top-Level Content-Type ............ 18 | ||||
| 5.2.3 Initial Set of Top-Level Content-Types ............ 18 | ||||
| 5.3 Content-Transfer-Encoding Header Field .............. 21 | ||||
| 5.3.1 Content-Transfer-Encoding Syntax .................. 21 | ||||
| 5.3.2 Content-Transfer-Encoding Semantics ............... 22 | ||||
| 5.3.3 Quoted-Printable Content-Transfer-Encoding ........ 26 | ||||
| 5.3.4 Base64 Content-Transfer-Encoding .................. 30 | ||||
| 5.4 Content-ID Header Field ............................. 32 | ||||
| 5.5 Content-Description Header Field .................... 33 | ||||
| 5.6 Additional MIME Header Fields ....................... 33 | ||||
| 6 Predefined Content-Type Values ........................ 34 | ||||
| 6.1 Discrete Content-Type Values ........................ 34 | ||||
| 6.1.1 Text Content-Type ................................. 34 | ||||
| 6.1.1.1 Representation of Line Breaks ................... 35 | ||||
| 6.1.1.2 Charset Parameter ............................... 35 | ||||
| 6.1.1.3 Plain Subtype ................................... 38 | ||||
| 6.1.1.4 Unrecognized Subtypes ........................... 38 | ||||
| 6.1.2 Image Content-Type ................................ 39 | ||||
| 6.1.3 Audio Content-Type ................................ 39 | ||||
| 6.1.4 Video Content-Type ................................ 40 | ||||
| 6.1.5 Application Content-Type .......................... 40 | ||||
| 6.1.5.1 Octet-Stream Subtype ............................ 41 | ||||
| 6.1.5.2 PostScript Subtype .............................. 42 | ||||
| 6.1.5.3 Other Application Subtypes ...................... 45 | ||||
| 6.2 Composite Content-Type Values ....................... 46 | ||||
| 6.2.1 Multipart Content-Type ............................ 46 | ||||
| 6.2.1.1 Common Syntax ................................... 48 | ||||
| 6.2.1.2 Handling Nested Messages and Multiparts ......... 53 | ||||
| 6.2.1.3 Mixed Subtype ................................... 53 | ||||
| 6.2.1.4 Alternative Subtype ............................. 53 | ||||
| 6.2.1.5 Digest Subtype .................................. 56 | ||||
| 6.2.1.6 Parallel Subtype ................................ 57 | ||||
| 6.2.1.7 Other Multipart Subtypes ........................ 57 | ||||
| 6.2.2 Message Content-Type .............................. 57 | ||||
| 6.2.2.1 RFC822 Subtype .................................. 58 | ||||
| 6.2.2.2 Partial Subtype ................................. 58 | ||||
| 6.2.2.2.1 Message Fragmentation and Reassembly .......... 59 | ||||
| 6.2.2.2.2 Fragmentation and Reassembly Example .......... 60 | ||||
| 6.2.2.3 External-Body Subtype ........................... 62 | ||||
| 6.2.2.3.1 General External-Body Parameters .............. 64 | ||||
| 6.2.2.3.2 The 'ftp' and 'tftp' Access-Types ............. 65 | ||||
| 6.2.2.3.3 The 'anon-ftp' Access-Type .................... 66 | ||||
| 6.2.2.3.4 The 'local-file' Access-Type .................. 66 | ||||
| 6.2.2.3.5 The 'mail-server' Access-Type ................. 66 | ||||
| 6.2.2.3.6 Examples and Further Explanations ............. 67 | ||||
| 6.2.2.4 Other Message Subtypes .......................... 70 | ||||
| 7 Experimental Content-Type Values ...................... 71 | ||||
| 8 Summary ............................................... 72 | ||||
| 9 Security Considerations ............................... 73 | ||||
| 10 Authors' Addresses ................................... 74 | ||||
| 11 Acknowledgements ..................................... 75 | ||||
| A MIME Conformance ...................................... 77 | ||||
| B Guidelines For Sending Email Data ..................... 80 | ||||
| C A Complex Multipart Example ........................... 83 | ||||
| D Collected Grammar ..................................... 85 | ||||
| F Summary of the Seven Content-types .................... 88 | ||||
| G Canonical Encoding Model .............................. 91 | ||||
| H Changes from RFC 1521 ................................. 94 | ||||
| I References ............................................ 97 | ||||
| 3. Introduction | ||||
| field. The primary subtype is "rfc822". The | Since its publication in 1982, RFC 822 [RFC-822] has defined | |||
| "partial" subtype is defined for partial messages, | the standard format of textual mail messages on the Internet. | |||
| to permit the fragmented transmission of bodies | Its success has been such that the RFC 822 format has been | |||
| that are thought to be too large to be passed | adopted, wholly or partially, well beyond the confines of the | |||
| through mail transport facilities. Another | Internet and the Internet SMTP transport defined by RFC 821 | |||
| subtype, "External-body", is defined for | [RFC-821]. As the format has seen wider use, a number of | |||
| specifying large bodies by reference to an | limitations have proven increasingly restrictive for the user | |||
| external data source. | community. | |||
| image -- image data. Image requires a display device | ||||
| (such as a graphical display, a printer, or a FAX | ||||
| machine) to view the information. Initial | ||||
| subtypes are defined for two widely-used image | ||||
| formats, jpeg and gif. | ||||
| audio -- audio data, with initial subtype "basic". | ||||
| Audio requires an audio output device (such as a | ||||
| speaker or a telephone) to "display" the contents. | ||||
| video -- video data. Video requires the capability to | ||||
| display moving images, typically including | ||||
| specialized hardware and software. The initial | ||||
| subtype is "mpeg". | ||||
| application -- some other kind of data, typically | ||||
| either uninterpreted binary data or information to | ||||
| be processed by a mail-based application. The | ||||
| primary subtype, "octet-stream", is to be used in | ||||
| the case of uninterpreted binary data, in which | ||||
| case the simplest recommended action is to offer | ||||
| to write the information into a file for the user. | ||||
| An additional subtype, "PostScript", is defined | ||||
| for transporting PostScript documents in bodies. | ||||
| Other expected uses for "application" include | ||||
| spreadsheets, data for mail-based scheduling | ||||
| systems, and languages for "active" | ||||
| (computational) email. (Note that active email | ||||
| and other application data may entail several | ||||
| security considerations, which are discussed later | ||||
| in this memo, particularly in the context of | ||||
| application/PostScript.) | ||||
| Default RFC 822 messages are typed by this protocol as plain | RFC 822 was intended to specify a format for text messages. | |||
| text in the US-ASCII character set, which can be explicitly | As such, non-text messages, such as multimedia messages that | |||
| specified as "Content-type: text/plain; charset=us-ascii". | might include audio or images, are simply not mentioned. Even | |||
| If no Content-Type is specified, this default is assumed. | in the case of text, however, RFC 822 is inadequate for the | |||
| In the presence of a MIME-Version header field, a receiving | needs of mail users whose languages require the use of | |||
| User Agent can also assume that plain US-ASCII text was the | character sets richer than US-ASCII. Since RFC 822 does not | |||
| sender's intent. In the absence of a MIME-Version | specify mechanisms for mail containing audio, video, Asian | |||
| specification, plain US-ASCII text must still be assumed, | language text, or even text in most European languages, | |||
| additional specifications are needed. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | One of the notable limitations of RFC 821/822 based mail | |||
| systems is the fact that they limit the contents of electronic | ||||
| mail messages to relatively short lines of 7-bit US-ASCII. | ||||
| This forces users to convert any non-textual data that they | ||||
| may wish to send into seven-bit bytes representable as | ||||
| printable US-ASCII characters before invoking a local mail UA | ||||
| (User Agent, a program with which human users send and receive | ||||
| mail). Examples of such encodings currently used in the | ||||
| Internet include pure hexadecimal, uuencode, the 3-in-4 base | ||||
| 64 scheme specified in RFC 1421, the Andrew Toolkit | ||||
| Representation [ATK], and many others. | ||||
| but the sender's intent might have been otherwise. | The limitations of RFC 822 mail become even more apparent as | |||
| gateways are designed to allow for the exchange of mail | ||||
| messages between RFC 822 hosts and X.400 hosts. X.400 [X400] | ||||
| specifies mechanisms for the inclusion of non-textual body | ||||
| parts within electronic mail messages. The current standards | ||||
| for the mapping of X.400 messages to RFC 822 messages specify | ||||
| either that X.400 non-textual body parts must be converted to | ||||
| (not encoded in) IA5Text format, or that they must be | ||||
| discarded, notifying the RFC 822 user that discarding has | ||||
| occurred. This is clearly undesirable, as information that a | ||||
| user may wish to receive is lost. Even though a user agent | ||||
| may not have the capability of dealing with the non-textual | ||||
| body part, the user might have some mechanism external to the | ||||
| UA that can extract useful information from the body part. | ||||
| Moreover, it does not allow for the fact that the message may | ||||
| eventually be gatewayed back into an X.400 message handling | ||||
| system (i.e., the X.400 message is "tunneled" through Internet | ||||
| mail), where the non-textual information would definitely | ||||
| become useful again. | ||||
| RATIONALE: In the absence of any Content-Type | This document describes several mechanisms that combine to | |||
| header field or MIME-Version header field, it is | solve most of these problems without introducing any serious | |||
| impossible to be certain that a message is | incompatibilities with the existing world of RFC 822 mail. In | |||
| actually text in the US-ASCII character set, since | particular, it describes: | |||
| it might well be a message that, using the | ||||
| conventions that predate this document, includes | ||||
| text in another character set or non-textual data | ||||
| in a manner that cannot be automatically | ||||
| recognized (e.g., a uuencoded compressed UNIX tar | ||||
| file). Although there is no fully acceptable | ||||
| alternative to treating such untyped messages as | ||||
| "text/plain; charset=us-ascii", implementors | ||||
| should remain aware that if a message lacks both | ||||
| the MIME-Version and the Content-Type header | ||||
| fields, it may in practice contain almost | ||||
| anything. | ||||
| It should be noted that the list of Content-Type values | (1) A MIME-Version header field, which uses a version | |||
| given here may be augmented in time, via the mechanisms | number to declare a message to be conformant with this | |||
| described above, and that the set of subtypes is expected to | specification and allows mail processing agents to | |||
| grow substantially. | distinguish between such messages and those generated | |||
| by older or non-conformant software, which are presumed | ||||
| to lack such a field. | ||||
| When a mail reader encounters mail with an unknown Content- | (2) A Content-Type header field, generalized from RFC 1049 | |||
| type value, it should generally treat it as equivalent to | [RFC-1049], which can be used to specify the type and | |||
| "application/octet-stream", as described later in this | subtype of data in the body of a message and to fully | |||
| document. | specify the native representation (encoding) of such | |||
| data. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | (3) A Content-Transfer-Encoding header field, which can be | |||
| used to specify an auxiliary encoding that was applied | ||||
| to the data in order to allow it to pass through mail | ||||
| transport mechanisms which may have data or character | ||||
| set limitations. | ||||
| 5 The Content-Transfer-Encoding Header Field | (4) Two additional header fields that can be used to | |||
| further describe the data in a body, the Content-ID and | ||||
| Content-Description header fields. | ||||
| Many Content-Types which could usefully be transported via | All of these header fields defined in this document are | |||
| email are represented, in their "natural" format, as 8-bit | subject to the general syntactic rules for header fields | |||
| character or binary data. Such data cannot be transmitted | specified in RFC 822. In particular, all of these header | |||
| over some transport protocols. For example, RFC 821 | fields can include RFC 822 comments, which have no semantic | |||
| restricts mail messages to 7-bit US-ASCII data with lines no | content and should be ignored during MIME processing. | |||
| longer than 1000 characters. | ||||
| It is necessary, therefore, to define a standard mechanism | The generalized Content-Type header field values can be used | |||
| for re-encoding such data into a 7-bit short-line format. | to identify both discrete and composite bodies. The following | |||
| This document specifies that such encodings will be | types of discrete bodies are currently defined: | |||
| indicated by a new "Content-Transfer-Encoding" header field. | ||||
| The Content-Transfer-Encoding field is used to indicate the | ||||
| type of transformation that has been used in order to | ||||
| represent the body in an acceptable manner for transport. | ||||
| Unlike Content-Types, a proliferation of Content-Transfer- | (1) A "text" Content-Type value, which can be used to | |||
| Encoding values is undesirable and unnecessary. However, | represent textual information in a number of character | |||
| establishing only a single Content-Transfer-Encoding | sets and formatted text description languages in a | |||
| mechanism does not seem possible. There is a tradeoff | standardized manner. | |||
| between the desire for a compact and efficient encoding of | ||||
| largely-binary data and the desire for a readable encoding | ||||
| of data that is mostly, but not entirely, 7-bit data. For | ||||
| this reason, at least two encoding mechanisms are necessary: | ||||
| a "readable" encoding and a "dense" encoding. | ||||
| The Content-Transfer-Encoding field is designed to specify | (2) An "image" Content-Type value, for transmitting still | |||
| an invertible mapping between the "native" representation of | image (picture) data. | |||
| a type of data and a representation that can be readily | ||||
| exchanged using 7 bit mail transport protocols, such as | ||||
| those defined by RFC 821 (SMTP). This field has not been | ||||
| defined by any previous standard. The field's value is a | ||||
| single token specifying the type of encoding, as enumerated | ||||
| below. Formally: | ||||
| encoding := "Content-Transfer-Encoding" ":" mechanism | (3) An "audio" Content-Type value, for transmitting audio | |||
| or voice data. | ||||
| mechanism := "7bit" ; case-insensitive | (4) A "video" Content-Type value, for transmitting video or | |||
| / "quoted-printable" | moving image data, possibly with audio as part of the | |||
| / "base64" | composite video data format. | |||
| / "8bit" | ||||
| / "binary" | ||||
| / x-token | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | (5) An "application" Content-Type value, which can be used | |||
| to transmit application data or binary data, and hence, | ||||
| among other uses, to implement an electronic mail file | ||||
| transfer service. | ||||
| These values are not case sensitive. That is, Base64 and | Two types of composite bodies are currently defined: | |||
| BASE64 and bAsE64 are all equivalent. An encoding type of | ||||
| 7BIT requires that the body is already in a seven-bit mail- | ||||
| ready representation. This is the default value -- that is, | ||||
| "Content-Transfer-Encoding: 7BIT" is assumed if the | ||||
| Content-Transfer-Encoding header field is not present. | ||||
| The values "8bit", "7bit", and "binary" all mean that NO | (1) A "multipart" Content-Type value, which can be used to | |||
| encoding has been performed. However, they are potentially | combine several body parts, possibly of differing types | |||
| useful as indications of the kind of data contained in the | of data, into a single message. | |||
| object, and therefore of the kind of encoding that might | ||||
| need to be performed for transmission in a given transport | ||||
| system. In particular: | ||||
| "7bit" means that the data is all represented as short | (2) A "message" Content-Type value, for encapsulating | |||
| lines of US-ASCII data. | another message or part of a message. | |||
| "8bit" means that the lines are short, but there may be | ||||
| non-ASCII characters (octets with the high-order | ||||
| bit set). | ||||
| "Binary" means that not only may non-ASCII characters | ||||
| be present, but also that the lines are not | ||||
| necessarily short enough for SMTP transport. | ||||
| The difference between "8bit" (or any other conceivable | MIME's Content-Type mechanism has been carefully designed to | |||
| bit-width token) and the "binary" token is that "binary" | be extensible, and it is expected that the set of content- | |||
| does not require adherence to any limits on line length or | type/subtype pairs and their associated parameters will grow | |||
| to the SMTP CRLF semantics, while the bit-width tokens do | significantly with time. Several other MIME entities, most | |||
| require such adherence. If the body contains data in any | notably the list of the name of character sets registered for | |||
| bit-width other than 7-bit, the appropriate bit-width | MIME usage, are likely to have new values defined over time. | |||
| Content-Transfer-Encoding token must be used (e.g., "8bit" | In order to ensure that the set of such values is developed in | |||
| for unencoded 8 bit wide data). If the body contains binary | an orderly, well-specified, and public manner, MIME sets up a | |||
| data, the "binary" Content-Transfer-Encoding token must be | registration process which uses the Internet Assigned Numbers | |||
| used. | Authority (IANA) as a central registry for MIME's extension | |||
| areas. The registration process is described in RFC REG [RFC- | ||||
| REG]. | ||||
| NOTE: The distinction between the Content- | Finally, to specify and promote interoperability, Appendix A | |||
| Transfer-Encoding values of "binary", "8bit", etc. | of this document provides a basic applicability statement for | |||
| may seem unimportant, in that all of them really | a subset of the above mechanisms that defines a minimal level | |||
| mean "none" -- that is, there has been no encoding | of "conformance" with this document. | |||
| of the data for transport. However, clear | ||||
| labeling will be of enormous value to gateways | ||||
| between future mail transport systems with | ||||
| differing capabilities in transporting data that | ||||
| do not meet the restrictions of RFC 821 transport. | ||||
| Mail transport for unencoded 8-bit data is defined | HISTORICAL NOTE: Several of the mechanisms described in this | |||
| in RFC-1426 [RFC-1426]. As of the publication of | document may seem somewhat strange or even baroque at first | |||
| reading. It is important to note that compatibility with | ||||
| existing standards AND robustness across existing practice | ||||
| were two of the highest priorities of the working group that | ||||
| developed this document. In particular, compatibility was | ||||
| always favored over elegance. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | MIME was first defined and published as RFC 1341 [RFC-1341] | |||
| and RFC1342 [RFC-1342], then revised in RFC 1521 [RFC-1521] | ||||
| and RFC 1522 [RFC-1522]. This document is a relatively minor | ||||
| updating of RFC 1521, and is intended to supersede it. The | ||||
| companion document RFC MIME-HEADERS [RFC-MIME-HEADERS] in turn | ||||
| supersedes RFC 1522. | ||||
| this document, there are no standardized Internet | The differences between this document and RFC 1521 are | |||
| mail transports for which it is legitimate to | summarized in Appendix G. Please refer to the current edition | |||
| include unencoded binary data in mail bodies. | of the "IAB Official Protocol Standards" for the | |||
| Thus there are no circumstances in which the | standardization state and status of this protocol. RFC 822 and | |||
| "binary" Content-Transfer-Encoding is actually | RFC 1123 [RFC-1123] also provide essential background for MIME | |||
| legal on the Internet. However, in the event that | since no conforming implementation of MIME can violate them. | |||
| binary mail transport becomes a reality in | In addition, several other informational RFC documents will be | |||
| Internet mail, or when this document is used in | of interest to the MIME implementor, in particular RFC 1344 | |||
| conjunction with any other binary-capable | [RFC-1344], RFC 1345 [RFC-1345], and RFC 1524 [RFC-1524]. | |||
| transport mechanism, binary bodies should be | ||||
| labeled as such using this mechanism. | ||||
| NOTE: The five values defined for the Content- | 4. Notations, Conventions, and Generic BNF Grammar | |||
| Transfer-Encoding field imply nothing about the | ||||
| Content-Type other than the algorithm by which it | ||||
| was encoded or the transport system requirements | ||||
| if unencoded. | ||||
| Implementors may, if necessary, define new Content- | Although the mechanisms specified in this document are all | |||
| Transfer-Encoding values, but must use an x-token, which is | described in prose, most are also described formally in the | |||
| a name prefixed by "X-" to indicate its non-standard status, | augmented BNF notation of RFC 822. Implementors will need to | |||
| e.g., "Content-Transfer-Encoding: x-my-new-encoding". | be familiar with this notation in order to understand this | |||
| However, unlike Content-Types and subtypes, the creation of | specification, and are referred to RFC 822 for a complete | |||
| new Content-Transfer-Encoding values is explicitly and | explanation of the augmented BNF notation. | |||
| strongly discouraged, as it seems likely to hinder | ||||
| interoperability with little potential benefit. Their use | ||||
| is allowed only as the result of an agreement between | ||||
| cooperating user agents. | ||||
| If a Content-Transfer-Encoding header field appears as part | Some of the augmented BNF in this document makes reference to | |||
| of a message header, it applies to the entire body of that | syntactic entities that are defined in RFC 822 and not in this | |||
| message. If a Content-Transfer-Encoding header field | document. A complete formal grammar, then, is obtained by | |||
| appears as part of a body part's headers, it applies only to | Appendix D of this document, the collected grammar, with the | |||
| the body of that body part. If an entity is of type | BNF of RFC 822 plus the modifications to RFC 822 defined in | |||
| "multipart" or "message", the Content-Transfer-Encoding is | RFC 1123, which specifically changes the syntax for `return', | |||
| not permitted to have any value other than a bit width | `date' and `mailbox'. | |||
| (e.g., "7bit", "8bit", etc.) or "binary". | ||||
| It should be noted that email is character-oriented, so that | The term CRLF, in this document, refers to the sequence of the | |||
| the mechanisms described here are mechanisms for encoding | two US-ASCII characters CR (decimal value 13) and LF (decimal | |||
| arbitrary octet streams, not bit streams. If a bit stream | value 10) which, taken together, in this order, denote a line | |||
| is to be encoded via one of these mechanisms, it must first | break in RFC 822 mail. | |||
| be converted to an 8-bit byte stream using the network | ||||
| standard bit order ("big-endian"), in which the earlier bits | ||||
| in a stream become the higher-order bits in a byte. A bit | ||||
| stream not ending at an 8-bit boundary must be padded with | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | The term "character set" is used in this document to refer to | |||
| a method used with one or more tables to convert a sequence of | ||||
| octets into a sequence of characters. Note that unconditional | ||||
| conversion in the other direction is not required, in that not | ||||
| all characters may be available in a given character set and a | ||||
| character set may provide more than one sequence of octets to | ||||
| represent a particular character. This definition is intended | ||||
| to allow various kinds of character encodings, from simple | ||||
| single-table mappings such as US-ASCII to complex table | ||||
| switching methods such as those that use ISO 2022's | ||||
| techniques. However, the definition associated with a MIME | ||||
| character set name must fully specify the mapping to be | ||||
| performed from octets to characters. In particular, use of | ||||
| external profiling information to determine the exact mapping | ||||
| is not permitted. | ||||
| zeroes. This document provides a mechanism for noting the | The term "message", when not further qualified, means either | |||
| addition of such padding in the case of the application | the (complete or "top-level") message being transferred on a | |||
| Content-Type, which has a "padding" parameter. | network, or a message encapsulated in a body part of type | |||
| "message". | ||||
| The encoding mechanisms defined here explicitly encode all | The term "body part", in this document, refers to either the a | |||
| data in ASCII. Thus, for example, suppose an entity has | single part message or one of the parts in the body of a | |||
| header fields such as: | multipart entity. A body part has a header and a body, so it | |||
| makes sense to speak about the body of a body part. | ||||
| Content-Type: text/plain; charset=ISO-8859-1 | The term "entity", in this document, means either a message or | |||
| Content-transfer-encoding: base64 | a body part. All kinds of entities share the property that | |||
| they have a header and a body. | ||||
| This must be interpreted to mean that the body is a base64 | The term "body", when not further qualified, means the body of | |||
| ASCII encoding of data that was originally in ISO-8859-1, | an entity, that is the body of either a message or of a body | |||
| and will be in that character set again after decoding. | part. | |||
| The following sections will define the two standard encoding | NOTE: The previous four definitions are clearly circular. | |||
| mechanisms. The definition of new content-transfer- | This is unavoidable, since the overall structure of a MIME | |||
| encodings is explicitly discouraged and should only occur | message is indeed recursive. | |||
| when absolutely necessary. All content-transfer-encoding | ||||
| namespace except that beginning with "X-" is explicitly | ||||
| reserved to the IANA for future use. Private agreements | ||||
| about content-transfer-encodings are also explicitly | ||||
| discouraged. | ||||
| Certain Content-Transfer-Encoding values may only be used on | "7bit data" refers to data that is all represented as short | |||
| certain Content-Types. In particular, it is expressly | lines of US-ASCII. CR (decimal value 13) and LF (decimal | |||
| forbidden to use any encodings other than "7bit", "8bit", or | value 10) characters only occur as part of CRLF line | |||
| "binary" with any Content-Type that recursively includes | separation sequences and no NULs (US-ASCII value 0) are | |||
| other Content-Type fields, notably the "multipart" and | allowed. | |||
| "message" Content-Types. All encodings that are desired for | ||||
| bodies of type multipart or message must be done at the | ||||
| innermost level, by encoding the actual body that needs to | ||||
| be encoded. | ||||
| It should also be noted that, by definition, if a | (1) "8bit data" refers to data that is all represented as | |||
| "multipart" or "message" entity has a transfer-encoding | short lines, but there may be non-US-ASCII characters | |||
| value such as "7bit", but one of the enclosed parts has a | (octets with the high-order bit set) present. As with | |||
| less restrictive value such as "8bit", then either the outer | "7bit data" CR and LF characters only occur as part of | |||
| "7bit" labelling is in error, because 8 bit data are | CRLF line separation sequences and no NULs are allowed. | |||
| included, or the inner "8bit" labelling placed an | ||||
| unnecessarily high demand on the transport system because | ||||
| the actual included data were actually 7bit-safe. | ||||
| NOTE ON ENCODING RESTRICTIONS: Though the | (2) "Binary data" refers to data where any sequence of | |||
| prohibition against using content-transfer- | octets whatsoever is allowed. | |||
| encodings on data of type multipart or message may | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | "Lines" are defined as sequences of octets separated by a CRLF | |||
| sequences. This is consistent with both RFC 821 and RFC 822. | ||||
| Lines in MIME bodies must also be terminated with a CRLF, but | ||||
| the terminating CRLF on the last line of the body may properly | ||||
| be part of a subsequent boundary marker rather than being part | ||||
| of the body itself. | ||||
| seem overly restrictive, it is necessary to | In this document, all numeric and octet values are given in | |||
| prevent nested encodings, in which data are passed | decimal notation. All Content-Type values, subtypes, and | |||
| through an encoding algorithm multiple times, and | parameter names as defined in this document are case- | |||
| must be decoded multiple times in order to be | insensitive. However, parameter values are case-sensitive | |||
| properly viewed. Nested encodings add | unless otherwise specified for the specific parameter. | |||
| considerable complexity to user agents: aside | ||||
| from the obvious efficiency problems with such | ||||
| multiple encodings, they can obscure the basic | ||||
| structure of a message. In particular, they can | ||||
| imply that several decoding operations are | ||||
| necessary simply to find out what types of objects | ||||
| a message contains. Banning nested encodings may | ||||
| complicate the job of certain mail gateways, but | ||||
| this seems less of a problem than the effect of | ||||
| nested encodings on user agents. | ||||
| NOTE ON THE RELATIONSHIP BETWEEN CONTENT-TYPE AND | FORMATTING NOTE: Notes, such at this one, provide additional | |||
| CONTENT-TRANSFER-ENCODING: It may seem that the | nonessential information which may be skipped by the reader | |||
| Content-Transfer-Encoding could be inferred from | without missing anything essential. The primary purpose of | |||
| the characteristics of the Content-Type that is to | these non-essential notes is to convey information about the | |||
| be encoded, or, at the very least, that certain | rationale of this document, or to place this document in the | |||
| Content-Transfer-Encodings could be mandated for | proper historical or evolutionary context. Such information | |||
| use with specific Content-Types. There are several | may in particular be skipped by those who are focused entirely | |||
| reasons why this is not the case. First, given the | on building a conformant implementation, but may be of use to | |||
| varying types of transports used for mail, some | those who wish to understand why certain design choices were | |||
| encodings may be appropriate for some Content- | made. | |||
| Type/transport combinations and not for others. | ||||
| (For example, in an 8-bit transport, no encoding | ||||
| would be required for text in certain character | ||||
| sets, while such encodings are clearly required | ||||
| for 7-bit SMTP.) | ||||
| Second, certain Content-Types may require | 5. MIME Header Fields | |||
| different types of transfer encoding under | ||||
| different circumstances. For example, many | ||||
| PostScript bodies might consist entirely of short | ||||
| lines of 7-bit data and hence require little or no | ||||
| encoding. Other PostScript bodies (especially | ||||
| those using Level 2 PostScript's binary encoding | ||||
| mechanism) may only be reasonably represented | ||||
| using a binary transport encoding. Finally, since | ||||
| Content-Type is intended to be an open-ended | ||||
| specification mechanism, strict specification of | ||||
| an association between Content-Types and encodings | ||||
| effectively couples the specification of an | ||||
| application protocol with a specific lower-level | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | MIME defines a number of new RFC 822 header fields that are | |||
| used to describe the content of messages. These header fields | ||||
| occur in two contexts: | ||||
| transport. This is not desirable since the | (1) As part of a regular RFC 822 message header. | |||
| developers of a Content-Type should not have to be | ||||
| aware of all the transports in use and what their | ||||
| limitations are. | ||||
| NOTE ON TRANSLATING ENCODINGS: The quoted- | (2) In a MIME body part header within a multipart | |||
| printable and base64 encodings are designed so | construct. | |||
| that conversion between them is possible. The only | ||||
| issue that arises in such a conversion is the | ||||
| handling of line breaks. When converting from | ||||
| quoted-printable to base64 a line break must be | ||||
| converted into a CRLF sequence. Similarly, a CRLF | ||||
| sequence in base64 data must be converted to a | ||||
| quoted-printable line break, but ONLY when | ||||
| converting text data. | ||||
| NOTE ON CANONICAL ENCODING MODEL: There was some | The formal definition of these header fields is as follows: | |||
| confusion, in earlier drafts of this memo, | ||||
| regarding the model for when email data was to be | ||||
| converted to canonical form and encoded, and in | ||||
| particular how this process would affect the | ||||
| treatment of CRLFs, given that the representation | ||||
| of newlines varies greatly from system to system, | ||||
| and the relationship between content-transfer- | ||||
| encodings and character sets. For this reason, a | ||||
| canonical model for encoding is presented as | ||||
| Appendix G. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | MIME-message-headers := fields | |||
| version CRLF | ||||
| [ content CRLF ] | ||||
| [ encoding CRLF ] | ||||
| [ id CRLF ] | ||||
| [ description CRLF ] | ||||
| *( mime-extension-field CRLF ) | ||||
| ; The ordering of the header | ||||
| ; fields implied by this BNF | ||||
| ; definition should be ignored | ||||
| 5.1 Quoted-Printable Content-Transfer-Encoding | MIME-part-headers := [ content CRLF ] | |||
| [ encoding CRLF ] | ||||
| [ id CRLF ] | ||||
| [ description CRLF ] | ||||
| *( mime-extension-field CRLF ) | ||||
| ; The ordering of the header | ||||
| ; fields implied by this BNF | ||||
| ; definition should be ignored | ||||
| The Quoted-Printable encoding is intended to represent data | The syntax of the various specific MIME header fields will be | |||
| that largely consists of octets that correspond to printable | described in the following sections. | |||
| characters in the ASCII character set. It encodes the data | ||||
| in such a way that the resulting octets are unlikely to be | ||||
| modified by mail transport. If the data being encoded are | ||||
| mostly ASCII text, the encoded form of the data remains | ||||
| largely recognizable by humans. A body which is entirely | ||||
| ASCII may also be encoded in Quoted-Printable to ensure the | ||||
| integrity of the data should the message pass through a | ||||
| character-translating, and/or line-wrapping gateway. | ||||
| In this encoding, octets are to be represented as determined | 5.1. MIME-Version Header Field | |||
| by the following rules: | ||||
| Rule #1: (General 8-bit representation) Any octet, | Since RFC 822 was published in 1982, there has really been | |||
| except those indicating a line break according to the | only one format standard for Internet messages, and there has | |||
| newline convention of the canonical (standard) form of | been little perceived need to declare the format standard in | |||
| the data being encoded, may be represented by an "=" | use. This document is an independent document that | |||
| followed by a two digit hexadecimal representation of | complements RFC 822. Although the extensions in this document | |||
| the octet's value. The digits of the hexadecimal | have been defined in such a way as to be compatible with RFC | |||
| alphabet, for this purpose, are "0123456789ABCDEF". | 822, there are still circumstances in which it might be | |||
| Uppercase letters must be used when sending hexadecimal | desirable for a mail-processing agent to know whether a | |||
| data, though a robust implementation may choose to | message was composed with the new standard in mind. | |||
| recognize lowercase letters on receipt. Thus, for | ||||
| example, the value 12 (ASCII form feed) can be | ||||
| represented by "=0C", and the value 61 (ASCII EQUAL | ||||
| SIGN) can be represented by "=3D". Except when the | ||||
| following rules allow an alternative encoding, this | ||||
| rule is mandatory. | ||||
| Rule #2: (Literal representation) Octets with decimal | Therefore, this document defines a new header field, "MIME- | |||
| values of 33 through 60 inclusive, and 62 through 126, | Version", which is to be used to declare the version of the | |||
| inclusive, MAY be represented as the ASCII characters | Internet message body format standard in use. | |||
| which correspond to those octets (EXCLAMATION POINT | ||||
| through LESS THAN, and GREATER THAN through TILDE, | ||||
| respectively). | ||||
| Rule #3: (White Space): Octets with values of 9 and 32 | Messages composed in accordance with this document MUST | |||
| MAY be represented as ASCII TAB (HT) and SPACE | include such a header field, with the following verbatim text: | |||
| characters, respectively, but MUST NOT be so | ||||
| represented at the end of an encoded line. Any TAB (HT) | ||||
| or SPACE characters on an encoded line MUST thus be | ||||
| followed on that line by a printable character. In | ||||
| particular, an "=" at the end of an encoded line, | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | MIME-Version: 1.0 | |||
| indicating a soft line break (see rule #5) may follow | The presence of this header field is an assertion that the | |||
| one or more TAB (HT) or SPACE characters. It follows | message has been composed in compliance with this document. | |||
| that an octet with value 9 or 32 appearing at the end | ||||
| of an encoded line must be represented according to | ||||
| Rule #1. This rule is necessary because some MTAs | ||||
| (Message Transport Agents, programs which transport | ||||
| messages from one user to another, or perform a part of | ||||
| such transfers) are known to pad lines of text with | ||||
| SPACEs, and others are known to remove "white space" | ||||
| characters from the end of a line. Therefore, when | ||||
| decoding a Quoted-Printable body, any trailing white | ||||
| space on a line must be deleted, as it will necessarily | ||||
| have been added by intermediate transport agents. | ||||
| Rule #4 (Line Breaks): A line break in a text body, | Since it is possible that a future document might extend the | |||
| independent of what its representation is following the | message format standard again, a formal BNF is given for the | |||
| canonical representation of the data being encoded, | content of the MIME-Version field: | |||
| must be represented by a (RFC 822) line break, which is | ||||
| a CRLF sequence, in the Quoted-Printable encoding. | ||||
| Since the canonical representation of types other than | ||||
| text do not generally include the representation of | ||||
| line breaks, no hard line breaks (i.e. line breaks that | ||||
| are intended to be meaningful and to be displayed to | ||||
| the user) should occur in the quoted-printable encoding | ||||
| of such types. Of course, occurrences of "=0D", "=0A", | ||||
| "=0A=0D" and "=0D=0A" will eventually be encountered. | ||||
| In general, however, base64 is preferred over quoted- | ||||
| printable for binary data. | ||||
| Note that many implementations may elect to encode the | version := "MIME-Version" ":" 1*DIGIT "." 1*DIGIT | |||
| local representation of various content types directly, | ||||
| as described in Appendix G. In particular, this may | ||||
| apply to plain text material on systems that use | ||||
| newline conventions other than CRLF delimiters. Such an | ||||
| implementation is permissible, but the generation of | ||||
| line breaks must be generalized to account for the case | ||||
| where alternate representations of newline sequences | ||||
| are used. | ||||
| Rule #5 (Soft Line Breaks): The Quoted-Printable | Thus, future format specifiers, which might replace or extend | |||
| encoding REQUIRES that encoded lines be no more than 76 | "1.0", are constrained to be two integer fields, separated by | |||
| characters long. If longer lines are to be encoded with | a period. If a message is received with a MIME-version value | |||
| the Quoted-Printable encoding, 'soft' line breaks must | other than "1.0", it cannot be assumed to conform with this | |||
| be used. An equal sign as the last character on a | specification. | |||
| encoded line indicates such a non-significant ('soft') | ||||
| line break in the encoded text. Thus if the "raw" form | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | Note that the MIME-Version header field is required at the top | |||
| level of a message. It is not required for each body part of | ||||
| a multipart entity. It is required for the embedded headers | ||||
| of a body of type "message" if and only if the embedded | ||||
| message is itself claimed to be MIME-conformant. | ||||
| of the line is a single unencoded line that says: | It is not possible to fully specify how a mail reader that | |||
| conforms with MIME as defined in this document should treat a | ||||
| message that might arrive in the future with some value of | ||||
| MIME-Version other than "1.0". | ||||
| Now's the time for all folk to come to the aid of | It is also worth noting that version control for specific | |||
| their country. | content-types is not accomplished using the MIME-Version | |||
| mechanism. In particular, some formats (such as | ||||
| application/postscript) have version numbering conventions | ||||
| that are internal to the document format. Where such | ||||
| conventions exist, MIME does nothing to supersede them. Where | ||||
| no such conventions exist, a MIME type might use a "version" | ||||
| parameter in the content-type field if necessary. | ||||
| This can be represented, in the Quoted-Printable | NOTE TO IMPLEMENTORS: When checking MIME-Version values any | |||
| encoding, as | RFC 822 comment strings that are present must be ignored. In | |||
| particular, the following four MIME-Version fields are | ||||
| equivalent: | ||||
| Now's the time = | MIME-Version: 1.0 | |||
| for all folk to come= | ||||
| to the aid of their country. | ||||
| This provides a mechanism with which long lines are | MIME-Version: 1.0 (produced by MetaSend Vx.x) | |||
| encoded in such a way as to be restored by the user | ||||
| agent. The 76 character limit does not count the | ||||
| trailing CRLF, but counts all other characters, | ||||
| including any equal signs. | ||||
| Since the hyphen character ("-") is represented as itself in | MIME-Version: (produced by MetaSend Vx.x) 1.0 | |||
| the Quoted-Printable encoding, care must be taken, when | ||||
| encapsulating a quoted-printable encoded body in a multipart | ||||
| entity, to ensure that the encapsulation boundary does not | ||||
| appear anywhere in the encoded body. (A good strategy is to | ||||
| choose a boundary that includes a character sequence such as | ||||
| "=_" which can never appear in a quoted-printable body. See | ||||
| the definition of multipart messages later in this | ||||
| document.) | ||||
| NOTE: The quoted-printable encoding represents | MIME-Version: 1.(produced by MetaSend Vx.x)0 | |||
| something of a compromise between readability and | ||||
| reliability in transport. Bodies encoded with the | ||||
| quoted-printable encoding will work reliably over | ||||
| most mail gateways, but may not work perfectly | ||||
| over a few gateways, notably those involving | ||||
| translation into EBCDIC. (In theory, an EBCDIC | ||||
| gateway could decode a quoted-printable body and | ||||
| re-encode it using base64, but such gateways do | ||||
| not yet exist.) A higher level of confidence is | ||||
| offered by the base64 Content-Transfer-Encoding. | ||||
| A way to get reasonably reliable transport through | ||||
| EBCDIC gateways is to also quote the ASCII | ||||
| characters | ||||
| !"#$@[\]^`{|}~ | 5.2. Content-Type Header Field | |||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | The purpose of the Content-Type field is to describe the data | |||
| contained in the body fully enough that the receiving user | ||||
| agent can pick an appropriate agent or mechanism to present | ||||
| the data to the user, or otherwise deal with the data in an | ||||
| appropriate manner. | ||||
| according to rule #1. See Appendix B for more | HISTORICAL NOTE: The Content-Type header field was first | |||
| information. | defined in RFC 1049. RFC 1049 Content-types used a simpler | |||
| and less powerful syntax, but one that is largely compatible | ||||
| with the mechanism given here. | ||||
| Because quoted-printable data is generally assumed to be | The Content-Type header field is used to specify the nature of | |||
| line-oriented, it is to be expected that the representation | the data in the body of an entity, by giving type and subtype | |||
| of the breaks between the lines of quoted printable data may | identifiers, and by providing auxiliary information that may | |||
| be altered in transport, in the same manner that plain text | be required for certain types. After the type and subtype | |||
| mail has always been altered in Internet mail when passing | names, the remainder of the header field is simply a set of | |||
| between systems with differing newline conventions. If such | parameters, specified in an attribute/value notation. The | |||
| alterations are likely to constitute a corruption of the | ordering of parameters is not significant. | |||
| data, it is probably more sensible to use the base64 | ||||
| encoding rather than the quoted-printable encoding. | ||||
| WARNING TO IMPLEMENTORS: If binary data are encoded in | In general, the top-level Content-Type is used to declare the | |||
| quoted-printable, care must be taken to encode CR and LF | general type of data, while the subtype specifies a specific | |||
| characters as "=0D" and "=0A", respectively. In particular, | format for that type of data. Thus, a Content-Type of | |||
| a CRLF sequence in binary data should be encoded as | "image/xyz" is enough to tell a user agent that the data is an | |||
| "=0D=0A". Otherwise, if CRLF were represented as a hard | image, even if the user agent has no knowledge of the specific | |||
| line break, it might be incorrectly decoded on platforms | image format "xyz". Such information can be used, for | |||
| with different line break conventions. | example, to decide whether or not to show a user the raw data | |||
| from an unrecognized subtype -- such an action might be | ||||
| reasonable for unrecognized subtypes of text, but not for | ||||
| unrecognized subtypes of image or audio. For this reason, | ||||
| registered subtypes of text, image, audio, and video should | ||||
| not contain embedded information that is really of a different | ||||
| type. Such compound formats should be represented using the | ||||
| "multipart" or "application" types. | ||||
| For formalists, the syntax of quoted-printable data is | Parameters are modifiers of the content-subtype, and as such | |||
| described by the following grammar: | do not fundamentally affect the nature of the content. The set | |||
| of meaningful parameters depends on the content-type and | ||||
| subtype. Most parameters are associated with a single specific | ||||
| subtype. However, a given top-level content-type may define | ||||
| parameters which are applicable to any subtype of that type. | ||||
| For example, the "charset" parameter is applicable to any | ||||
| subtype of "text", while the "boundary" parameter is required | ||||
| for any subtype of the "multipart" content-type. | ||||
| quoted-printable := ([*(ptext / SPACE / TAB) ptext] ["="] | There are NO globally-meaningful parameters that apply to all | |||
| CRLF) | content-types. Truly global mechanisms are best addressed, in | |||
| ; Maximum line length of 76 characters excluding CRLF | the MIME model, by the definition of additional Content-* | |||
| header fields. | ||||
| ptext := octet / <any ASCII character except "=", SPACE, or | An initial set of seven top-level Content-Types is defined by | |||
| TAB> | this document. Five of these are discrete types whose content | |||
| ; characters not listed as "mail-safe" in Appendix B | is essentially opaque as far as MIME processing is concerned. | |||
| ; are also not recommended. | The remaining two are composite types whose contents require | |||
| additional handling by MIME processors. | ||||
| octet := "=" 2(DIGIT / "A" / "B" / "C" / "D" / "E" / "F") | This set of top-level Content-Types is intended to be | |||
| ; octet must be used for characters > 127, =, SPACE, or | substantially complete. It is expected that additions to the | |||
| TAB, | larger set of supported types can generally be accomplished by | |||
| ; and is recommended for any characters not listed in | the creation of new subtypes of these initial types. In the | |||
| ; Appendix B as "mail-safe". | future, more top-level types may be defined only by a | |||
| standards-track extension to this standard. If another top- | ||||
| level type is to be used for any reason, it must be given a | ||||
| name starting with "X-" to indicate its non-standard status | ||||
| and to avoid a potential conflict with a future official name. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | 5.2.1. Syntax of the Content-Type Header Field | |||
| 5.2 Base64 Content-Transfer-Encoding | In the Augmented BNF notation of RFC 822, a Content-Type | |||
| header field value is defined as follows: | ||||
| The Base64 Content-Transfer-Encoding is designed to | content := "Content-Type" ":" type "/" subtype | |||
| represent arbitrary sequences of octets in a form that need | *(";" parameter) | |||
| not be humanly readable. The encoding and decoding | ; Matching of type and subtype is | |||
| algorithms are simple, but the encoded data are consistently | ; ALWAYS case-insensitive | |||
| only about 33 percent larger than the unencoded data. This | ||||
| encoding is virtually identical to the one used in Privacy | ||||
| Enhanced Mail (PEM) applications, as defined in RFC 1421. | ||||
| The base64 encoding is adapted from RFC 1421, with one | ||||
| change: base64 eliminates the "*" mechanism for embedded | ||||
| clear text. | ||||
| A 65-character subset of US-ASCII is used, enabling 6 bits | type := discrete-type / composite-type | |||
| to be represented per printable character. (The extra 65th | ||||
| character, "=", is used to signify a special processing | ||||
| function.) | ||||
| NOTE: This subset has the important property that | discrete-type := "text" / "image" / "audio" / "video" / | |||
| it is represented identically in all versions of | "application" / extension-token | |||
| ISO 646, including US ASCII, and all characters in | ||||
| the subset are also represented identically in all | ||||
| versions of EBCDIC. Other popular encodings, | ||||
| such as the encoding used by the uuencode utility | ||||
| and the base85 encoding specified as part of Level | ||||
| 2 PostScript, do not share these properties, and | ||||
| thus do not fulfill the portability requirements a | ||||
| binary transport encoding for mail must meet. | ||||
| The encoding process represents 24-bit groups of input bits | composite-type := "message" / "multipart" / extension-token | |||
| as output strings of 4 encoded characters. Proceeding from | ||||
| left to right, a 24-bit input group is formed by | ||||
| concatenating 3 8-bit input groups. These 24 bits are then | ||||
| treated as 4 concatenated 6-bit groups, each of which is | ||||
| translated into a single digit in the base64 alphabet. When | ||||
| encoding a bit stream via the base64 encoding, the bit | ||||
| stream must be presumed to be ordered with the most- | ||||
| significant-bit first. That is, the first bit in the stream | ||||
| will be the high-order bit in the first byte, and the eighth | ||||
| bit will be the low-order bit in the first byte, and so on. | ||||
| Each 6-bit group is used as an index into an array of 64 | extension-token := iana-token / ietf-token / x-token | |||
| printable characters. The character referenced by the index | ||||
| is placed in the output string. These characters, identified | ||||
| in Table 1, below, are selected so as to be universally | ||||
| representable, and the set excludes characters with | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | iana-token := <a publicly-defined extension token, | |||
| registered with IANA, as specified in | ||||
| RFC REG [RFC-REG]> | ||||
| particular significance to SMTP (e.g., ".", CR, LF) and to | ietf-token := <a publicly-defined extension token, | |||
| the encapsulation boundaries defined in this document (e.g., | initially registered with IANA and | |||
| "-"). | subsequently standardized by the IETF> | |||
| Table 1: The Base64 Alphabet | x-token := <The two characters "X-" or "x-" followed, with | |||
| no intervening white space, by any token> | ||||
| Value Encoding Value Encoding Value Encoding Value | subtype := extension-token | |||
| Encoding | ||||
| 0 A 17 R 34 i 51 z | ||||
| 1 B 18 S 35 j 52 0 | ||||
| 2 C 19 T 36 k 53 1 | ||||
| 3 D 20 U 37 l 54 2 | ||||
| 4 E 21 V 38 m 55 3 | ||||
| 5 F 22 W 39 n 56 4 | ||||
| 6 G 23 X 40 o 57 5 | ||||
| 7 H 24 Y 41 p 58 6 | ||||
| 8 I 25 Z 42 q 59 7 | ||||
| 9 J 26 a 43 r 60 8 | ||||
| 10 K 27 b 44 s 61 9 | ||||
| 11 L 28 c 45 t 62 + | ||||
| 12 M 29 d 46 u 63 / | ||||
| 13 N 30 e 47 v | ||||
| 14 O 31 f 48 w (pad) = | ||||
| 15 P 32 g 49 x | ||||
| 16 Q 33 h 50 y | ||||
| The output stream (encoded bytes) must be represented in | parameter := attribute "=" value | |||
| lines of no more than 76 characters each. All line breaks | ||||
| or other characters not found in Table 1 must be ignored by | ||||
| decoding software. In base64 data, characters other than | ||||
| those in Table 1, line breaks, and other white space | ||||
| probably indicate a transmission error, about which a | ||||
| warning message or even a message rejection might be | ||||
| appropriate under some circumstances. | ||||
| Special processing is performed if fewer than 24 bits are | attribute := token | |||
| available at the end of the data being encoded. A full | ||||
| encoding quantum is always completed at the end of a body. | ||||
| When fewer than 24 input bits are available in an input | ||||
| group, zero bits are added (on the right) to form an | ||||
| integral number of 6-bit groups. Padding at the end of the | ||||
| data is performed using the '=' character. Since all | ||||
| base64 input is an integral number of octets, only the | ||||
| following cases can arise: (1) the final quantum of encoding | ||||
| input is an integral multiple of 24 bits; here, the final | ||||
| unit of encoded output will be an integral multiple of 4 | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | value := token / quoted-string | |||
| characters with no "=" padding, (2) the final quantum of | token := 1*<any (US-ASCII) CHAR except SPACE, CTLs, | |||
| encoding input is exactly 8 bits; here, the final unit of | or tspecials> | |||
| encoded output will be two characters followed by two "=" | ||||
| padding characters, or (3) the final quantum of encoding | ||||
| input is exactly 16 bits; here, the final unit of encoded | ||||
| output will be three characters followed by one "=" padding | ||||
| character. | ||||
| Because it is used only for padding at the end of the data, | tspecials := "(" / ")" / "<" / ">" / "@" / | |||
| the occurrence of any '=' characters may be taken as | "," / ";" / ":" / "\" / <"> | |||
| evidence that the end of the data has been reached (without | "/" / "[" / "]" / "?" / "=" | |||
| truncation in transit). No such assurance is possible, | ; Must be in quoted-string, | |||
| however, when the number of octets transmitted was a | ; to use within parameter values | |||
| multiple of three. | ||||
| Any characters outside of the base64 alphabet are to be | Note that the definition of "tspecials" is the same as the RFC | |||
| ignored in base64-encoded data. The same applies to any | 822 definition of "specials" with the addition of the three | |||
| illegal sequence of characters in the base64 encoding, such | characters "/", "?", and "=", and the removal of ".". | |||
| as "=====" | ||||
| Care must be taken to use the proper octets for line breaks | Note also that a subtype specification is MANDATORY -- it may | |||
| if base64 encoding is applied directly to text material that | not be omitted from a Content-Type header field. As such, | |||
| has not been converted to canonical form. In particular, | there are no default subtypes. | |||
| text line breaks must be converted into CRLF sequences prior | ||||
| to base64 encoding. The important thing to note is that this | ||||
| may be done directly by the encoder rather than in a prior | ||||
| canonicalization step in some implementations. | ||||
| NOTE: There is no need to worry about quoting | The type, subtype, and parameter names are not case sensitive. | |||
| apparent encapsulation boundaries within base64- | For example, TEXT, Text, and TeXt are all equivalent top-level | |||
| encoded parts of multipart entities because no | Content Types. Parameter values are normally case sensitive, | |||
| hyphen characters are used in the base64 encoding. | but sometimes are interpreted in a case-insensitive fashion, | |||
| depending on the intended use. (For example, multipart | ||||
| boundaries are case-sensitive, but the "access-type" parameter | ||||
| for message/External-body is not case-sensitive.) | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | Note that the value of a quoted string parameter does not | |||
| include the quotes. That is, the quotation marks in a | ||||
| quoted-string are not a part of the value of the parameter, | ||||
| but are merely used to delimit that parameter value. In | ||||
| addition, comments are allowed in accordance with RFC 822 | ||||
| rules for structured header fields. Thus the following two | ||||
| forms | ||||
| 6 Additional Content- Header Fields | Content-type: text/plain; charset=us-ascii (Plain text) | |||
| 6.1 Optional Content-ID Header Field | Content-type: text/plain; charset="us-ascii" | |||
| In constructing a high-level user agent, it may be desirable | are completely equivalent. | |||
| to allow one body to make reference to another. | ||||
| Accordingly, bodies may be labeled using the "Content-ID" | ||||
| header field, which is syntactically identical to the | ||||
| "Message-ID" header field: | ||||
| id := "Content-ID" ":" msg-id | Beyond this syntax, the only syntactic constraint on the | |||
| definition of subtype names is the desire that their uses must | ||||
| not conflict. That is, it would be undesirable to have two | ||||
| different communities using "Content-Type: application/foobar" | ||||
| to mean two different things. The process of defining new | ||||
| content-subtypes, then, is not intended to be a mechanism for | ||||
| imposing restrictions, but simply a mechanism for publicizing | ||||
| the usages. There are, therefore, two acceptable mechanisms | ||||
| for defining new Content-Type subtypes: | ||||
| Like the Message-ID values, Content-ID values must be | (1) Private values (starting with "X-") may be defined | |||
| generated to be world-unique. | bilaterally between two cooperating agents without | |||
| outside registration or standardization. | ||||
| The Content-ID value may be used for uniquely identifying | (2) New standard values MUST be documented, registered | |||
| MIME entities in several contexts, particularly for cacheing | with, and approved by IANA, as described in RFC REG. | |||
| data referenced by the message/external-body mechanism. | ||||
| Although the Content-ID header is generally optional, its | ||||
| use is mandatory in implementations which generate data of | ||||
| the optional MIME Content-type "message/external-body". | ||||
| That is, each message/external-body entity must have a | ||||
| Content-ID field to permit cacheing of such data. | ||||
| It is also worth noting that the Content-ID value has | 5.2.2. Definition of a Top-Level Content-Type | |||
| special semantics in the case of the multipart/alternative | ||||
| content-type. This is explained in the section of this | ||||
| document dealing with multipart/alternative. | ||||
| 6.2 Optional Content-Description Header Field | The definition of a top-level content-type consists of: | |||
| The ability to associate some descriptive information with a | (1) a name and a description of the type, including | |||
| given body is often desirable. For example, it may be | criteria for whether a particular type would qualify | |||
| useful to mark an "image" body as "a picture of the Space | under that type, | |||
| Shuttle Endeavor." Such text may be placed in the Content- | ||||
| Description header field. | ||||
| description := "Content-Description" ":" *text | (2) the names and definitions of parameters, if any, which | |||
| are defined for all subtypes of that type (including | ||||
| whether such parameters are required or optional), | ||||
| The description is presumed to be given in the US-ASCII | (3) how a user agent and/or gateway should handle unknown | |||
| character set, although the mechanism specified in [RFC- | subtypes of this type, | |||
| 1522] may be used for non-US-ASCII Content-Description | ||||
| values. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | (4) general considerations on gatewaying objects of this | |||
| top-level type, if any, and | ||||
| 7 The Predefined Content-Type Values | (5) any restrictions on content-transfer-encodings for | |||
| objects of this top-level type. | ||||
| This document defines seven initial Content-Type values and | 5.2.3. Initial Set of Top-Level Content-Types | |||
| an extension mechanism for private or experimental types. | ||||
| Further standard types must be defined by new published | ||||
| specifications. It is expected that most innovation in new | ||||
| types of mail will take place as subtypes of the seven types | ||||
| defined here. The most essential characteristics of the | ||||
| seven content-types are summarized in Appendix F. | ||||
| 7.1 The Text Content-Type | The initial seven standard top-level Content-Types are | |||
| detailed in the bulk of this document. The five discrete top- | ||||
| level Content-Types are: | ||||
| The text Content-Type is intended for sending material which | (1) text -- textual information. The subtype "plain" in | |||
| is principally textual in form. It is the default Content- | particular indicates plain (unformatted) text. No | |||
| Type. A "charset" parameter may be used to indicate the | special software is required to get the full meaning of | |||
| character set of the body text for some text subtypes, | the text, aside from support for the indicated | |||
| notably including the primary subtype, "text/plain", which | character set. Other subtypes are to be used for | |||
| indicates plain (unformatted) text. The default Content- | enriched text in forms where application software may | |||
| Type for Internet mail is "text/plain; charset=us-ascii". | enhance the appearance of the text, but such software | |||
| must not be required in order to get the general idea | ||||
| of the content. Possible subtypes thus include any | ||||
| word processor format that can be read without | ||||
| resorting to software that understands the format. In | ||||
| particular, formats that employ embeddded binary | ||||
| formatting information are not considered directly | ||||
| readable. A very simple and portable subtype, richtext, | ||||
| was defined in RFC 1341 [RFC-1341], with a further | ||||
| revision in RFC 1563 [RFC-1563] under the name | ||||
| "enriched". | ||||
| Beyond plain text, there are many formats for representing | (2) image -- image data. Image requires a display device | |||
| what might be known as "extended text" -- text with embedded | (such as a graphical display, a graphics printer, or a | |||
| formatting and presentation information. An interesting | FAX machine) to view the information. Initial subtypes | |||
| characteristic of many such representations is that they are | are defined for two widely-used image formats, jpeg and | |||
| to some extent readable even without the software that | gif. | |||
| interprets them. It is useful, then, to distinguish them, | ||||
| at the highest level, from such unreadable data as images, | ||||
| audio, or text represented in an unreadable form. In the | ||||
| absence of appropriate interpretation software, it is | ||||
| reasonable to show subtypes of text to the user, while it is | ||||
| not reasonable to do so with most nontextual data. | ||||
| Such formatted textual data should be represented using | (3) audio -- audio data. Audio requires an audio output | |||
| subtypes of text. Plausible subtypes of text are typically | device (such as a speaker or a telephone) to "display" | |||
| given by the common name of the representation format, e.g., | the contents. An initial subtype "basic" is defined in | |||
| "text/richtext" [RFC-1341]. | this document. | |||
| 7.1.1 The charset parameter | (4) video -- video data. Video requires the capability to | |||
| display moving images, typically including specialized | ||||
| hardware and software. An initial subtype "mpeg" is | ||||
| defined in this document. | ||||
| A critical parameter that may be specified in the Content- | (5) application -- some other kind of data, typically | |||
| Type field for text/plain data is the character set. This | either uninterpreted binary data or information to be | |||
| is specified with a "charset" parameter, as in: | processed by a mail-based application. The subtype | |||
| "octet-stream" is to be used in the case of | ||||
| uninterpreted binary data, in which case the simplest | ||||
| recommended action is to offer to write the information | ||||
| into a file for the user. The "PostScript" subtype is | ||||
| also defined for the transport of PostScript material. | ||||
| Other expected uses for "application" include | ||||
| spreadsheets, data for mail-based scheduling systems, | ||||
| and languages for "active" (computational) email, and | ||||
| word processing formats that are not directly readable. | ||||
| Note that security considerations may exist for some | ||||
| types of application data, most notably | ||||
| application/PostScript and any form of active mail. | ||||
| These issues are discussed later in this document. | ||||
| Content-type: text/plain; charset=us-ascii | The two composite top-level Content-Types are: | |||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | (1) multipart -- data consisting of multiple parts of | |||
| independent data types. Four subtypes are initially | ||||
| defined, including the basic "mixed" subtype specifying | ||||
| a generic mixed set of parts, "alternative" for | ||||
| representing the same data in multiple formats, | ||||
| "parallel" for parts intended to be viewed | ||||
| simultaneously, and "digest" for multipart entities in | ||||
| which each part is of type "message". | ||||
| Unlike some other parameter values, the values of the | (2) message -- an encapsulated message. A body of | |||
| charset parameter are NOT case sensitive. The default | Content-Type "message" is itself all or part of some | |||
| character set, which must be assumed in the absence of a | kind of message object. Such objects may in turn | |||
| charset parameter, is US-ASCII. | contain other messages and body parts of their own. | |||
| The "rfc822" subtype is used when the encpsulated | ||||
| content is itself an RFC 822 message. The "partial" | ||||
| subtype is defined for partial RFC 822 messages, to | ||||
| permit the fragmented transmission of bodies that are | ||||
| thought to be too large to be passed through mail | ||||
| transport facilities in one piece. Another subtype, | ||||
| "external-body", is defined for specifying large bodies | ||||
| by reference to an external data source. | ||||
| The specification for any future subtypes of "text" must | Default RFC 822 messages without a MIME Content-Type header | |||
| specify whether or not they will also utilize a "charset" | are taken by this protocol to be plain text in the US-ASCII | |||
| parameter, and may possibly restrict its values as well. | character set, which can be explicitly specified as: | |||
| When used with a particular body, the semantics of the | ||||
| "charset" parameter should be identical to those specified | ||||
| here for "text/plain", i.e., the body consists entirely of | ||||
| characters in the given charset. In particular, definers of | ||||
| future text subtypes should pay close attention the the | ||||
| implications of multibyte character sets for their subtype | ||||
| definitions. | ||||
| This RFC specifies the definition of the charset parameter | Content-type: text/plain; charset=us-ascii | |||
| for the purposes of MIME to be a unique mapping of a byte | ||||
| stream to glyphs, a mapping which does not require external | ||||
| profiling information. | ||||
| An initial list of predefined character set names can be | This default is assumed if no Content-Type is specified. In | |||
| found at the end of this section. Additional character sets | the presence of a MIME-Version header field, a receiving User | |||
| may be registered with IANA, although the standardization of | Agent can also assume that plain US-ASCII text was the | |||
| their use requires the usual IAB review and approval. Note | sender's intent. Plain US-ASCII text must still be assumed in | |||
| that if the specified character set includes 8-bit data, a | the absence of a MIME-Version specification, but the sender's | |||
| Content-Transfer-Encoding header field and a corresponding | intent might have been otherwise. | |||
| encoding on the data are required in order to transmit the | ||||
| body via some mail transfer protocols, such as SMTP. | ||||
| The default character set, US-ASCII, has been the subject of | RATIONALE: In the absence of any Content-Type header field or | |||
| some confusion and ambiguity in the past. Not only were | MIME-Version header field, it is impossible to be certain that | |||
| there some ambiguities in the definition, there have been | a message is actually text in the US-ASCII character set, | |||
| wide variations in practice. In order to eliminate such | since it might well be a message that, using some set of | |||
| ambiguity and variations in the future, it is strongly | nonstandard conventions that predate this document, includes | |||
| recommended that new user agents explicitly specify a | text in another character set or non-textual data in a manner | |||
| character set via the Content-Type header field. "US-ASCII" | that cannot be automatically recognized (e.g., a uuencoded | |||
| does not indicate an arbitrary seven-bit character code, but | compressed UNIX tar file). Although there is no fully | |||
| specifies that the body uses character coding that uses the | acceptable alternative to treating such untyped messages as | |||
| exact correspondence of codes to characters specified in | "text/plain; charset=us-ascii", implementors should remain | |||
| ASCII. National use variations of ISO 646 [ISO-646] are NOT | aware that if a message lacks both the MIME-Version and the | |||
| ASCII and their use in Internet mail is explicitly | Content-Type header fields, it may in practice contain almost | |||
| discouraged. The omission of the ISO 646 character set is | anything. | |||
| deliberate in this regard. The character set name of "US- | ||||
| ASCII" explicitly refers to ANSI X3.4-1986 [US-ASCII] only. | ||||
| The character set name "ASCII" is reserved and must not be | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | It should be noted that the list of Content-Type values given | |||
| here may be augmented in time, via the mechanisms described | ||||
| above, and that the set of subtypes is expected to grow | ||||
| substantially. | ||||
| used for any purpose. | When a mail reader encounters mail with an unknown Content- | |||
| type value, it should generally treat it as equivalent to | ||||
| "application/octet-stream", as described later in this | ||||
| document. | ||||
| NOTE: RFC 821 explicitly specifies "ASCII", and | 5.3. Content-Transfer-Encoding Header Field | |||
| references an earlier version of the American | ||||
| Standard. Insofar as one of the purposes of | ||||
| specifying a Content-Type and character set is to | ||||
| permit the receiver to unambiguously determine how | ||||
| the sender intended the coded message to be | ||||
| interpreted, assuming anything other than "strict | ||||
| ASCII" as the default would risk unintentional and | ||||
| incompatible changes to the semantics of messages | ||||
| now being transmitted. This also implies that | ||||
| messages containing characters coded according to | ||||
| national variations on ISO 646, or using code- | ||||
| switching procedures (e.g., those of ISO 2022), as | ||||
| well as 8-bit or multiple octet character | ||||
| encodings MUST use an appropriate character set | ||||
| specification to be consistent with this | ||||
| specification. | ||||
| The complete US-ASCII character set is listed in [US-ASCII]. | Many Content-Types which could be usefully transported via | |||
| Note that the control characters including DEL (0-31, 127) | email are represented, in their "natural" format, as 8-bit | |||
| have no defined meaning apart from the combination CRLF | character or binary data. Such data cannot be transmitted over | |||
| (ASCII values 13 and 10) indicating a new line. Two of the | some transport protocols. For example, RFC 821 (SMTP) | |||
| characters have de facto meanings in wide use: FF (12) often | restricts mail messages to 7-bit US-ASCII data with lines no | |||
| means "start subsequent text on the beginning of a new | longer than 1000 characters. | |||
| page"; and TAB or HT (9) often (though not always) means | ||||
| "move the cursor to the next available column after the | ||||
| current position where the column number is a multiple of 8 | ||||
| (counting the first column as column 0)." Apart from this, | ||||
| any use of the control characters or DEL in a body must be | ||||
| part of a private agreement between the sender and | ||||
| recipient. Such private agreements are discouraged and | ||||
| should be replaced by the other capabilities of this | ||||
| document. | ||||
| NOTE: Beyond US-ASCII, an enormous proliferation | It is necessary, therefore, to define a standard mechanism for | |||
| of character sets is possible. It is the opinion | encoding such data into a 7-bit short-line format. Proper | |||
| of the IETF working group that a large number of | labelling of unencoded material in less restrictive formats | |||
| character sets is NOT a good thing. We would | for direct use over less restrictive transports is also | |||
| prefer to specify a single character set that can | desireable. This document specifies that such encodings will | |||
| be used universally for representing all of the | be indicated by a new "Content-Transfer-Encoding" header | |||
| world's languages in electronic mail. | field. This field has not been defined by any previous | |||
| Unfortunately, existing practice in several | standard. | |||
| communities seems to point to the continued use of | ||||
| multiple character sets in the near future. For | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | 5.3.1. Content-Transfer-Encoding Syntax | |||
| this reason, we define names for a small number of | The Content-Transfer-Encoding field's value is a single token | |||
| character sets for which a strong constituent base | specifying the type of encoding, as enumerated below. | |||
| exists. | Formally: | |||
| The defined charset values are: | encoding := "Content-Transfer-Encoding" ":" mechanism | |||
| US-ASCII -- as defined in [US-ASCII]. | mechanism := "7bit" / "8bit" / "binary" / | |||
| "quoted-printable" / "base64" / | ||||
| ietf-token / x-token | ||||
| ISO-8859-X -- where "X" is to be replaced, as | These values are not case sensitive -- Base64 and BASE64 and | |||
| necessary, for the parts of ISO-8859 [ISO- | bAsE64 are all equivalent. An encoding type of 7BIT requires | |||
| 8859]. Note that the ISO 646 character sets | that the body is already in a 7-bit mail-ready representation. | |||
| have deliberately been omitted in favor of | This is the default value -- that is, "Content-Transfer- | |||
| their 8859 replacements, which are the | Encoding: 7BIT" is assumed if the Content-Transfer-Encoding | |||
| designated character sets for Internet mail. | header field is not present. | |||
| As of the publication of this document, the | ||||
| legitimate values for "X" are the digits 1 | ||||
| through 9. | ||||
| The character sets specified above are the ones that were | 5.3.2. Content-Transfer-Encoding Semantics | |||
| relatively uncontroversial during the drafting of MIME. | ||||
| This document does not endorse the use of any particular | ||||
| character set other than US-ASCII, and recognizes that the | ||||
| future evolution of world character sets remains unclear. | ||||
| It is expected that in the future, additional character sets | ||||
| will be registered for use in MIME. | ||||
| Note that the character set used, if anything other than | This single token actually provides two pieces of information. | |||
| US-ASCII, must always be explicitly specified in the | It specifies what sort of encoding transformation the body was | |||
| Content-Type field. | subjected to, and it specifies what the domain of the result | |||
| is. | ||||
| No other character set name may be used in Internet mail | Three transformations are currently defined: identity, the | |||
| without the publication of a formal specification and its | "quoted-printable" encoding, and the "base64" encoding. The | |||
| registration with IANA, or by private agreement, in which | domains are "binary", "8bit" and "7bit". | |||
| case the character set name must begin with "X-". | ||||
| Implementors are discouraged from defining new character | The values "7bit", "8bit", and "binary" all mean that the | |||
| sets for mail use unless absolutely necessary. | identity (i.e. NO) encoding transformation has been performed. | |||
| As such, they serve simply as indicators of the domain of the | ||||
| body part data, and provide useful information about the sort | ||||
| of encoding that might be needed for transmission in a given | ||||
| transport system. The terms "7bit data", "8bit data", and | ||||
| "binary data" are all defined in Section 4. | ||||
| The "charset" parameter has been defined primarily for the | The quoted-printable and base64 encodings transform their | |||
| purpose of textual data, and is described in this section | input from an arbitrary domain into material in the "7bit" | |||
| for that reason. However, it is conceivable that non- | domain, thus making it safe to carry over restricted | |||
| textual data might also wish to specify a charset value for | transports. The specific definition of the transformations are | |||
| some purpose, in which case the same syntax and values | given below. | |||
| should be used. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | The proper Content-Transfer-Encoding label must always be | |||
| used. Labelling unencoded data containing 8-bit characters as | ||||
| "7bit" is not allowed, nor is labelling unencoded non-line- | ||||
| oriented data as anything other than "binary" allowed. | ||||
| In general, mail-sending software must always use the | Unlike Content-Type subtypes, a proliferation of Content- | |||
| "lowest common denominator" character set possible. For | Transfer-Encoding values is both undesirable and unnecessary. | |||
| example, if a body contains only US-ASCII characters, it | However, establishing only a single transformation into the | |||
| must be marked as being in the US-ASCII character set, not | "7bit" domain does not seem possible. There is a tradeoff | |||
| ISO-8859-1, which, like all the ISO-8859 family of character | between the desire for a compact and efficient encoding of | |||
| sets, is a superset of US-ASCII. More generally, if a | largely-binary data and the desire for a readable encoding of | |||
| widely-used character set is a subset of another character | data that is mostly, but not entirely, 7-bit. For this | |||
| set, and a body contains only characters in the widely-used | reason, at least two encoding mechanisms are necessary: a | |||
| subset, it must be labeled as being in that subset. This | "readable" encoding (quoted-printable) and a "dense" encoding | |||
| will increase the chances that the recipient will be able to | (base64). | |||
| view the mail correctly. | ||||
| 7.1.2 The Text/plain subtype | Mail transport for unencoded 8-bit data is defined in RFC 1652 | |||
| [RFC-1652]. As of the publication of this document, there are | ||||
| no standardized Internet mail transports for which it is | ||||
| legitimate to include unencoded binary data in mail bodies. | ||||
| The primary subtype of text is "plain". This indicates | Thus there are no circumstances in which the "binary" | |||
| plain (unformatted) text. The default Content-Type for | Content-Transfer-Encoding is actually valid on the Internet. | |||
| Internet mail, "text/plain; charset=us-ascii", describes | However, in the event that binary mail transport becomes a | |||
| existing Internet practice. That is, it is the type of body | reality in Internet mail, or when this document is used in | |||
| defined by RFC 822. | conjunction with any other binary-capable transport mechanism, | |||
| binary bodies should be labelled as such using this mechanism. | ||||
| No other text subtype is defined by this document. | NOTE: The five values defined for the Content-Transfer- | |||
| Encoding field imply nothing about the Content-Type other than | ||||
| the algorithm by which it was encoded or the transport system | ||||
| requirements if unencoded. | ||||
| The formal grammar for the content-type header field for | Implementors may, if necessary, define new Content-Transfer- | |||
| text is as follows: | Encoding values, but must use an x-token, which is a name | |||
| prefixed by "X-", to indicate its non-standard status, e.g., | ||||
| "Content-Transfer-Encoding: x-my-new-encoding". However, | ||||
| unlike Content-Types and subtypes, the creation of new | ||||
| Content-Transfer-Encoding values is STRONGLY discouraged, as | ||||
| it seems likely to hinder interoperability with little | ||||
| potential benefit. Such use is therefore allowed only as the | ||||
| result of an agreement between cooperating user agents. | ||||
| text-type := "text" "/" text-subtype [";" "charset" "=" | If a Content-Transfer-Encoding header field appears as part of | |||
| charset] | a message header, it applies to the entire body of that | |||
| message. If a Content-Transfer-Encoding header field appears | ||||
| as part of a body part's headers, it applies only to the body | ||||
| of that body part. If an entity is of type "multipart" the | ||||
| Content-Transfer-Encoding is not permitted to have any value | ||||
| other than "7bit", "8bit" or "binary". Even more severe | ||||
| restrictions apply to some subtypes of the "message" type. | ||||
| text-subtype := "plain" / extension-token | It should be noted that email is character-oriented, so that | |||
| the mechanisms described here are mechanisms for encoding | ||||
| arbitrary octet streams, not bit streams. If a bit stream is | ||||
| to be encoded via one of these mechanisms, it must first be | ||||
| converted to an 8-bit byte stream using the network standard | ||||
| bit order ("big-endian"), in which the earlier bits in a | ||||
| stream become the higher-order bits in a 8-bit byte. A bit | ||||
| stream not ending at an 8-bit boundary must be padded with | ||||
| zeroes. This document provides a mechanism for noting the | ||||
| addition of such padding in the case of the | ||||
| application/octet-stream Content-Type, which has a "padding" | ||||
| parameter. | ||||
| charset := "us-ascii" / "iso-8859-1" / "iso-8859-2" / "iso- | The encoding mechanisms defined here explicitly encode all | |||
| 8859-3" | data in US-ASCII. Thus, for example, suppose an entity has | |||
| / "iso-8859-4" / "iso-8859-5" / "iso-8859-6" / "iso- | header fields such as: | |||
| 8859-7" | ||||
| / "iso-8859-8" / "iso-8859-9" / extension-token | ||||
| ; case insensitive | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | Content-Type: text/plain; charset=ISO-8859-1 | |||
| Content-transfer-encoding: base64 | ||||
| 7.2 The Multipart Content-Type | This must be interpreted to mean that the body is a base64 | |||
| US-ASCII encoding of data that was originally in ISO-8859-1, | ||||
| and will be in that character set again after decoding. | ||||
| In the case of multiple part entities, in which one or more | The following sections will define the two standard encoding | |||
| different sets of data are combined in a single body, a | mechanisms. The definition of new content-transfer-encodings | |||
| "multipart" Content-Type field must appear in the entity's | is explicitly discouraged and should only occur when | |||
| header. The body must then contain one or more "body parts," | absolutely necessary. All content-transfer-encoding namespace | |||
| each preceded by an encapsulation boundary, and the last one | except that beginning with "X-" is explicitly reserved to the | |||
| followed by a closing boundary. Each part starts with an | IANA for future use. Private agreements about content- | |||
| encapsulation boundary, and then contains a body part | transfer-encodings are also explicitly discouraged. | |||
| consisting of header area, a blank line, and a body area. | ||||
| Thus a body part is similar to an RFC 822 message in syntax, | ||||
| but different in meaning. | ||||
| A body part is NOT to be interpreted as actually being an | Certain Content-Transfer-Encoding values may only be used on | |||
| RFC 822 message. To begin with, NO header fields are | certain Content-Types. In particular, it is EXPRESSLY | |||
| actually required in body parts. A body part that starts | FORBIDDEN to use any encodings other than "7bit", "8bit", or | |||
| with a blank line, therefore, is allowed and is a body part | "binary" with any composite Content-Type, i.e. one that | |||
| for which all default values are to be assumed. In such a | recursively includes other Content-Type fields. Currently the | |||
| case, the absence of a Content-Type header field implies | only composite Content-Types are "multipart" and "message". | |||
| that the corresponding body is plain US-ASCII text. The | All encodings that are desired for bodies of type multipart or | |||
| only header fields that have defined meaning for body parts | message must be done at the innermost level, by encoding the | |||
| are those the names of which begin with "Content-". All | actual body that needs to be encoded. | |||
| other header fields are generally to be ignored in body | ||||
| parts. Although they should generally be retained in mail | ||||
| processing, they may be discarded by gateways if necessary. | ||||
| Such other fields are permitted to appear in body parts but | ||||
| must not be depended on. "X-" fields may be created for | ||||
| experimental or private purposes, with the recognition that | ||||
| the information they contain may be lost at some gateways. | ||||
| NOTE: The distinction between an RFC 822 message | It should also be noted that, by definition, if a composite | |||
| and a body part is subtle, but important. A | entity has a transfer-encoding value such as "7bit", but one | |||
| gateway between Internet and X.400 mail, for | of the enclosed parts has a less restrictive value such as | |||
| example, must be able to tell the difference | "8bit", then either the outer "7bit" labelling is in error, | |||
| between a body part that contains an image and a | because 8-bit data are included, or the inner "8bit" labelling | |||
| body part that contains an encapsulated message, | placed an unnecessarily high demand on the transport system | |||
| the body of which is an image. In order to | because the actual included data were actually 7-bit-safe. | |||
| represent the latter, the body part must have | ||||
| "Content-Type: message", and its body (after the | ||||
| blank line) must be the encapsulated message, with | ||||
| its own "Content-Type: image" header field. The | ||||
| use of similar syntax facilitates the conversion | ||||
| of messages to body parts, and vice versa, but the | ||||
| distinction between the two must be understood by | ||||
| implementors. (For the special case in which all | ||||
| parts actually are messages, a "digest" subtype is | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | NOTE ON ENCODING RESTRICTIONS: Though the prohibition against | |||
| using content-transfer-encodings on composite body data may | ||||
| seem overly restrictive, it is necessary to prevent nested | ||||
| encodings, in which data are passed through an encoding | ||||
| algorithm multiple times, and must be decoded multiple times | ||||
| in order to be properly viewed. Nested encodings add | ||||
| considerable complexity to user agents: Aside from the | ||||
| obvious efficiency problems with such multiple encodings, they | ||||
| can obscure the basic structure of a message. In particular, | ||||
| they can imply that several decoding operations are necessary | ||||
| simply to find out what types of bodies a message contains. | ||||
| Banning nested encodings may complicate the job of certain | ||||
| mail gateways, but this seems less of a problem than the | ||||
| effect of nested encodings on user agents. | ||||
| also defined.) | NOTE ON THE RELATIONSHIP BETWEEN CONTENT-TYPE AND CONTENT- | |||
| TRANSFER-ENCODING: It may seem that the Content-Transfer- | ||||
| Encoding could be inferred from the characteristics of the | ||||
| Content-Type that is to be encoded, or, at the very least, | ||||
| that certain Content-Transfer-Encodings could be mandated for | ||||
| use with specific Content-Types. There are several reasons | ||||
| why this is not the case. First, given the varying types of | ||||
| transports used for mail, some encodings may be appropriate | ||||
| for some Content-Type/transport combinations and not for | ||||
| others. (For example, in an 8-bit transport, no encoding | ||||
| would be required for text in certain character sets, while | ||||
| such encodings are clearly required for 7-bit SMTP.) | ||||
| As stated previously, each body part is preceded by an | Second, certain Content-Types may require different types of | |||
| encapsulation boundary. The encapsulation boundary MUST NOT | transfer encoding under different circumstances. For example, | |||
| appear inside any of the encapsulated parts. Thus, it is | many PostScript bodies might consist entirely of short lines | |||
| crucial that the composing agent be able to choose and | of 7-bit data and hence require no encoding at all. Other | |||
| specify the unique boundary that will separate the parts. | PostScript bodies (especially those using Level 2 PostScript's | |||
| binary encoding mechanism) may only be reasonably represented | ||||
| using a binary transport encoding. Finally, since Content- | ||||
| Type is intended to be an open-ended specification mechanism, | ||||
| strict specification of an association between Content-Types | ||||
| and encodings effectively couples the specification of an | ||||
| application protocol with a specific lower-level transport. | ||||
| This is not desirable since the developers of a Content-Type | ||||
| should not have to be aware of all the transports in use and | ||||
| what their limitations are. | ||||
| All present and future subtypes of the "multipart" type must | NOTE ON TRANSLATING ENCODINGS: The quoted-printable and | |||
| use an identical syntax. Subtypes may differ in their | base64 encodings are designed so that conversion between them | |||
| semantics, and may impose additional restrictions on syntax, | is possible. The only issue that arises in such a conversion | |||
| but must conform to the required syntax for the multipart | is the handling of line breaks. When converting from quoted- | |||
| type. This requirement ensures that all conformant user | printable to base64 a line break must be converted into a CRLF | |||
| agents will at least be able to recognize and separate the | sequence. Similarly, a CRLF sequence in base64 data must be | |||
| parts of any multipart entity, even of an unrecognized | converted to a quoted-printable line break, but ONLY when | |||
| subtype. | converting text data. | |||
| As stated in the definition of the Content-Transfer-Encoding | NOTE ON CANONICAL ENCODING MODEL: There was some confusion, | |||
| field, no encoding other than "7bit", "8bit", or "binary" is | in earlier drafts of this document, regarding the model for | |||
| permitted for entities of type "multipart". The multipart | when email data was to be converted to canonical form and | |||
| delimiters and header fields are always represented as 7-bit | encoded, and in particular how this process would affect the | |||
| ASCII in any case (though the header fields may encode non- | treatment of CRLFs, given that the representation of newlines | |||
| ASCII header text as per [RFC-1522]), and data within the | varies greatly from system to system, and the relationship | |||
| body parts can be encoded on a part-by-part basis, with | between content-transfer-encodings and character sets. A | |||
| Content-Transfer-Encoding fields for each appropriate body | canonical model for encoding is presented as Appendix F for | |||
| part. | this reason. | |||
| Mail gateways, relays, and other mail handling agents are | 5.3.3. Quoted-Printable Content-Transfer-Encoding | |||
| commonly known to alter the top-level header of an RFC 822 | ||||
| message. In particular, they frequently add, remove, or | ||||
| reorder header fields. Such alterations are explicitly | ||||
| forbidden for the body part headers embedded in the bodies | ||||
| of messages of type "multipart." | ||||
| 7.2.1 Multipart: The common syntax | The Quoted-Printable encoding is intended to represent data | |||
| that largely consists of octets that correspond to printable | ||||
| characters in the US-ASCII character set. It encodes the data | ||||
| in such a way that the resulting octets are unlikely to be | ||||
| modified by mail transport. If the data being encoded are | ||||
| mostly US-ASCII text, the encoded form of the data remains | ||||
| largely recognizable by humans. A body which is entirely US- | ||||
| ASCII may also be encoded in Quoted-Printable to ensure the | ||||
| integrity of the data should the message pass through a | ||||
| character-translating, and/or line-wrapping gateway. | ||||
| All subtypes of "multipart" share a common syntax, defined | In this encoding, octets are to be represented as determined | |||
| in this section. A simple example of a multipart message | by the following rules: | |||
| also appears in this section. An example of a more complex | ||||
| multipart message is given in Appendix C. | ||||
| The Content-Type field for multipart entities requires one | (1) (General 8-bit representation) Any octet, except those | |||
| parameter, "boundary", which is used to specify the | indicating a line break according to the newline | |||
| encapsulation boundary. The encapsulation boundary is | convention of the canonical (standard) form of the data | |||
| defined as a line consisting entirely of two hyphen | being encoded, may be represented by an "=" followed by | |||
| characters ("-", decimal code 45) followed by the boundary | a two digit hexadecimal representation of the octet's | |||
| value. The digits of the hexadecimal alphabet, for | ||||
| this purpose, are "0123456789ABCDEF". Uppercase | ||||
| letters must be used when sending hexadecimal data, | ||||
| though a robust implementation may choose to recognize | ||||
| lowercase letters on receipt. Thus, for example, the | ||||
| decimal value 12 (US-ASCII form feed) can be | ||||
| represented by "=0C", and the decimal value 61 (US- | ||||
| ASCII EQUAL SIGN) can be represented by "=3D". This | ||||
| rule must be followed except when the following rules | ||||
| allow an alternative encoding. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | (2) (Literal representation) Octets with decimal values of | |||
| 33 through 60 inclusive, and 62 through 126, inclusive, | ||||
| MAY be represented as the US-ASCII characters which | ||||
| correspond to those octets (EXCLAMATION POINT through | ||||
| LESS THAN, and GREATER THAN through TILDE, | ||||
| respectively). | ||||
| parameter value from the Content-Type header field. | (3) (White Space) Octets with values of 9 and 32 MAY be | |||
| represented as US-ASCII TAB (HT) and SPACE characters, | ||||
| respectively, but MUST NOT be so represented at the end | ||||
| of an encoded line. Any TAB (HT) or SPACE characters on | ||||
| an encoded line MUST thus be followed on that line by a | ||||
| printable character. In particular, an "=" at the end | ||||
| of an encoded line, indicating a soft line break (see | ||||
| rule #5) may follow one or more TAB (HT) or SPACE | ||||
| characters. It follows that an octet with decimal | ||||
| value 9 or 32 appearing at the end of an encoded line | ||||
| must be represented according to Rule #1. This rule is | ||||
| necessary because some MTAs (Message Transport Agents, | ||||
| programs which transport messages from one user to | ||||
| another, or perform a part of such transfers) are known | ||||
| to pad lines of text with SPACEs, and others are known | ||||
| to remove "white space" characters from the end of a | ||||
| line. Therefore, when decoding a Quoted-Printable body, | ||||
| any trailing white space on a line must be deleted, as | ||||
| it will necessarily have been added by intermediate | ||||
| transport agents. | ||||
| NOTE: The hyphens are for rough compatibility | (4) (Line Breaks) A line break in a text body, represented | |||
| with the earlier RFC 934 method of message | as a CRLF sequence in the text canonical form, must be | |||
| encapsulation, and for ease of searching for the | represented by a (RFC 822) line break, which is also a | |||
| boundaries in some implementations. However, it | CRLF sequence, in the Quoted-Printable encoding. Since | |||
| should be noted that multipart messages are NOT | the canonical representation of types other than text | |||
| completely compatible with RFC 934 encapsulations; | do not generally include the representation of line | |||
| in particular, they do not obey RFC 934 quoting | breaks as CRLF sequences, no hard line breaks (i.e. | |||
| conventions for embedded lines that begin with | line breaks that are intended to be meaningful and to | |||
| hyphens. This mechanism was chosen over the RFC | be displayed to the user) should occur in the quoted- | |||
| 934 mechanism because the latter causes lines to | printable encoding of such types. Sequences like "=0D", | |||
| grow with each level of quoting. The combination | "=0A", "=0A=0D" and "=0D=0A" will routinely appear in | |||
| of this growth with the fact that SMTP | non-text data represented in quoted-printable, of | |||
| implementations sometimes wrap long lines made the | course. | |||
| RFC 934 mechanism unsuitable for use in the event | ||||
| that deeply-nested multipart structuring is ever | ||||
| desired. | ||||
| WARNING TO IMPLEMENTORS: The grammar for parameters on the | Note that many implementations may elect to encode the | |||
| Content-type field is such that it is often necessary to | local representation of various content types directly, | |||
| enclose the boundaries in quotes on the Content-type line. | as described in Appendix F. In particular, this may | |||
| This is not always necessary, but never hurts. Implementors | apply to plain text material on systems that use | |||
| should be sure to study the grammar carefully in order to | newline conventions other than CRLF delimiters. Such | |||
| avoid producing illegal Content-type fields. Thus, a | an implementation is permissible, but the generation of | |||
| typical multipart Content-Type header field might look like | line breaks must be generalized to account for the case | |||
| this: | where alternate representations of newline sequences | |||
| are used. | ||||
| Content-Type: multipart/mixed; | (5) (Soft Line Breaks) The Quoted-Printable encoding | |||
| boundary=gc0p4Jq0M2Yt08jU534c0p | REQUIRES that encoded lines be no more than 76 | |||
| characters long. If longer lines are to be encoded | ||||
| with the Quoted-Printable encoding, "soft" line breaks | ||||
| must be used. An equal sign as the last character on a | ||||
| encoded line indicates such a non-significant ("soft") | ||||
| line break in the encoded text. | ||||
| But the following is illegal: | Thus if the "raw" form of the line is a single unencoded line | |||
| that says: | ||||
| Content-Type: multipart/mixed; | Now's the time for all folk to come to the aid of their country. | |||
| boundary=gc0p4Jq0M:2Yt08jU534c0p | ||||
| (because of the colon) and must instead be represented as | This can be represented, in the Quoted-Printable encoding, as: | |||
| Content-Type: multipart/mixed; | Now's the time = | |||
| boundary="gc0p4Jq0M:2Yt08jU534c0p" | for all folk to come= | |||
| to the aid of their country. | ||||
| This indicates that the entity consists of several parts, | This provides a mechanism with which long lines are encoded in | |||
| each itself with a structure that is syntactically identical | such a way as to be restored by the user agent. The 76 | |||
| to an RFC 822 message, except that the header area might be | character limit does not count the trailing CRLF, but counts | |||
| completely empty, and that the parts are each preceded by | all other characters, including any equal signs. | |||
| the line | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | Since the hyphen character ("-") is represented as itself in | |||
| the Quoted-Printable encoding, care must be taken, when | ||||
| encapsulating a quoted-printable encoded body in a multipart | ||||
| entity, to ensure that the encapsulation boundary does not | ||||
| appear anywhere in the encoded body. (A good strategy is to | ||||
| choose a boundary that includes a character sequence such as | ||||
| "=_" which can never appear in a quoted-printable body. See | ||||
| the definition of multipart messages later in this document.) | ||||
| --gc0p4Jq0M:2Yt08jU534c0p | NOTE: The quoted-printable encoding represents something of a | |||
| compromise between readability and reliability in transport. | ||||
| Bodies encoded with the quoted-printable encoding will work | ||||
| reliably over most mail gateways, but may not work perfectly | ||||
| over a few gateways, notably those involving translation into | ||||
| EBCDIC. A higher level of confidence is offered by the base64 | ||||
| Content-Transfer-Encoding. A way to get reasonably reliable | ||||
| transport through EBCDIC gateways is to also quote the US- | ||||
| ASCII characters | ||||
| Note that the encapsulation boundary must occur at the | !"#$@[\]^`{|}~ | |||
| beginning of a line, i.e., following a CRLF, and that the | ||||
| initial CRLF is considered to be attached to the | ||||
| encapsulation boundary rather than part of the preceding | ||||
| part. The boundary must be followed immediately either by | ||||
| another CRLF and the header fields for the next part, or by | ||||
| two CRLFs, in which case there are no header fields for the | ||||
| next part (and it is therefore assumed to be of Content-Type | ||||
| text/plain). | ||||
| NOTE: The CRLF preceding the encapsulation line | according to rule #1. See Appendix B for more information. | |||
| is conceptually attached to the boundary so that | ||||
| it is possible to have a part that does not end | ||||
| with a CRLF (line break). Body parts that must | ||||
| be considered to end with line breaks, therefore, | ||||
| must have two CRLFs preceding the encapsulation | ||||
| line, the first of which is part of the preceding | ||||
| body part, and the second of which is part of the | ||||
| encapsulation boundary. | ||||
| Encapsulation boundaries must not appear within the | Because quoted-printable data is generally assumed to be | |||
| encapsulations, and must be no longer than 70 characters, | line-oriented, it is to be expected that the representation of | |||
| not counting the two leading hyphens. | the breaks between the lines of quoted printable data may be | |||
| altered in transport, in the same manner that plain text mail | ||||
| has always been altered in Internet mail when passing between | ||||
| systems with differing newline conventions. If such | ||||
| alterations are likely to constitute a corruption of the data, | ||||
| it is probably more sensible to use the base64 encoding rather | ||||
| than the quoted-printable encoding. | ||||
| The encapsulation boundary following the last body part is a | WARNING TO IMPLEMENTORS: If binary data are encoded in | |||
| distinguished delimiter that indicates that no further body | quoted-printable, care must be taken to encode CR and LF | |||
| parts will follow. Such a delimiter is identical to the | characters as "=0D" and "=0A", respectively. In particular, a | |||
| previous delimiters, with the addition of two more hyphens | CRLF sequence in binary data should be encoded as "=0D=0A". | |||
| at the end of the line: | Otherwise, if CRLF were represented as a hard line break, it | |||
| might be incorrectly decoded on platforms with different line | ||||
| break conventions. | ||||
| --gc0p4Jq0M2Yt08jU534c0p-- | For formalists, the syntax of quoted-printable data is | |||
| described by the following grammar: | ||||
| There appears to be room for additional information prior to | quoted-printable := ([*(ptext / SPACE / TAB) ptext] | |||
| the first encapsulation boundary and following the final | ["="] CRLF) | |||
| boundary. These areas should generally be left blank, and | ; Maximum line length of 76 characters | |||
| implementations must ignore anything that appears before the | ; excluding CRLF | |||
| first boundary or after the last one. | ||||
| NOTE: These "preamble" and "epilogue" areas are | ptext := octet / safe-char | |||
| generally not used because of the lack of proper | ||||
| typing of these parts and the lack of clear | ||||
| semantics for handling these areas at gateways, | ||||
| particularly X.400 gateways. However, rather than | ||||
| leaving the preamble area blank, many MIME | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | safe-char := <any US-ASCII character except "=", | |||
| SPACE, or TAB> | ||||
| ; Characters not listed as "mail-safe" in | ||||
| ; Appendix B are also not recommended. | ||||
| implementations have found this to be a convenient | octet := "=" 2(DIGIT / "A" / "B" / "C" / "D" / "E" / "F") | |||
| place to insert an explanatory note for recipients | ; Octet must be used for characters > 127, =, | |||
| who read the message with pre-MIME software, since | ; SPACE, or TAB, and is recommended for any | |||
| such notes will be ignored by MIME-compliant | ; characters not listed in Appendix B as | |||
| software. | ; "mail-safe". | |||
| NOTE: Because encapsulation boundaries must not | IMPORTANT NOTE: The addition of LWSP between the elements | |||
| appear in the body parts being encapsulated, a | shown in this BNF is NOT allowed since this BNF does not | |||
| user agent must exercise care to choose a unique | specify a structured header field. | |||
| boundary. The boundary in the example above could | ||||
| have been the result of an algorithm designed to | ||||
| produce boundaries with a very low probability of | ||||
| already existing in the data to be encapsulated | ||||
| without having to prescan the data. Alternate | ||||
| algorithms might result in more 'readable' | ||||
| boundaries for a recipient with an old user agent, | ||||
| but would require more attention to the | ||||
| possibility that the boundary might appear in the | ||||
| encapsulated part. The simplest boundary possible | ||||
| is something like "---", with a closing boundary | ||||
| of "-----". | ||||
| As a very simple example, the following multipart message | 5.3.4. Base64 Content-Transfer-Encoding | |||
| has two parts, both of them plain text, one of them | ||||
| explicitly typed and one of them implicitly typed: | ||||
| From: Nathaniel Borenstein <nsb@bellcore.com> | The Base64 Content-Transfer-Encoding is designed to represent | |||
| To: Ned Freed <ned@innosoft.com> | arbitrary sequences of octets in a form that need not be | |||
| Subject: Sample message | humanly readable. The encoding and decoding algorithms are | |||
| MIME-Version: 1.0 | simple, but the encoded data are consistently only about 33 | |||
| Content-type: multipart/mixed; | percent larger than the unencoded data. This encoding is | |||
| boundary="simple boundary" | virtually identical to the one used in Privacy Enhanced Mail | |||
| (PEM) applications, as defined in RFC 1421 [RFC-1421]. | ||||
| This is the preamble. It is to be ignored, though it | A 65-character subset of US-ASCII is used, enabling 6 bits to | |||
| is a handy place for mail composers to include an | be represented per printable character. (The extra 65th | |||
| explanatory note to non-MIME conformant readers. | character, "=", is used to signify a special processing | |||
| --simple boundary | function.) | |||
| This is implicitly typed plain ASCII text. | NOTE: This subset has the important property that it is | |||
| It does NOT end with a linebreak. | represented identically in all versions of ISO 646, including | |||
| --simple boundary | US-ASCII, and all characters in the subset are also | |||
| Content-type: text/plain; charset=us-ascii | represented identically in all versions of EBCDIC. Other | |||
| popular encodings, such as the encoding used by the uuencode | ||||
| utility and the base85 encoding specified as part of Level 2 | ||||
| PostScript, do not share these properties, and thus do not | ||||
| fulfill the portability requirements a binary transport | ||||
| encoding for mail must meet. | ||||
| This is explicitly typed plain ASCII text. | The encoding process represents 24-bit groups of input bits as | |||
| output strings of 4 encoded characters. Proceeding from left | ||||
| to right, a 24-bit input group is formed by concatenating 3 | ||||
| 8-bit input groups. These 24 bits are then treated as 4 | ||||
| concatenated 6-bit groups, each of which is translated into a | ||||
| single digit in the base64 alphabet. When encoding a bit | ||||
| stream via the base64 encoding, the bit stream must be | ||||
| presumed to be ordered with the most-significant-bit first. | ||||
| That is, the first bit in the stream will be the high-order | ||||
| bit in the first 8-bit byte, and the eighth bit will be the | ||||
| low-order bit in the first 8-bit byte, and so on. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | Each 6-bit group is used as an index into an array of 64 | |||
| printable characters. The character referenced by the index | ||||
| is placed in the output string. These characters, identified | ||||
| in Table 1, below, are selected so as to be universally | ||||
| representable, and the set excludes characters with particular | ||||
| significance to SMTP (e.g., ".", CR, LF) and to the | ||||
| encapsulation boundaries defined in this document (e.g., "-"). | ||||
| It DOES end with a linebreak. | Table 1: The Base64 Alphabet | |||
| --simple boundary-- | Value Encoding Value Encoding Value Encoding Value Encoding | |||
| This is the epilogue. It is also to be ignored. | 0 A 17 R 34 i 51 z | |||
| 1 B 18 S 35 j 52 0 | ||||
| 2 C 19 T 36 k 53 1 | ||||
| 3 D 20 U 37 l 54 2 | ||||
| 4 E 21 V 38 m 55 3 | ||||
| 5 F 22 W 39 n 56 4 | ||||
| 6 G 23 X 40 o 57 5 | ||||
| 7 H 24 Y 41 p 58 6 | ||||
| 8 I 25 Z 42 q 59 7 | ||||
| 9 J 26 a 43 r 60 8 | ||||
| 10 K 27 b 44 s 61 9 | ||||
| 11 L 28 c 45 t 62 + | ||||
| 12 M 29 d 46 u 63 / | ||||
| 13 N 30 e 47 v | ||||
| 14 O 31 f 48 w (pad) = | ||||
| 15 P 32 g 49 x | ||||
| 16 Q 33 h 50 y | ||||
| The use of a Content-Type of multipart in a body part within | The encoded output stream must be represented in lines of no | |||
| another multipart entity is explicitly allowed. In such | more than 76 characters each. All line breaks or other | |||
| cases, for obvious reasons, care must be taken to ensure | characters not found in Table 1 must be ignored by decoding | |||
| that each nested multipart entity must use a different | software. In base64 data, characters other than those in | |||
| boundary delimiter. See Appendix C for an example of nested | Table 1, line breaks, and other white space probably indicate | |||
| multipart entities. | a transmission error, about which a warning message or even a | |||
| message rejection might be appropriate under some | ||||
| circumstances. | ||||
| The use of the multipart Content-Type with only a single | Special processing is performed if fewer than 24 bits are | |||
| body part may be useful in certain contexts, and is | available at the end of the data being encoded. A full | |||
| explicitly permitted. | encoding quantum is always completed at the end of a body. | |||
| When fewer than 24 input bits are available in an input group, | ||||
| zero bits are added (on the right) to form an integral number | ||||
| of 6-bit groups. Padding at the end of the data is performed | ||||
| using the "=" character. Since all base64 input is an | ||||
| integral number of octets, only the following cases can arise: | ||||
| (1) the final quantum of encoding input is an integral | ||||
| multiple of 24 bits; here, the final unit of encoded output | ||||
| will be an integral multiple of 4 characters with no "=" | ||||
| padding, (2) the final quantum of encoding input is exactly 8 | ||||
| bits; here, the final unit of encoded output will be two | ||||
| characters followed by two "=" padding characters, or (3) the | ||||
| final quantum of encoding input is exactly 16 bits; here, the | ||||
| final unit of encoded output will be three characters followed | ||||
| by one "=" padding character. | ||||
| The only mandatory parameter for the multipart Content-Type | Because it is used only for padding at the end of the data, | |||
| is the boundary parameter, which consists of 1 to 70 | the occurrence of any "=" characters may be taken as evidence | |||
| characters from a set of characters known to be very robust | that the end of the data has been reached (without truncation | |||
| through email gateways, and NOT ending with white space. | in transit). No such assurance is possible, however, when the | |||
| (If a boundary appears to end with white space, the white | number of octets transmitted was a multiple of three. | |||
| space must be presumed to have been added by a gateway, and | ||||
| must be deleted.) It is formally specified by the following | ||||
| BNF: | ||||
| boundary := 0*69<bchars> bcharsnospace | Any characters outside of the base64 alphabet are to be | |||
| ignored in base64-encoded data. The same applies to any | ||||
| invalid sequence of characters in the base64 encoding, such as | ||||
| "=====" | ||||
| bchars := bcharsnospace / " " | Care must be taken to use the proper octets for line breaks if | |||
| base64 encoding is applied directly to text material that has | ||||
| not been converted to canonical form. In particular, text | ||||
| line breaks must be converted into CRLF sequences prior to | ||||
| base64 encoding. The important thing to note is that this may | ||||
| be done directly by the encoder rather than in a prior | ||||
| canonicalization step in some implementations. | ||||
| bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" / "+" / | NOTE: There is no need to worry about quoting apparent | |||
| "_" | encapsulation boundaries within base64-encoded parts of | |||
| / "," / "-" / "." / "/" / ":" / "=" / "?" | multipart entities because no hyphen characters are used in | |||
| the base64 encoding. | ||||
| Overall, the body of a multipart entity may be specified as | 5.4. Content-ID Header Field | |||
| follows: | ||||
| multipart-body := preamble 1*encapsulation | In constructing a high-level user agent, it may be desirable | |||
| close-delimiter epilogue | to allow one body to make reference to another. Accordingly, | |||
| bodies may be labelled using the "Content-ID" header field, | ||||
| which is syntactically identical to the "Message-ID" header | ||||
| field: | ||||
| encapsulation := delimiter body-part CRLF | id := "Content-ID" ":" msg-id | |||
| delimiter := "--" boundary CRLF ; taken from Content-Type | Like the Message-ID values, Content-ID values must be | |||
| field. | generated to be world-unique. | |||
| ; There must be no space | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | The Content-ID value may be used for uniquely identifying MIME | |||
| entities in several contexts, particularly for caching data | ||||
| referenced by the message/external-body mechanism. Although | ||||
| the Content-ID header is generally optional, its use is | ||||
| MANDATORY in implementations which generate data of the | ||||
| optional MIME Content-type "message/external-body". That is, | ||||
| each message/external-body entity must have a Content-ID field | ||||
| to permit caching of such data. | ||||
| ; between "--" and boundary. | It is also worth noting that the Content-ID value has special | |||
| semantics in the case of the multipart/alternative content- | ||||
| type. This is explained in the section of this document | ||||
| dealing with multipart/alternative. | ||||
| close-delimiter := "--" boundary "--" CRLF | 5.5. Content-Description Header Field | |||
| ; Again, no space by "--", | ||||
| preamble := discard-text ; to be ignored | The ability to associate some descriptive information with a | |||
| upon receipt. | given body is often desirable. For example, it may be useful | |||
| to mark an "image" body as "a picture of the Space Shuttle | ||||
| Endeavor." Such text may be placed in the Content-Description | ||||
| header field. This header field is always optional. | ||||
| epilogue := discard-text ; to be ignored | description := "Content-Description" ":" *text | |||
| upon receipt. | ||||
| discard-text := *(*text CRLF) | The description is presumed to be given in the US-ASCII | |||
| character set, although the mechanism specified in RFC MIME- | ||||
| HEADERS [RFC-MIME-HEADERS] may be used for non-US-ASCII | ||||
| Content-Description values. | ||||
| body-part := <"message" as defined in RFC 822, | 5.6. Additional MIME Header Fields | |||
| with all header fields optional, and with the | ||||
| specified delimiter not occurring anywhere in | ||||
| the message body, either on a line by itself | ||||
| or as a substring anywhere. Note that the | ||||
| semantics of a part differ from the semantics | ||||
| of a message, as described in the text.> | ||||
| NOTE: In certain transport enclaves, RFC 822 | Future documents may elect to define additional MIME header | |||
| restrictions such as the one that limits bodies to | fields for various purposes. Any new header field that | |||
| printable ASCII characters may not be in force. (That | further describes the content of a message should begin with | |||
| is, the transport domains may resemble standard | the string "Content-" to allow such fields which appear in a | |||
| Internet mail transport as specified in RFC821 and | message header to be distinguished from ordinary RFC 822 | |||
| assumed by RFC822, but without certain restrictions.) | message header fields. | |||
| The relaxation of these restrictions should be | ||||
| construed as locally extending the definition of | ||||
| bodies, for example to include octets outside of the | ||||
| ASCII range, as long as these extensions are supported | ||||
| by the transport and adequately documented in the | ||||
| Content-Transfer-Encoding header field. However, in | ||||
| no event are headers (either message headers or body- | ||||
| part headers) allowed to contain anything other than | ||||
| ASCII characters. | ||||
| NOTE: Conspicuously missing from the multipart | MIME-extension-field := <Any RFC 822 header field which | |||
| type is a notion of structured, related body | begins with the string | |||
| parts. In general, it seems premature to try to | "Content-"> | |||
| standardize interpart structure yet. It is | ||||
| recommended that those wishing to provide a more | ||||
| structured or integrated multipart messaging | ||||
| facility should define a subtype of multipart that | ||||
| is syntactically identical, but that always | ||||
| expects the inclusion of a distinguished part that | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | 6. Predefined Content-Type Values | |||
| can be used to specify the structure and | This document defines seven initial Content-Type values and an | |||
| integration of the other parts, probably referring | extension mechanism for private or experimental types. | |||
| to them by their Content-ID field. If this | Further standard types must be defined by new published | |||
| approach is used, other implementations will not | specifications. It is expected that most innovation in new | |||
| recognize the new subtype, but will treat it as | types of mail will take place as subtypes of the seven types | |||
| the primary subtype (multipart/mixed) and will | defined here. The most essential characteristics of the seven | |||
| thus be able to show the user the parts that are | content-types are summarized in Appendix E. | |||
| recognized. | ||||
| 7.2.2 The Multipart/mixed (primary) subtype | 6.1. Discrete Content-Type Values | |||
| The primary subtype for multipart, "mixed", is intended for | Five of the seven initial Content-Type values refer to | |||
| use when the body parts are independent and need to be | discrete bodies. The content of such entities is handled by | |||
| bundled in a particular order. Any multipart subtypes that | non-MIME mechanisms; they are opaque to MIME processors. | |||
| an implementation does not recognize must be treated as | ||||
| being of subtype "mixed". | ||||
| 7.2.3 The Multipart/alternative subtype | 6.1.1. Text Content-Type | |||
| The multipart/alternative type is syntactically identical to | The text Content-Type is intended for sending material which | |||
| multipart/mixed, but the semantics are different. In | is principally textual in form. A "charset" parameter may be | |||
| particular, each of the parts is an "alternative" version of | used to indicate the character set of the body text for some | |||
| the same information. | text subtypes, notably including the subtype "text/plain", | |||
| which indicates plain (unformatted) text. The default | ||||
| Content-Type for Internet mail if none is specified is | ||||
| "text/plain; charset=us-ascii". | ||||
| Systems should recognize that the content of the various | Beyond plain text, there are many formats for representing | |||
| parts are interchangeable. Systems should choose the | what might be known as "extended text" -- text with embedded | |||
| "best" type based on the local environment and preferences, | formatting and presentation information. An interesting | |||
| in some cases even through user interaction. As with | characteristic of many such representations is that they are | |||
| multipart/mixed, the order of body parts is significant. In | to some extent readable even without the software that | |||
| this case, the alternatives appear in an order of increasing | interprets them. It is useful, then, to distinguish them, at | |||
| faithfulness to the original content. In general, the best | the highest level, from such unreadable data as images, audio, | |||
| choice is the LAST part of a type supported by the recipient | or text represented in an unreadable form. In the absence of | |||
| system's local environment. | appropriate interpretation software, it is reasonable to show | |||
| subtypes of text to the user, while it is not reasonable to do | ||||
| so with most nontextual data. | ||||
| Multipart/alternative may be used, for example, to send mail | Such formatted textual data should be represented using | |||
| in a fancy text format in such a way that it can easily be | subtypes of text. Plausible subtypes of text are typically | |||
| displayed anywhere: | given by the common name of the representation format, e.g., | |||
| "text/enriched" [RFC-1563]. | ||||
| From: Nathaniel Borenstein <nsb@bellcore.com> | 6.1.1.1. Representation of Line Breaks | |||
| To: Ned Freed <ned@innosoft.com> | ||||
| Subject: Formatted text mail | ||||
| MIME-Version: 1.0 | ||||
| Content-Type: multipart/alternative; boundary=boundary42 | ||||
| --boundary42 | The canonical form of any MIME text type MUST represent a line | |||
| break as a CRLF sequence. Similarly, any occurrence of CRLF | ||||
| in text MUST represent a line break. Use of CR and LF outside | ||||
| of line break sequences is also forbidden. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | This rule applies regardless of format or character set or | |||
| sets involved. | ||||
| Content-Type: text/plain; charset=us-ascii | 6.1.1.2. Charset Parameter | |||
| ...plain text version of message goes here.... | A critical parameter that may be specified in the Content-Type | |||
| field for text/plain data is the character set. This is | ||||
| specified with a "charset" parameter, as in: | ||||
| --boundary42 | Content-type: text/plain; charset=iso-8859-1 | |||
| Content-Type: text/richtext | ||||
| .... RFC 1341 richtext version of same message goes here ... | Unlike some other parameter values, the values of the charset | |||
| parameter are NOT case sensitive. The default character set, | ||||
| which must be assumed in the absence of a charset parameter, | ||||
| is US-ASCII. | ||||
| --boundary42 | The specification for any future subtypes of "text" must | |||
| Content-Type: text/x-whatever | specify whether or not they will also utilize a "charset" | |||
| parameter, and may possibly restrict its values as well. When | ||||
| used with a particular body, the semantics of the "charset" | ||||
| parameter should be identical to those specified here for | ||||
| "text/plain", i.e., the body consists entirely of characters | ||||
| in the given charset. In particular, definers of future text | ||||
| subtypes should pay close attention the the implications of | ||||
| multioctet character sets for their subtype definitions. | ||||
| .... fanciest version of same message goes here ... | This RFC specifies the definition of the charset parameter for | |||
| the purposes of MIME to be the name of a character set, as | ||||
| "character set" as defined in Section 4 of this document. The | ||||
| rules regarding line breaks detailed in the previous section | ||||
| must also be observed -- a character set whose definition does | ||||
| not conform to these rules cannot be used in a MIME text type. | ||||
| --boundary42-- | An initial list of predefined character set names can be found | |||
| at the end of this section. Additional character sets may be | ||||
| registered with IANA as described in RFC REG. | ||||
| In this example, users whose mail system understood the | Note that if the specified character set includes 8-bit data, | |||
| "text/x-whatever" format would see only the fancy version, | a Content-Transfer-Encoding header field and a corresponding | |||
| while other users would see only the richtext or plain text | encoding on the data are required in order to transmit the | |||
| version, depending on the capabilities of their system. | body via some mail transfer protocols, such as SMTP. | |||
| In general, user agents that compose multipart/alternative | The default character set, US-ASCII, has been the subject of | |||
| entities must place the body parts in increasing order of | some confusion and ambiguity in the past. Not only were there | |||
| preference, that is, with the preferred format last. For | some ambiguities in the definition, there have been wide | |||
| fancy text, the sending user agent should put the plainest | variations in practice. In order to eliminate such ambiguity | |||
| format first and the richest format last. Receiving user | and variations in the future, it is strongly recommended that | |||
| agents should pick and display the last format they are | new user agents explicitly specify a character set via the | |||
| capable of displaying. In the case where one of the | Content-Type header field. "US-ASCII" does not indicate an | |||
| alternatives is itself of type "multipart" and contains | arbitrary 7-bit character code, but specifies that the body | |||
| unrecognized sub-parts, the user agent may choose either to | uses character coding that uses the exact correspondence of | |||
| show that alternative, an earlier alternative, or both. | octets to characters specified in US-ASCII. National use | |||
| variations of ISO 646 [ISO-646] are NOT US-ASCII and their use | ||||
| in Internet mail is explicitly discouraged. The omission of | ||||
| the ISO 646 character set is deliberate in this regard. The | ||||
| character set name of "US-ASCII" explicitly refers to ANSI | ||||
| X3.4-1986 [US-ASCII] only. The character set name "ASCII" is | ||||
| reserved and must not be used for any purpose. | ||||
| NOTE: From an implementor's perspective, it might | NOTE: RFC 821 explicitly specifies "ASCII", and references an | |||
| seem more sensible to reverse this ordering, and | earlier version of the American Standard. Insofar as one of | |||
| have the plainest alternative last. However, | the purposes of specifying a Content-Type and character set is | |||
| placing the plainest alternative first is the | to permit the receiver to unambiguously determine how the | |||
| friendliest possible option when | sender intended the coded message to be interpreted, assuming | |||
| multipart/alternative entities are viewed using a | anything other than "strict ASCII" as the default would risk | |||
| non-MIME-conformant mail reader. While this | unintentional and incompatible changes to the semantics of | |||
| approach does impose some burden on conformant | messages now being transmitted. This also implies that | |||
| mail readers, interoperability with older mail | messages containing characters coded according to national | |||
| readers was deemed to be more important in this | variations on ISO 646, or using code-switching procedures | |||
| case. | (e.g., those of ISO 2022), as well as 8-bit or multiple octet | |||
| character encodings MUST use an appropriate character set | ||||
| specification to be consistent with this specification. | ||||
| It may be the case that some user agents, if they can | The complete US-ASCII character set is listed in ANSI X3.4- | |||
| recognize more than one of the formats, will prefer to offer | 1986. Note that the control characters including DEL (0-31, | |||
| 127) have no defined meaning apart from the combination CRLF | ||||
| (US-ASCII values 13 and 10) indicating a new line. Two of the | ||||
| characters have de facto meanings in wide use: FF (12) often | ||||
| means "start subsequent text on the beginning of a new page"; | ||||
| and TAB or HT (9) often (though not always) means "move the | ||||
| cursor to the next available column after the current position | ||||
| where the column number is a multiple of 8 (counting the first | ||||
| column as column 0)." Apart from this, any use of the control | ||||
| characters or DEL in a body must be part of a private | ||||
| agreement between the sender and recipient. Such private | ||||
| agreements are discouraged and should be replaced by the other | ||||
| capabilities of this document. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | NOTE: Beyond US-ASCII, an enormous proliferation of character | |||
| sets is possible. It is the opinion of the IETF working group | ||||
| that a large number of character sets is NOT a good thing. We | ||||
| would prefer to specify a SINGLE character set that can be | ||||
| used universally for representing all of the world's languages | ||||
| in electronic mail. Unfortunately, existing practice in | ||||
| several communities seems to point to the continued use of | ||||
| multiple character sets in the near future. For this reason, | ||||
| we define names for a small number of character sets for which | ||||
| a strong constituent base exists. | ||||
| the user the choice of which format to view. This makes | The defined charset values are: | |||
| sense, for example, if mail includes both a nicely-formatted | ||||
| image version and an easily-edited text version. What is | ||||
| most critical, however, is that the user not automatically | ||||
| be shown multiple versions of the same data. Either the | ||||
| user should be shown the last recognized version or should | ||||
| be given the choice. | ||||
| NOTE ON THE SEMANTICS OF CONTENT-ID IN | (1) US-ASCII -- as defined in ANSI X3.4-1986 [US-ASCII]. | |||
| MULTIPART/ALTERNATIVE: Each part of a multipart/alternative | ||||
| entity represents the same data, but the mappings between | ||||
| the two are not necessarily without information loss. For | ||||
| example, information is lost when translating ODA to | ||||
| PostScript or plain text. It is recommended that each part | ||||
| should have a different Content-ID value in the case where | ||||
| the information content of the two parts is not identical. | ||||
| However, where the information content is identical -- for | ||||
| example, where several parts of type "message/external-body" | ||||
| specify alternate ways to access the identical data -- the | ||||
| same Content-ID field value should be used, to optimize any | ||||
| cacheing mechanisms that might be present on the recipient's | ||||
| end. However, it is recommended that the Content-ID values | ||||
| used by the parts should not be the same Content-ID value | ||||
| that describes the multipart/alternative as a whole, if | ||||
| there is any such Content-ID field. That is, one Content-ID | ||||
| value will refer to the multipart/alternative entity, while | ||||
| one or more other Content-ID values will refer to the parts | ||||
| inside it. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | (2) ISO-8859-X -- where "X" is to be replaced, as | |||
| necessary, for the parts of ISO-8859 [ISO-8859]. Note | ||||
| that the ISO 646 character sets have deliberately been | ||||
| omitted in favor of their 8859 replacements, which are | ||||
| the designated character sets for Internet mail. As of | ||||
| the publication of this document, the legitimate values | ||||
| for "X" are the digits 1 through 9. | ||||
| 7.2.4 The Multipart/digest subtype | All of these character sets are used as pure 7- or 8-bit sets | |||
| without any shift or escape functions. The meaning of shift | ||||
| and escape sequences in these character sets is not defined. | ||||
| This document defines a "digest" subtype of the multipart | The character sets specified above are the ones that were | |||
| Content-Type. This type is syntactically identical to | relatively uncontroversial during the drafting of MIME. This | |||
| multipart/mixed, but the semantics are different. In | document does not endorse the use of any particular character | |||
| particular, in a digest, the default Content-Type value for | set other than US-ASCII, and recognizes that the future | |||
| a body part is changed from "text/plain" to | evolution of world character sets remains unclear. It is | |||
| "message/rfc822". This is done to allow a more readable | expected that in the future, additional character sets will be | |||
| digest format that is largely compatible (except for the | registered for use in MIME. | |||
| quoting convention) with RFC 934. | ||||
| A digest in this format might, then, look something like | Note that the character set used, if anything other than US- | |||
| this: | ASCII, must always be explicitly specified in the Content-Type | |||
| field. | ||||
| From: Moderator-Address | No other character set name may be used in Internet mail | |||
| To: Recipient-List | without the publication of a formal specification and its | |||
| MIME-Version: 1.0 | registration with IANA, or by private agreement, in which case | |||
| Subject: Internet Digest, volume 42 | the character set name must begin with "X-". | |||
| Content-Type: multipart/digest; | ||||
| boundary="---- next message ----" | ||||
| ------ next message ---- | Implementors are discouraged from defining new character sets | |||
| for mail use unless absolutely necessary. | ||||
| From: someone-else | The "charset" parameter has been defined primarily for the | |||
| Subject: my opinion | purpose of textual data, and is described in this section for | |||
| that reason. However, it is conceivable that non-textual data | ||||
| might also wish to specify a charset value for some purpose, | ||||
| in which case the same syntax and values should be used. | ||||
| ...body goes here ... | In general, mail-sending software should always use the | |||
| "lowest common denominator" character set possible. For | ||||
| example, if a body contains only US-ASCII characters, it | ||||
| should be marked as being in the US-ASCII character set, not | ||||
| ISO-8859-1, which, like all the ISO-8859 family of character | ||||
| sets, is a superset of US-ASCII. More generally, if a | ||||
| widely-used character set is a subset of another character | ||||
| set, and a body contains only characters in the widely-used | ||||
| subset, it should be labelled as being in that subset. This | ||||
| will increase the chances that the recipient will be able to | ||||
| view the mail correctly. | ||||
| ------ next message ---- | 6.1.1.3. Plain Subtype | |||
| From: someone-else-again | The simplest and most important subtype of text is "plain". | |||
| Subject: my different opinion | This indicates plain (unformatted) text. The default | |||
| Content-Type for Internet mail, "text/plain; charset=us- | ||||
| ascii", describes existing Internet practice. That is, it is | ||||
| the type of body defined by RFC 822. | ||||
| ... another body goes here... | No other text subtype is defined by this document. | |||
| ------ next message ------ | 6.1.1.4. Unrecognized Subtypes | |||
| 7.2.5 The Multipart/parallel subtype | Unrecognized subtypes of text should be treated as subtype | |||
| "plain" as long as the MIME implementation knows how to handle | ||||
| the charset. Unrecognized subtypes which also specify an | ||||
| unrecognized charset should be treated as "application/octet- | ||||
| stream". | ||||
| This document defines a "parallel" subtype of the multipart | 6.1.2. Image Content-Type | |||
| Content-Type. This type is syntactically identical to | ||||
| multipart/mixed, but the semantics are different. In | ||||
| particular, in a parallel entity, the order of body | ||||
| parts is not significant. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | A Content-Type of "image" indicates that the body contains an | |||
| image. The subtype names the specific image format. These | ||||
| names are not case sensitive. Two initial subtypes are "jpeg" | ||||
| for the JPEG format, JFIF encoding, and "gif" for GIF format | ||||
| [GIF]. | ||||
| A common presentation of this type is to display all of the | The list of image subtypes given here is neither exclusive nor | |||
| parts simultaneously on hardware and software that are | exhaustive, and is expected to grow as more types are | |||
| capable of doing so. However, composing agents should be | registered with IANA, as described in RFC REG. | |||
| aware that many mail readers will lack this capability and | ||||
| will show the parts serially in any event. | ||||
| 7.2.6 Other Multipart subtypes | Unrecognized subtypes of image should at a miniumum be treated | |||
| as "application/octet-stream". Implementations may optionally | ||||
| elect to pass subtypes of image that they do not specifically | ||||
| recognize to a robust general-purpose image viewing | ||||
| application, if such an application is available. | ||||
| Other multipart subtypes are expected in the future. MIME | 6.1.3. Audio Content-Type | |||
| implementations must in general treat unrecognized subtypes | ||||
| of multipart as being equivalent to "multipart/mixed". | ||||
| The formal grammar for content-type header fields for | A Content-Type of "audio" indicates that the body contains | |||
| multipart data is given by: | audio data. Although there is not yet a consensus on an | |||
| "ideal" audio format for use with computers, there is a | ||||
| pressing need for a format capable of providing interoperable | ||||
| behavior. | ||||
| multipart-type := "multipart" "/" multipart-subtype | The initial subtype of "basic" is specified to meet this | |||
| ";" "boundary" "=" boundary | requirement by providing an absolutely minimal lowest common | |||
| denominator audio format. It is expected that richer formats | ||||
| for higher quality and/or lower bandwidth audio will be | ||||
| defined by a later document. | ||||
| multipart-subtype := "mixed" / "parallel" / "digest" | The content of the "audio/basic" subtype is single channel | |||
| / "alternative" / extension-token | audio encoded using 8-bit ISDN mu-law [PCM] at a sample rate | |||
| of 8000 Hz. | ||||
| 7.3 The Message Content-Type | Unrecognized subtypes of audio should at a miniumum be treated | |||
| as "application/octet-stream". Implementations may optionally | ||||
| elect to pass subtypes of audio that they do not specifically | ||||
| recognize to a robust general-purpose audio playing | ||||
| application, if such an application is available. | ||||
| It is frequently desirable, in sending mail, to encapsulate | 6.1.4. Video Content-Type | |||
| another mail message. For this common operation, a special | ||||
| Content-Type, "message", is defined. The primary subtype, | ||||
| message/rfc822, has no required parameters in the Content- | ||||
| Type field. Additional subtypes, "partial" and "External- | ||||
| body", do have required parameters. These subtypes are | ||||
| explained below. | ||||
| NOTE: It has been suggested that subtypes of | A Content-Type of "video" indicates that the body contains a | |||
| message might be defined for forwarded or rejected | time-varying-picture image, possibly with color and | |||
| messages. However, forwarded and rejected | coordinated sound. The term "video" is used extremely | |||
| messages can be handled as multipart messages in | generically, rather than with reference to any particular | |||
| which the first part contains any control or | technology or format, and is not meant to preclude subtypes | |||
| descriptive information, and a second part, of | such as animated drawings encoded compactly. The subtype | |||
| type message/rfc822, is the forwarded or rejected | "mpeg" refers to video coded according to the MPEG standard | |||
| message. Composing rejection and forwarding | [MPEG]. | |||
| messages in this manner will preserve the type | ||||
| information on the original message and allow it | ||||
| to be correctly presented to the recipient, and | ||||
| hence is strongly encouraged. | ||||
| As stated in the definition of the Content-Transfer-Encoding | Note that although in general this document strongly | |||
| field, no encoding other than "7bit", "8bit", or "binary" is | discourages the mixing of multiple media in a single body, it | |||
| is recognized that many so-called "video" formats include a | ||||
| representation for synchronized audio, and this is explicitly | ||||
| permitted for subtypes of "video". | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | Unrecognized subtypes of video should at a minumum be treated | |||
| as "application/octet-stream". Implementations may optionally | ||||
| elect to pass subtypes of video that they do not specifically | ||||
| recognize to a robust general-purpose video display | ||||
| application, if such an application is available. | ||||
| permitted for messages or parts of type "message". Even | 6.1.5. Application Content-Type | |||
| stronger restrictions apply to the subtypes | ||||
| "message/partial" and "message/external-body", as specified | ||||
| below. The message header fields are always US-ASCII in any | ||||
| case, and data within the body can still be encoded, in | ||||
| which case the Content-Transfer-Encoding header field in the | ||||
| encapsulated message will reflect this. Non-ASCII text in | ||||
| the headers of an encapsulated message can be specified | ||||
| using the mechanisms described in [RFC-1522]. | ||||
| Mail gateways, relays, and other mail handling agents are | The "application" Content-Type is to be used for discrete data | |||
| commonly known to alter the top-level header of an RFC 822 | which do not fit in any of the other categories, and | |||
| message. In particular, they frequently add, remove, or | particularly for data to be processed by mail-based uses of | |||
| reorder header fields. Such alterations are explicitly | application programs. This is information which must be | |||
| forbidden for the encapsulated headers embedded in the | processed by an application before it is viewable or usable to | |||
| bodies of messages of type "message." | a user. Expected uses for Content-Type application include | |||
| mail-based file transfer, spreadsheets, data for mail-based | ||||
| scheduling systems, and languages for "active" (computational) | ||||
| email. (The latter, in particular, can pose security problems | ||||
| which must be understood by implementors, and are considered | ||||
| in detail in the discussion of the application/PostScript | ||||
| content-type.) | ||||
| 7.3.1 The Message/rfc822 (primary) subtype | For example, a meeting scheduler might define a standard | |||
| representation for information about proposed meeting dates. | ||||
| An intelligent user agent would use this information to | ||||
| conduct a dialog with the user, and might then send further | ||||
| mail based on that dialog. More generally, there have been | ||||
| several "active" messaging languages developed in which | ||||
| programs in a suitably specialized language are sent through | ||||
| the mail and automatically run in the recipient's environment. | ||||
| A Content-Type of "message/rfc822" indicates that the body | Such applications may be defined as subtypes of the | |||
| contains an encapsulated message, with the syntax of an RFC | "application" Content-Type. This document defines two | |||
| 822 message. However, unlike top-level RFC 822 messages, | subtypes: octet-stream, and PostScript. | |||
| the restriction that each message/rfc822 body must include a | ||||
| "From", "Date", and at least one destination header is | ||||
| removed and replaced with the requirement that at least one | ||||
| of "From", "Subject", or "Date" must be present. | ||||
| It should be noted that, despite the use of the numbers | The subtype of application will often be the name of the | |||
| "822", a message/rfc822 entity can include enhanced | application for which the data are intended. This does not | |||
| information as defined in this document. In other words, a | mean, however, that any application program name may be used | |||
| message/rfc822 message may be a MIME message. | freely as a subtype of application. Usage of any subtype | |||
| (other than subtypes beginning with "x-") must be registered | ||||
| with IANA, as described in RFC REG. | ||||
| 7.3.2 The Message/Partial subtype | 6.1.5.1. Octet-Stream Subtype | |||
| A subtype of message, "partial", is defined in order to | The "octet-stream" subtype is used to indicate that a body | |||
| allow large objects to be delivered as several separate | contains arbitrary binary data. The set of currently defined | |||
| pieces of mail and automatically reassembled by the | parameters is: | |||
| receiving user agent. (The concept is similar to IP | ||||
| fragmentation/reassembly in the basic Internet Protocols.) | ||||
| This mechanism can be used when intermediate transport | ||||
| agents limit the size of individual messages that can be | ||||
| sent. Content-Type "message/partial" thus indicates that | ||||
| the body contains a fragment of a larger message. | ||||
| Three parameters must be specified in the Content-Type field | (1) TYPE -- the general type or category of binary data. | |||
| of type message/partial: The first, "id", is a unique | This is intended as information for the human recipient | |||
| rather than for any automatic processing. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | (2) PADDING -- the number of bits of padding that were | |||
| appended to the bit-stream comprising the actual | ||||
| contents to produce the enclosed 8-bit byte-oriented | ||||
| data. This is useful for enclosing a bit-stream in a | ||||
| body when the total number of bits is not a multiple of | ||||
| 8. | ||||
| identifier, as close to a world-unique identifier as | Both of these parameters are optional. | |||
| possible, to be used to match the parts together. (In | ||||
| general, the identifier is essentially a message-id; if | ||||
| placed in double quotes, it can be any message-id, in | ||||
| accordance with the BNF for "parameter" given earlier in | ||||
| this specification.) The second, "number", an integer, is | ||||
| the part number, which indicates where this part fits into | ||||
| the sequence of fragments. The third, "total", another | ||||
| integer, is the total number of parts. This third subfield | ||||
| is required on the final part, and is optional (though | ||||
| encouraged) on the earlier parts. Note also that these | ||||
| parameters may be given in any order. | ||||
| Thus, part 2 of a 3-part message may have either of the | An additional parameter, "CONVERSIONS", was defined in RFC | |||
| following header fields: | 1341 but has since been removed. RFC 1341 also defined the | |||
| use of a "NAME" parameter which gave a suggested file name to | ||||
| be used if the data were to be written to a file. This has | ||||
| been deprecated in anticipation of a separate Content- | ||||
| Disposition header field, to be defined in a subsequent RFC. | ||||
| Content-Type: Message/Partial; | The recommended action for an implementation that receives | |||
| number=2; total=3; | application/octet-stream mail is to simply offer to put the | |||
| id="oc=jpbe0M2Yt4s@thumper.bellcore.com" | data in a file, with any Content-Transfer-Encoding undone, or | |||
| perhaps to use it as input to a user-specified process. | ||||
| Content-Type: Message/Partial; | To reduce the danger of transmitting rogue programs through | |||
| id="oc=jpbe0M2Yt4s@thumper.bellcore.com"; | the mail, it is strongly recommended that implementations NOT | |||
| number=2 | implement a path-search mechanism whereby an arbitrary program | |||
| named in the Content-Type parameter (e.g., an "interpreter=" | ||||
| parameter) is found and executed using the mail body as input. | ||||
| But part 3 MUST specify the total number of parts: | 6.1.5.2. PostScript Subtype | |||
| Content-Type: Message/Partial; | A Content-Type of "application/postscript" indicates a | |||
| number=3; total=3; | PostScript program. Currently two variants of the PostScript | |||
| id="oc=jpbe0M2Yt4s@thumper.bellcore.com" | language are allowed; the original level 1 variant is | |||
| described in [POSTSCRIPT] and the more recent level 2 variant | ||||
| is described in [POSTSCRIPT2]. | ||||
| Note that part numbering begins with 1, not 0. | PostScript is a registered trademark of Adobe Systems, Inc. | |||
| Use of the MIME content-type "application/postscript" implies | ||||
| recognition of that trademark and all the rights it entails. | ||||
| When the parts of a message broken up in this manner are put | The PostScript language definition provides facilities for | |||
| together, the result is a complete MIME entity, which may | internal labelling of the specific language features a given | |||
| have its own Content-Type header field, and thus may contain | program uses. This labelling, called the PostScript document | |||
| any other data type. | structuring conventions, or DSC, is very general and provides | |||
| substantially more information than just the language level. | ||||
| The use of document structuring conventions, while not | ||||
| required, is strongly recommended as an aid to | ||||
| interoperability. Documents which lack proper structuring | ||||
| conventions cannot be tested to see whether or not they will | ||||
| work in a given environment. As such, some systems may assume | ||||
| the worst and refuse to process unstructured documents. | ||||
| Message fragmentation and reassembly: The semantics of a | The execution of general-purpose PostScript interpreters | |||
| reassembled partial message must be those of the "inner" | entails serious security risks, and implementors are | |||
| message, rather than of a message containing the inner | discouraged from simply sending PostScript email bodies to | |||
| message. This makes it possible, for example, to send a | "off-the-shelf" interpreters. While it is usually safe to | |||
| large audio message as several partial messages, and still | send PostScript to a printer, where the potential for harm is | |||
| have it appear to the recipient as a simple audio message | greatly constrained by typical printer environments, | |||
| rather than as an encapsulated message containing an audio | implementors should consider all of the following before they | |||
| message. That is, the encapsulation of the message is | add interactive display of PostScript bodies to their mail | |||
| considered to be "transparent". | readers. | |||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | The remainder of this section outlines some, though probably | |||
| not all, of the possible problems with sending PostScript | ||||
| through the mail. | ||||
| When generating and reassembling the parts of a | (1) Dangerous operations in the PostScript language | |||
| message/partial message, the headers of the encapsulated | include, but may not be limited to, the PostScript | |||
| message must be merged with the headers of the enclosing | operators "deletefile", "renamefile", "filenameforall", | |||
| entities. In this process the following rules must be | and "file". "File" is only dangerous when applied to | |||
| observed: | something other than standard input or output. | |||
| Implementations may also define additional nonstandard | ||||
| file operators; these may also pose a threat to | ||||
| security. "Filenameforall", the wildcard file search | ||||
| operator, may appear at first glance to be harmless. | ||||
| Note, however, that this operator has the potential to | ||||
| reveal information about what files the recipient has | ||||
| access to, and this information may itself be | ||||
| sensitive. Message senders should avoid the use of | ||||
| potentially dangerous file operators, since these | ||||
| operators are quite likely to be unavailable in secure | ||||
| PostScript implementations. Message receiving and | ||||
| displaying software should either completely disable | ||||
| all potentially dangerous file operators or take | ||||
| special care not to delegate any special authority to | ||||
| their operation. These operators should be viewed as | ||||
| being done by an outside agency when interpreting | ||||
| PostScript documents. Such disabling and/or checking | ||||
| should be done completely outside of the reach of the | ||||
| PostScript language itself; care should be taken to | ||||
| insure that no method exists for re-enabling full- | ||||
| function versions of these operators. | ||||
| (1) All of the header fields from the initial | (2) The PostScript language provides facilities for exiting | |||
| enclosing entity (part one), except those that | the normal interpreter, or server, loop. Changes made | |||
| start with "Content-" and the specific header | in this "outer" environment are customarily retained | |||
| fields "Subject", "Message-ID", "Encrypted", and | across documents, and may in some cases be retained | |||
| "MIME-Version",must be copied, in order, to the | semipermanently in nonvolatile memory. The operators | |||
| new message. | associated with exiting the interpreter loop have the | |||
| potential to interfere with subsequent document | ||||
| processing. As such, their unrestrained use constitutes | ||||
| a threat of service denial. PostScript operators that | ||||
| exit the interpreter loop include, but may not be | ||||
| limited to, the exitserver and startjob operators. | ||||
| Message sending software should not generate PostScript | ||||
| that depends on exiting the interpreter loop to | ||||
| operate, since the ability to exit will probably be | ||||
| unavailable in secure PostScript implementations. | ||||
| Message receiving and displaying software should | ||||
| completely disable the ability to make retained changes | ||||
| to the PostScript environment by eliminating or | ||||
| disabling the "startjob" and "exitserver" operations. | ||||
| If these operations cannot be eliminated or completely | ||||
| disabled the password associated with them should at | ||||
| least be set to a hard-to-guess value. | ||||
| (2) Only those header fields in the enclosed | (3) PostScript provides operators for setting system-wide | |||
| message which start with "Content-" and "Subject", | and device-specific parameters. These parameter | |||
| "Message-ID", "Encrypted", and "MIME-Version" must | settings may be retained across jobs and may | |||
| be appended, in order, to the header fields of the | potentially pose a threat to the correct operation of | |||
| new message. Any header fields in the enclosed | the interpreter. The PostScript operators that set | |||
| message which do not start with "Content-" (except | system and device parameters include, but may not be | |||
| for "Message-ID", "Encrypted", and "MIME-Version") | limited to, the "setsystemparams" and "setdevparams" | |||
| will be ignored. | operators. Message sending software should not | |||
| generate PostScript that depends on the setting of | ||||
| system or device parameters to operate correctly. The | ||||
| ability to set these parameters will probably be | ||||
| unavailable in secure PostScript implementations. | ||||
| Message receiving and displaying software should | ||||
| disable the ability to change system and device | ||||
| parameters. If these operators cannot be completely | ||||
| disabled the password associated with them should at | ||||
| least be set to a hard-to-guess value. | ||||
| (3) All of the header fields from the second and | (4) Some PostScript implementations provide nonstandard | |||
| any subsequent messages will be ignored. | facilities for the direct loading and execution of | |||
| machine code. Such facilities are quite obviously open | ||||
| to substantial abuse. Message sending software should | ||||
| not make use of such features. Besides being totally | ||||
| hardware-specific, they are also likely to be | ||||
| unavailable in secure implementations of PostScript. | ||||
| Message receiving and displaying software should not | ||||
| allow such operators to be used if they exist. | ||||
| For example, if an audio message is broken into two parts, | (5) PostScript is an extensible language, and many, if not | |||
| the first part might look something like this: | most, implementations of it provide a number of their | |||
| own extensions. This document does not deal with such | ||||
| extensions explicitly since they constitute an unknown | ||||
| factor. Message sending software should not make use | ||||
| of nonstandard extensions; they are likely to be | ||||
| missing from some implementations. Message receiving | ||||
| and displaying software should make sure that any | ||||
| nonstandard PostScript operators are secure and don't | ||||
| present any kind of threat. | ||||
| X-Weird-Header-1: Foo | (6) It is possible to write PostScript that consumes huge | |||
| From: Bill@host.com | amounts of various system resources. It is also | |||
| To: joe@otherhost.com | possible to write PostScript programs that loop | |||
| Subject: Audio mail (part 1 of 2) | indefinitely. Both types of programs have the | |||
| Message-ID: <id1@host.com> | potential to cause damage if sent to unsuspecting | |||
| MIME-Version: 1.0 | recipients. Message-sending software should avoid the | |||
| Content-type: message/partial; | construction and dissemination of such programs, which | |||
| id="ABC@host.com"; | is antisocial. Message receiving and displaying | |||
| number=1; total=2 | software should provide appropriate mechanisms to abort | |||
| processing of a document after a reasonable amount of | ||||
| time has elapsed. In addition, PostScript interpreters | ||||
| should be limited to the consumption of only a | ||||
| reasonable amount of any given system resource. | ||||
| X-Weird-Header-1: Bar | (7) It is possible to include raw binary information inside | |||
| X-Weird-Header-2: Hello | PostScript in various forms. This is not recommended | |||
| Message-ID: <anotherid@foo.com> | for use in email, both because it is not supported by | |||
| Subject: Audio mail | all PostScript interpreters and because it | |||
| MIME-Version: 1.0 | significantly complicates the use of a MIME Content- | |||
| Content-type: audio/basic | Transfer-Encoding. (Without such binary, PostScript | |||
| may typically be viewed as line-oriented data. The | ||||
| treatment of CRLF sequences becomes extremely | ||||
| problematic if binary and line-oriented data are mixed | ||||
| in a single Postscript data stream.) | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | (8) Finally, bugs may exist in some PostScript interpreters | |||
| which could possibly be exploited to gain unauthorized | ||||
| access to a recipient's system. Apart from noting this | ||||
| possibility, there is no specific action to take to | ||||
| prevent this, apart from the timely correction of such | ||||
| bugs if any are found. | ||||
| Content-transfer-encoding: base64 | 6.1.5.3. Other Application Subtypes | |||
| ... first half of encoded audio data goes here... | It is expected that many other subtypes of application will be | |||
| defined in the future. MIME implementations must at a minimum | ||||
| treat any unrecognized subtypes as being equivalent to | ||||
| "application/octet-stream". | ||||
| and the second half might look something like this: | 6.2. Composite Content-Type Values | |||
| From: Bill@host.com | The remaining two of the seven initial Content-Type values | |||
| To: joe@otherhost.com | refer to composite entities. Composite entities are handled | |||
| Subject: Audio mail (part 2 of 2) | using MIME mechanisms -- a MIME processor typically handles | |||
| MIME-Version: 1.0 | the body directly. | |||
| Message-ID: <id2@host.com> | ||||
| Content-type: message/partial; | ||||
| id="ABC@host.com"; number=2; total=2 | ||||
| ... second half of encoded audio data goes here... | 6.2.1. Multipart Content-Type | |||
| Then, when the fragmented message is reassembled, the | In the case of multiple part entities, in which one or more | |||
| resulting message to be displayed to the user should look | different sets of data are combined in a single body, a | |||
| something like this: | "multipart" Content-Type field must appear in the entity's | |||
| header. The body must then contain one or more "body parts," | ||||
| each preceded by an encapsulation boundary, and the last one | ||||
| followed by a closing boundary. Each part starts with an | ||||
| encapsulation boundary, and then contains a body part | ||||
| consisting of a header area, a blank line, and a body area. | ||||
| Thus a body part is similar to an RFC 822 message in syntax, | ||||
| but different in meaning. | ||||
| X-Weird-Header-1: Foo | A body part is NOT to be interpreted as actually being an RFC | |||
| From: Bill@host.com | 822 message. To begin with, NO header fields are actually | |||
| To: joe@otherhost.com | required in body parts. A body part that starts with a blank | |||
| Subject: Audio mail | line, therefore, is allowed and is a body part for which all | |||
| Message-ID: <anotherid@foo.com> | default values are to be assumed. In such a case, the | |||
| MIME-Version: 1.0 | absence of a Content-Type header indicates that the | |||
| Content-type: audio/basic | corresponding body has a content-type of "text/plain; | |||
| Content-transfer-encoding: base64 | charset=US-ASCII"". | |||
| ... first half of encoded audio data goes here... | The only header fields that have defined meaning for body | |||
| ... second half of encoded audio data goes here... | parts are those the names of which begin with "Content-". All | |||
| other header fields are generally to be ignored in body parts. | ||||
| Although they should generally be retained in mail processing, | ||||
| they may be discarded by gateways if necessary. Such other | ||||
| fields are permitted to appear in body parts but must not be | ||||
| depended on. "X-" fields may be created for experimental or | ||||
| private purposes, with the recognition that the information | ||||
| they contain may be lost at some gateways. | ||||
| Note on encoding of MIME entities encapsulated inside | NOTE: The distinction between an RFC 822 message and a body | |||
| message/partial entities: Because data of type "message" | part is subtle, but important. A gateway between Internet and | |||
| may never be encoded in base64 or quoted-printable, a | X.400 mail, for example, must be able to tell the difference | |||
| problem might arise if message/partial entities are | between a body part that contains an image and a body part | |||
| constructed in an environment that supports binary or 8-bit | that contains an encapsulated message, the body of which is a | |||
| transport. The problem is that the binary data would be | GIF image. In order to represent the latter, the body part | |||
| split into multiple message/partial objects, each of them | must have "Content-Type: message/rfc822", and its body (after | |||
| requiring binary transport. If such objects were | the blank line) must be the encapsulated message, with its own | |||
| encountered at a gateway into a 7-bit transport environment, | "Content-Type: image/gif" header field. The use of similar | |||
| there would be no way to properly encode them for the 7-bit | syntax facilitates the conversion of messages to body parts, | |||
| world, aside from waiting for all of the parts, reassembling | and vice versa, but the distinction between the two must be | |||
| the message, and then encoding the reassembled data in | understood by implementors. (For the special case in which | |||
| base64 or quoted-printable. Since it is possible that | all parts actually are messages, a "digest" subtype is also | |||
| different parts might go through different gateways, even | defined.) | |||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | As stated previously, each body part is preceded by an | |||
| encapsulation boundary. The encapsulation boundary MUST NOT | ||||
| appear inside any of the encapsulated parts. Thus, it is | ||||
| crucial that the composing agent be able to choose and specify | ||||
| a unique boundary that will separate the parts. | ||||
| this is not an acceptable solution. For this reason, it is | All present and future subtypes of the "multipart" type must | |||
| specified that MIME entities of type message/partial must | use an identical syntax. Subtypes may differ in their | |||
| always have a content-transfer-encoding of 7-bit (the | semantics, and may impose additional restrictions on syntax, | |||
| default). In particular, even in environments that support | but must conform to the required syntax for the multipart | |||
| binary or 8-bit transport, the use of a content-transfer- | type. This requirement ensures that all conformant user | |||
| encoding of "8bit" or "binary" is explicitly prohibited for | agents will at least be able to recognize and separate the | |||
| entities of type message/partial. | parts of any multipart entity, even of an unrecognized | |||
| subtype. | ||||
| It should be noted that, because some message transfer | As stated in the definition of the Content-Transfer-Encoding | |||
| agents may choose to automatically fragment large messages, | field, no encoding other than "7bit", "8bit", or "binary" is | |||
| and because such agents may use different fragmentation | permitted for entities of type "multipart". The multipart | |||
| thresholds, it is possible that the pieces of a partial | delimiters and header fields are always represented as 7-bit | |||
| message, upon reassembly, may prove themselves to comprise a | US-ASCII in any case (though the header fields may encode | |||
| partial message. This is explicitly permitted. | non-US-ASCII header text as per RFC MIME-HEADERS, and data | |||
| within the body parts can be encoded on a part-by-part basis, | ||||
| with Content-Transfer-Encoding fields for each appropriate | ||||
| body part. | ||||
| It should also be noted that the inclusion of a "References" | Message transport agents, relays, and gateways are commonly | |||
| field in the headers of the second and subsequent pieces of | known to alter the top-level header of an RFC 822 message. In | |||
| a fragmented message that references the Message-Id on the | particular, they frequently add, remove, or reorder header | |||
| previous piece may be of benefit to mail readers that | fields. Such alterations are explicitly forbidden for the | |||
| understand and track references. However, the generation of | headers of any body part which occurs within an enclosing | |||
| such "References" fields is entirely optional. | multipart body part. | |||
| Finally, it should be noted that the "Encrypted" header | 6.2.1.1. Common Syntax | |||
| field has been made obsolete by Privacy Enhanced Messaging | ||||
| (PEM), but the rules above are believed to describe the | ||||
| correct way to treat it if it is encountered in the context | ||||
| of conversion to and from message/partial fragments. | ||||
| 7.3.3 The Message/External-Body subtype | This section defines a common syntax for subtypes of | |||
| multipart. All subtypes of multipart must use this syntax. A | ||||
| simple example of a multipart message also appears in this | ||||
| section. An example of a more complex multipart message is | ||||
| given in Appendix C. | ||||
| The external-body subtype indicates that the actual body | The Content-Type field for multipart entities requires one | |||
| data are not included, but merely referenced. In this case, | parameter, "boundary", which is used to specify the | |||
| the parameters describe a mechanism for accessing the | encapsulation boundary. The encapsulation boundary is defined | |||
| external data. | as a line consisting entirely of two hyphen characters ("-", | |||
| decimal value 45) followed by the boundary parameter value | ||||
| from the Content-Type header field. | ||||
| When an entity is of type "message/external-body", it | NOTE: The hyphens are for rough compatibility with the | |||
| consists of a header, two consecutive CRLFs, and the message | earlier RFC 934 method of message encapsulation, and for ease | |||
| header for the encapsulated message. If another pair of | of searching for the boundaries in some implementations. | |||
| consecutive CRLFs appears, this of course ends the message | However, it should be noted that multipart messages are NOT | |||
| header for the encapsulated message. However, since the | completely compatible with RFC 934 encapsulations; in | |||
| encapsulated message's body is itself external, it does NOT | particular, they do not obey RFC 934 quoting conventions for | |||
| appear in the area that follows. For example, consider the | embedded lines that begin with hyphens. This mechanism was | |||
| following message: | chosen over the RFC 934 mechanism because the latter causes | |||
| lines to grow with each level of quoting. The combination of | ||||
| this growth with the fact that SMTP implementations sometimes | ||||
| wrap long lines made the RFC 934 mechanism unsuitable for use | ||||
| in the event that deeply-nested multipart structuring is ever | ||||
| desired. | ||||
| Content-type: message/external-body; | WARNING TO IMPLEMENTORS: The grammar for parameters on the | |||
| Content-type field is such that it is often necessary to | ||||
| enclose the boundaries in quotes on the Content-type line. | ||||
| This is not always necessary, but never hurts. Implementors | ||||
| should be sure to study the grammar carefully in order to | ||||
| avoid producing invalid Content-type fields. Thus, a typical | ||||
| multipart Content-Type header field might look like this: | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | Content-Type: multipart/mixed; boundary=gc0p4Jq0M2Yt08j34c0p | |||
| access-type=local-file; | But the following is not valid: | |||
| name="/u/nsb/Me.gif" | ||||
| Content-type: image/gif | Content-Type: multipart/mixed; boundary=gc0pJq0M:08jU534c0p | |||
| Content-ID: <id42@guppylake.bellcore.com> | ||||
| Content-Transfer-Encoding: binary | ||||
| THIS IS NOT REALLY THE BODY! | (because of the colon) and must instead be represented as | |||
| Content-Type: multipart/mixed; boundary="gc0pJq0M:08jU534c0p" | ||||
| The area at the end, which might be called the "phantom | This Content-Type value indicates that the content consists of | |||
| body", is ignored for most external-body messages. However, | one or more parts, each with a structure that is syntactically | |||
| it may be used to contain auxiliary information for some | identical to an RFC 822 message, except that the header area | |||
| such messages, as indeed it is when the access-type is | is allowed to be completely empty, and that the parts are each | |||
| "mail-server". Of the access-types defined by this | preceded by the line | |||
| document, the phantom body is used only when the access-type | ||||
| is "mail-server". In all other cases, the phantom body is | ||||
| ignored. | ||||
| The only always-mandatory parameter for message/external- | --gc0pJq0M:08jU534c0p | |||
| body is "access-type"; all of the other parameters may be | ||||
| mandatory or optional depending on the value of access-type. | ||||
| ACCESS-TYPE -- A case-insensitive word, indicating | The encapsulation boundary MUST occur at the beginning of a | |||
| the supported access mechanism by which the file | line, i.e., following a CRLF, and the initial CRLF is | |||
| or data may be obtained. Values include, but are | considered to be attached to the encapsulation boundary rather | |||
| not limited to, "FTP", "ANON-FTP", "TFTP", "AFS", | than part of the preceding part. The boundary must be | |||
| "LOCAL-FILE", and "MAIL-SERVER". Future values, | followed immediately either by another CRLF and the header | |||
| except for experimental values beginning with "X- | fields for the next part, or by two CRLFs, in which case there | |||
| ", must be registered with IANA, as described in | are no header fields for the next part (and it is therefore | |||
| Appendix E . | assumed to be of Content-Type text/plain). | |||
| In addition, the following three parameters are optional for | NOTE: The CRLF preceding the encapsulation line is | |||
| ALL access-types: | conceptually attached to the boundary so that it is possible | |||
| to have a part that does not end with a CRLF (line break). | ||||
| Body parts that must be considered to end with line breaks, | ||||
| therefore, must have two CRLFs preceding the encapsulation | ||||
| line, the first of which is part of the preceding body part, | ||||
| and the second of which is part of the encapsulation boundary. | ||||
| EXPIRATION -- The date (in the RFC 822 "date-time" | Encapsulation boundaries must not appear within the | |||
| syntax, as extended by RFC 1123 to permit 4 digits | encapsulations, and must be no longer than 70 characters, not | |||
| in the year field) after which the existence of | counting the two leading hyphens. | |||
| the external data is not guaranteed. | ||||
| SIZE -- The size (in octets) of the data. The | The encapsulation boundary following the last body part is a | |||
| intent of this parameter is to help the recipient | distinguished delimiter that indicates that no further body | |||
| decide whether or not to expend the necessary | parts will follow. Such a delimiter is identical to the | |||
| resources to retrieve the external data. Note | previous delimiters, with the addition of two more hyphens at | |||
| that this describes the size of the data in its | the end of the line: | |||
| canonical form, that is, before any Content- | ||||
| Transfer-Encoding has been applied or after the | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | --gc0pJq0M:08jU534c0p-- | |||
| data have been decoded. | There appears to be room for additional information prior to | |||
| the first encapsulation boundary and following the final | ||||
| boundary. These areas should generally be left blank, and | ||||
| implementations must ignore anything that appears before the | ||||
| first boundary or after the last one. | ||||
| PERMISSION -- A case-insensitive field that | NOTE: These "preamble" and "epilogue" areas are generally not | |||
| indicates whether or not it is expected that | used because of the lack of proper typing of these parts and | |||
| clients might also attempt to overwrite the data. | the lack of clear semantics for handling these areas at | |||
| By default, or if permission is "read", the | gateways, particularly X.400 gateways. However, rather than | |||
| assumption is that they are not, and that if the | leaving the preamble area blank, many MIME implementations | |||
| data is retrieved once, it is never needed again. | have found this to be a convenient place to insert an | |||
| If PERMISSION is "read-write", this assumption is | explanatory note for recipients who read the message with | |||
| invalid, and any local copy must be considered no | pre-MIME software, since such notes will be ignored by MIME- | |||
| more than a cache. "Read" and "Read-write" are | compliant software. | |||
| the only defined values of permission. | ||||
| The precise semantics of the access-types defined here are | NOTE: Because encapsulation boundaries must not appear in the | |||
| described in the sections that follow. | body parts being encapsulated, a user agent must exercise care | |||
| to choose a unique boundary. The boundary in the example | ||||
| above could have been the result of an algorithm designed to | ||||
| produce boundaries with a very low probability of already | ||||
| existing in the data to be encapsulated without having to | ||||
| prescan the data. Alternate algorithms might result in more | ||||
| "readable" boundaries for a recipient with an old user agent, | ||||
| but would require more attention to the possibility that the | ||||
| boundary might appear in the encapsulated part. The simplest | ||||
| boundary possible is something like "---", with a closing | ||||
| boundary of "-----". | ||||
| The encapsulated headers in ALL message/external-body | As a very simple example, the following multipart message has | |||
| entities MUST include a Content-ID header field to give a | two parts, both of them plain text, one of them explicitly | |||
| unique identifier by which to reference the data. This | typed and one of them implicitly typed: | |||
| identifier may be used for cacheing mechanisms, and for | ||||
| recognizing the receipt of the data when the access-type is | ||||
| "mail-server". | ||||
| Note that, as specified here, the tokens that describe | From: Nathaniel Borenstein <nsb@bellcore.com> | |||
| external-body data, such as file names and mail server | To: Ned Freed <ned@innosoft.com> | |||
| commands, are required to be in the US-ASCII character set. | Date: Sun, 21 Mar 1993 23:56:48 -0800 (PST) | |||
| If this proves problematic in practice, a new mechanism may | Subject: Sample message | |||
| be required as a future extension to MIME, either as newly | MIME-Version: 1.0 | |||
| defined access-types for message/external-body or by some | Content-type: multipart/mixed; boundary="simple boundary" | |||
| other mechanism. | ||||
| As with message/partial, it is specified that MIME entities | This is the preamble. It is to be ignored, though it | |||
| of type message/external-body must always have a content- | is a handy place for mail composers to include an | |||
| transfer-encoding of 7-bit (the default). In particular, | explanatory note to non-MIME conformant readers. | |||
| even in environments that support binary or 8-bit transport, | ||||
| the use of a content-transfer-encoding of "8bit" or "binary" | ||||
| is explicitly prohibited for entities of type | ||||
| message/external-body. | ||||
| 7.3.3.1 The "ftp" and "tftp" access-types | --simple boundary | |||
| An access-type of FTP or TFTP indicates that the message | This is implicitly typed plain US-ASCII text. | |||
| body is accessible as a file using the FTP [RFC-959] or TFTP | It does NOT end with a linebreak. | |||
| [RFC-783] protocols, respectively. For these access-types, | --simple boundary | |||
| the following additional parameters are mandatory: | Content-type: text/plain; charset=us-ascii | |||
| This is explicitly typed plain US-ASCII text. | ||||
| It DOES end with a linebreak. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | --simple boundary-- | |||
| NAME -- The name of the file that contains the | This is the epilogue. It is also to be ignored. | |||
| actual body data. | ||||
| SITE -- A machine from which the file may be | The use of a Content-Type of multipart in a body part within | |||
| obtained, using the given protocol. This must be | another multipart entity is explicitly allowed. In such | |||
| a fully qualified domain name, not a nickname. | cases, for obvious reasons, care must be taken to ensure that | |||
| each nested multipart entity uses a different boundary | ||||
| delimiter. See Appendix C for an example of nested multipart | ||||
| entities. | ||||
| Before any data are retrieved, using FTP, the user will | The use of the multipart Content-Type with only a single body | |||
| generally need to be asked to provide a login id and a | part may be useful in certain contexts, and is explicitly | |||
| password for the machine named by the site parameter. For | permitted. | |||
| security reasons, such an id and password are not specified | ||||
| as content-type parameters, but must be obtained from the | ||||
| user. | ||||
| In addition, the following parameters are optional: | The only mandatory global parameter for the multipart | |||
| Content-Type is the boundary parameter, which consists of 1 to | ||||
| 70 characters from a set of characters known to be very robust | ||||
| through email gateways, and NOT ending with white space. (If a | ||||
| boundary appears to end with white space, the white space must | ||||
| be presumed to have been added by a gateway, and must be | ||||
| deleted.) It is formally specified by the following BNF: | ||||
| DIRECTORY -- A directory from which the data named | boundary := 0*69<bchars> bcharsnospace | |||
| by NAME should be retrieved. | ||||
| MODE -- A case-insensitive string indicating the | bchars := bcharsnospace / " " | |||
| mode to be used when retrieving the information. | ||||
| The legal values for access-type "TFTP" are | ||||
| "NETASCII", "OCTET", and "MAIL", as specified by | ||||
| the TFTP protocol [RFC-783]. The legal values for | ||||
| access-type "FTP" are "ASCII", "EBCDIC", "IMAGE", | ||||
| and "LOCALn" where "n" is a decimal integer, | ||||
| typically 8. These correspond to the | ||||
| representation types "A" "E" "I" and "L n" as | ||||
| specified by the FTP protocol [RFC-959]. Note | ||||
| that "BINARY" and "TENEX" are not valid values for | ||||
| MODE, but that "OCTET" or "IMAGE" or "LOCAL8" | ||||
| should be used instead. IF MODE is not specified, | ||||
| the default value is "NETASCII" for TFTP and | ||||
| "ASCII" otherwise. | ||||
| 7.3.3.2 The "anon-ftp" access-type | bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" / | |||
| "+" / "_" / "," / "-" / "." / | ||||
| "/" / ":" / "=" / "?" | ||||
| The "anon-ftp" access-type is identical to the "ftp" access | Overall, the body of a multipart entity may be specified as | |||
| type, except that the user need not be asked to provide a | follows: | |||
| name and password for the specified site. Instead, the ftp | ||||
| protocol will be used with login "anonymous" and a password | ||||
| that corresponds to the user's email address. | ||||
| 7.3.3.3 The "local-file" and "afs" access-types | dash-boundary := "--" boundary | |||
| ; boundary taken from Content-Type | ||||
| ; field. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | multipart-body := preamble dash-boundary | |||
| [*LWSP-char] CRLF | ||||
| body-part *encapsulation | ||||
| close-delimiter [*LWSP-char] | ||||
| CRLF epilogue | ||||
| An access-type of "local-file" indicates that the actual | encapsulation := delimiter [*LWSP-char] | |||
| body is accessible as a file on the local machine. An | CRLF body-part | |||
| access-type of "afs" indicates that the file is accessible | ||||
| via the global AFS file system. In both cases, only a | ||||
| single parameter is required: | ||||
| NAME -- The name of the file that contains the | delimiter := CRLF dash-boundary | |||
| actual body data. | ||||
| The following optional parameter may be used to describe the | close-delimiter := CRLF dash-boundary "--" | |||
| locality of reference for the data, that is, the site or | ||||
| sites at which the file is expected to be visible: | ||||
| SITE -- A domain specifier for a machine or set of | preamble := discard-text | |||
| machines that are known to have access to the data | ||||
| file. Asterisks may be used for wildcard matching | ||||
| to a part of a domain name, such as | ||||
| "*.bellcore.com", to indicate a set of machines on | ||||
| which the data should be directly visible, while a | ||||
| single asterisk may be used to indicate a file | ||||
| that is expected to be universally available, | ||||
| e.g., via a global file system. | ||||
| 7.3.3.4 The "mail-server" access-type | epilogue := discard-text | |||
| The "mail-server" access-type indicates that the actual body | discard-text := *text *(*text CRLF) | |||
| is available from a mail server. The mandatory parameter | ; To be ignored upon receipt. | |||
| for this access-type is: | ||||
| SERVER -- The email address of the mail server | body-part := <"message" as defined in RFC 822, with all | |||
| from which the actual body data can be obtained. | header fields optional, not starting with the | |||
| specified dash-boundary, and with the | ||||
| delimiter not occurring anywhere in the | ||||
| message body. Note that the semantics of a | ||||
| part differ from the semantics of a message, | ||||
| as described in the text.> | ||||
| Because mail servers accept a variety of syntaxes, some of | IMPORTANT NOTE: The addition of LWSP between the elements | |||
| which is multiline, the full command to be sent to a mail | shown in this BNF is NOT allowed since this BNF does not | |||
| server is not included as a parameter on the content-type | specify a structured header field. | |||
| line. Instead, it is provided as the "phantom body" when | ||||
| the content-type is message/external-body and the access- | ||||
| type is mail-server. | ||||
| An optional parameter for this access-type is: | NOTE: In certain transport enclaves, RFC 822 restrictions | |||
| such as the one that limits bodies to printable US-ASCII | ||||
| characters may not be in force. (That is, the transport | ||||
| domains may resemble standard Internet mail transport as | ||||
| specified in RFC 821 and assumed by RFC 822, but without | ||||
| certain restrictions.) The relaxation of these restrictions | ||||
| should be construed as locally extending the definition of | ||||
| bodies, for example to include octets outside of the US-ASCII | ||||
| range, as long as these extensions are supported by the | ||||
| transport and adequately documented in the Content-Transfer- | ||||
| Encoding header field. However, in no event are headers | ||||
| (either message headers or body-part headers) allowed to | ||||
| contain anything other than US-ASCII characters. | ||||
| SUBJECT -- The subject that is to be used in the | NOTE: Conspicuously missing from the multipart type is a | |||
| mail that is sent to obtain the data. Note that | notion of structured, related body parts. In general, it | |||
| keying mail servers on Subject lines is NOT | seems premature to try to standardize interpart structure yet. | |||
| recommended, but such mail servers are known to | It is recommended that those wishing to provide a more | |||
| exist. | structured or integrated multipart messaging facility should | |||
| define a subtype of multipart that is syntactically identical, | ||||
| but that always expects the inclusion of a distinguished part | ||||
| that can be used to specify the structure and integration of | ||||
| the other parts, probably referring to them by their Content- | ||||
| ID field. If this approach is used, other implementations | ||||
| will not recognize the new subtype, but will treat it as the | ||||
| primary subtype (multipart/mixed) and will thus be able to | ||||
| show the user the parts that are recognized. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | 6.2.1.2. Handling Nested Messages and Multiparts | |||
| Note that MIME does not define a mail server syntax. | The "message/rfc822" subtype defined in a subsequent section | |||
| Rather, it allows the inclusion of arbitrary mail server | of this document has no terminating condition other than | |||
| commands in the phantom body. Implementations must include | running out of data. Similarly, an improperly truncated | |||
| the phantom body in the body of the message it sends to the | multipart object may not have any terminating boundary marker, | |||
| mail server address to retrieve the relevant data. | and does arise in practice due to mail system malfunctions. | |||
| It is worth noting that, unlike other access-types, mail- | It is essential that such objects be handled correctly when | |||
| server access is asynchronous and will happen at an | they are themselves imbedded inside of another multipart | |||
| unpredictable time in the future. For this reason, it is | structure. MIME implementations are therefore required to | |||
| important that there be a mechanism by which the returned | recognize outer level boundary markers at ANY level of inner | |||
| data can be matched up with the original message/external- | nesting. It is not sufficient to only check for the next | |||
| body entity. MIME mailservers must use the same Content-ID | expected marker or other terminating condition. | |||
| field on the returned message that was used in the original | ||||
| message/external-body entity, to facilitate such matching. | ||||
| 7.3.3.5 Examples and Further Explanations | 6.2.1.3. Mixed Subtype | |||
| With the emerging possibility of very wide-area file | The "mixed" subtype of multipart is intended for use when the | |||
| systems, it becomes very hard to know in advance the set of | body parts are independent and need to be bundled in a | |||
| machines where a file will and will not be accessible | particular order. Any multipart subtypes that an | |||
| directly from the file system. Therefore it may make sense | implementation does not recognize must be treated as being of | |||
| to provide both a file name, to be tried directly, and the | subtype "mixed". | |||
| name of one or more sites from which the file is known to be | ||||
| accessible. An implementation can try to retrieve remote | ||||
| files using FTP or any other protocol, using anonymous file | ||||
| retrieval or prompting the user for the necessary name and | ||||
| password. If an external body is accessible via multiple | ||||
| mechanisms, the sender may include multiple parts of type | ||||
| message/external-body within an entity of type | ||||
| multipart/alternative. | ||||
| However, the external-body mechanism is not intended to be | 6.2.1.4. Alternative Subtype | |||
| limited to file retrieval, as shown by the mail-server | ||||
| access-type. Beyond this, one can imagine, for example, | ||||
| using a video server for external references to video clips. | ||||
| If an entity is of type "message/external-body", then the | The multipart/alternative type is syntactically identical to | |||
| body of the entity will contain the header fields of the | multipart/mixed, but the semantics are different. In | |||
| encapsulated message. The body itself is to be found in the | particular, each of the parts is an "alternative" version of | |||
| external location. This means that if the body of the | the same information. | |||
| "message/external-body" message contains two consecutive | ||||
| CRLFs, everything after those pairs is NOT part of the | ||||
| message itself. For most message/external-body messages, | ||||
| this trailing area must simply be ignored. However, it is a | ||||
| convenient place for additional data that cannot be included | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | Systems should recognize that the content of the various parts | |||
| are interchangeable. Systems should choose the "best" type | ||||
| based on the local environment and preferences, in some cases | ||||
| even through user interaction. As with multipart/mixed, the | ||||
| order of body parts is significant. In this case, the | ||||
| alternatives appear in an order of increasing faithfulness to | ||||
| the original content. In general, the best choice is the LAST | ||||
| part of a type supported by the recipient system's local | ||||
| environment. | ||||
| in the content-type header field. In particular, if the | Multipart/alternative may be used, for example, to send mail | |||
| "access-type" value is "mail-server", then the trailing area | in a fancy text format in such a way that it can easily be | |||
| must contain commands to be sent to the mail server at the | displayed anywhere: | |||
| address given by the value of the SERVER parameter. | ||||
| The embedded message header fields which appear in the body | From: Nathaniel Borenstein <nsb@bellcore.com> | |||
| of the message/external-body data must be used to declare | To: Ned Freed <ned@innosoft.com> | |||
| the Content-type of the external body if it is anything | Date: Mon, 22 Mar 1993 09:41:09 -0800 (PST) | |||
| other than plain ASCII text, since the external body does | Subject: Formatted text mail | |||
| not have a header section to declare its type. Similarly, | MIME-Version: 1.0 | |||
| any Content-transfer-encoding other than "7bit" must also be | Content-Type: multipart/alternative; boundary=boundary42 | |||
| declared here. Thus a complete message/external-body | ||||
| message, referring to a document in PostScript format, might | ||||
| look like this: | ||||
| From: Whomever | --boundary42 | |||
| To: Someone | Content-Type: text/plain; charset=us-ascii | |||
| Subject: whatever | ||||
| MIME-Version: 1.0 | ||||
| Message-ID: <id1@host.com> | ||||
| Content-Type: multipart/alternative; boundary=42 | ||||
| Content-ID: <id001@guppylake.bellcore.com> | ||||
| --42 | ... plain text version of message goes here ... | |||
| Content-Type: message/external-body; | ||||
| name="BodyFormats.ps"; | ||||
| site="thumper.bellcore.com"; | ||||
| access-type=ANON-FTP; | ||||
| directory="pub"; | ||||
| mode="image"; | ||||
| expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)" | ||||
| Content-type: application/postscript | --boundary42 | |||
| Content-ID: <id42@guppylake.bellcore.com> | Content-Type: text/enriched | |||
| --42 | ... RFC 1563 text/enriched version of same message | |||
| Content-Type: message/external-body; | goes here ... | |||
| name="/u/nsb/writing/rfcs/RFC-MIME.ps"; | ||||
| site="thumper.bellcore.com"; | ||||
| access-type=AFS | ||||
| expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)" | ||||
| Content-type: application/postscript | --boundary42 | |||
| Content-ID: <id42@guppylake.bellcore.com> | Content-Type: application/x-whatever | |||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | ... fanciest version of same message goes here ... | |||
| --42 | --boundary42-- | |||
| Content-Type: message/external-body; | ||||
| access-type=mail-server | ||||
| server="listserv@bogus.bitnet"; | ||||
| expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)" | ||||
| Content-type: application/postscript | In this example, users whose mail system understood the | |||
| Content-ID: <id42@guppylake.bellcore.com> | "application/x-whatever" format would see only the fancy | |||
| version, while other users would see only the enriched or | ||||
| plain text version, depending on the capabilities of their | ||||
| system. | ||||
| get RFC-MIME.DOC | In general, user agents that compose multipart/alternative | |||
| entities must place the body parts in increasing order of | ||||
| preference, that is, with the preferred format last. For | ||||
| fancy text, the sending user agent should put the plainest | ||||
| format first and the richest format last. Receiving user | ||||
| agents should pick and display the last format they are | ||||
| capable of displaying. In the case where one of the | ||||
| alternatives is itself of type "multipart" and contains | ||||
| unrecognized sub-parts, the user agent may choose either to | ||||
| show that alternative, an earlier alternative, or both. | ||||
| --42-- | NOTE: From an implementor's perspective, it might seem more | |||
| sensible to reverse this ordering, and have the plainest | ||||
| alternative last. However, placing the plainest alternative | ||||
| first is the friendliest possible option when | ||||
| multipart/alternative entities are viewed using a non-MIME- | ||||
| conformant mail reader. While this approach does impose some | ||||
| burden on conformant mail readers, interoperability with older | ||||
| mail readers was deemed to be more important in this case. | ||||
| Note that in the above examples, the default Content- | It may be the case that some user agents, if they can | |||
| transfer-encoding of "7bit" is assumed for the external | recognize more than one of the formats, will prefer to offer | |||
| postscript data. | the user the choice of which format to view. This makes | |||
| sense, for example, if mail includes both a nicely-formatted | ||||
| image version and an easily-edited text version. What is most | ||||
| critical, however, is that the user not automatically be shown | ||||
| multiple versions of the same data. Either the user should be | ||||
| shown the last recognized version or should be given the | ||||
| choice. | ||||
| Like the message/partial type, the message/external-body | NOTE ON THE SEMANTICS OF CONTENT-ID IN MULTIPART/ALTERNATIVE: | |||
| type is intended to be transparent, that is, to convey the | Each part of a multipart/alternative entity represents the | |||
| data type in the external body rather than to convey a | same data, but the mappings between the two are not | |||
| message with a body of that type. Thus the headers on the | necessarily without information loss. For example, | |||
| outer and inner parts must be merged using the same rules as | information is lost when translating ODA to PostScript or | |||
| for message/partial. In particular, this means that the | plain text. It is recommended that each part should have a | |||
| Content-type header is overridden, but the From and Subject | different Content-ID value in the case where the information | |||
| headers are preserved. | content of the two parts is not identical. And when the | |||
| information content is identical -- for example, where several | ||||
| parts of type "message/external-body" specify alternate ways | ||||
| to access the identical data -- the same Content-ID field | ||||
| value should be used, to optimize any caching mechanisms that | ||||
| might be present on the recipient's end. However, the | ||||
| Content-ID values used by the parts should NOT be the same | ||||
| Content-ID value that describes the multipart/alternative as a | ||||
| whole, if there is any such Content-ID field. That is, one | ||||
| Content-ID value will refer to the multipart/alternative | ||||
| entity, while one or more other Content-ID values will refer | ||||
| to the parts inside it. | ||||
| Note that since the external bodies are not transported as | 6.2.1.5. Digest Subtype | |||
| mail, they need not conform to the 7-bit and line length | ||||
| requirements, but might in fact be binary files. Thus a | ||||
| Content-Transfer-Encoding is not generally necessary, though | ||||
| it is permitted. | ||||
| Note that the body of a message of type "message/external- | This document defines a "digest" subtype of the multipart | |||
| body" is governed by the basic syntax for an RFC 822 | Content-Type. This type is syntactically identical to | |||
| message. In particular, anything before the first | multipart/mixed, but the semantics are different. In | |||
| consecutive pair of CRLFs is header information, while | particular, in a digest, the default Content-Type value for a | |||
| anything after it is body information, which is ignored for | body part is changed from "text/plain" to "message/rfc822". | |||
| most access-types. | This is done to allow a more readable digest format that is | |||
| largely compatible (except for the quoting convention) with | ||||
| RFC 934. | ||||
| The formal grammar for content-type header fields for data | A digest in this format might, then, look something like this: | |||
| of type message is given by: | ||||
| message-type := "message" "/" message-subtype | From: Moderator-Address | |||
| To: Recipient-List | ||||
| Date: Mon, 22 Mar 1994 13:34:51 +0000 | ||||
| Subject: Internet Digest, volume 42 | ||||
| MIME-Version: 1.0 | ||||
| Content-Type: multipart/digest; | ||||
| boundary="---- next message ----" | ||||
| message-subtype := "rfc822" | ------ next message ---- | |||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | From: someone-else | |||
| Date: Fri, 26 Mar 1993 11:13:32 +0200 | ||||
| Subject: my opinion | ||||
| / "partial" 2#3partial-param | ...body goes here ... | |||
| / "external-body" 1*external-param | ||||
| / extension-token | ||||
| partial-param := (";" "id" "=" value) | ------ next message ---- | |||
| / (";" "number" "=" 1*DIGIT) | ||||
| / (";" "total" "=" 1*DIGIT) | ||||
| ; id & number required; total required for last | ||||
| part | ||||
| external-param := (";" "access-type" "=" atype) | From: someone-else-again | |||
| / (";" "expiration" "=" date-time) | Date: Fri, 26 Mar 1993 10:07:13 -0500 | |||
| ; Note that date-time is quoted | Subject: my different opinion | |||
| / (";" "size" "=" 1*DIGIT) | ||||
| / (";" "permission" "=" ("read" / "read- | ||||
| write")) | ||||
| ; Permission is case-insensitive | ||||
| / (";" "name" "=" value) | ||||
| / (";" "site" "=" value) | ||||
| / (";" "dir" "=" value) | ||||
| / (";" "mode" "=" value) | ||||
| / (";" "server" "=" value) | ||||
| / (";" "subject" "=" value) | ||||
| ; access-type required; others required based on | ||||
| access-type | ||||
| atype := "ftp" / "anon-ftp" / "tftp" / "local-file" | ... another body goes here ... | |||
| / "afs" / "mail-server" / extension-token | ||||
| ; Case-insensitive | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | ------ next message ------ | |||
| 7.4 The Application Content-Type | 6.2.1.6. Parallel Subtype | |||
| The "application" Content-Type is to be used for data which | This document defines a "parallel" subtype of the multipart | |||
| do not fit in any of the other categories, and particularly | Content-Type. This type is syntactically identical to | |||
| for data to be processed by mail-based uses of application | multipart/mixed, but the semantics are different. In | |||
| programs. This is information which must be processed by an | particular, in a parallel entity, the order of body parts is | |||
| application before it is viewable or usable to a user. | not significant. | |||
| Expected uses for Content-Type application include mail- | ||||
| based file transfer, spreadsheets, data for mail-based | ||||
| scheduling systems, and languages for "active" | ||||
| (computational) email. (The latter, in particular, can pose | ||||
| security problems which must be understood by implementors, | ||||
| and are considered in detail in the discussion of the | ||||
| application/PostScript content-type.) | ||||
| For example, a meeting scheduler might define a standard | A common presentation of this type is to display all of the | |||
| representation for information about proposed meeting dates. | parts simultaneously on hardware and software that are capable | |||
| An intelligent user agent would use this information to | of doing so. However, composing agents should be aware that | |||
| conduct a dialog with the user, and might then send further | many mail readers will lack this capability and will show the | |||
| mail based on that dialog. More generally, there have been | parts serially in any event. | |||
| several "active" messaging languages developed in which | ||||
| programs in a suitably specialized language are sent through | ||||
| the mail and automatically run in the recipient's | ||||
| environment. | ||||
| Such applications may be defined as subtypes of the | 6.2.1.7. Other Multipart Subtypes | |||
| "application" Content-Type. This document defines two | ||||
| subtypes: octet-stream, and PostScript. | ||||
| In general, the subtype of application will often be the | Other multipart subtypes are expected in the future. MIME | |||
| name of the application for which the data are intended. | implementations must in general treat unrecognized subtypes of | |||
| This does not mean, however, that any application program | multipart as being equivalent to "multipart/mixed". | |||
| name may be used freely as a subtype of application. Such | ||||
| usages (other than subtypes beginning with "x-") must be | ||||
| registered with IANA, as described in Appendix E. | ||||
| 7.4.1 The Application/Octet-Stream (primary) subtype | 6.2.2. Message Content-Type | |||
| The primary subtype of application, "octet-stream", may be | It is frequently desirable, in sending mail, to encapsulate | |||
| used to indicate that a body contains binary data. The set | another mail message. A special Content-Type, "message", is | |||
| of possible parameters includes, but is not limited to: | defined to facilitate this. In particular, the "rfc822" | |||
| subtype of "message" is used to encapsulate RFC 822 messages. | ||||
| TYPE -- the general type or category of binary | NOTE: It has been suggested that subtypes of message might be | |||
| data. This is intended as information for the | defined for forwarded or rejected messages. However, | |||
| human recipient rather than for any automatic | forwarded and rejected messages can be handled as multipart | |||
| processing. | messages in which the first part contains any control or | |||
| descriptive information, and a second part, of type | ||||
| message/rfc822, is the forwarded or rejected message. | ||||
| Composing rejection and forwarding messages in this manner | ||||
| will preserve the type information on the original message and | ||||
| allow it to be correctly presented to the recipient, and hence | ||||
| is strongly encouraged. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | Subtypes of message often impose restrictions on what | |||
| encodings are allowed. These restrictions are described in | ||||
| conjunction with each specific subtype. | ||||
| PADDING -- the number of bits of padding that were | Mail gateways, relays, and other mail handling agents are | |||
| appended to the bit-stream comprising the actual | commonly known to alter the top-level header of an RFC 822 | |||
| contents to produce the enclosed byte-oriented | message. In particular, they frequently add, remove, or | |||
| data. This is useful for enclosing a bit-stream | reorder header fields. Such alterations are explicitly | |||
| in a body when the total number of bits is not a | forbidden for the encapsulated headers embedded in the bodies | |||
| multiple of the byte size. | of messages of type "message." | |||
| An additional parameter, "conversions", was defined in | 6.2.2.1. RFC822 Subtype | |||
| [RFC-1341] but has been removed. | ||||
| RFC 1341 also defined the use of a "NAME" parameter which | A Content-Type of "message/rfc822" indicates that the body | |||
| gave a suggested file name to be used if the data were to be | contains an encapsulated message, with the syntax of an RFC | |||
| written to a file. This has been deprecated in anticipation | 822 message. However, unlike top-level RFC 822 messages, the | |||
| of a separate Content-Disposition header field, to be | restriction that each message/rfc822 body must include a | |||
| defined in a subsequent RFC. | "From", "Date", and at least one destination header is removed | |||
| and replaced with the requirement that at least one of "From", | ||||
| "Subject", or "Date" must be present. | ||||
| The recommended action for an implementation that receives | No encoding other than "7bit", "8bit", or "binary" is | |||
| application/octet-stream mail is to simply offer to put the | permitted for parts of type "message/rfc822". The message | |||
| data in a file, with any Content-Transfer-Encoding undone, | header fields are always US-ASCII in any case, and data within | |||
| or perhaps to use it as input to a user-specified process. | the body can still be encoded, in which case the Content- | |||
| Transfer-Encoding header field in the encapsulated message | ||||
| will reflect this. Non-US-ASCII text in the headers of an | ||||
| encapsulated message can be specified using the mechanisms | ||||
| described in RFC MIME-HEADERS. | ||||
| To reduce the danger of transmitting rogue programs through | It should be noted that, despite the use of the numbers "822", | |||
| the mail, it is strongly recommended that implementations | a message/rfc822 entity can include enhanced information as | |||
| NOT implement a path-search mechanism whereby an arbitrary | defined in this document. In other words, a message/rfc822 | |||
| program named in the Content-Type parameter (e.g., an | message may be a MIME message. | |||
| "interpreter=" parameter) is found and executed using the | ||||
| mail body as input. | ||||
| 7.4.2 The Application/PostScript subtype | 6.2.2.2. Partial Subtype | |||
| A Content-Type of "application/postscript" indicates a | The "partial" subtype is defined to allow large entities to be | |||
| PostScript program. Currently two variants of the | delivered as several separate pieces of mail and automatically | |||
| PostScript language are allowed; the original level 1 | reassembled by the receiving user agent. (The concept is | |||
| variant is described in [POSTSCRIPT] and the more recent | similar to IP fragmentation and reassembly in the basic | |||
| level 2 variant is described in [POSTSCRIPT2]. | Internet Protocols.) This mechanism can be used when | |||
| intermediate transport agents limit the size of individual | ||||
| messages that can be sent. Content-Type "message/partial" | ||||
| thus indicates that the body contains a fragment of a larger | ||||
| message. | ||||
| PostScript is a registered trademark of Adobe Systems, Inc. | Three parameters must be specified in the Content-Type field | |||
| Use of the MIME content-type "application/postscript" | of type message/partial: The first, "id", is a unique | |||
| implies recognition of that trademark and all the rights it | identifier, as close to a world-unique identifier as possible, | |||
| entails. | to be used to match the parts together. (In general, the | |||
| identifier is essentially a message-id; if placed in double | ||||
| quotes, it can be ANY message-id, in accordance with the BNF | ||||
| for "parameter" given earlier in this specification.) The | ||||
| second, "number", an integer, is the part number, which | ||||
| indicates where this part fits into the sequence of fragments. | ||||
| The third, "total", another integer, is the total number of | ||||
| parts. This third subfield is required on the final part, and | ||||
| is optional (though encouraged) on the earlier parts. Note | ||||
| also that these parameters may be given in any order. | ||||
| The PostScript language definition provides facilities for | Thus, part 2 of a 3-part message may have either of the | |||
| internal labeling of the specific language features a given | following header fields: | |||
| program uses. This labeling, called the PostScript document | ||||
| structuring conventions, is very general and provides | ||||
| substantially more information than just the language level. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | Content-Type: Message/Partial; number=2; total=3; | |||
| id="oc=jpbe0M2Yt4s@thumper.bellcore.com" | ||||
| The use of document structuring conventions, while not | Content-Type: Message/Partial; | |||
| required, is strongly recommended as an aid to | id="oc=jpbe0M2Yt4s@thumper.bellcore.com"; | |||
| interoperability. Documents which lack proper structuring | number=2 | |||
| conventions cannot be tested to see whether or not they will | ||||
| work in a given environment. As such, some systems may | ||||
| assume the worst and refuse to process unstructured | ||||
| documents. | ||||
| The execution of general-purpose PostScript interpreters | But part 3 MUST specify the total number of parts: | |||
| entails serious security risks, and implementors are | ||||
| discouraged from simply sending PostScript email bodies to | ||||
| "off-the-shelf" interpreters. While it is usually safe to | ||||
| send PostScript to a printer, where the potential for harm | ||||
| is greatly constrained, implementors should consider all of | ||||
| the following before they add interactive display of | ||||
| PostScript bodies to their mail readers. | ||||
| The remainder of this section outlines some, though probably | Content-Type: Message/Partial; number=3; total=3; | |||
| not all, of the possible problems with sending PostScript | id="oc=jpbe0M2Yt4s@thumper.bellcore.com" | |||
| through the mail. | ||||
| Dangerous operations in the PostScript language include, but | Note that part numbering begins with 1, not 0. | |||
| may not be limited to, the PostScript operators deletefile, | ||||
| renamefile, filenameforall, and file. File is only | ||||
| dangerous when applied to something other than standard | ||||
| input or output. Implementations may also define additional | ||||
| nonstandard file operators; these may also pose a threat to | ||||
| security. Filenameforall, the wildcard file search | ||||
| operator, may appear at first glance to be harmless. Note, | ||||
| however, that this operator has the potential to reveal | ||||
| information about what files the recipient has access to, | ||||
| and this information may itself be sensitive. Message | ||||
| senders should avoid the use of potentially dangerous file | ||||
| operators, since these operators are quite likely to be | ||||
| unavailable in secure PostScript implementations. Message- | ||||
| receiving and -displaying software should either completely | ||||
| disable all potentially dangerous file operators or take | ||||
| special care not to delegate any special authority to their | ||||
| operation. These operators should be viewed as being done by | ||||
| an outside agency when interpreting PostScript documents. | ||||
| Such disabling and/or checking should be done completely | ||||
| outside of the reach of the PostScript language itself; care | ||||
| should be taken to insure that no method exists for re- | ||||
| enabling full-function versions of these operators. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | When the parts of a message broken up in this manner are put | |||
| together, the result is a complete MIME entity, which may have | ||||
| its own Content-Type header field, and thus may contain any | ||||
| other data type. | ||||
| The PostScript language provides facilities for exiting the | 6.2.2.2.1. Message Fragmentation and Reassembly | |||
| normal interpreter, or server, loop. Changes made in this | ||||
| "outer" environment are customarily retained across | ||||
| documents, and may in some cases be retained semipermanently | ||||
| in nonvolatile memory. The operators associated with | ||||
| exiting the interpreter loop have the potential to interfere | ||||
| with subsequent document processing. As such, their | ||||
| unrestrained use constitutes a threat of service denial. | ||||
| PostScript operators that exit the interpreter loop include, | ||||
| but may not be limited to, the exitserver and startjob | ||||
| operators. Message-sending software should not generate | ||||
| PostScript that depends on exiting the interpreter loop to | ||||
| operate. The ability to exit will probably be unavailable in | ||||
| secure PostScript implementations. Message-receiving and | ||||
| -displaying software should, if possible, disable the | ||||
| ability to make retained changes to the PostScript | ||||
| environment, and eliminate the startjob and exitserver | ||||
| commands. If these commands cannot be eliminated, the | ||||
| password associated with them should at least be set to a | ||||
| hard-to-guess value. | ||||
| PostScript provides operators for setting system-wide and | The semantics of a reassembled partial message must be those | |||
| device-specific parameters. These parameter settings may be | of the "inner" message, rather than of a message containing | |||
| retained across jobs and may potentially pose a threat to | the inner message. This makes it possible, for example, to | |||
| the correct operation of the interpreter. The PostScript | send a large audio message as several partial messages, and | |||
| operators that set system and device parameters include, but | still have it appear to the recipient as a simple audio | |||
| may not be limited to, the setsystemparams and setdevparams | message rather than as an encapsulated message containing an | |||
| operators. Message-sending software should not generate | audio message. That is, the encapsulation of the message is | |||
| PostScript that depends on the setting of system or device | considered to be "transparent". | |||
| parameters to operate correctly. The ability to set these | ||||
| parameters will probably be unavailable in secure PostScript | ||||
| implementations. Message-receiving and -displaying software | ||||
| should, if possible, disable the ability to change system | ||||
| and device parameters. If these operators cannot be | ||||
| disabled, the password associated with them should at least | ||||
| be set to a hard-to-guess value. | ||||
| Some PostScript implementations provide nonstandard | When generating and reassembling the parts of a | |||
| facilities for the direct loading and execution of machine | message/partial message, the headers of the encapsulated | |||
| code. Such facilities are quite obviously open to | message must be merged with the headers of the enclosing | |||
| substantial abuse. Message-sending software should not | entities. In this process the following rules must be | |||
| make use of such features. Besides being totally hardware- | observed: | |||
| specific, they are also likely to be unavailable in secure | ||||
| implementations of PostScript. Message-receiving and | ||||
| -displaying software should not allow such operators to be | ||||
| used if they exist. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | (1) All of the header fields from the initial enclosing | |||
| entity (part one), except those that start with | ||||
| "Content-" and the specific header fields "Subject", | ||||
| "Message-ID", "Encrypted", and "MIME-Version", must be | ||||
| copied, in order, to the new message. | ||||
| PostScript is an extensible language, and many, if not most, | (2) Only those header fields in the enclosed message which | |||
| implementations of it provide a number of their own | start with "Content-" and "Subject", "Message-ID", | |||
| extensions. This document does not deal with such extensions | "Encrypted", and "MIME-Version" must be appended, in | |||
| explicitly since they constitute an unknown factor. | order, to the header fields of the new message. Any | |||
| Message-sending software should not make use of nonstandard | header fields in the enclosed message which do not | |||
| extensions; they are likely to be missing from some | start with "Content-" (except for "Message-ID", | |||
| implementations. Message-receiving and -displaying software | "Encrypted", and "MIME-Version") will be ignored. | |||
| should make sure that any nonstandard PostScript operators | ||||
| are secure and don't present any kind of threat. | ||||
| It is possible to write PostScript that consumes huge | (3) All of the header fields from the second and any | |||
| amounts of various system resources. It is also possible to | subsequent messages will be ignored. | |||
| write PostScript programs that loop infinitely. Both types | ||||
| of programs have the potential to cause damage if sent to | ||||
| unsuspecting recipients. Message-sending software should | ||||
| avoid the construction and dissemination of such programs, | ||||
| which is antisocial. Message-receiving and -displaying | ||||
| software should provide appropriate mechanisms to abort | ||||
| processing of a document after a reasonable amount of time | ||||
| has elapsed. In addition, PostScript interpreters should be | ||||
| limited to the consumption of only a reasonable amount of | ||||
| any given system resource. | ||||
| It is possible to include raw binary information inside | 6.2.2.2.2. Fragmentation and Reassembly Example | |||
| PostScript in various forms. This is not recommended for | ||||
| use in email, both because it is not supported by all | ||||
| PostScript interpreters and because it significantly | ||||
| complicates the use of a MIME Content-Transfer-Encoding. | ||||
| (Without such binary, PostScript may typically be viewed as | ||||
| line-oriented data. The treatment of CRLF sequences becomes | ||||
| extremely problematic if binary and line-oriented data are | ||||
| mixed in a single Postscript data stream.) | ||||
| Finally, bugs may exist in some PostScript interpreters | If an audio message is broken into two parts, the first part | |||
| which could possibly be exploited to gain unauthorized | might look something like this: | |||
| access to a recipient's system. Apart from noting this | ||||
| possibility, there is no specific action to take to prevent | ||||
| this, apart from the timely correction of such bugs if any | ||||
| are found. | ||||
| 7.4.3 Other Application subtypes | X-Weird-Header-1: Foo | |||
| From: Bill@host.com | ||||
| To: joe@otherhost.com | ||||
| Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST) | ||||
| Subject: Audio mail (part 1 of 2) | ||||
| Message-ID: <id1@host.com> | ||||
| MIME-Version: 1.0 | ||||
| Content-type: message/partial; id="ABC@host.com"; | ||||
| number=1; total=2 | ||||
| It is expected that many other subtypes of application will | X-Weird-Header-1: Bar | |||
| be defined in the future. MIME implementations must | X-Weird-Header-2: Hello | |||
| generally treat any unrecognized subtypes as being | Message-ID: <anotherid@foo.com> | |||
| equivalent to application/octet-stream. | Subject: Audio mail | |||
| MIME-Version: 1.0 | ||||
| Content-type: audio/basic | ||||
| Content-transfer-encoding: base64 | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | ... first half of encoded audio data goes here ... | |||
| The formal grammar for content-type header fields for | and the second half might look something like this: | |||
| application data is given by: | ||||
| application-type := "application" "/" application-subtype | From: Bill@host.com | |||
| To: joe@otherhost.com | ||||
| Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST) | ||||
| Subject: Audio mail (part 2 of 2) | ||||
| MIME-Version: 1.0 | ||||
| Message-ID: <id2@host.com> | ||||
| Content-type: message/partial; | ||||
| id="ABC@host.com"; number=2; total=2 | ||||
| application-subtype := ("octet-stream" *stream-param) | ... second half of encoded audio data goes here ... | |||
| / "postscript" / extension-token | ||||
| stream-param := (";" "type" "=" value) | Then, when the fragmented message is reassembled, the | |||
| / (";" "padding" "=" padding) | resulting message to be displayed to the user should look | |||
| something like this: | ||||
| padding := "0" / "1" / "2" / "3" / "4" / "5" / "6" / "7" | X-Weird-Header-1: Foo | |||
| From: Bill@host.com | ||||
| To: joe@otherhost.com | ||||
| Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST) | ||||
| Subject: Audio mail | ||||
| Message-ID: <anotherid@foo.com> | ||||
| MIME-Version: 1.0 | ||||
| Content-type: audio/basic | ||||
| Content-transfer-encoding: base64 | ||||
| 7.5 The Image Content-Type | ... first half of encoded audio data goes here ... | |||
| ... second half of encoded audio data goes here ... | ||||
| A Content-Type of "image" indicates that the body contains | Because data of type "message" may never be encoded in base64 | |||
| an image. The subtype names the specific image format. | or quoted-printable, a problem might arise if message/partial | |||
| These names are case insensitive. Two initial subtypes are | entities are constructed in an environment that supports | |||
| "jpeg" for the JPEG format, JFIF encoding, and "gif" for GIF | binary or 8-bit transport. The problem is that the binary | |||
| format [GIF]. | data would be split into multiple message/partial messages, | |||
| each of them requiring binary transport. If such messages | ||||
| were encountered at a gateway into a 7-bit transport | ||||
| environment, there would be no way to properly encode them for | ||||
| the 7-bit world, aside from waiting for all of the fragments, | ||||
| reassembling the inner message, and then encoding the | ||||
| reassembled data in base64 or quoted-printable. Since it is | ||||
| possible that different fragments might go through different | ||||
| gateways, even this is not an acceptable solution. For this | ||||
| reason, it is specified that MIME entities of type | ||||
| message/partial must always have a content-transfer-encoding | ||||
| of 7-bit (the default). In particular, even in environments | ||||
| that support binary or 8-bit transport, the use of a content- | ||||
| transfer-encoding of "8bit" or "binary" is explicitly | ||||
| prohibited for entities of type message/partial. | ||||
| The list of image subtypes given here is neither exclusive | Because some message transfer agents may choose to | |||
| nor exhaustive, and is expected to grow as more types are | automatically fragment large messages, and because such agents | |||
| registered with IANA, as described in Appendix E. | may use very different fragmentation thresholds, it is | |||
| possible that the pieces of a partial message, upon | ||||
| reassembly, may prove themselves to comprise a partial | ||||
| message. This is explicitly permitted. | ||||
| The formal grammar for the content-type header field for | The inclusion of a "References" field in the headers of the | |||
| data of type image is given by: | second and subsequent pieces of a fragmented message that | |||
| references the Message-Id on the previous piece may be of | ||||
| benefit to mail readers that understand and track references. | ||||
| However, the generation of such "References" fields is | ||||
| entirely optional. | ||||
| image-type := "image" "/" ("gif" / "jpeg" / extension-token) | Finally, it should be noted that the "Encrypted" header field | |||
| has been made obsolete by Privacy Enhanced Messaging (PEM) | ||||
| [RFC1421, RFC1422, RFC1423, and RFC1424], but the rules above | ||||
| are nevertheless believed to describe the correct way to treat | ||||
| it if it is encountered in the context of conversion to and | ||||
| from message/partial fragments. | ||||
| 7.6 The Audio Content-Type | 6.2.2.3. External-Body Subtype | |||
| A Content-Type of "audio" indicates that the body contains | The external-body subtype indicates that the actual body data | |||
| audio data. Although there is not yet a consensus on an | are not included, but merely referenced. In this case, the | |||
| "ideal" audio format for use with computers, there is a | parameters describe a mechanism for accessing the external | |||
| pressing need for a format capable of providing | data. | |||
| interoperable behavior. | ||||
| The initial subtype of "basic" is specified to meet this | When an entity is of type "message/external-body", it consists | |||
| requirement by providing an absolutely minimal lowest common | of a header, two consecutive CRLFs, and the message header for | |||
| denominator audio format. It is expected that richer | the encapsulated message. If another pair of consecutive | |||
| formats for higher quality and/or lower bandwidth audio will | CRLFs appears, this of course ends the message header for the | |||
| be defined by a later document. | encapsulated message. However, since the encapsulated | |||
| message's body is itself external, it does NOT appear in the | ||||
| area that follows. For example, consider the following | ||||
| message: | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | Content-type: message/external-body; | |||
| access-type=local-file; | ||||
| name="/u/nsb/Me.gif" | ||||
| Content-type: image/gif | ||||
| Content-ID: <id42@guppylake.bellcore.com> | ||||
| Content-Transfer-Encoding: binary | ||||
| The content of the "audio/basic" subtype is audio encoded | THIS IS NOT REALLY THE BODY! | |||
| using 8-bit ISDN mu-law [PCM]. When this subtype is | ||||
| present, a sample rate of 8000 Hz and a single channel is | ||||
| assumed. | ||||
| The formal grammar for the content-type header field for | The area at the end, which might be called the "phantom body", | |||
| data of type audio is given by: | is ignored for most external-body messages. However, it may | |||
| be used to contain auxiliary information for some such | ||||
| messages, as indeed it is when the access-type is "mail- | ||||
| server". The only access-type defined in this document that | ||||
| uses the phantom body is "mail-server", but other access-types | ||||
| may be defined in the future in other documents that use this | ||||
| area. | ||||
| audio-type := "audio" "/" ("basic" / extension-token) | The encapsulated headers in ALL message/external-body entities | |||
| MUST include a Content-ID header field to give a unique | ||||
| identifier by which to reference the data. This identifier | ||||
| may be used for caching mechanisms, and for recognizing the | ||||
| receipt of the data when the access-type is "mail-server". | ||||
| 7.7 The Video Content-Type | Note that, as specified here, the tokens that describe | |||
| external-body data, such as file names and mail server | ||||
| commands, are required to be in the US-ASCII character set. | ||||
| If this proves problematic in practice, a new mechanism may be | ||||
| required as a future extension to MIME, either as newly | ||||
| defined access-types for message/external-body or by some | ||||
| other mechanism. | ||||
| A Content-Type of "video" indicates that the body contains a | As with message/partial, MIME entities of type | |||
| time-varying-picture image, possibly with color and | message/external-body MUST have a content-transfer-encoding of | |||
| coordinated sound. The term "video" is used extremely | 7-bit (the default). In particular, even in environments that | |||
| generically, rather than with reference to any particular | support binary or 8-bit transport, the use of a content- | |||
| technology or format, and is not meant to preclude subtypes | transfer-encoding of "8bit" or "binary" is explicitly | |||
| such as animated drawings encoded compactly. The subtype | prohibited for entities of type message/external-body. | |||
| "mpeg" refers to video coded according to the MPEG standard | ||||
| [MPEG]. | ||||
| Note that although in general this document strongly | 6.2.2.3.1. General External-Body Parameters | |||
| discourages the mixing of multiple media in a single body, | ||||
| it is recognized that many so-called "video" formats include | ||||
| a representation for synchronized audio, and this is | ||||
| explicitly permitted for subtypes of "video". | ||||
| The formal grammar for the content-type header field for | The parameters that may be used with any message/external-body | |||
| data of type video is given by: | are: | |||
| video-type := "video" "/" ("mpeg" / extension-token) | (1) ACCESS-TYPE -- A word indicating the supported access | |||
| mechanism by which the file or data may be obtained. | ||||
| This word is not case sensitive. Values include, but | ||||
| are not limited to, "FTP", "ANON-FTP", "TFTP", "LOCAL- | ||||
| FILE", and "MAIL-SERVER". Future values, except for | ||||
| experimental values beginning with "X-", must be | ||||
| registered with IANA, as described in RFC REG. This | ||||
| parameter is unconditionally mandatory and MUST be | ||||
| present on EVERY message/external-body. | ||||
| 7.8 Experimental Content-Type Values | (2) EXPIRATION -- The date (in the RFC 822 "date-time" | |||
| syntax, as extended by RFC 1123 to permit 4 digits in | ||||
| the year field) after which the existence of the | ||||
| external data is not guaranteed. This parameter may be | ||||
| used with ANY access-type and is ALWAYS optional. | ||||
| A Content-Type value beginning with the characters "X-" is a | (3) SIZE -- The size (in octets) of the data. The intent | |||
| private value, to be used by consenting mail systems by | of this parameter is to help the recipient decide | |||
| mutual agreement. Any format without a rigorous and public | whether or not to expend the necessary resources to | |||
| definition must be named with an "X-" prefix, and publicly | retrieve the external data. Note that this describes | |||
| specified values shall never begin with "X-". (Older | the size of the data in its canonical form, that is, | |||
| versions of the widely-used Andrew system use the "X-BE2" | before any Content-Transfer-Encoding has been applied | |||
| name, so new systems should probably choose a different | or after the data have been decoded. This parameter | |||
| name.) | may be used with ANY access-type and is ALWAYS | |||
| optional. | ||||
| In general, the use of "X-" top-level types is strongly | (4) PERMISSION -- A case-insensitive field that indicates | |||
| discouraged. Implementors should invent subtypes of the | whether or not it is expected that clients might also | |||
| existing types whenever possible. The invention of new | attempt to overwrite the data. By default, or if | |||
| permission is "read", the assumption is that they are | ||||
| not, and that if the data is retrieved once, it is | ||||
| never needed again. If PERMISSION is "read-write", | ||||
| this assumption is invalid, and any local copy must be | ||||
| considered no more than a cache. "Read" and "Read- | ||||
| write" are the only defined values of permission. This | ||||
| parameter may be used with ANY access-type and is | ||||
| ALWAYS optional. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | The precise semantics of the access-types defined here are | |||
| described in the sections that follow. | ||||
| types is intended to be restricted primarily to the | 6.2.2.3.2. The 'ftp' and 'tftp' Access-Types | |||
| development of new media types for email, such as digital | ||||
| odors or holography, and not for new data formats in | ||||
| general. In many cases, a subtype of application will be | ||||
| more appropriate than a new top-level type. | ||||
| Summary | An access-type of FTP or TFTP indicates that the message body | |||
| is accessible as a file using the FTP [RFC-959] or TFTP [RFC- | ||||
| 783] protocols, respectively. For these access-types, the | ||||
| following additional parameters are mandatory: | ||||
| Using the MIME-Version, Content-Type, and Content-Transfer- | (1) NAME -- The name of the file that contains the actual | |||
| Encoding header fields, it is possible to include, in a | body data. | |||
| standardized way, arbitrary types of data objects with RFC | ||||
| 822 conformant mail messages. No restrictions imposed by | ||||
| either RFC 821 or RFC 822 are violated, and care has been | ||||
| taken to avoid problems caused by additional restrictions | ||||
| imposed by the characteristics of some Internet mail | ||||
| transport mechanisms (see Appendix B). The "multipart" and | ||||
| "message" Content-Types allow mixing and hierarchical | ||||
| structuring of objects of different types in a single | ||||
| message. Further Content-Types provide a standardized | ||||
| mechanism for tagging messages or body parts as audio, | ||||
| image, or several other kinds of data. A distinguished | ||||
| parameter syntax allows further specification of data format | ||||
| details, particularly the specification of alternate | ||||
| character sets. Additional optional header fields provide | ||||
| mechanisms for certain extensions deemed desirable by many | ||||
| implementors. Finally, a number of useful Content-Types are | ||||
| defined for general use by consenting user agents, notably | ||||
| message/partial, and message/external-body. | ||||
| Security Considerations | (2) SITE -- A machine from which the file may be obtained, | |||
| using the given protocol. This must be a fully | ||||
| qualified domain name, not a nickname. | ||||
| Security issues are discussed in Section 7.4.2 and in | (3) Before any data are retrieved, using FTP, the user will | |||
| Appendix F. Implementors should pay special attention to | generally need to be asked to provide a login id and a | |||
| the security implications of any mail content-types that can | password for the machine named by the site parameter. | |||
| cause the remote execution of any actions in the recipient's | For security reasons, such an id and password are not | |||
| environment. In such cases, the discussion of the | specified as content-type parameters, but must be | |||
| application/postscript content-type in Section 7.4.2 may | obtained from the user. | |||
| serve as a model for considering other content-types with | ||||
| remote execution capabilities. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | In addition, the following parameters are optional: | |||
| Authors' Addresses | (1) DIRECTORY -- A directory from which the data named by | |||
| NAME should be retrieved. | ||||
| For more information, the authors of this document may be | (2) MODE -- A case-insensitive string indicating the mode | |||
| contacted via Internet mail: | to be used when retrieving the information. The valid | |||
| values for access-type "TFTP" are "NETASCII", "OCTET", | ||||
| and "MAIL", as specified by the TFTP protocol [RFC- | ||||
| 783]. The valid values for access-type "FTP" are | ||||
| "ASCII", "EBCDIC", "IMAGE", and "LOCALn" where "n" is a | ||||
| decimal integer, typically 8. These correspond to the | ||||
| representation types "A" "E" "I" and "L n" as specified | ||||
| by the FTP protocol [RFC-959]. Note that "BINARY" and | ||||
| "TENEX" are not valid values for MODE and that "OCTET" | ||||
| or "IMAGE" or "LOCAL8" should be used instead. IF MODE | ||||
| is not specified, the default value is "NETASCII" for | ||||
| TFTP and "ASCII" otherwise. | ||||
| Nathaniel S. Borenstein | 6.2.2.3.3. The 'anon-ftp' Access-Type | |||
| First Virtual Holdings | ||||
| 25 Washington Avenue | ||||
| Morristown, NJ 07960 | ||||
| Email: nsb@nsb.fv.com | The "anon-ftp" access-type is identical to the "ftp" access | |||
| Phone: +1 201 540 8967 | type, except that the user need not be asked to provide a name | |||
| Fax: +1 201 993 3032 | and password for the specified site. Instead, the ftp | |||
| protocol will be used with login "anonymous" and a password | ||||
| that corresponds to the user's email address. | ||||
| Ned Freed | 6.2.2.3.4. The 'local-file' Access-Type | |||
| Innosoft International, Inc. | ||||
| 250 West First Street | ||||
| Suite 240 | ||||
| Claremont, CA 91711 | ||||
| Phone: +1 909 624 7907 | An access-type of "local-file" indicates that the actual body | |||
| Fax: +1 909 621 5319 | is accessible as a file on the local machine. Two additional | |||
| Email: ned@innosoft.com | parameters are defined for this access type: | |||
| MIME is a result of the work of the Internet Engineering | (1) NAME -- The name of the file that contains the actual | |||
| Task Force Working Group on Email Extensions. The chairman | body data. This parameter is mandatory for the "local- | |||
| of that group, Greg Vaudreuil, may be reached at: | file" access-type. | |||
| Gregory M. Vaudreuil | (2) SITE -- A domain specifier for a machine or set of | |||
| Tigon Corporation | machines that are known to have access to the data | |||
| 17060 Dallas Parkway | file. This optional parameter is used to describe the | |||
| Dallas Texas, 75248 | locality of reference for the data, that is, the site | |||
| 214-733-2722 | or sites at which the file is expected to be visible. | |||
| Email: gvaudre@cnri.reston.va.us | Asterisks may be used for wildcard matching to a part | |||
| of a domain name, such as "*.bellcore.com", to indicate | ||||
| a set of machines on which the data should be directly | ||||
| visible, while a single asterisk may be used to | ||||
| indicate a file that is expected to be universally | ||||
| available, e.g., via a global file system. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | 6.2.2.3.5. The 'mail-server' Access-Type | |||
| Acknowledgements | The "mail-server" access-type indicates that the actual body | |||
| is available from a mail server. Two additional parameters | ||||
| are defined for this access-type: | ||||
| This document is the result of the collective effort of a | (1) SERVER -- The email address of the mail server from | |||
| large number of people, at several IETF meetings, on the | which the actual body data can be obtained. This | |||
| IETF-SMTP and IETF-822 mailing lists, and elsewhere. | parameter is mandatory for the "mail-server" access- | |||
| Although any enumeration seems doomed to suffer from | type. | |||
| egregious omissions, the following are among the many | ||||
| contributors to this effort: | ||||
| Harald Tveit Alvestrand Timo Lehtinen | (2) SUBJECT -- The subject that is to be used in the mail | |||
| Randall Atkinson John R. MacMillan | that is sent to obtain the data. Note that keying mail | |||
| Philippe Brandon Rick McGowan | servers on Subject lines is NOT recommended, but such | |||
| Kevin Carosso Leo Mclaughlin | mail servers are known to exist. This is an optional | |||
| Uhhyung Choi Goli Montaser-Kohsari | parameter. | |||
| Cristian Constantinof Keith Moore | ||||
| Mark Crispin Tom Moore | ||||
| Dave Crocker Erik Naggum | ||||
| Terry Crowley Mark Needleman | ||||
| Walt Daniels John Noerenberg | ||||
| Frank Dawson Mats Ohrman | ||||
| Hitoshi Doi Julian Onions | ||||
| Kevin Donnelly Michael Patton | ||||
| Keith Edwards David J. Pepper | ||||
| Chris Eich Blake C. Ramsdell | ||||
| Johnny Eriksson Luc Rooijakkers | ||||
| Craig Everhart Marshall T. Rose | ||||
| Patrik F.ltstr.m Jonathan Rosenberg | ||||
| Erik E. Fair Jan Rynning | ||||
| Roger Fajman Harri Salminen | ||||
| Alain Fontaine Michael Sanderson | ||||
| James M. Galvin Masahiro Sekiguchi | ||||
| Philip Gladstone Mark Sherman | ||||
| Thomas Gordon Keld Simonsen | ||||
| Phill Gross Bob Smart | ||||
| James Hamilton Peter Speck | ||||
| Steve Hardcastle-Kille Henry Spencer | ||||
| David Herron Einar Stefferud | ||||
| Bruce Howard Michael Stein | ||||
| Bill Janssen Klaus Steinberger | ||||
| Olle J.rnefors Peter Svanberg | ||||
| Risto Kankkunen James Thompson | ||||
| Phil Karn Steve Uhler | ||||
| Alan Katz Stuart Vance | ||||
| Tim Kehres Erik van der Poel | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | Because mail servers accept a variety of syntaxes, some of | |||
| which is multiline, the full command to be sent to a mail | ||||
| server is not included as a parameter on the content-type | ||||
| line. Instead, it is provided as the "phantom body" when the | ||||
| content-type is message/external-body and the access-type is | ||||
| mail-server. | ||||
| Neil Katin Guido van Rossum | Note that MIME does not define a mail server syntax. Rather, | |||
| Kyuho Kim Peter Vanderbilt | it allows the inclusion of arbitrary mail server commands in | |||
| Anders Klemets Greg Vaudreuil | the phantom body. Implementations must include the phantom | |||
| John Klensin Ed Vielmetti | body in the body of the message it sends to the mail server | |||
| Valdis Kletniek Ryan Waldron | address to retrieve the relevant data. | |||
| Jim Knowles Wally Wedel | ||||
| Stev Knowles Sven-Ove Westberg | ||||
| Bob Kummerfeld Brian Wideen | ||||
| Pekka Kytolaakso John Wobus | ||||
| Stellan Lagerstr.m Glenn Wright | ||||
| Vincent Lau Rayan Zachariassen | ||||
| Donald Lindsay David Zimmerman | ||||
| Marc Andreessen Bob Braden | Unlike other access-types, mail-server access is asynchronous | |||
| Brian Capouch Peter Clitherow | and will happen at an unpredictable time in the future. For | |||
| Dave Collier-Brown John Coonrod | this reason, it is important that there be a mechanism by | |||
| Stephen Crocker Jim Davis | which the returned data can be matched up with the original | |||
| Axel Deininger Dana S Emery | message/external-body entity. MIME mailservers must use the | |||
| Martin Forssen Stephen Gildea | same Content-ID field on the returned message that was used in | |||
| Terry Gray Mark Horton | the original message/external-body entity, to facilitate such | |||
| Warner Losh Carlyn Lowery | matching. | |||
| Laurence Lundblade Charles Lynn | ||||
| Larry Masinter Michael J. McInerny | ||||
| Jon Postel Christer Romson | ||||
| Yutaka Sato Markku Savela | ||||
| Richard Alan Schafer Larry W. Virden | ||||
| Rhys Weatherly Jay Weber | ||||
| Dave Wecker | ||||
| The authors apologize for any omissions from this list, | 6.2.2.3.6. Examples and Further Explanations | |||
| which are certainly unintentional. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | When the external-body mechanism is used in conjunction with | |||
| the multipart/alternative Content-Type it extends the | ||||
| functionality of multipart/alternative to include the case | ||||
| where the same object is provided in the same format but via | ||||
| different accces mechanisms. When this is done the originator | ||||
| of the message must order the part first in terms of preferred | ||||
| formats and then by preferred access mechanisms. The | ||||
| recipient's viewer should then evaluate the list both in terms | ||||
| of format and access mechanisms. | ||||
| Appendix A -- Minimal MIME-Conformance | With the emerging possibility of very wide-area file systems, | |||
| it becomes very hard to know in advance the set of machines | ||||
| where a file will and will not be accessible directly from the | ||||
| file system. Therefore it may make sense to provide both a | ||||
| file name, to be tried directly, and the name of one or more | ||||
| sites from which the file is known to be accessible. An | ||||
| implementation can try to retrieve remote files using FTP or | ||||
| any other protocol, using anonymous file retrieval or | ||||
| prompting the user for the necessary name and password. If an | ||||
| external body is accessible via multiple mechanisms, the | ||||
| sender may include multiple parts of type message/external- | ||||
| body within an entity of type multipart/alternative. | ||||
| The mechanisms described in this document are open-ended. | However, the external-body mechanism is not intended to be | |||
| It is definitely not expected that all implementations will | limited to file retrieval, as shown by the mail-server | |||
| support all of the Content-Types described, nor that they | access-type. Beyond this, one can imagine, for example, using | |||
| will all share the same extensions. In order to promote | a video server for external references to video clips. | |||
| interoperability, however, it is useful to define the | ||||
| concept of "MIME-conformance" to define a certain level of | ||||
| implementation that allows the useful interworking of | ||||
| messages with content that differs from US ASCII text. In | ||||
| this section, we specify the requirements for such | ||||
| conformance. | ||||
| A mail user agent that is MIME-conformant MUST: | The embedded message header fields which appear in the body of | |||
| the message/external-body data must be used to declare the | ||||
| Content-type of the external body if it is anything other than | ||||
| plain US-ASCII text, since the external body does not have a | ||||
| header section to declare its type. Similarly, any Content- | ||||
| transfer-encoding other than "7bit" must also be declared | ||||
| here. Thus a complete message/external-body message, | ||||
| referring to a document in PostScript format, might look like | ||||
| this: | ||||
| 1. Always generate a "MIME-Version: 1.0" header | From: Whomever | |||
| field. | To: Someone | |||
| Date: Whenever | ||||
| Subject: whatever | ||||
| MIME-Version: 1.0 | ||||
| Message-ID: <id1@host.com> | ||||
| Content-Type: multipart/alternative; boundary=42 | ||||
| Content-ID: <id001@guppylake.bellcore.com> | ||||
| 2. Recognize the Content-Transfer-Encoding header | --42 | |||
| field, and decode all received data encoded with | Content-Type: message/external-body; name="BodyFormats.ps"; | |||
| either the quoted-printable or base64 | site="thumper.bellcore.com"; mode="image"; | |||
| implementations. Encode any data sent that is | access-type=ANON-FTP; directory="pub"; | |||
| not in seven-bit mail-ready representation using | expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)" | |||
| one of these transformations and include the | ||||
| appropriate Content-Transfer-Encoding header | ||||
| field, unless the underlying transport mechanism | ||||
| supports non-seven-bit data, as SMTP does not. | ||||
| 3. Recognize and interpret the Content-Type | Content-type: application/postscript | |||
| header field, and avoid showing users raw data | Content-ID: <id42@guppylake.bellcore.com> | |||
| with a Content-Type field other than text. Be | ||||
| able to send at least text/plain messages, with | ||||
| the character set specified as a parameter if it | ||||
| is not US-ASCII. | ||||
| 4. Explicitly handle the following Content-Type | --42 | |||
| values, to at least the following extents: | Content-Type: message/external-body; access-type=local-file; | |||
| name="/u/nsb/writing/rfcs/RFC-MIME.ps"; | ||||
| site="thumper.bellcore.com"; | ||||
| expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)" | ||||
| Text: | Content-type: application/postscript | |||
| -- Recognize and display "text" mail | Content-ID: <id42@guppylake.bellcore.com> | |||
| with the character set "US-ASCII." | ||||
| -- Recognize other character sets at | ||||
| least to the extent of being able | ||||
| to inform the user about what | ||||
| character set the message uses. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | --42 | |||
| Content-Type: message/external-body; | ||||
| access-type=mail-server | ||||
| server="listserv@bogus.bitnet"; | ||||
| expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)" | ||||
| -- Recognize the "ISO-8859-*" character | Content-type: application/postscript | |||
| sets to the extent of being able to | Content-ID: <id42@guppylake.bellcore.com> | |||
| display those characters that are | ||||
| common to ISO-8859-* and US-ASCII, | ||||
| namely all characters represented | ||||
| by octet values 0-127. | ||||
| -- For unrecognized subtypes, show or | ||||
| offer to show the user the "raw" | ||||
| version of the data after | ||||
| conversion of the content from | ||||
| canonical form to local form. | ||||
| Message: | ||||
| -- Recognize and display at least the | ||||
| primary (822) encapsulation in such | ||||
| a way as to preserve any recursive | ||||
| structure, that is, displaying or | ||||
| offering to display the | ||||
| encapsulated data in accordance | ||||
| with its Content-type. | ||||
| Multipart: | ||||
| -- Recognize the primary (mixed) | ||||
| subtype. Display all relevant | ||||
| information on the message level | ||||
| and the body part header level and | ||||
| then display or offer to display | ||||
| each of the body parts | ||||
| individually. | ||||
| -- Recognize the "alternative" subtype, | ||||
| and avoid showing the user | ||||
| redundant parts of | ||||
| multipart/alternative mail. | ||||
| -- Recognize the "multipart/digest" | ||||
| subtype, specifically using | ||||
| "message/rfc822" rather than | ||||
| "text/plain" as the default | ||||
| content-type for encapsulations | ||||
| inside "multipart/digest" entities. | ||||
| -- Treat any unrecognized subtypes as if | ||||
| they were "mixed". | ||||
| Application: | ||||
| -- Offer the ability to remove either of | ||||
| the two types of Content-Transfer- | ||||
| Encoding defined in this document | ||||
| and put the resulting information | ||||
| in a user file. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | get RFC-MIME.DOC | |||
| 5. Upon encountering any unrecognized Content- | --42-- | |||
| Type, an implementation must treat it as if it had | ||||
| a Content-Type of "application/octet-stream" with | ||||
| no parameter sub-arguments. How such data are | ||||
| handled is up to an implementation, but likely | ||||
| options for handling such unrecognized data | ||||
| include offering the user to write it into a file | ||||
| (decoded from its mail transport format) or | ||||
| offering the user to name a program to which the | ||||
| decoded data should be passed as input. | ||||
| Unrecognized predefined types, which in a MIME- | ||||
| conformant mailer might still include audio, | ||||
| image, or video, should also be treated in this | ||||
| way. | ||||
| A user agent that meets the above conditions is said to be | Note that in the above examples, the default Content- | |||
| MIME-conformant. The meaning of this phrase is that it is | transfer-encoding of "7bit" is assumed for the external | |||
| assumed to be "safe" to send virtually any kind of | postscript data. | |||
| properly-marked data to users of such mail systems, because | ||||
| such systems will at least be able to treat the data as | ||||
| undifferentiated binary, and will not simply splash it onto | ||||
| the screen of unsuspecting users. There is another sense | ||||
| in which it is always "safe" to send data in a format that | ||||
| is MIME-conformant, which is that such data will not break | ||||
| or be broken by any known systems that are conformant with | ||||
| RFC 821 and RFC 822. User agents that are MIME-conformant | ||||
| have the additional guarantee that the user will not be | ||||
| shown data that were never intended to be viewed as text. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | Like the message/partial type, the message/external-body type | |||
| is intended to be transparent, that is, to convey the data | ||||
| type in the external body rather than to convey a message with | ||||
| a body of that type. Thus the headers on the outer and inner | ||||
| parts must be merged using the same rules as for | ||||
| message/partial. In particular, this means that the Content- | ||||
| type header is overridden, but the From and Subject headers | ||||
| are preserved. | ||||
| Appendix B -- General Guidelines For Sending Email Data | Note that since the external bodies are not transported as | |||
| mail, they need not conform to the 7-bit and line length | ||||
| requirements, but might in fact be binary files. Thus a | ||||
| Content-Transfer-Encoding is not generally necessary, though | ||||
| it is permitted. | ||||
| Internet email is not a perfect, homogeneous system. Mail | Note that the body of a message of type "message/external- | |||
| may become corrupted at several stages in its travel to a | body" is governed by the basic syntax for an RFC 822 message. | |||
| final destination. Specifically, email sent throughout the | In particular, anything before the first consecutive pair of | |||
| Internet may travel across many networking technologies. | CRLFs is header information, while anything after it is body | |||
| Many networking and mail technologies do not support the | information, which is ignored for most access-types. | |||
| full functionality possible in the SMTP transport | ||||
| environment. Mail traversing these systems is likely to be | ||||
| modified in such a way that it can be transported. | ||||
| There exist many widely-deployed non-conformant MTAs in the | 6.2.2.4. Other Message Subtypes | |||
| Internet. These MTAs, speaking the SMTP protocol, alter | ||||
| messages on the fly to take advantage of the internal data | ||||
| structure of the hosts they are implemented on, or are just | ||||
| plain broken. | ||||
| The following guidelines may be useful to anyone devising a | MIME implementations must in general treat unrecognized | |||
| data format (Content-Type) that will survive the widest | subtypes of message as being equivalent to | |||
| range of networking technologies and known broken MTAs | "application/octet-stream". | |||
| unscathed. Note that anything encoded in the base64 | ||||
| encoding will satisfy these rules, but that some well-known | ||||
| mechanisms, notably the UNIX uuencode facility, will not. | ||||
| Note also that anything encoded in the Quoted-Printable | ||||
| encoding will survive most gateways intact, but possibly not | ||||
| some gateways to systems that use the EBCDIC character set. | ||||
| (1) Under some circumstances the encoding used for data | 7. Experimental Content-Type Values | |||
| may change as part of normal gateway or user agent | ||||
| operation. In particular, conversion from base64 to | ||||
| quoted-printable and vice versa may be necessary. This | ||||
| may result in the confusion of CRLF sequences with line | ||||
| breaks in text bodies. As such, the persistence of CRLF | ||||
| as something other than a line break must not be relied | ||||
| on. | ||||
| (2) Many systems may elect to represent and store text | A Content-Type value beginning with the characters "X-" is a | |||
| data using local newline conventions. Local newline | private value, to be used by consenting mail systems by mutual | |||
| conventions may not match the RFC822 CRLF convention -- | agreement. Any format without a rigorous and public | |||
| systems are known that use plain CR, plain LF, CRLF, or | definition must be named with an "X-" prefix, and publicly | |||
| counted records. The result is that isolated CR and LF | specified values shall never begin with "X-". (Older versions | |||
| characters are not well tolerated in general; they | of the widely used Andrew system use the "X-BE2" name, so new | |||
| may be lost or converted to delimiters on some systems, | systems should probably choose a different name.) | |||
| and hence must not be relied on. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | In general, the use of "X-" top-level types is strongly | |||
| discouraged. Implementors should invent subtypes of the | ||||
| existing types whenever possible. The invention of new types | ||||
| is intended to be restricted primarily to the development of | ||||
| new media types for email, such as digital odors or | ||||
| holography, and not for new data formats in general. In many | ||||
| cases, a subtype of application will be more appropriate than | ||||
| a new top-level type. | ||||
| (3) TAB (HT) characters may be misinterpreted or may be | 8. Summary | |||
| automatically converted to variable numbers of spaces. | ||||
| This is unavoidable in some environments, notably those | ||||
| not based on the ASCII character set. Such conversion | ||||
| is STRONGLY DISCOURAGED, but it may occur, and mail | ||||
| formats must not rely on the persistence of TAB (HT) | ||||
| characters. | ||||
| (4) Lines longer than 76 characters may be wrapped or | Using the MIME-Version, Content-Type, and Content-Transfer- | |||
| truncated in some environments. Line wrapping and line | Encoding header fields, it is possible to include, in a | |||
| truncation are STRONGLY DISCOURAGED, but unavoidable in | standardized way, arbitrary types of data objects with RFC 822 | |||
| some cases. Applications which require long lines must | conformant mail messages. No restrictions imposed by either | |||
| somehow differentiate between soft and hard line | RFC 821 or RFC 822 are violated, and care has been taken to | |||
| breaks. (A simple way to do this is to use the | avoid problems caused by additional restrictions imposed by | |||
| quoted-printable encoding.) | the characteristics of some Internet mail transport mechanisms | |||
| (see Appendix B). The "multipart" and "message" Content-Types | ||||
| allow mixing and hierarchical structuring of objects of | ||||
| different types in a single message. Further Content-Types | ||||
| provide a standardized mechanism for tagging messages or body | ||||
| parts as audio, image, or several other kinds of data. A | ||||
| distinguished parameter syntax allows further specification of | ||||
| data format details, particularly the specification of | ||||
| alternate character sets. Additional optional header fields | ||||
| provide mechanisms for certain extensions deemed desirable by | ||||
| many implementors. Finally, a number of useful Content-Types | ||||
| are defined for general use by consenting user agents, notably | ||||
| message/partial, and message/external-body. | ||||
| (5) Trailing "white space" characters (SPACE, TAB | 9. Security Considerations | |||
| (HT)) on a line may be discarded by some transport | ||||
| agents, while other transport agents may pad lines with | ||||
| these characters so that all lines in a mail file are | ||||
| of equal length. The persistence of trailing white | ||||
| space, therefore, must not be relied on. | ||||
| (6) Many mail domains use variations on the ASCII | Security issues are discussed in the context of the | |||
| character set, or use character sets such as EBCDIC | application/postscript type and in Appendix E. Implementors | |||
| which contain most but not all of the US-ASCII | should pay special attention to the security implications of | |||
| characters. The correct translation of characters not | any mail content-types that can cause the remote execution of | |||
| in the "invariant" set cannot be depended on across | any actions in the recipient's environment. In such cases, | |||
| character converting gateways. For example, this | the discussion of the application/postscript type may serve as | |||
| situation is a problem when sending uuencoded | a model for considering other content-types with remote | |||
| information across BITNET, an EBCDIC system. Similar | execution capabilities. | |||
| problems can occur without crossing a gateway, since | ||||
| many Internet hosts use character sets other than ASCII | ||||
| internally. The definition of Printable Strings in | ||||
| X.400 adds further restrictions in certain special | ||||
| cases. In particular, the only characters that are | ||||
| known to be consistent across all gateways are the 73 | ||||
| characters that correspond to the upper and lower case | ||||
| letters A-Z and a-z, the 10 digits 0-9, and the | ||||
| following eleven special characters: | ||||
| "'" (ASCII code 39) | 10. Authors' Addresses | |||
| "(" (ASCII code 40) | ||||
| ")" (ASCII code 41) | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | For more information, the authors of this document may be | |||
| contacted via Internet mail: | ||||
| "+" (ASCII code 43) | Nathaniel S. Borenstein | |||
| "," (ASCII code 44) | First Virtual Holdings | |||
| "-" (ASCII code 45) | 25 Washington Avenue | |||
| "." (ASCII code 46) | Morristown, NJ 07960 | |||
| "/" (ASCII code 47) | USA | |||
| ":" (ASCII code 58) | ||||
| "=" (ASCII code 61) | ||||
| "?" (ASCII code 63) | ||||
| A maximally portable mail representation, such as the | Email: nsb@nsb.fv.com | |||
| base64 encoding, will confine itself to relatively | Phone: +1 201 540 8967 | |||
| short lines of text in which the only meaningful | Fax: +1 201 993 3032 | |||
| characters are taken from this set of 73 characters. | ||||
| (7) Some mail transport agents will corrupt data that | Ned Freed | |||
| includes certain literal strings. In particular, a | Innosoft International, Inc. | |||
| period (".") alone on a line is known to be corrupted | 1050 East Garvey Avenue South | |||
| by some (incorrect) SMTP implementations, and a line | West Covina, CA 91790 | |||
| that starts with the five characters "From " (the fifth | USA | |||
| character is a SPACE) are commonly corrupted as well. | ||||
| A careful composition agent can prevent these | ||||
| corruptions by encoding the data (e.g., in the quoted- | ||||
| printable encoding, "=46rom " in place of "From " at | ||||
| the start of a line, and "=2E" in place of "." alone on | ||||
| a line. | ||||
| Please note that the above list is NOT a list of recommended | Email: ned@innosoft.com | |||
| practices for MTAs. RFC 821 MTAs are prohibited from | Phone: +1 818 919 3600 | |||
| altering the character of white space or wrapping long | Fax: +1 818919 3614 | |||
| lines. These BAD and illegal practices are known to occur | ||||
| on established networks, and implementations should be | ||||
| robust in dealing with the bad effects they can cause. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | MIME is a result of the work of the Internet Engineering Task | |||
| Force Working Group on Email Extensions. The chairman of that | ||||
| group, Greg Vaudreuil, may be reached at: | ||||
| Appendix C -- A Complex Multipart Example | Gregory M. Vaudreuil | |||
| Tigon Corporation | ||||
| 17060 Dallas Parkway | ||||
| Dallas Texas, 75248 | ||||
| What follows is the outline of a complex multipart message. | Email: greg.vaudreuil@ons.octel.com | |||
| This message has five parts to be displayed serially: two | Phone: +1 214 733 2722 | |||
| introductory plain text parts, an embedded multipart | 11. Acknowledgements | |||
| message, a richtext part, and a closing encapsulated text | ||||
| message in a non-ASCII character set. The embedded | ||||
| multipart message has two parts to be displayed in parallel, | ||||
| a picture and an audio fragment. | ||||
| MIME-Version: 1.0 | This document is the result of the collective effort of a | |||
| From: Nathaniel Borenstein <nsb@bellcore.com> | large number of people, at several IETF meetings, on the | |||
| To: Ned Freed <ned@innosoft.com> | IETF-SMTP and IETF-822 mailing lists, and elsewhere. Although | |||
| Subject: A multipart example | any enumeration seems doomed to suffer from egregious | |||
| Content-Type: multipart/mixed; | omissions, the following are among the many contributors to | |||
| boundary=unique-boundary-1 | this effort: | |||
| This is the preamble area of a multipart message. | Harald Tveit Alvestrand Marc Andreessen | |||
| Mail readers that understand multipart format | Randall Atkinson Bob Braden | |||
| should ignore this preamble. | Philippe Brandon Brian Capouch | |||
| If you are reading this text, you might want to | Kevin Carosso Uhhyung Choi | |||
| consider changing to a mail reader that understands | Peter Clitherow Dave Collier-Brown | |||
| how to properly display multipart messages. | Cristian Constantinof John Coonrod | |||
| --unique-boundary-1 | Mark Crispin Dave Crocker | |||
| Stephen Crocker Terry Crowley | ||||
| Walt Daniels Jim Davis | ||||
| Frank Dawson Axel Deininger | ||||
| Hitoshi Doi Kevin Donnelly | ||||
| Steve Dorner Keith Edwards | ||||
| Chris Eich Dana S. Emery | ||||
| Johnny Eriksson Craig Everhart | ||||
| Patrik Faltstrom Erik E. Fair | ||||
| Roger Fajman Alain Fontaine | ||||
| Martin Forssen James M. Galvin | ||||
| Stephen Gildea Philip Gladstone | ||||
| Thomas Gordon Keld Simonsen | ||||
| Terry Gray Phill Gross | ||||
| James Hamilton David Herron | ||||
| Mark Horton Bruce Howard | ||||
| Bill Janssen Olle Jarnefors | ||||
| Risto Kankkunen Phil Karn | ||||
| Alan Katz Tim Kehres | ||||
| Neil Katin Steve Kille | ||||
| Kyuho Kim Anders Klemets | ||||
| John Klensin Valdis Kletniek | ||||
| Jim Knowles Stev Knowles | ||||
| Bob Kummerfeld Pekka Kytolaakso | ||||
| Stellan Lagerstrom Vincent Lau | ||||
| Timo Lehtinen Donald Lindsay | ||||
| Warner Losh Carlyn Lowery | ||||
| Laurence Lundblade Charles Lynn | ||||
| John R. MacMillan Larry Masinter | ||||
| Rick McGowan Michael J. McInerny | ||||
| Leo Mclaughlin Goli Montaser-Kohsari | ||||
| Keith Moore Tom Moore | ||||
| Erik Naggum Mark Needleman | ||||
| John Noerenberg Mats Ohrman | ||||
| Julian Onions Michael Patton | ||||
| David J. Pepper Erik van der Poel | ||||
| Jon Postel Blake C. Ramsdell | ||||
| Christer Romson Luc Rooijakkers | ||||
| Marshall T. Rose Jonathan Rosenberg | ||||
| Guido van Rossum Jan Rynning | ||||
| Harri Salminen Michael Sanderson | ||||
| Yutaka Sato Markku Savela | ||||
| Richard Alan Schafer Masahiro Sekiguchi | ||||
| Mark Sherman Bob Smart | ||||
| Peter Speck Henry Spencer | ||||
| Einar Stefferud Michael Stein | ||||
| Klaus Steinberger Peter Svanberg | ||||
| James Thompson Steve Uhler | ||||
| Stuart Vance Peter Vanderbilt | ||||
| Greg Vaudreuil Ed Vielmetti | ||||
| Larry W. Virden Ryan Waldron | ||||
| Rhys Weatherly Jay Weber | ||||
| Dave Wecker Wally Wedel | ||||
| Sven-Ove Westberg Brian Wideen | ||||
| John Wobus Glenn Wright | ||||
| Rayan Zachariassen David Zimmerman | ||||
| ...Some text appears here... | The authors apologize for any omissions from this list, which | |||
| [Note that the preceding blank line means | are certainly unintentional. | |||
| no header fields were given and this is text, | ||||
| with charset US ASCII. It could have been | ||||
| done with explicit typing as in the next part.] | ||||
| --unique-boundary-1 | Appendix A -- MIME Conformance | |||
| Content-type: text/plain; charset=US-ASCII | ||||
| This could have been part of the previous part, | The mechanisms described in this document are open-ended. It | |||
| but illustrates explicit versus implicit | is definitely not expected that all implementations will | |||
| typing of body parts. | support all of the Content-Types described, nor that they will | |||
| all share the same extensions. In order to promote | ||||
| interoperability, however, it is useful to define the concept | ||||
| of "MIME-conformance" to define a certain level of | ||||
| implementation that allows the useful interworking of messages | ||||
| with content that differs from US-ASCII text. In this | ||||
| section, we specify the requirements for such conformance. | ||||
| --unique-boundary-1 | A mail user agent that is MIME-conformant MUST: | |||
| Content-Type: multipart/parallel; | ||||
| boundary=unique-boundary-2 | ||||
| --unique-boundary-2 | (1) Always generate a "MIME-Version: 1.0" header field. | |||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | (2) Recognize the Content-Transfer-Encoding header field | |||
| and decode all received data encoded with either the | ||||
| quoted-printable or base64 implementations. Any non-7- | ||||
| bit data that is sent without encoding must be properly | ||||
| labelled with a content-transfer-encoding of 8bit or | ||||
| binary, as appropriate. If the underlying transport | ||||
| does not support 8bit or binary (as SMTP [RFC821] does | ||||
| not), the snder is required to both encode and label | ||||
| data using an appropriate Content-Transfer-Encoding | ||||
| such as quoted-printable or base64. | ||||
| Content-Type: audio/basic | (3) Recognize and interpret the Content-Type header field, | |||
| Content-Transfer-Encoding: base64 | and avoid showing users raw data with a Content-Type | |||
| field other than text. Be able to send at least | ||||
| text/plain messages, with the character set specified | ||||
| as a parameter if it is not US-ASCII. | ||||
| ... base64-encoded 8000 Hz single-channel | (4) Explicitly handle the following Content-Type values, to | |||
| mu-law-format audio data goes here.... | at least the following extents: | |||
| --unique-boundary-2 | Text: | |||
| Content-Type: image/gif | ||||
| Content-Transfer-Encoding: base64 | ||||
| ... base64-encoded image data goes here.... | -- Recognize and display "text" mail with the | |||
| character set "US-ASCII." | ||||
| -- Recognize other character sets at least to the | ||||
| extent of being able to inform the user about what | ||||
| character set the message uses. | ||||
| --unique-boundary-2-- | -- Recognize the "ISO-8859-*" character sets to the | |||
| extent of being able to display those characters that | ||||
| are common to ISO-8859-* and US-ASCII, namely all | ||||
| characters represented by octet values 0-127. | ||||
| --unique-boundary-1 | -- For unrecognized subtypes in a known character | |||
| Content-type: text/richtext | set, show or offer to show the user the "raw" version | |||
| of the data after conversion of the content from | ||||
| canonical form to local form. | ||||
| This is <bold><italic>richtext.</italic></bold> | -- Treat material in an unknown character set as if | |||
| <smaller>as defined in RFC 1341</smaller> | it were "application/octet-stream". | |||
| <nl><nl>Isn't it | ||||
| <bigger><bigger>cool?</bigger></bigger> | ||||
| --unique-boundary-1 | Image, audio, and video: | |||
| Content-Type: message/rfc822 | ||||
| From: (mailbox in US-ASCII) | -- At a minumum provide facilities to Treat any | |||
| To: (address in US-ASCII) | unrecognized subtypes as if they were | |||
| Subject: (subject in US-ASCII) | "application/octet-stream". | |||
| Content-Type: Text/plain; charset=ISO-8859-1 | ||||
| Content-Transfer-Encoding: Quoted-printable | ||||
| ... Additional text in ISO-8859-1 goes here ... | Application: | |||
| --unique-boundary-1-- | -- Offer the ability to remove either of the quoted- | |||
| printable or base64 encodings defined in this | ||||
| document if they were used and put the resulting | ||||
| information in a user file. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | Multipart: | |||
| Appendix D -- Collected Grammar | -- Recognize the mixed subtype. Display all relevant | |||
| information on the message level and the body part | ||||
| header level and then display or offer to display | ||||
| each of the body parts individually. | ||||
| This appendix contains the complete BNF grammar for all the | -- Recognize the "alternative" subtype, and avoid | |||
| syntax specified by this document. | showing the user redundant parts of | |||
| multipart/alternative mail. | ||||
| By itself, however, this grammar is incomplete. It refers | -- Recognize the "multipart/digest" subtype, | |||
| to several entities that are defined by RFC 822. Rather | specifically using "message/rfc822" rather than | |||
| than reproduce those definitions here, and risk | "text/plain" as the default content-type for | |||
| unintentional differences between the two, this document | encapsulations inside "multipart/digest" entities. | |||
| simply refers the reader to RFC 822 for the remaining | ||||
| definitions. Wherever a term is undefined, it refers to the | ||||
| RFC 822 definition. | ||||
| application-subtype := ("octet-stream" *stream-param) | -- Treat any unrecognized subtypes as if they were | |||
| / "postscript" / extension-token | "mixed". | |||
| application-type := "application" "/" application-subtype | Message: | |||
| attribute := token ; case-insensitive | -- Recognize and display at least the primary | |||
| (RFC822) encapsulation in such a way as to preserve | ||||
| any recursive structure, that is, displaying or | ||||
| offering to display the encapsulated data in | ||||
| accordance with its Content-type. | ||||
| atype := "ftp" / "anon-ftp" / "tftp" / "local-file" | -- Treat any unrecognized subtypes as if they were | |||
| / "afs" / "mail-server" / extension-token | "application/octet-stream". | |||
| ; Case-insensitive | ||||
| audio-type := "audio" "/" ("basic" / extension-token) | (5) Upon encountering any unrecognized Content-Type, an | |||
| implementation must treat it as if it had a Content- | ||||
| Type of "application/octet-stream" with no parameter | ||||
| sub-arguments. How such data are handled is up to an | ||||
| implementation, but likely options for handling such | ||||
| unrecognized data include offering the user to write it | ||||
| into a file (decoded from its mail transport format) or | ||||
| offering the user to name a program to which the | ||||
| decoded data should be passed as input. | ||||
| body-part := <"message" as defined in RFC 822, | A user agent that meets the above conditions is said to be | |||
| with all header fields optional, and with the | MIME-conformant. The meaning of this phrase is that it is | |||
| specified delimiter not occurring anywhere in | assumed to be "safe" to send virtually any kind of properly- | |||
| the message body, either on a line by itself | marked data to users of such mail systems, because such | |||
| or as a substring anywhere.> | systems will at least be able to treat the data as | |||
| undifferentiated binary, and will not simply splash it onto | ||||
| the screen of unsuspecting users. | ||||
| NOTE: In certain transport enclaves, RFC 822 | There is another sense in which it is always "safe" to send | |||
| restrictions such as the one that limits bodies to | data in a format that is MIME-conformant, which is that such | |||
| printable ASCII characters may not be in force. (That | data will not break or be broken by any known systems that are | |||
| is, the transport domains may resemble standard | conformant with RFC 821 and RFC 822. User agents that are | |||
| Internet mail transport as specified in RFC821 and | MIME-conformant have the additional guarantee that the user | |||
| assumed by RFC822, but without certain restrictions.) | will not be shown data that were never intended to be viewed | |||
| The relaxation of these restrictions should be | as text. | |||
| construed as locally extending the definition of | ||||
| bodies, for example to include octets outside of the | ||||
| ASCII range, as long as these extensions are supported | ||||
| by the transport and adequately documented in the | ||||
| Content-Transfer-Encoding header field. However, in | ||||
| no event are headers (either message headers or body- | ||||
| part headers) allowed to contain anything other than | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | Appendix B -- Guidelines For Sending Email Data | |||
| ASCII characters. | Internet email is not a perfect, homogeneous system. Mail may | |||
| become corrupted at several stages in its travel to a final | ||||
| destination. Specifically, email sent throughout the Internet | ||||
| may travel across many networking technologies. Many | ||||
| networking and mail technologies do not support the full | ||||
| functionality possible in the SMTP transport environment. | ||||
| Mail traversing these systems is likely to be modified in such | ||||
| a way that it can be transported. | ||||
| boundary := 0*69<bchars> bcharsnospace | There exist many widely-deployed non-conformant MTAs in the | |||
| Internet. These MTAs, speaking the SMTP protocol, alter | ||||
| messages on the fly to take advantage of the internal data | ||||
| structure of the hosts they are implemented on, or are just | ||||
| plain broken. | ||||
| bchars := bcharsnospace / " " | The following guidelines may be useful to anyone devising a | |||
| data format (Content-Type) that will survive the widest range | ||||
| of networking technologies and known broken MTAs unscathed. | ||||
| Note that anything encoded in the base64 encoding will satisfy | ||||
| these rules, but that some well-known mechanisms, notably the | ||||
| UNIX uuencode facility, will not. Note also that anything | ||||
| encoded in the Quoted-Printable encoding will survive most | ||||
| gateways intact, but possibly not some gateways to systems | ||||
| that use the EBCDIC character set. | ||||
| bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" / "+" / | (1) Under some circumstances the encoding used for data may | |||
| "_" | change as part of normal gateway or user agent | |||
| / "," / "-" / "." / "/" / ":" / "=" / "?" | operation. In particular, conversion from base64 to | |||
| quoted-printable and vice versa may be necessary. This | ||||
| may result in the confusion of CRLF sequences with line | ||||
| breaks in text bodies. As such, the persistence of | ||||
| CRLF as something other than a line break must not be | ||||
| relied on. | ||||
| charset := "us-ascii" / "iso-8859-1" / "iso-8859-2" / "iso- | (2) Many systems may elect to represent and store text data | |||
| 8859-3" | using local newline conventions. Local newline | |||
| / "iso-8859-4" / "iso-8859-5" / "iso-8859-6" / "iso- | conventions may not match the RFC822 CRLF convention -- | |||
| 8859-7" | systems are known that use plain CR, plain LF, CRLF, or | |||
| / "iso-8859-8" / "iso-8859-9" / extension-token | counted records. The result is that isolated CR and LF | |||
| ; case insensitive | characters are not well tolerated in general; they may | |||
| be lost or converted to delimiters on some systems, and | ||||
| hence must not be relied on. | ||||
| close-delimiter := "--" boundary "--" CRLF | (3) TAB (HT) characters may be misinterpreted or may be | |||
| ; Again, no space by "--", | automatically converted to variable numbers of spaces. | |||
| This is unavoidable in some environments, notably those | ||||
| not based on the US-ASCII character set. Such | ||||
| conversion is STRONGLY DISCOURAGED, but it may occur, | ||||
| and mail formats must not rely on the persistence of | ||||
| TAB (HT) characters. | ||||
| content := "Content-Type" ":" type "/" subtype | (4) Lines longer than 76 characters may be wrapped or | |||
| *(";" parameter) | truncated in some environments. Line wrapping and line | |||
| ; case-insensitive matching of type and subtype | truncation are STRONGLY DISCOURAGED, but unavoidable in | |||
| some cases. Applications which require long lines must | ||||
| somehow differentiate between soft and hard line | ||||
| breaks. (A simple way to do this is to use the | ||||
| quoted-printable encoding.) | ||||
| delimiter := "--" boundary CRLF ; taken from Content-Type | (5) Trailing "white space" characters (SPACE, TAB (HT)) on | |||
| field. | a line may be discarded by some transport agents, while | |||
| ; There must be no space | other transport agents may pad lines with these | |||
| ; between "--" and boundary. | characters so that all lines in a mail file are of | |||
| equal length. The persistence of trailing white space, | ||||
| therefore, must not be relied on. | ||||
| description := "Content-Description" ":" *text | (6) Many mail domains use variations on the US-ASCII | |||
| character set, or use character sets such as EBCDIC | ||||
| which contain most but not all of the US-ASCII | ||||
| characters. The correct translation of characters not | ||||
| in the "invariant" set cannot be depended on across | ||||
| character converting gateways. For example, this | ||||
| situation is a problem when sending uuencoded | ||||
| information across BITNET, an EBCDIC system. Similar | ||||
| problems can occur without crossing a gateway, since | ||||
| many Internet hosts use character sets other than US- | ||||
| ASCII internally. The definition of Printable Strings | ||||
| in X.400 adds further restrictions in certain special | ||||
| cases. In particular, the only characters that are | ||||
| known to be consistent across all gateways are the 73 | ||||
| characters that correspond to the upper and lower case | ||||
| letters A-Z and a-z, the 10 digits 0-9, and the | ||||
| following eleven special characters: | ||||
| discard-text := *(*text CRLF) | "'" (US-ASCII decimal value 39) | |||
| "(" (US-ASCII decimal value 40) | ||||
| ")" (US-ASCII decimal value 41) | ||||
| "+" (US-ASCII decimal value 43) | ||||
| "," (US-ASCII decimal value 44) | ||||
| "-" (US-ASCII decimal value 45) | ||||
| "." (US-ASCII decimal value 46) | ||||
| "/" (US-ASCII decimal value 47) | ||||
| ":" (US-ASCII decimal value 58) | ||||
| "=" (US-ASCII decimal value 61) | ||||
| "?" (US-ASCII decimal value 63) | ||||
| encapsulation := delimiter body-part CRLF | A maximally portable mail representation, such as the | |||
| base64 encoding, will confine itself to relatively | ||||
| short lines of text in which the only meaningful | ||||
| characters are taken from this set of 73 characters. | ||||
| encoding := "Content-Transfer-Encoding" ":" mechanism | (7) Some mail transport agents will corrupt data that | |||
| includes certain literal strings. In particular, a | ||||
| period (".") alone on a line is known to be corrupted | ||||
| by some (incorrect) SMTP implementations, and a line | ||||
| that starts with the five characters "From " (the fifth | ||||
| character is a SPACE) are commonly corrupted as well. | ||||
| A careful composition agent can prevent these | ||||
| corruptions by encoding the data (e.g., in the quoted- | ||||
| printable encoding, "=46rom " in place of "From " at | ||||
| the start of a line, and "=2E" in place of "." alone on | ||||
| a line. | ||||
| epilogue := discard-text ; to be ignored | Please note that the above list is NOT a list of recommended | |||
| upon receipt. | practices for MTAs. RFC 821 MTAs are prohibited from altering | |||
| the character of white space or wrapping long lines. These | ||||
| BAD and invalid practices are known to occur on established | ||||
| networks, and implementations should be robust in dealing with | ||||
| the bad effects they can cause. | ||||
| extension-token := x-token / iana-token | Appendix C -- A Complex Multipart Example | |||
| external-param := (";" "access-type" "=" atype) | What follows is the outline of a complex multipart message. | |||
| / (";" "expiration" "=" date-time) | This message has five parts to be displayed serially: two | |||
| introductory plain text parts, an embedded multipart message, | ||||
| a text/enriched part, and a closing encapsulated text message | ||||
| in a non-ASCII character set. The embedded multipart message | ||||
| has two parts to be displayed in parallel, a picture and an | ||||
| audio fragment. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | MIME-Version: 1.0 | |||
| From: Nathaniel Borenstein <nsb@bellcore.com> | ||||
| To: Ned Freed <ned@innosoft.com> | ||||
| Date: Fri, 07 Oct 1994 16:15:05 -0700 (PDT) | ||||
| Subject: A multipart example | ||||
| Content-Type: multipart/mixed; | ||||
| boundary=unique-boundary-1 | ||||
| ; Note that date-time is quoted | This is the preamble area of a multipart message. | |||
| / (";" "size" "=" 1*DIGIT) | Mail readers that understand multipart format | |||
| / (";" "permission" "=" ("read" / "read- | should ignore this preamble. | |||
| write")) | ||||
| ; Permission is case-insensitive | ||||
| / (";" "name" "=" value) | ||||
| / (";" "site" "=" value) | ||||
| / (";" "dir" "=" value) | ||||
| / (";" "mode" "=" value) | ||||
| / (";" "server" "=" value) | ||||
| / (";" "subject" "=" value) | ||||
| ; access-type required; others required based on | ||||
| access-type | ||||
| iana-token := <a publicly-defined extension token, | If you are reading this text, you might want to | |||
| registered with IANA, as specified in | consider changing to a mail reader that understands | |||
| appendix E> | how to properly display multipart messages. | |||
| id := "Content-ID" ":" msg-id | --unique-boundary-1 | |||
| image-type := "image" "/" ("gif" / "jpeg" / extension-token) | ... Some text appears here ... | |||
| mechanism := "7bit" ; case-insensitive | [Note that the blank between the boundary and the start | |||
| / "quoted-printable" | of the text in this part means no header fields were | |||
| / "base64" | given and this is text in the US-ASCII character set. | |||
| / "8bit" | It could have been done with explicit typing as in the | |||
| / "binary" | next part.] | |||
| / x-token | ||||
| message-subtype := "rfc822" | --unique-boundary-1 | |||
| / "partial" 2#3partial-param | Content-type: text/plain; charset=US-ASCII | |||
| / "external-body" 1*external-param | ||||
| / extension-token | ||||
| message-type := "message" "/" message-subtype | This could have been part of the previous part, but | |||
| illustrates explicit versus implicit typing of body | ||||
| parts. | ||||
| multipart-body := preamble 1*encapsulation close-delimiter | --unique-boundary-1 | |||
| epilogue | Content-Type: multipart/parallel; boundary=unique-boundary-2 | |||
| multipart-subtype := "mixed" / "parallel" / "digest" | --unique-boundary-2 | |||
| / "alternative" / extension-token | Content-Type: audio/basic | |||
| Content-Transfer-Encoding: base64 | ||||
| multipart-type := "multipart" "/" multipart-subtype | ... base64-encoded 8000 Hz single-channel | |||
| ";" "boundary" "=" boundary | mu-law-format audio data goes here ... | |||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | --unique-boundary-2 | |||
| Content-Type: image/gif | ||||
| Content-Transfer-Encoding: base64 | ||||
| octet := "=" 2(DIGIT / "A" / "B" / "C" / "D" / "E" / "F") | ... base64-encoded image data goes here ... | |||
| ; octet must be used for characters > 127, =, SPACE, or | ||||
| TAB, | ||||
| ; and is recommended for any characters not listed in | ||||
| ; Appendix B as "mail-safe". | ||||
| padding := "0" / "1" / "2" / "3" / "4" / "5" / "6" / "7" | --unique-boundary-2-- | |||
| parameter := attribute "=" value | --unique-boundary-1 | |||
| Content-type: text/enriched | ||||
| partial-param := (";" "id" "=" value) | This is <bold><italic>enriched.</italic></bold> | |||
| / (";" "number" "=" 1*DIGIT) | <smaller>as defined in RFC 1563</smaller> | |||
| / (";" "total" "=" 1*DIGIT) | ||||
| ; id & number required; total required for last | ||||
| part | ||||
| preamble := discard-text ; to be ignored | Isn't it | |||
| upon receipt. | <bigger><bigger>cool?</bigger></bigger> | |||
| ptext := octet / <any ASCII character except "=", SPACE, or | --unique-boundary-1 | |||
| TAB> | Content-Type: message/rfc822 | |||
| ; characters not listed as "mail-safe" in Appendix B | ||||
| ; are also not recommended. | ||||
| quoted-printable := ([*(ptext / SPACE / TAB) ptext] ["="] | From: (mailbox in US-ASCII) | |||
| CRLF) | To: (address in US-ASCII) | |||
| ; Maximum line length of 76 characters excluding CRLF | Subject: (subject in US-ASCII) | |||
| Content-Type: Text/plain; charset=ISO-8859-1 | ||||
| Content-Transfer-Encoding: Quoted-printable | ||||
| stream-param := (";" "type" "=" value) | ... Additional text in ISO-8859-1 goes here ... | |||
| / (";" "padding" "=" padding) | ||||
| subtype := token ; case-insensitive | --unique-boundary-1-- | |||
| Appendix D -- Collected Grammar | ||||
| text-subtype := "plain" / extension-token | This appendix contains the complete BNF grammar for all the | |||
| syntax specified by this document. | ||||
| text-type := "text" "/" text-subtype [";" "charset" "=" | By itself, however, this grammar is incomplete. It refers to | |||
| charset] | several entities that are defined by RFC 822. Rather than | |||
| reproduce those definitions here, and risk unintentional | ||||
| differences between the two, this document simply refers the | ||||
| reader to RFC 822 for the remaining definitions. Wherever a | ||||
| term is undefined, it refers to the RFC 822 definition. | ||||
| token := 1*<any (ASCII) CHAR except SPACE, CTLs, or | attribute := token | |||
| tspecials> | ||||
| tspecials := "(" / ")" / "<" / ">" / "@" | boundary := 0*69<bchars> bcharsnospace | |||
| / "," / ";" / ":" / "\" / <"> | ||||
| / "/" / "[" / "]" / "?" / "=" | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | bchars := bcharsnospace / " " | |||
| ; Must be in quoted-string, | bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" / | |||
| ; to use within parameter values | "+" / "_" / "," / "-" / "." / | |||
| "/" / ":" / "=" / "?" | ||||
| type := "application" / "audio" ; case- | body-part := <"message" as defined in RFC 822, with all | |||
| insensitive | header fields optional, not starting with the | |||
| / "image" / "message" | specified dash-boundary, and with the | |||
| / "multipart" / "text" | delimiter not occurring anywhere in the | |||
| / "video" / extension-token | message body. Note that the semantics of a | |||
| ; All values case-insensitive | part differ from the semantics of a message, | |||
| as described in the text.> | ||||
| value := token / quoted-string | close-delimiter := CRLF dash-boundary "--" | |||
| version := "MIME-Version" ":" 1*DIGIT "." 1*DIGIT | composite-type := "message" / "multipart" / extension-token | |||
| video-type := "video" "/" ("mpeg" / extension-token) | content := "Content-Type" ":" type "/" subtype | |||
| *(";" parameter) | ||||
| ; Matching of type and subtype is | ||||
| ; ALWAYS case-insensitive | ||||
| x-token := <The two characters "X-" or "x-" followed, with | dash-boundary := "--" boundary | |||
| no | ; boundary taken from Content-Type | |||
| intervening white space, by any token> | ; field. | |||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | delimiter := CRLF dash-boundary | |||
| Appendix E -- IANA Registration Procedures | description := "Content-Description" ":" *text | |||
| MIME has been carefully designed to have extensible | discard-text := *(*text CRLF) | |||
| mechanisms, and it is expected that the set of content- | ; To be ignored upon receipt. | |||
| type/subtype pairs and their associated parameters will grow | ||||
| significantly with time. Several other MIME fields, notably | ||||
| character set names, access-type parameters for the | ||||
| message/external-body type, and possibly even Content- | ||||
| Transfer-Encoding values, are likely to have new values | ||||
| defined over time. In order to ensure that the set of such | ||||
| values is developed in an orderly, well-specified, and | ||||
| public manner, MIME defines a registration process which | ||||
| uses the Internet Assigned Numbers Authority (IANA) as a | ||||
| central registry for such values. | ||||
| In general, parameters in the content-type header field are | discrete-type := "text" / "image" / "audio" / "video" / | |||
| used to convey supplemental information for various content | "application" / extension-token | |||
| types, and their use is defined when the content-type and | ||||
| subtype are defined. New parameters should not be defined | ||||
| as a way to introduce new functionality. | ||||
| In order to simplify and standardize the registration | encapsulation := delimiter [*LWSP-char] | |||
| process, this appendix gives templates for the registration | CRLF body-part | |||
| of new values with IANA. Each of these is given in the form | ||||
| of an email message template, to be filled in by the | ||||
| registering party. | ||||
| E.1 Registration of New Content-type/subtype Values | encoding := "Content-Transfer-Encoding" ":" mechanism | |||
| Note that MIME is generally expected to be extended by | epilogue := discard-text | |||
| subtypes. If a new fundamental top-level type is needed, | ||||
| its specification must be published as an RFC or submitted | ||||
| in a form suitable to become an RFC, and be subject to the | ||||
| Internet standards process. | ||||
| To: IANA@isi.edu | extension-token := iana-token / ietf-token / x-token | |||
| Subject: Registration of new MIME | ||||
| content-type/subtype | ||||
| MIME type name: | iana-token := <a publicly-defined extension token, | |||
| registered with IANA, as specified in | ||||
| RFC REG [REF-REG]> | ||||
| (If the above is not an existing top-level MIME type, | ietf-token := <a publicly-defined extension token, | |||
| please explain why an existing type cannot be used.) | initially registered with IANA and | |||
| subsequently standardized by the IETF> | ||||
| MIME subtype name: | id := "Content-ID" ":" msg-id | |||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | mechanism := "7bit" / "8bit" / "binary" / | |||
| "quoted-printable" / "base64" / | ||||
| ietf-token / x-token | ||||
| Required parameters: | multipart-body := preamble dash-boundary | |||
| [*LWSP-char] CRLF | ||||
| body-part *encapsulation | ||||
| close-delimiter [*LWSP-char] | ||||
| CRLF epilogue | ||||
| Optional parameters: | octet := "=" 2(DIGIT / "A" / "B" / "C" / "D" / "E" / "F") | |||
| ; Octet must be used for characters > 127, =, | ||||
| ; SPACE, or TAB, and is recommended for any | ||||
| ; characters not listed in Appendix B as | ||||
| ; "mail-safe". | ||||
| Encoding considerations: | parameter := attribute "=" value | |||
| Security considerations: | preamble := discard-text | |||
| Published specification: | ptext := octet / safe-char | |||
| (The published specification must be an Internet RFC or | quoted-printable := ([*(ptext / SPACE / TAB) ptext] ["="] CRLF) | |||
| RFC-to-be if a new top-level type is being defined, and | ; Maximum line length of 76 characters | |||
| must be a publicly available specification in any | ; excluding CRLF | |||
| case.) | ||||
| Person & email address to contact for further | safe-char := <any US-ASCII character except "=", | |||
| information: | SPACE, or TAB> | |||
| ; Characters not listed as "mail-safe" in | ||||
| ; Appendix B are also not recommended. | ||||
| E.2 Registration of New Access-type Values for | subtype := extension-token | |||
| Message/external-body | ||||
| To: IANA@isi.edu | token := 1*<any (US-ASCII) CHAR except SPACE, CTLs, | |||
| Subject: Registration of new MIME Access-type for | or tspecials> | |||
| Message/external-body content-type | ||||
| MIME access-type name: | tspecials := "(" / ")" / "<" / ">" / "@" / | |||
| "," / ";" / ":" / "\" / <"> | ||||
| "/" / "[" / "]" / "?" / "=" | ||||
| ; Must be in quoted-string, | ||||
| ; to use within parameter values | ||||
| Required parameters: | type := discrete-type / composite-type | |||
| Optional parameters: | value := token / quoted-string | |||
| Published specification: | version := "MIME-Version" ":" 1*DIGIT "." 1*DIGIT | |||
| (The published specification must be an Internet RFC or | x-token := <The two characters "X-" or "x-" followed, with | |||
| RFC-to-be.) | no intervening white space, by any token> | |||
| Person & email address to contact for further | Appendix E -- Summary of the Seven Content-types | |||
| information: | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | Content type: text | |||
| Appendix F -- Summary of the Seven Content-types | Subtypes defined by this document: plain | |||
| Content-type: text | Important parameters: charset | |||
| Subtypes defined by this document: plain | Encoding notes: quoted-printable generally preferred if an | |||
| encoding is needed and the character set is mostly a US- | ||||
| ASCII superset. | ||||
| Important Parameters: charset | Security considerations: Rich text formats such as TeX and | |||
| Troff often contain mechanisms for executing arbitrary | ||||
| commands or file system operations, and should not be used | ||||
| automatically unless these security problems have been | ||||
| addressed. Even plain text may contain control characters | ||||
| that can be used to exploit the capabilities of | ||||
| "intelligent" terminals and cause security violations. User | ||||
| interfaces designed to run on such terminals should be aware | ||||
| of and try to prevent such problems. | ||||
| Encoding notes: quoted-printable generally preferred if an | Content type: image | |||
| encoding is needed and the character set is mostly an | ||||
| ASCII superset. | ||||
| Security considerations: Rich text formats such as TeX and | Subtypes defined by this document: jpeg, gif | |||
| Troff often contain mechanisms for executing arbitrary | ||||
| commands or file system operations, and should not be | ||||
| used automatically unless these security problems have | ||||
| been addressed. Even plain text may contain control | ||||
| characters that can be used to exploit the capabilities | ||||
| of "intelligent" terminals and cause security | ||||
| violations. User interfaces designed to run on such | ||||
| terminals should be aware of and try to prevent such | ||||
| problems. | ||||
| ________________________________________________________________ | ||||
| Content-type: multipart | Important parameters: none | |||
| Subtypes defined by this document: mixed, alternative, | Encoding notes: base64 generally preferred | |||
| digest, parallel. | ||||
| Important Parameters: boundary | Content type: audio | |||
| Encoding notes: No content-transfer-encoding is permitted. | Subtypes defined by this document: basic | |||
| ________________________________________________________________ | Important parameters: none | |||
| Content-type: message | Encoding notes: base64 generally preferred | |||
| Subtypes defined by this document: rfc822, partial, | Content type: video | |||
| external-body | ||||
| Important Parameters: id, number, total, access-type, | Subtypes defined by this document: mpeg | |||
| expiration, size, permission, name, site, directory, | ||||
| mode, server, subject | ||||
| Encoding notes: No content-transfer-encoding is permitted. | Important parameters: none | |||
| Specifically, only "7bit" is permitted for | Encoding notes: base64 generally preferred | |||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | Content type: application | |||
| "message/partial" or "message/external-body", and only | Subtypes defined by this document: octet-stream, postscript | |||
| "7bit", "8bit", or "binary" are permitted for other | ||||
| subtypes of "message". | ||||
| ________________________________________________________________ | Important parameters: type, padding | |||
| Content-type: application | Deprecated parameters: name and conversions were defined in | |||
| RFC 1341, and have since been deleted. | ||||
| Subtypes defined by this document: octet-stream, postscript | Encoding notes: base64 preferred for unreadable subtypes. | |||
| Important Parameters: type, padding | Security considerations: This type is intended for the | |||
| transmission of data to be interpreted by locally-installed | ||||
| programs. Severe security problems could result if this | ||||
| type is used to transmit binary programs or programs in | ||||
| general-purpose interpreted languages, such as LISP programs | ||||
| or shell scripts, without taking special precautions. | ||||
| Authors of mail-reading agents are cautioned against giving | ||||
| their systems the power to execute mail-based application | ||||
| data without carefully considering the security | ||||
| implications. While it is certainly possible to define safe | ||||
| application formats and even safe interpreters for unsafe | ||||
| formats, each interpreter should be evaluated separately for | ||||
| possible security problems. | ||||
| Deprecated Parameters: name and conversions were defined in | Content type: multipart | |||
| RFC 1341. | ||||
| Encoding notes: base64 preferred for unreadable subtypes. | Subtypes defined by this document: mixed, alternative, | |||
| digest, parallel. | ||||
| Security considerations: This type is intended for the | Important parameters: boundary | |||
| transmission of data to be interpreted by locally-installed | ||||
| programs. If used, for example, to transmit executable | ||||
| binary programs or programs in general-purpose interpreted | ||||
| languages, such as LISP programs or shell scripts, severe | ||||
| security problems could result. Authors of mail-reading | ||||
| agents are cautioned against giving their systems the power | ||||
| to execute mail-based application data without carefully | ||||
| considering the security implications. While it is | ||||
| certainly possible to define safe application formats and | ||||
| even safe interpreters for unsafe formats, each interpreter | ||||
| should be evaluated separately for possible security | ||||
| problems. | ||||
| ________________________________________________________________ | ||||
| Content-type: image | Encoding notes: No content-transfer-encoding other than | |||
| "7bit", "8bit", or "binary" are permitted. | ||||
| Subtypes defined by this document: jpeg, gif | Content type: message | |||
| Important Parameters: none | Subtypes defined by this document: rfc822, partial, | |||
| external-body | ||||
| Encoding notes: base64 generally preferred | Important parameters: id, number, total, access-type, | |||
| expiration, size, permission, name, site, directory, mode, | ||||
| server, subject | ||||
| Encoding notes: Only "7bit" is permitted for | ||||
| "message/partial" or "message/external-body", and only | ||||
| "7bit", "8bit", or "binary" are permitted for other subtypes | ||||
| of "message". | ||||
| ________________________________________________________________ | Appendix F -- Canonical Encoding Model | |||
| Content-type: audio | There was some confusion, in earlier drafts of this memo, | |||
| regarding the model for when email data was to be converted to | ||||
| canonical form and encoded, and in particular how this process | ||||
| would affect the treatment of CRLFs, given that the | ||||
| representation of newlines varies greatly from system to | ||||
| system. For this reason, a canonical model for encoding is | ||||
| presented below. | ||||
| Subtypes defined by this document: basic | The process of composing a MIME entity can be modeled as being | |||
| done in a number of steps. Note that these steps are roughly | ||||
| similar to those steps used in PEM [RFC1421] and are performed | ||||
| for each "innermost level" body: | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | (1) Creation of local form. | |||
| Important Parameters: none | The body to be transmitted is created in the system's | |||
| native format. The native character set is used, and | ||||
| where appropriate local end of line conventions are | ||||
| used as well. The body may be a UNIX-style text file, | ||||
| or a Sun raster image, or a VMS indexed file, or audio | ||||
| data in a system-dependent format stored only in | ||||
| memory, or anything else that corresponds to the local | ||||
| model for the representation of some form of | ||||
| information. Fundamentally, the data is created in the | ||||
| "native" form that corresponds to the type specified by | ||||
| the content type. | ||||
| Encoding notes: base64 generally preferred | (2) Conversion to canonical form. | |||
| ________________________________________________________________ | The entire body, including "out-of-band" information | |||
| such as record lengths and possibly file attribute | ||||
| information, is converted to a universal canonical | ||||
| form. The specific content type of the body as well as | ||||
| its associated attributes dictate the nature of the | ||||
| canonical form that is used. Conversion to the proper | ||||
| canonical form may involve character set conversion, | ||||
| transformation of audio data, compression, or various | ||||
| other operations specific to the various content types. | ||||
| If character set conversion is involved, however, care | ||||
| must be taken to understand the semantics of the | ||||
| content-type, which may have strong implications for | ||||
| any character set conversion, e.g. with regard to | ||||
| syntactically meaningful characters in a text subtype | ||||
| other than "plain". | ||||
| Content-type: video | For example, in the case of text/plain data, the text | |||
| must be converted to a supported character set and | ||||
| lines must be delimited with CRLF delimiters in | ||||
| accordance with RFC 822. Note that the restriction on | ||||
| line lengths implied by RFC 822 is eliminated if the | ||||
| next step employs either quoted-printable or base64 | ||||
| encoding. | ||||
| Subtypes defined by this document: mpeg | (3) Apply transfer encoding. | |||
| Important Parameters: none | A Content-Transfer-Encoding appropriate for this body | |||
| is applied. Note that there is no fixed relationship | ||||
| between the content type and the transfer encoding. In | ||||
| particular, it may be appropriate to base the choice of | ||||
| base64 or quoted-printable on character frequency | ||||
| counts which are specific to a given instance of a | ||||
| body. | ||||
| Encoding notes: base64 generally preferred | (4) Insertion into entity. | |||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | The encoded object is inserted into a MIME entity with | |||
| appropriate headers. The entity is then inserted into | ||||
| the body of a higher-level entity (message or | ||||
| multipart) if needed. | ||||
| Appendix G -- Canonical Encoding Model | It is vital to note that these steps are only a model; they | |||
| are specifically NOT a blueprint for how an actual system | ||||
| would be built. In particular, the model fails to account for | ||||
| two common designs: | ||||
| There was some confusion, in earlier drafts of this memo, | (1) In many cases the conversion to a canonical form prior | |||
| regarding the model for when email data was to be converted | to encoding will be subsumed into the encoder itself, | |||
| to canonical form and encoded, and in particular how this | which understands local formats directly. For example, | |||
| process would affect the treatment of CRLFs, given that the | the local newline convention for text bodies might be | |||
| representation of newlines varies greatly from system to | carried through to the encoder itself along with | |||
| system. For this reason, a canonical model for encoding is | knowledge of what that format is. | |||
| presented below. | ||||
| The process of composing a MIME entity can be modeled as | (2) The output of the encoders may have to pass through one | |||
| being done in a number of steps. Note that these steps are | or more additional steps prior to being transmitted as | |||
| roughly similar to those steps used in RFC 1421 and are | a message. As such, the output of the encoder may not | |||
| performed for each 'innermost level' body: | be conformant with the formats specified by RFC 822. | |||
| Step 1. Creation of local form. | In particular, once again it may be appropriate for the | |||
| converter's output to be expressed using local newline | ||||
| conventions rather than using the standard RFC 822 CRLF | ||||
| delimiters. | ||||
| The body to be transmitted is created in the system's native | Other implementation variations are conceivable as well. The | |||
| format. The native character set is used, and where | vital aspect of this discussion is that, in spite of any | |||
| appropriate local end of line conventions are used as well. | optimizations, collapsings of required steps, or insertion of | |||
| The body may be a UNIX-style text file, or a Sun raster | additional processing, the resulting messages must be | |||
| image, or a VMS indexed file, or audio data in a system- | consistent with those produced by the model described here. | |||
| dependent format stored only in memory, or anything else | For example, a message with the following header fields: | |||
| that corresponds to the local model for the representation | ||||
| of some form of information. Fundamentally, the data is | ||||
| created in the "native" form specified by the type/subtype | ||||
| information. | ||||
| Step 2. Conversion to canonical form. | Content-type: text/foo; charset=bar | |||
| Content-Transfer-Encoding: base64 | ||||
| The entire body, including "out-of-band" information such as | must be first represented in the text/foo form, then (if | |||
| record lengths and possibly file attribute information, is | necessary) represented in the "bar" character set, and finally | |||
| converted to a universal canonical form. The specific | transformed via the base64 algorithm into a mail-safe form. | |||
| content type of the body as well as its associated | ||||
| attributes dictate the nature of the canonical form that is | ||||
| used. Conversion to the proper canonical form may involve | ||||
| character set conversion, transformation of audio data, | ||||
| compression, or various other operations specific to the | ||||
| various content types. If character set conversion is | ||||
| involved, however, care must be taken to understand the | ||||
| semantics of the content-type, which may have strong | ||||
| implications for any character set conversion, e.g. with | ||||
| regard to syntactically meaningful characters in a text | ||||
| subtype other than "plain". | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | Appendix G -- Changes from RFC 1521 | |||
| For example, in the case of text/plain data, the text must | This document is a revision of RFC 1521. For the convenience | |||
| be converted to a supported character set and lines must be | of those familiar with RFC 1521, the changes from that | |||
| delimited with CRLF delimiters in accordance with RFC822. | document are summarized in this appendix. For further history, | |||
| Note that the restriction on line lengths implied by RFC822 | note that Appendix H in RFC 1521 specified how that document | |||
| is eliminated if the next step employs either quoted- | differed from its predecessor, RFC 1341. | |||
| printable or base64 encoding. | ||||
| Step 3. Apply transfer encoding. | (1) This document has been completely reformatted. This was | |||
| done to improve the quality of the plain text version | ||||
| of this document, which is required to be the reference | ||||
| copy. | ||||
| A Content-Transfer-Encoding appropriate for this body is | (2) BNF describing the overall structure of MIME message | |||
| applied. Note that there is no fixed relationship between | and part headers has been added. This is a | |||
| the content type and the transfer encoding. In particular, | documentation change only -- the underlying syntax has | |||
| it may be appropriate to base the choice of base64 or | not changed in any way. | |||
| quoted-printable on character frequency counts which are | ||||
| specific to a given instance of a body. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | (3) The specific BNF for the seven content types in MIME | |||
| has been removed. This BNF was incorrect, incomplete, | ||||
| amd inconsistent with the type-indendependent BNF. And | ||||
| since the type-independent BNF already fully specifies | ||||
| the syntax of the various MIME headers, the type- | ||||
| specific BNF was, in the final analysis, completely | ||||
| unnecessary and caused more problems than it solved. | ||||
| Step 4. Insertion into entity. | (4) The more specific "US-ASCII" character set name has | |||
| replaced the use of the term ASCII in many parts of | ||||
| this specification. | ||||
| The encoded object is inserted into a MIME entity with | (5) The informal concept of a primary subtype has been | |||
| appropriate headers. The entity is then inserted into the | removed. | |||
| body of a higher-level entity (message or multipart) if | ||||
| needed. | ||||
| It is vital to note that these steps are only a model; they | (6) The term "object" was being used inconsistently. This | |||
| are specifically NOT a blueprint for how an actual system | term has been replaced with the more precise terms | |||
| would be built. In particular, the model fails to account | "body", "body part", and "entity" where appropriate. | |||
| for two common designs: | ||||
| 1. In many cases the conversion to a canonical | (7) The BNF for the multipart content-type has been | |||
| form prior to encoding will be subsumed into the | rearranged to make it clear that the CRLF preceeding | |||
| encoder itself, which understands local formats | the boundary marker is actually part of the marker | |||
| directly. For example, the local newline | itself rather than the preceeding body part. | |||
| convention for text bodies might be carried | ||||
| through to the encoder itself along with knowledge | ||||
| of what that format is. | ||||
| 2. The output of the encoders may have to pass | (8) In the rules on reassembling "message/partial" MIME | |||
| through one or more additional steps prior to | entities, "Subject" is added to the list of headers to | |||
| being transmitted as a message. As such, the | take from the inner message, and the example is | |||
| output of the encoder may not be conformant with | modified to clarify this point. | |||
| the formats specified by RFC822. In particular, | ||||
| once again it may be appropriate for the | ||||
| converter's output to be expressed using local | ||||
| newline conventions rather than using the standard | ||||
| RFC822 CRLF delimiters. | ||||
| Other implementation variations are conceivable as well. | (9) In the discussion of the application/postscript type, | |||
| The vital aspect of this discussion is that, in spite of any | an additional paragraph has been added warning about | |||
| optimizations, collapsings of required steps, or insertion | possible interoperability problems caused by embedding | |||
| of additional processing, the resulting messages must be | of binary data inside a PostScript MIME entity. | |||
| consistent with those produced by the model described here. | ||||
| For example, a message with the following header fields: | ||||
| Content-type: text/foo; charset=bar | (10) Added a clarifying note to the basic syntax rules for | |||
| Content-Transfer-Encoding: base64 | Content-Type to make it clear that the following two | |||
| forms: | ||||
| must be first represented in the text/foo form, then (if | Content-type: text/plain; charset=us-ascii (comment) | |||
| necessary) represented in the "bar" character set, and | ||||
| finally transformed via the base64 algorithm into a mail- | ||||
| safe form. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | Content-type: text/plain; charset="us-ascii" | |||
| Appendix H -- Changes from RFC 1521 | are completely equivalent. | |||
| This document is a very minor revision of RFC 1521. For the | (11) The following sentence has been removed from the | |||
| convenience of those familiar with RFC 1521, the changes | discussion of the MIME-Version header: "However, | |||
| from that document are summarized in this appendix. For | conformant software is encouraged to check the version | |||
| further history, note that Appendix H in RFC 1521 specified | number and at least warn the user if an unrecognized | |||
| how that document differed from its predecessor, RFC 1341. | MIME-version is encountered." | |||
| 1. In the rules on reassembling "message/partial" MIME | (12) A typo was fixed that said "application/external-body" | |||
| entities in section 7.3.2, "Subject" is added to the list of | instead of "message/external-body". | |||
| headers to take from the inner message, and the example is | ||||
| modified to clarify this point. | ||||
| 2. In the discussion of the application/postscript type in | (13) The definition of a character set has been reorganized | |||
| section 7.4.2, an additional paragraph has been added | to make the requirements clearer. | |||
| warning against the embedding of binary data inside a | ||||
| PostScript MIME entity. | ||||
| 3. Added a clarifying note to the basic syntax rules in | (14) The definitions of "7bit" and "8bit" have been | |||
| section 4 to make it clear that the following two forms: | tightened so that use of bare CR, LF, and NUL | |||
| characters are no longer allowed. | ||||
| Content-type: text/plain; charset=us-ascii | (15) The definition of canonical text in MIME has been | |||
| Content-type: text/plain; charset="us-ascii" | tightened so that line breaks must be represented by a | |||
| CRLF sequence. CR and LF characters are not allowed | ||||
| outside of this usage. The definition of quoted- | ||||
| printable encoding has been altered accordingly. | ||||
| are completely equivalent. | (16) Prose was added to clarify the use of the "7bit", "8- | |||
| bit", and "binary" transfer-encodings on multipart or | ||||
| message entities encapsulating "8bit" or "binary" data. | ||||
| 4. In section 7.2.3, a typo was fixed that said | (17) In Appendix A, "multipart/digest" support was added to | |||
| "application/external-body" instead of "message/external- | the list of requirements for minimal MIME conformance. | |||
| body". | Also, the requirement for "message/rfc822" support were | |||
| strengthened to clarify the importance of recognizing | ||||
| recursive structure. | ||||
| 5. In section 5, the following paragraph was added to | (18) The various restrictions on subtypes of "message" are | |||
| clarify the use of the "7bit" transfer-encoding in multipart | now specified entirely on a subtype by subtype basis. | |||
| or message entities encapsulating "8bit" or "binary" data: | ||||
| It should also be noted that, by definition, if a | (19) The definition of "message/rfc822" was changed to | |||
| "multipart" or "message" entity has a transfer- | indicate that at least one of the "From", "Subject", or | |||
| encoding value such as "7bit", but one of the | "Date" headers must be present. | |||
| enclosed parts has a less restrictive value such | ||||
| as "8bit", then either the outer "7bit" labelling | ||||
| is in error, because 8 bit data are included, or | ||||
| the inner "8bit" labelling placed an unnecessarily | ||||
| high demand on the transport system because the | ||||
| actual included data were actually 7bit-safe. | ||||
| 6. In Appendix A, "multipart/digest" support was added to | (20) The required handling of unrecognized subtypes as | |||
| the list of requirements for minimal MIME conformance. | "application/octet-stream" has been made more explicit | |||
| in both the type definitions sections and the | ||||
| conformance guidelines. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | (21) Examples using text/richtext were changed to | |||
| text/enriched. | ||||
| Also, the requirement for "message/rfc822" support were | (22) The BNF definition of subtype has been changed to make | |||
| strengthened to clarify the importance of recognizing | it clear that either an IANA registered subtype or a | |||
| recursive structure. | nonstandard "X-" subtype must be used in a Content-Type | |||
| header field. | ||||
| 7. In section 7.3.1, the definition of "message/rfc822" was | (23) The use of escape and shift mechanisms in the US-ASCII | |||
| changed to indicate that at least one of the "From", | and ISO-8859-X character sets this specification | |||
| "Subject", or "Date" headers must be present. | defines has been clarified: Such mechanisms should | |||
| never be used in conjunction with these character sets | ||||
| and their effect if they are used is undefined. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | (24) The definition of the AFS access-type for | |||
| message/external-body has been removed. | ||||
| References | (25) Entities that are simply registered for use and those | |||
| that are standardized by the IETF are now distinguished | ||||
| in the MIME BNF. | ||||
| [US-ASCII] Coded Character Set--7-Bit American Standard Code | (26) The handling of the combination of | |||
| for Information Interchange, ANSI X3.4-1986. | multipart/alternative and message/external-body is now | |||
| specifically addressed. | ||||
| [ATK] Borenstein, Nathaniel S., Multimedia Applications | Appendix H -- References | |||
| Development with the Andrew Toolkit, Prentice-Hall, 1990. | ||||
| [GIF] Graphics Interchange Format (Version 89a), Compuserve, | [ATK] | |||
| Inc., Columbus, Ohio, 1990. | Borenstein, Nathaniel S., Multimedia Applications | |||
| Development with the Andrew Toolkit, Prentice-Hall, 1990. | ||||
| [ISO-2022] International Standard--Information Processing-- | [GIF] | |||
| ISO 7-bit and 8-bit coded character sets--Code extension | Graphics Interchange Format (Version 89a), Compuserve, | |||
| techniques, ISO 2022:1986. | Inc., Columbus, Ohio, 1990. | |||
| [ISO-8859] Information Processing -- 8-bit Single-Byte Coded | [ISO-2022] | |||
| Graphic Character Sets -- Part 1: Latin Alphabet No. 1, ISO | International Standard -- Information Processing -- ISO | |||
| 8859-1:1987. Part 2: Latin alphabet No. 2, ISO 8859-2, | 7-bit and 8-bit Coded Character Sets -- Code Extension | |||
| 1987. Part 3: Latin alphabet No. 3, ISO 8859-3, 1988. Part | Techniques, ISO 2022:1986. | |||
| 4: Latin alphabet No. 4, ISO 8859-4, 1988. Part 5: | ||||
| Latin/Cyrillic alphabet, ISO 8859-5, 1988. Part 6: | ||||
| Latin/Arabic alphabet, ISO 8859-6, 1987. Part 7: | ||||
| Latin/Greek alphabet, ISO 8859-7, 1987. Part 8: | ||||
| Latin/Hebrew alphabet, ISO 8859-8, 1988. Part 9: Latin | ||||
| alphabet No. 5, ISO 8859-9, 1990. | ||||
| [ISO-646] International Standard--Information Processing-- | [ISO-8859] | |||
| ISO 7-bit coded character set for information interchange, | International Standard -- Information Processing -- 8-bit | |||
| ISO 646:1983. | Single-Byte Coded Graphic Character Sets -- Part 1: Latin | |||
| Alphabet No. 1, ISO 8859-1:1987. Part 2: Latin alphabet | ||||
| No. 2, ISO 8859-2, 1987. Part 3: Latin alphabet No. 3, | ||||
| ISO 8859-3, 1988. Part 4: Latin alphabet No. 4, ISO | ||||
| 8859-4, 1988. Part 5: Latin/Cyrillic alphabet, ISO | ||||
| 8859-5, 1988. Part 6: Latin/Arabic alphabet, ISO 8859-6, | ||||
| 1987. Part 7: Latin/Greek alphabet, ISO 8859-7, 1987. | ||||
| Part 8: Latin/Hebrew alphabet, ISO 8859-8, 1988. Part 9: | ||||
| Latin alphabet No. 5, ISO 8859-9, 1990. | ||||
| [MPEG] Video Coding Draft Standard ISO 11172 CD, ISO | [ISO-646] | |||
| IEC/TJC1/SC2/WG11 (Motion Picture Experts Group), May, 1991. | International Standard -- Information Processing -- ISO | |||
| 7-bit Coded Character Set For Information Interchange, | ||||
| ISO 646:1983. | ||||
| [PCM] CCITT, Fascicle III.4 - Recommendation G.711, "Pulse | [MPEG] | |||
| Code Modulation (PCM) of Voice Frequencies", Geneva, 1972. | Video Coding Draft Standard ISO 11172 CD, ISO | |||
| IEC/TJC1/SC2/WG11 (Motion Picture Experts Group), May, | ||||
| 1991. | ||||
| [POSTSCRIPT] Adobe Systems, Inc., PostScript Language | [PCM] | |||
| Reference Manual, Addison-Wesley, 1985. | CCITT, Fascicle III.4 - Recommendation G.711, "Pulse Code | |||
| Modulation (PCM) of Voice Frequencies", Geneva, 1972. | ||||
| [POSTSCRIPT2] Adobe Systems, Inc., PostScript Language | [POSTSCRIPT] | |||
| Reference Manual, Addison-Wesley, Second Edition, 1990. | Adobe Systems, Inc., PostScript Language Reference | |||
| Manual, Addison-Wesley, 1985. | ||||
| [X400] Schicker, Pietro, "Message Handling Systems, X.400", | [POSTSCRIPT2] | |||
| Message Handling Systems and Distributed Applications, E. | Adobe Systems, Inc., PostScript Language Reference | |||
| Stefferud, O-j. Jacobsen, and P. Schicker, eds., North- | Manual, Addison-Wesley, Second Edition, 1990. | |||
| Holland, 1989, pp. 3-41. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | [RFC-783] | |||
| Sollins, K.R., "TFTP Protocol (revision 2)", RFC-783, | ||||
| MIT, June 1981. | ||||
| [RFC-783] Sollins, K.R. "TFTP Protocol (revision 2)", | [RFC-821] | |||
| RFC-783, MIT, June 1981. | Postel, J.B., "Simple Mail Transfer Protocol", STD 10, | |||
| RFC 821, USC/Information Sciences Institute, August 1982. | ||||
| [RFC-821] Postel, J.B. "Simple Mail Transfer | [RFC-822] | |||
| Protocol", STD 10, RFC 821, USC/Information Sciences | Crocker, D., "Standard for the Format of ARPA Internet | |||
| Institute, August 1982. | Text Messages", STD 11, RFC 822, UDEL, August 1982. | |||
| [RFC-822] Crocker, D., "Standard for the Format of ARPA | [RFC-934] | |||
| Internet Text Messages", STD 11, RFC 822, UDEL, August 1982. | Rose, M., and E. Stefferud, "Proposed Standard for | |||
| Message Encapsulation", RFC 934, Delaware and NMA, | ||||
| January 1985. | ||||
| [RFC-934] Rose, M., and E. Stefferud, "Proposed Standard for | [RFC-959] | |||
| Message Encapsulation", RFC 934, Delaware and NMA, January | Postel, J. and J. Reynolds, "File Transfer Protocol", STD | |||
| 1985. | 9, RFC 959, USC/Information Sciences Institute, October | |||
| 1985. | ||||
| [RFC-959] Postel, J. and J. Reynolds, "File Transfer | [RFC-1049] | |||
| Protocol", STD 9, RFC 959, USC/Information Sciences | Sirbu, M., "Content-Type Header Field for Internet | |||
| Institute, October 1985. | Messages", STD 11, RFC 1049, CMU, March 1988. | |||
| [RFC-1049] Sirbu, M., "Content-Type Header Field for | [RFC-1154] | |||
| Internet Messages", STD 11, RFC 1049, CMU, March 1988. | Robinson, D. and R. Ullmann, "Encoding Header Field for | |||
| Internet Messages", RFC 1154, Prime Computer, Inc., April | ||||
| 1990. | ||||
| [RFC-1421] Linn, J., "Privacy Enhancement for Internet | [RFC-1341] | |||
| Electronic Mail: Part I - Message Encryption and | Borenstein, N., and N. Freed, "MIME (Multipurpose | |||
| Authentication Procedures", RFC 1421, IAB IRTF PSRG, IETF | Internet Mail Extensions): Mechanisms for Specifying and | |||
| PEM WG, February 1993. | Describing the Format of Internet Message Bodies", RFC | |||
| 1341, Bellcore, Innosoft, June 1992. | ||||
| [RFC-1154] Robinson, D. and R. Ullmann, "Encoding Header | [RFC-1342] | |||
| Field for Internet Messages", RFC 1154, Prime Computer, | Moore, K., "Representation of Non-Ascii Text in Internet | |||
| Inc., April 1990. | Message Headers", RFC 1342, University of Tennessee, June | |||
| 1992. | ||||
| [RFC-1341] Borenstein, N., and N. Freed, "MIME | [RFC-1344] | |||
| (Multipurpose Internet Mail Extensions): Mechanisms for | Borenstein, N., "Implications of MIME for Internet Mail | |||
| Specifying and Describing the Format of Internet Message | Gateways", RFC 1344, Bellcore, June 1992. | |||
| Bodies", RFC 1341, Bellcore, Innosoft, June 1992. | ||||
| [RFC-1342] Moore, K., "Representation of Non-Ascii Text in | [RFC-1345] | |||
| Internet Message Headers", RFC 1342, University of | Simonsen, K., "Character Mnemonics & Character Sets", RFC | |||
| Tennessee, June 1992. | 1345, Rationel Almen Planlaegning, June 1992. | |||
| [RFC-1343] Borenstein, N., "A User Agent Configuration | [RFC-1421] | |||
| Mechanism for Multimedia Mail Format Information", RFC 1343, | Linn, J., "Privacy Enhancement for Internet Electronic | |||
| Bellcore, June 1992. | Mail: Part I -- Message Encryption and Authentication | |||
| Procedures", RFC 1421, IAB IRTF PSRG, IETF PEM WG, | ||||
| February 1993. | ||||
| [RFC-1344] Borenstein, N., "Implications of MIME for | [RFC-1422] | |||
| Internet Mail Gateways", RFC 1344, Bellcore, June 1992. | Kent, S., "Privacy Enhancement for Internet Electronic | |||
| Mail: Part II -- Certificate-Based Key Management", RFC | ||||
| 1422, IAB IRTF PSRG, IETF PEM WG, February 1993. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | [RFC-1423] | |||
| Balenson, D., "Privacy Enhancement for Internet | ||||
| Electronic Mail: Part III -- Algorithms, Modes, and | ||||
| Identifiers", IAB IRTF PSRG, IETF PEM WG, February 1993. | ||||
| [RFC-1345] Simonsen, K., "Character Mnemonics & Character | [RFC-1424] | |||
| Sets", RFC 1345, Rationel Almen Planlaegning, June 1992. | Kaliski, B., "Privacy Enhancement for Internet Electronic | |||
| Mail: Part IV -- Key Certification and Related | ||||
| Services", IAB IRTF PSRG, IETF PEM WG, February 1993. | ||||
| [RFC-1426] Klensin, J., (WG Chair), Freed, N., (Editor), | [RFC-1521] | |||
| Rose, M., Stefferud, E., and D. Crocker, "SMTP Service | Borenstein, N., and N. Freed, "MIME (Multipurpose | |||
| Extension for 8bit-MIME transport", RFC 1426, United Nations | Internet Mail Extensions): Mechanisms for Specifying and | |||
| Universit, Innosoft, Dover Beach Consulting, Inc., Network | Describing the Format of Internet Message Bodies", RFC | |||
| Management Associates, Inc., The Branch Office, February | 1521, Bellcore, Innosoft, September, 1993. | |||
| 1993. | ||||
| [RFC-1522] Moore, K., "Representation of Non-Ascii Text in | [RFC-1522] | |||
| Internet Message Headers" RFC 1522, University of Tennessee, | Moore, K., "Representation of Non-ASCII Text in Internet | |||
| September 1993. | Message Headers", RFC 1522, University of Tennessee, | |||
| September 1993. | ||||
| [RFC-1340] Reynolds, J., and J. Postel, "Assigned Numbers", | [RFC-1524] | |||
| STD 2, RFC 1340, USC/Information Sciences Institute, July | Borenstein, N., "A User Agent Configuration Mechanism for | |||
| 1992. | Multimedia Mail Format Information", RFC 1524, Bellcore, | |||
| September 1993. | ||||
| [RFC-1521] Borenstein, N., and N. Freed, "MIME | [RFC-1563] | |||
| (Multipurpose Internet Mail Extensions): Mechanisms for | Borenstein, N., "The text/enriched MIME Content-type", | |||
| Specifying and Describing the Format of Internet Message | RFC 1563, Bellcore, January, 1994. | |||
| Bodies", RFC 1521, Bellcore, Innosoft, September, 1993. | ||||
| [RFC-1563] Borenstein, N., "The text/enriched MIME Content- | [RFC-1652] | |||
| type", RFC 1563, Bellcore, January, 1994. | Klensin, J., (WG Chair), Freed, N., (Editor), Rose, M., | |||
| Stefferud, E., and Crocker, D., "SMTP Service Extension | ||||
| for 8bit-MIME transport", RFC 1652, United Nations | ||||
| Universit, Innosoft, Dover Beach Consulting, Inc., | ||||
| Network Management Associates, Inc., The Branch Office, | ||||
| February 1993. | ||||
| Expires 11/20/94 draft-ietf-822-mime-00.txt May 1994 | [RFC-1700] | |||
| Reynolds, J., and J. Postel, "Assigned Numbers", STD 2, | ||||
| RFC 1700, USC/Information Sciences Institute, October | ||||
| 1994. | ||||
| THIS PAGE INTENTIONALLY LEFT BLANK. | [RFC-MIME-HEADERS] | |||
| Moore, K., "Representation of Non-Ascii Text in Internet | ||||
| Message Headers", RFC MIME-HEADERS, University of | ||||
| Tennessee, ?. | ||||
| Table of Contents | [RFC-REG] | |||
| Postel, J., "Media Type Registration Procedure", RFC REG, | ||||
| ?. | ||||
| 1 Introduction....................................... 4 | [US-ASCII] | |||
| 2 Notations, Conventions, and Generic BNF Grammar.... 4 | Coded Character Set -- 7-Bit American Standard Code for | |||
| 3 The MIME-Version Header Field...................... 6 | Information Interchange, ANSI X3.4-1986. | |||
| 4 The Content-Type Header Field...................... 8 | ||||
| 5 The Content-Transfer-Encoding Header Field......... 14 | ||||
| 5.1 Quoted-Printable Content-Transfer-Encoding......... 20 | ||||
| 5.2 Base64 Content-Transfer-Encoding................... 24 | ||||
| 6 Additional Content- Header Fields.................. 27 | ||||
| 6.1 Optional Content-ID Header Field................... 27 | ||||
| 6.2 Optional Content-Description Header Field.......... 27 | ||||
| 7 The Predefined Content-Type Values................. 28 | ||||
| 7.1 The Text Content-Type.............................. 28 | ||||
| 7.1.1 The charset parameter.............................. 28 | ||||
| 7.1.2 The Text/plain subtype............................. 32 | ||||
| 7.2 The Multipart Content-Type......................... 33 | ||||
| 7.2.1 Multipart: The common syntax...................... 34 | ||||
| 7.2.2 The Multipart/mixed (primary) subtype.............. 40 | ||||
| 7.2.3 The Multipart/alternative subtype.................. 40 | ||||
| 7.2.4 The Multipart/digest subtype....................... 43 | ||||
| 7.2.5 The Multipart/parallel subtype..................... 43 | ||||
| 7.3 The Message Content-Type........................... 44 | ||||
| 7.3.1 The Message/rfc822 (primary) subtype............... 45 | ||||
| 7.3.2 The Message/Partial subtype........................ 45 | ||||
| 7.3.3 The Message/External-Body subtype.................. 49 | ||||
| 7.4 The Application Content-Type....................... 58 | ||||
| 7.4.1 The Application/Octet-Stream (primary) subtype..... 58 | ||||
| 7.4.2 The Application/PostScript subtype................. 59 | ||||
| 7.4.3 Other Application subtypes......................... 62 | ||||
| 7.5 The Image Content-Type............................. 63 | ||||
| 7.6 The Audio Content-Type............................. 63 | ||||
| 7.7 The Video Content-Type............................. 64 | ||||
| 7.8 Experimental Content-Type Values................... 64 | ||||
| Summary............................................ 65 | ||||
| Security Considerations............................ 65 | ||||
| Authors' Addresses................................. 66 | ||||
| Acknowledgements................................... 67 | ||||
| Appendix A -- Minimal MIME-Conformance............. 69 | ||||
| Appendix B -- General Guidelines For Sending Email Data72 | ||||
| Appendix C -- A Complex Multipart Example.......... 75 | ||||
| Appendix D -- Collected Grammar.................... 77 | ||||
| Appendix E -- IANA Registration Procedures......... 82 | ||||
| E.1 Registration of New Content-type/subtype Values..82 | ||||
| E.2 Registration of New Access-type Values for Message/external-body83 | [X400] | |||
| Appendix F -- Summary of the Seven Content-types... 84 | Schicker, Pietro, "Message Handling Systems, X.400", | |||
| Appendix G -- Canonical Encoding Model............. 87 | Message Handling Systems and Distributed Applications, E. | |||
| Appendix H -- Changes from RFC 1521................ 90 | Stefferud, O-j. Jacobsen, and P. Schicker, eds., North- | |||
| References......................................... 92 | Holland, 1989, pp. 3-41. | |||
| End of changes. 729 change blocks. | ||||
| 3583 lines changed or deleted | 3335 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||