idnits 2.17.1 draft-ietf-822ext-mime-imt-01.txt: ** The Abstract section seems to be numbered Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-26) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 318: '...y MIME text type MUST represent a line...' RFC 2119 keyword, line 320: '...in text MUST represent a line break. ...' RFC 2119 keyword, line 402: '...racter encodings MUST use an appropria...' RFC 2119 keyword, line 861: '...undary delimiter MUST NOT appear insid...' RFC 2119 keyword, line 943: '...undary delimiter MUST occur at the beg...' (7 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 486 has weird spacing: '...of text is "p...' == Line 956 has weird spacing: '...F (line break...' == Line 1703 has weird spacing: '...ed, the defau...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (May 5, 1995) is 10584 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'RFC-1341' on line 216 looks like a reference -- Missing reference section? 'RFC-1563' on line 331 looks like a reference -- Missing reference section? 'ISO-646' on line 382 looks like a reference -- Missing reference section? 'US-ASCII' on line 433 looks like a reference -- Missing reference section? 'ISO-8859' on line 436 looks like a reference -- Missing reference section? 'PCM' on line 533 looks like a reference -- Missing reference section? 'MPEG' on line 551 looks like a reference -- Missing reference section? 'POSTSCRIPT' on line 643 looks like a reference -- Missing reference section? 'POSTSCRIPT2' on line 644 looks like a reference -- Missing reference section? 'MIME-IMB' on line 878 looks like a reference -- Missing reference section? 'RFC-959' on line 1700 looks like a reference -- Missing reference section? 'RFC-783' on line 1695 looks like a reference Summary: 9 errors (**), 0 flaws (~~), 4 warnings (==), 14 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group Nathaniel Borenstein 2 Internet Draft Ned Freed 3 5 Multipurpose Internet Mail Extensions 6 (MIME) Part Two: 8 Media Types 10 May 5, 1995 12 Status of this Memo 14 This document is an Internet-Draft. Internet-Drafts are 15 working documents of the Internet Engineering Task Force 16 (IETF), its areas, and its working groups. Note that other 17 groups may also distribute working documents as Internet- 18 Drafts. 20 Internet-Drafts are draft documents valid for a maximum of six 21 months. Internet-Drafts may be updated, replaced, or obsoleted 22 by other documents at any time. It is not appropriate to use 23 Internet-Drafts as reference material or to cite them other 24 than as a "working draft" or "work in progress". 26 To learn the current status of any Internet-Draft, please 27 check the 1id-abstracts.txt listing contained in the 28 Internet-Drafts Shadow Directories on ds.internic.net (US East 29 Coast), nic.nordu.net (Europe), ftp.isi.edu (US West Coast), 30 or munnari.oz.au (Pacific Rim). 32 1. Abstract 34 STD 11, RFC 822 defines a message representation protocol 35 specifying considerable detail about US-ASCII message headers, 36 but which leaves the message content, or message body, as flat 37 US-ASCII text. This set of documents, collectively called the 38 Multipurpose Internet Mail Extensions, or MIME, redefines the 39 format of messages to allow for 40 (1) textual message bodies in character sets other than 41 US-ASCII, 43 (2) non-textual message bodies, 45 (3) multi-part message bodies, and 47 (4) textual header information in character sets other than 48 US-ASCII. 50 These documents are based on earlier work documented in RFC 51 934, STD 11, and RFC 1049, but extends and revises them. 52 Because RFC 822 said so little about message bodies, these 53 documents are largely orthogonal to (rather than a revision 54 of) RFC 822. 56 In particular, these documents are designed to provide 57 facilities to include multiple parts in a single message, to 58 represent body and header text in character sets other than 59 US-ASCII, to represent formatted multi-font text messages, to 60 represent non-textual material such as images and audio 61 fragments, and generally to facilitate later extensions 62 defining new types of Internet mail for use by cooperating 63 mail agents. 65 The initial document in this set, RFC MIME-IMB, specifies the 66 various headers used to describe the structure of MIME 67 messages. This second document defines the general structure 68 of the MIME media typing system and defines an initial set of 69 media types. The third document, RFC MIME-HEADERS, describes 70 extensions to RFC 822 to allow non-US-ASCII text data in 71 Internet mail header fields. The fourth document, RFC MIME- 72 REG, specifies various IANA registration procedures for MIME- 73 related entities. The fifth and final document, RFC MIME- 74 CONF, describes MIME conformance criteria as well as providing 75 some illustrative examples of MIME message formats, 76 acknowledgements, and the bibliography. 78 These documents are revisions of RFCs 1521 and 1522, which 79 themselves were revisions of RFCs 1341 and 1342. An appendix 80 in RFC MIME-CONF describes differences and changes from 81 previous versions. 83 2. Table of Contents 85 1 Abstract .............................................. 1 86 2 Table of Contents ..................................... 3 87 3 Introduction .......................................... 4 88 4 Definition of a Top-Level Media Type .................. 5 89 5 Overview Of The Initial Top-Level Media Types ......... 5 90 6 Discrete Media Type Values ............................ 7 91 6.1 Text Media Type ..................................... 7 92 6.1.1 Representation of Line Breaks ..................... 8 93 6.1.2 Charset Parameter ................................. 8 94 6.1.3 Plain Subtype ..................................... 12 95 6.1.4 Unrecognized Subtypes ............................. 12 96 6.2 Image Media Type .................................... 12 97 6.3 Audio Media Type .................................... 13 98 6.4 Video Media Type .................................... 13 99 6.5 Application Media Type .............................. 14 100 6.5.1 Octet-Stream Subtype .............................. 15 101 6.5.2 PostScript Subtype ................................ 15 102 6.5.3 Other Application Subtypes ........................ 19 103 7 Composite Media Type Values ........................... 19 104 7.1 Multipart Media Type ................................ 19 105 7.1.1 Common Syntax ..................................... 21 106 7.1.2 Handling Nested Messages and Multiparts ........... 27 107 7.1.3 Mixed Subtype ..................................... 27 108 7.1.4 Alternative Subtype ............................... 28 109 7.1.5 Digest Subtype .................................... 30 110 7.1.6 Parallel Subtype .................................. 31 111 7.1.7 Other Multipart Subtypes .......................... 32 112 7.2 Message Media Type .................................. 32 113 7.2.1 RFC822 Subtype .................................... 32 114 7.2.2 Partial Subtype ................................... 33 115 7.2.2.1 Message Fragmentation and Reassembly ............ 34 116 7.2.2.2 Fragmentation and Reassembly Example ............ 35 117 7.2.3 External-Body Subtype ............................. 37 118 7.2.4 Other Message Subtypes ............................ 46 119 8 Experimental Media Type Values ........................ 46 120 9 Summary ............................................... 47 121 10 Security Considerations .............................. 47 122 11 Authors' Addresses ................................... 48 123 A Collected Grammar ..................................... 49 124 3. Introduction 126 The first document in this set, RFC MIME-IMB, defines a number 127 of header fields, including Content-Type. The Content-Type 128 field is used to specify the nature of the data in the body of 129 an entity, by giving media type and subtype identifiers, and 130 by providing auxiliary information that may be required for 131 certain media types. After the type and subtype names, the 132 remainder of the header field is simply a set of parameters, 133 specified in an attribute/value notation. The ordering of 134 parameters is not significant. 136 In general, the top-level media type is used to declare the 137 general type of data, while the subtype specifies a specific 138 format for that type of data. Thus, a media type of 139 "image/xyz" is enough to tell a user agent that the data is an 140 image, even if the user agent has no knowledge of the specific 141 image format "xyz". Such information can be used, for 142 example, to decide whether or not to show a user the raw data 143 from an unrecognized subtype -- such an action might be 144 reasonable for unrecognized subtypes of text, but not for 145 unrecognized subtypes of image or audio. For this reason, 146 registered subtypes of text, image, audio, and video should 147 not contain embedded information that is really of a different 148 type. Such compound formats should be represented using the 149 "multipart" or "application" types. 151 Parameters are modifiers of the media subtype, and as such do 152 not fundamentally affect the nature of the content. The set 153 of meaningful parameters depends on the media type and 154 subtype. Most parameters are associated with a single 155 specific subtype. However, a given top-level media type may 156 define parameters which are applicable to any subtype of that 157 type. Parameters may be required by their defining media type 158 or subtype or they may be optional. MIME implementations must 159 also ignore any parameters whose names they do not recognize. 161 MIME's Content-Type header field and media type mechanism has 162 been carefully designed to be extensible, and it is expected 163 that the set of media type/subtype pairs and their associated 164 parameters will grow significantly over time. Several other 165 MIME entities, most notably the list of the name of character 166 sets registered for MIME usage, are likely to have new values 167 defined over time. In order to ensure that the set of such 168 values is developed in an orderly, well-specified, and public 169 manner, MIME sets up a registration process which uses the 170 Internet Assigned Numbers Authority (IANA) as a central 171 registry for MIME's extension areas. The registration process 172 is described in a companion document, RFC MIME-REG. 174 The initial seven standard top-level media type are defined 175 and described in the remainder of this document. 177 4. Definition of a Top-Level Media Type 179 The definition of a top-level media type consists of: 181 (1) a name and a description of the type, including 182 criteria for whether a particular type would qualify 183 under that type, 185 (2) the names and definitions of parameters, if any, which 186 are defined for all subtypes of that type (including 187 whether such parameters are required or optional), 189 (3) how a user agent and/or gateway should handle unknown 190 subtypes of this type, 192 (4) general considerations on gatewaying objects of this 193 top-level type, if any, and 195 (5) any restrictions on content-transfer-encodings for 196 objects of this top-level type. 198 5. Overview Of The Initial Top-Level Media Types 200 The five discrete top-level media types are: 202 (1) text -- textual information. The subtype "plain" in 203 particular indicates plain (unformatted) text. No 204 special software is required to get the full meaning of 205 the text, aside from support for the indicated 206 character set. Other subtypes are to be used for 207 enriched text in forms where application software may 208 enhance the appearance of the text, but such software 209 must not be required in order to get the general idea 210 of the content. Possible subtypes thus include any 211 word processor format that can be read without 212 resorting to software that understands the format. In 213 particular, formats that employ embeddded binary 214 formatting information are not considered directly 215 readable. A very simple and portable subtype, 216 richtext, was defined in RFC 1341 [RFC-1341], with a 217 further revision in RFC 1563 [RFC-1563] under the name 218 "enriched". 220 (2) image -- image data. Image requires a display device 221 (such as a graphical display, a graphics printer, or a 222 FAX machine) to view the information. An initial 223 subtype is defined for the widely-used image format 224 JPEG. 226 (3) audio -- audio data. Audio requires an audio output 227 device (such as a speaker or a telephone) to "display" 228 the contents. An initial subtype "basic" is defined in 229 this document. 231 (4) video -- video data. Video requires the capability to 232 display moving images, typically including specialized 233 hardware and software. An initial subtype "mpeg" is 234 defined in this document. 236 (5) application -- some other kind of data, typically 237 either uninterpreted binary data or information to be 238 processed by an application. The subtype "octet- 239 stream" is to be used in the case of uninterpreted 240 binary data, in which case the simplest recommended 241 action is to offer to write the information into a file 242 for the user. The "PostScript" subtype is also defined 243 for the transport of PostScript material. Other 244 expected uses for "application" include spreadsheets, 245 data for mail-based scheduling systems, and languages 246 for "active" (computational) messaging, and word 247 processing formats that are not directly readable. 248 Note that security considerations may exist for some 249 types of application data, most notably 250 application/PostScript and any form of active 251 messaging. These issues are discussed later in this 252 document. 254 The two composite top-level media types are: 256 (1) multipart -- data consisting of multiple parts of 257 independent data types. Four subtypes are initially 258 defined, including the basic "mixed" subtype specifying 259 a generic mixed set of parts, "alternative" for 260 representing the same data in multiple formats, 261 "parallel" for parts intended to be viewed 262 simultaneously, and "digest" for multipart entities in 263 which each part has a default type of "message/rfc822". 265 (2) message -- an encapsulated message. A body of media 266 type "message" is itself all or part of some kind of 267 message object. Such objects may in turn contain other 268 messages and body parts of their own. The "rfc822" 269 subtype is used when the encapsulated content is itself 270 an RFC 822 message. The "partial" subtype is defined 271 for partial RFC 822 messages, to permit the fragmented 272 transmission of bodies that are thought to be too large 273 to be passed through transport facilities in one piece. 274 Another subtype, "external-body", is defined for 275 specifying large bodies by reference to an external 276 data source. 278 It should be noted that the list of media type values given 279 here may be augmented in time, via the mechanisms described 280 above, and that the set of subtypes is expected to grow 281 substantially. 283 6. Discrete Media Type Values 285 Five of the seven initial media type values refer to discrete 286 bodies. The content of such entities is handled by non-MIME 287 mechanisms; they are opaque to MIME processors. 289 6.1. Text Media Type 291 The text media type is intended for sending material which is 292 principally textual in form. A "charset" parameter may be 293 used to indicate the character set of the body text for some 294 text subtypes, notably including the subtype "text/plain", 295 which indicates plain (unformatted) text. The default media 296 type for Internet mail if none is specified is "text/plain; 297 charset=us-ascii". 299 Beyond plain text, there are many formats for representing 300 what might be known as "extended text" -- text with embedded 301 formatting and presentation information. An interesting 302 characteristic of many such representations is that they are 303 to some extent readable even without the software that 304 interprets them. It is useful, then, to distinguish them, at 305 the highest level, from such unreadable data as images, audio, 306 or text represented in an unreadable form. In the absence of 307 appropriate interpretation software, it is reasonable to show 308 subtypes of text to the user, while it is not reasonable to do 309 so with most nontextual data. 311 Such formatted textual data should be represented using 312 subtypes of text. Plausible subtypes of text are typically 313 given by the common name of the representation format, e.g., 314 "text/enriched" [RFC-1563]. 316 6.1.1. Representation of Line Breaks 318 The canonical form of any MIME text type MUST represent a line 319 break as a CRLF sequence. Similarly, any occurrence of CRLF 320 in text MUST represent a line break. Use of CR and LF outside 321 of line break sequences is also forbidden. 323 This rule applies regardless of format or character set or 324 sets involved. 326 NOTE: The proper interpretation of line breaks when a body is 327 displayed depends on the media type. In particular, while it 328 is appropriate to treat a line break as a transition to a new 329 line when displaying a text/plain body, this treatment is 330 actually incorrect for other subtypes of text like 331 text/enriched [RFC-1563]. 333 6.1.2. Charset Parameter 335 A critical parameter that may be specified in the Content-Type 336 field for text/plain data is the character set. This is 337 specified with a "charset" parameter, as in: 339 Content-type: text/plain; charset=iso-8859-1 341 Unlike some other parameter values, the values of the charset 342 parameter are NOT case sensitive. The default character set, 343 which must be assumed in the absence of a charset parameter, 344 is US-ASCII. 346 The specification for any future subtypes of "text" must 347 specify whether or not they will also utilize a "charset" 348 parameter, and may possibly restrict its values as well. When 349 used with a particular body, the semantics of the "charset" 350 parameter should be identical to those specified here for 351 "text/plain", i.e., the body consists entirely of characters 352 in the given charset. In particular, definers of future text 353 subtypes should pay close attention to the implications of 354 multioctet character sets for their subtype definitions. 356 This RFC specifies the definition of the charset parameter for 357 the purposes of MIME to be the name of a character set, as 358 "character set" as defined in MIME-IMB. The rules regarding 359 line breaks detailed in the previous section must also be 360 observed -- a character set whose definition does not conform 361 to these rules cannot be used in a MIME text type. 363 An initial list of predefined character set names can be found 364 at the end of this section. Additional character sets may be 365 registered with IANA as described in RFC MIME-REG. 367 Note that if the specified character set includes 8-bit data, 368 a Content-Transfer-Encoding header field and a corresponding 369 encoding on the data are required in order to transmit the 370 body via some mail transfer protocols, such as SMTP. 372 The default character set, US-ASCII, has been the subject of 373 some confusion and ambiguity in the past. Not only were there 374 some ambiguities in the definition, there have been wide 375 variations in practice. In order to eliminate such ambiguity 376 and variations in the future, it is strongly recommended that 377 new user agents explicitly specify a character set as a media 378 type parameter in the Content-Type header field. "US-ASCII" 379 does not indicate an arbitrary 7-bit character code, but 380 specifies that the body uses character coding that uses the 381 exact correspondence of octets to characters specified in US- 382 ASCII. National use variations of ISO 646 [ISO-646] are NOT 383 US-ASCII and their use in Internet mail is explicitly 384 discouraged. The omission of the ISO 646 character set is 385 deliberate in this regard. The character set name of "US- 386 ASCII" explicitly refers to ANSI X3.4-1986 [US-ASCII] only. 388 The character set name "ASCII" is reserved and must not be 389 used for any purpose. 391 NOTE: RFC 821 explicitly specifies "ASCII", and references an 392 earlier version of the American Standard. Insofar as one of 393 the purposes of specifying a media type and character set is 394 to permit the receiver to unambiguously determine how the 395 sender intended the coded message to be interpreted, assuming 396 anything other than "strict ASCII" as the default would risk 397 unintentional and incompatible changes to the semantics of 398 messages now being transmitted. This also implies that 399 messages containing characters coded according to national 400 variations on ISO 646, or using code-switching procedures 401 (e.g., those of ISO 2022), as well as 8-bit or multiple octet 402 character encodings MUST use an appropriate character set 403 specification to be consistent with this specification. 405 The complete US-ASCII character set is listed in ANSI X3.4- 406 1986. Note that the control characters including DEL (0-31, 407 127) have no defined meaning apart from the combination CRLF 408 (US-ASCII values 13 and 10) indicating a new line. Two of the 409 characters have de facto meanings in wide use: FF (12) often 410 means "start subsequent text on the beginning of a new page"; 411 and TAB or HT (9) often (though not always) means "move the 412 cursor to the next available column after the current position 413 where the column number is a multiple of 8 (counting the first 414 column as column 0)." Apart from this, any use of the control 415 characters or DEL in a body must be part of a private 416 agreement between the sender and recipient. Such private 417 agreements are discouraged and should be replaced by the other 418 capabilities of this document. 420 NOTE: Beyond US-ASCII, an enormous proliferation of character 421 sets is possible. It is the opinion of the IETF working group 422 that a large number of character sets is NOT a good thing. We 423 would prefer to specify a SINGLE character set that can be 424 used universally for representing all of the world's languages 425 in Internet mail. Unfortunately, existing practice in several 426 communities seems to point to the continued use of multiple 427 character sets in the near future. For this reason, we define 428 names for a small number of character sets for which a strong 429 constituent base exists. 431 The defined charset values are: 433 (1) US-ASCII -- as defined in ANSI X3.4-1986 [US-ASCII]. 435 (2) ISO-8859-X -- where "X" is to be replaced, as 436 necessary, for the parts of ISO-8859 [ISO-8859]. Note 437 that the ISO 646 character sets have deliberately been 438 omitted in favor of their 8859 replacements, which are 439 the designated character sets for Internet mail. As of 440 the publication of this document, the legitimate values 441 for "X" are the digits 1 through 9. 443 All of these character sets are used as pure 7- or 8-bit sets 444 without any shift or escape functions. The meaning of shift 445 and escape sequences in these character sets is not defined. 447 The character sets specified above are the ones that were 448 relatively uncontroversial during the drafting of MIME. This 449 document does not endorse the use of any particular character 450 set other than US-ASCII, and recognizes that the future 451 evolution of world character sets remains unclear. It is 452 expected that in the future, additional character sets will be 453 registered for use in MIME. 455 Note that the character set used, if anything other than US- 456 ASCII, must always be explicitly specified in the Content-Type 457 field. 459 No other character set name may be used in Internet mail 460 without the publication of a formal specification and its 461 registration with IANA, or by private agreement, in which case 462 the character set name must begin with "X-". 464 Implementors are discouraged from defining new character sets 465 unless absolutely necessary. 467 The "charset" parameter has been defined primarily for the 468 purpose of textual data, and is described in this section for 469 that reason. However, it is conceivable that non-textual data 470 might also wish to specify a charset value for some purpose, 471 in which case the same syntax and values should be used. 473 In general, composition software should always use the "lowest 474 common denominator" character set possible. For example, if a 475 body contains only US-ASCII characters, it should be marked as 476 being in the US-ASCII character set, not ISO-8859-1, which, 477 like all the ISO-8859 family of character sets, is a superset 478 of US-ASCII. More generally, if a widely-used character set 479 is a subset of another character set, and a body contains only 480 characters in the widely-used subset, it should be labelled as 481 being in that subset. This will increase the chances that the 482 recipient will be able to view the resulting object correctly. 484 6.1.3. Plain Subtype 486 The simplest and most important subtype of text is "plain". 487 This indicates plain (unformatted) text. The default media 488 type of "text/plain; charset=us-ascii" for Internet mail 489 describes existing Internet practice. That is, it is the type 490 of body defined by RFC 822. 492 No other text subtype is defined by this document. 494 6.1.4. Unrecognized Subtypes 496 Unrecognized subtypes of text should be treated as subtype 497 "plain" as long as the MIME implementation knows how to handle 498 the charset. Unrecognized subtypes which also specify an 499 unrecognized charset should be treated as "application/octet- 500 stream". 502 6.2. Image Media Type 504 A media type of "image" indicates that the body contains an 505 image. The subtype names the specific image format. These 506 names are not case sensitive. An initial subtype is "jpeg" for 507 the JPEG format using JFIF encoding. 509 The list of image subtypes given here is neither exclusive nor 510 exhaustive, and is expected to grow as more types are 511 registered with IANA, as described in RFC MIME-REG. 513 Unrecognized subtypes of image should at a miniumum be treated 514 as "application/octet-stream". Implementations may optionally 515 elect to pass subtypes of image that they do not specifically 516 recognize to a robust general-purpose image viewing 517 application, if such an application is available. 519 6.3. Audio Media Type 521 A media type of "audio" indicates that the body contains audio 522 data. Although there is not yet a consensus on an "ideal" 523 audio format for use with computers, there is a pressing need 524 for a format capable of providing interoperable behavior. 526 The initial subtype of "basic" is specified to meet this 527 requirement by providing an absolutely minimal lowest common 528 denominator audio format. It is expected that richer formats 529 for higher quality and/or lower bandwidth audio will be 530 defined by a later document. 532 The content of the "audio/basic" subtype is single channel 533 audio encoded using 8-bit ISDN mu-law [PCM] at a sample rate 534 of 8000 Hz. 536 Unrecognized subtypes of audio should at a miniumum be treated 537 as "application/octet-stream". Implementations may optionally 538 elect to pass subtypes of audio that they do not specifically 539 recognize to a robust general-purpose audio playing 540 application, if such an application is available. 542 6.4. Video Media Type 544 A media type of "video" indicates that the body contains a 545 time-varying-picture image, possibly with color and 546 coordinated sound. The term "video" is used extremely 547 generically, rather than with reference to any particular 548 technology or format, and is not meant to preclude subtypes 549 such as animated drawings encoded compactly. The subtype 550 "mpeg" refers to video coded according to the MPEG standard 551 [MPEG]. 553 Note that although in general this document strongly 554 discourages the mixing of multiple media in a single body, it 555 is recognized that many so-called "video" formats include a 556 representation for synchronized audio, and this is explicitly 557 permitted for subtypes of "video". 559 Unrecognized subtypes of video should at a minumum be treated 560 as "application/octet-stream". Implementations may optionally 561 elect to pass subtypes of video that they do not specifically 562 recognize to a robust general-purpose video display 563 application, if such an application is available. 565 6.5. Application Media Type 567 The "application" media type is to be used for discrete data 568 which do not fit in any of the other categories, and 569 particularly for data to be processed by some type of 570 application program. This is information which must be 571 processed by an application before it is viewable or usable by 572 a user. Expected uses for the application media type include 573 file transfer, spreadsheets, data for mail-based scheduling 574 systems, and languages for "active" (computational) messages. 575 (The latter, in particular, can pose security problems which 576 must be understood by implementors, and are considered in 577 detail in the discussion of the application/PostScript media 578 type.) 580 For example, a meeting scheduler might define a standard 581 representation for information about proposed meeting dates. 582 An intelligent user agent would use this information to 583 conduct a dialog with the user, and might then send additional 584 material based on that dialog. More generally, there have 585 been several "active" messaging languages developed in which 586 programs in a suitably specialized language are transported to 587 a remote location and automatically run in the recipient's 588 environment. 590 Such applications may be defined as subtypes of the 591 "application" media type. This document defines two subtypes: 592 octet-stream, and PostScript. 594 The subtype of application will often be the name of the 595 application for which the data are intended. This does not 596 mean, however, that any application program name may be used 597 freely as a subtype of application. Usage of any subtype 598 (other than subtypes beginning with "x-") must be registered 599 with IANA, as described in RFC MIME-REG. 601 6.5.1. Octet-Stream Subtype 603 The "octet-stream" subtype is used to indicate that a body 604 contains arbitrary binary data. The set of currently defined 605 parameters is: 607 (1) TYPE -- the general type or category of binary data. 608 This is intended as information for the human recipient 609 rather than for any automatic processing. 611 (2) PADDING -- the number of bits of padding that were 612 appended to the bit-stream comprising the actual 613 contents to produce the enclosed 8-bit byte-oriented 614 data. This is useful for enclosing a bit-stream in a 615 body when the total number of bits is not a multiple of 616 8. 618 Both of these parameters are optional. 620 An additional parameter, "CONVERSIONS", was defined in RFC 621 1341 but has since been removed. RFC 1341 also defined the 622 use of a "NAME" parameter which gave a suggested file name to 623 be used if the data were to be written to a file. This has 624 been deprecated in anticipation of a separate Content- 625 Disposition header field, to be defined in a subsequent RFC. 627 The recommended action for an implementation that receives an 628 application/octet-stream object is to simply offer to put the 629 data in a file, with any Content-Transfer-Encoding undone, or 630 perhaps to use it as input to a user-specified process. 632 To reduce the danger of transmitting rogue programs, it is 633 strongly recommended that implementations NOT implement a 634 path-search mechanism whereby an arbitrary program named in 635 the Content-Type parameter (e.g., an "interpreter=" parameter) 636 is found and executed using the message body as input. 638 6.5.2. PostScript Subtype 640 A media type of "application/postscript" indicates a 641 PostScript program. Currently two variants of the PostScript 642 language are allowed; the original level 1 variant is 643 described in [POSTSCRIPT] and the more recent level 2 variant 644 is described in [POSTSCRIPT2]. 646 PostScript is a registered trademark of Adobe Systems, Inc. 647 Use of the MIME media type "application/postscript" implies 648 recognition of that trademark and all the rights it entails. 650 The PostScript language definition provides facilities for 651 internal labelling of the specific language features a given 652 program uses. This labelling, called the PostScript document 653 structuring conventions, or DSC, is very general and provides 654 substantially more information than just the language level. 655 The use of document structuring conventions, while not 656 required, is strongly recommended as an aid to 657 interoperability. Documents which lack proper structuring 658 conventions cannot be tested to see whether or not they will 659 work in a given environment. As such, some systems may assume 660 the worst and refuse to process unstructured documents. 662 The execution of general-purpose PostScript interpreters 663 entails serious security risks, and implementors are 664 discouraged from simply sending PostScript bodies to "off- 665 the-shelf" interpreters. While it is usually safe to send 666 PostScript to a printer, where the potential for harm is 667 greatly constrained by typical printer environments, 668 implementors should consider all of the following before they 669 add interactive display of PostScript bodies to their MIME 670 readers. 672 The remainder of this section outlines some, though probably 673 not all, of the possible problems with the transport of 674 PostScript objects. 676 (1) Dangerous operations in the PostScript language 677 include, but may not be limited to, the PostScript 678 operators "deletefile", "renamefile", "filenameforall", 679 and "file". "File" is only dangerous when applied to 680 something other than standard input or output. 681 Implementations may also define additional nonstandard 682 file operators; these may also pose a threat to 683 security. "Filenameforall", the wildcard file search 684 operator, may appear at first glance to be harmless. 685 Note, however, that this operator has the potential to 686 reveal information about what files the recipient has 687 access to, and this information may itself be 688 sensitive. Message senders should avoid the use of 689 potentially dangerous file operators, since these 690 operators are quite likely to be unavailable in secure 691 PostScript implementations. Message receiving and 692 displaying software should either completely disable 693 all potentially dangerous file operators or take 694 special care not to delegate any special authority to 695 their operation. These operators should be viewed as 696 being done by an outside agency when interpreting 697 PostScript documents. Such disabling and/or checking 698 should be done completely outside of the reach of the 699 PostScript language itself; care should be taken to 700 insure that no method exists for re-enabling full- 701 function versions of these operators. 703 (2) The PostScript language provides facilities for exiting 704 the normal interpreter, or server, loop. Changes made 705 in this "outer" environment are customarily retained 706 across documents, and may in some cases be retained 707 semipermanently in nonvolatile memory. The operators 708 associated with exiting the interpreter loop have the 709 potential to interfere with subsequent document 710 processing. As such, their unrestrained use 711 constitutes a threat of service denial. PostScript 712 operators that exit the interpreter loop include, but 713 may not be limited to, the exitserver and startjob 714 operators. Message sending software should not 715 generate PostScript that depends on exiting the 716 interpreter loop to operate, since the ability to exit 717 will probably be unavailable in secure PostScript 718 implementations. Message receiving and displaying 719 software should completely disable the ability to make 720 retained changes to the PostScript environment by 721 eliminating or disabling the "startjob" and 722 "exitserver" operations. If these operations cannot be 723 eliminated or completely disabled the password 724 associated with them should at least be set to a hard- 725 to-guess value. 727 (3) PostScript provides operators for setting system-wide 728 and device-specific parameters. These parameter 729 settings may be retained across jobs and may 730 potentially pose a threat to the correct operation of 731 the interpreter. The PostScript operators that set 732 system and device parameters include, but may not be 733 limited to, the "setsystemparams" and "setdevparams" 734 operators. Message sending software should not 735 generate PostScript that depends on the setting of 736 system or device parameters to operate correctly. The 737 ability to set these parameters will probably be 738 unavailable in secure PostScript implementations. 739 Message receiving and displaying software should 740 disable the ability to change system and device 741 parameters. If these operators cannot be completely 742 disabled the password associated with them should at 743 least be set to a hard-to-guess value. 745 (4) Some PostScript implementations provide nonstandard 746 facilities for the direct loading and execution of 747 machine code. Such facilities are quite obviously open 748 to substantial abuse. Message sending software should 749 not make use of such features. Besides being totally 750 hardware-specific, they are also likely to be 751 unavailable in secure implementations of PostScript. 752 Message receiving and displaying software should not 753 allow such operators to be used if they exist. 755 (5) PostScript is an extensible language, and many, if not 756 most, implementations of it provide a number of their 757 own extensions. This document does not deal with such 758 extensions explicitly since they constitute an unknown 759 factor. Message sending software should not make use 760 of nonstandard extensions; they are likely to be 761 missing from some implementations. Message receiving 762 and displaying software should make sure that any 763 nonstandard PostScript operators are secure and don't 764 present any kind of threat. 766 (6) It is possible to write PostScript that consumes huge 767 amounts of various system resources. It is also 768 possible to write PostScript programs that loop 769 indefinitely. Both types of programs have the 770 potential to cause damage if sent to unsuspecting 771 recipients. Message-sending software should avoid the 772 construction and dissemination of such programs, which 773 is antisocial. Message receiving and displaying 774 software should provide appropriate mechanisms to abort 775 processing of a document after a reasonable amount of 776 time has elapsed. In addition, PostScript interpreters 777 should be limited to the consumption of only a 778 reasonable amount of any given system resource. 780 (7) It is possible to include raw binary information inside 781 PostScript in various forms. This is not recommended 782 for use in Internet mail, both because it is not 783 supported by all PostScript interpreters and because it 784 significantly complicates the use of a MIME Content- 785 Transfer-Encoding. (Without such binary, PostScript 786 may typically be viewed as line-oriented data. The 787 treatment of CRLF sequences becomes extremely 788 problematic if binary and line-oriented data are mixed 789 in a single Postscript data stream.) 791 (8) Finally, bugs may exist in some PostScript interpreters 792 which could possibly be exploited to gain unauthorized 793 access to a recipient's system. Apart from noting this 794 possibility, there is no specific action to take to 795 prevent this, apart from the timely correction of such 796 bugs if any are found. 798 6.5.3. Other Application Subtypes 800 It is expected that many other subtypes of application will be 801 defined in the future. MIME implementations must at a minimum 802 treat any unrecognized subtypes as being equivalent to 803 "application/octet-stream". 805 7. Composite Media Type Values 807 The remaining two of the seven initial Content-Type values 808 refer to composite entities. Composite entities are handled 809 using MIME mechanisms -- a MIME processor typically handles 810 the body directly. 812 7.1. Multipart Media Type 814 In the case of multiple part entities, in which one or more 815 different sets of data are combined in a single body, a 816 "multipart" media type field must appear in the entity's 817 header. The body must then contain one or more "body parts," 818 each preceded by a boundary delimiter line, and the last one 819 followed by a closing boundary delimiter line. After its 820 boundary delimiter line, each body part then consists of a 821 header area, a blank line, and a body area. Thus a body part 822 is similar to an RFC 822 message in syntax, but different in 823 meaning. 825 A body part is NOT to be interpreted as actually being an RFC 826 822 message. To begin with, NO header fields are actually 827 required in body parts. A body part that starts with a blank 828 line, therefore, is allowed and is a body part for which all 829 default values are to be assumed. In such a case, the absence 830 of a Content-Type header usually indicates that the 831 corresponding body has a content-type of "text/plain; 832 charset=US-ASCII". 834 The only header fields that have defined meaning for body 835 parts are those the names of which begin with "Content-". All 836 other header fields are generally to be ignored in body parts. 837 Although they should generally be retained if at all possible, 838 they may be discarded by gateways if necessary. Such other 839 fields are permitted to appear in body parts but must not be 840 depended on. "X-" fields may be created for experimental or 841 private purposes, with the recognition that the information 842 they contain may be lost at some gateways. 844 NOTE: The distinction between an RFC 822 message and a body 845 part is subtle, but important. A gateway between Internet and 846 X.400 mail, for example, must be able to tell the difference 847 between a body part that contains an image and a body part 848 that contains an encapsulated message, the body of which is a 849 JPEG image. In order to represent the latter, the body part 850 must have "Content-Type: message/rfc822", and its body (after 851 the blank line) must be the encapsulated message, with its own 852 "Content-Type: image/jpeg" header field. The use of similar 853 syntax facilitates the conversion of messages to body parts, 854 and vice versa, but the distinction between the two must be 855 understood by implementors. (For the special case in which 856 all parts actually are messages, a "digest" subtype is also 857 defined.) 859 As stated previously, each body part is preceded by a boundary 860 delimiter line that contains the boundary delimiter. The 861 boundary delimiter MUST NOT appear inside any of the 862 encapsulated parts, on a line by itself or as the prefix of 863 any line. This implies that it is crucial that the composing 864 agent be able to choose and specify a unique boundary 865 parameter value that does not contain the boundary parameter 866 value of an enclosing multipart as a prefix. 868 All present and future subtypes of the "multipart" type must 869 use an identical syntax. Subtypes may differ in their 870 semantics, and may impose additional restrictions on syntax, 871 but must conform to the required syntax for the multipart 872 type. This requirement ensures that all conformant user 873 agents will at least be able to recognize and separate the 874 parts of any multipart entity, even those of an unrecognized 875 subtype. 877 As stated in the definition of the Content-Transfer-Encoding 878 field [MIME-IMB], no encoding other than "7bit", "8bit", or 879 "binary" is permitted for entities of type "multipart". The 880 multipart boundary delimiters and header fields are always 881 represented as 7-bit US-ASCII in any case (though the header 882 fields may encode non-US-ASCII header text as per RFC MIME- 883 HEADERS) and data within the body parts can be encoded on a 884 part-by-part basis, with Content-Transfer-Encoding fields for 885 each appropriate body part. 887 7.1.1. Common Syntax 889 This section defines a common syntax for subtypes of 890 multipart. All subtypes of multipart must use this syntax. A 891 simple example of a multipart message also appears in this 892 section. An example of a more complex multipart message is 893 given in RFC MIME-CONF. 895 The Content-Type field for multipart entities requires one 896 parameter, "boundary". The boundary delimiter line is then 897 defined as a line consisting entirely of two hyphen characters 898 ("-", decimal value 45) followed by the boundary parameter 899 value from the Content-Type header field, optional linear 900 whitespace, and a terminating CRLF. 902 NOTE: The hyphens are for rough compatibility with the 903 earlier RFC 934 method of message encapsulation, and for ease 904 of searching for the boundaries in some implementations. 905 However, it should be noted that multipart messages are NOT 906 completely compatible with RFC 934 encapsulations; in 907 particular, they do not obey RFC 934 quoting conventions for 908 embedded lines that begin with hyphens. This mechanism was 909 chosen over the RFC 934 mechanism because the latter causes 910 lines to grow with each level of quoting. The combination of 911 this growth with the fact that SMTP implementations sometimes 912 wrap long lines made the RFC 934 mechanism unsuitable for use 913 in the event that deeply-nested multipart structuring is ever 914 desired. 916 WARNING TO IMPLEMENTORS: The grammar for parameters on the 917 Content-type field is such that it is often necessary to 918 enclose the boundary parameter values in quotes on the 919 Content-type line. This is not always necessary, but never 920 hurts. Implementors should be sure to study the grammar 921 carefully in order to avoid producing invalid Content-type 922 fields. Thus, a typical multipart Content-Type header field 923 might look like this: 925 Content-Type: multipart/mixed; boundary=gc0p4Jq0M2Yt08j34c0p 927 But the following is not valid: 929 Content-Type: multipart/mixed; boundary=gc0pJq0M:08jU534c0p 931 (because of the colon) and must instead be represented as 933 Content-Type: multipart/mixed; boundary="gc0pJq0M:08jU534c0p" 935 This Content-Type value indicates that the content consists of 936 one or more parts, each with a structure that is syntactically 937 identical to an RFC 822 message, except that the header area 938 is allowed to be completely empty, and that the parts are each 939 preceded by the line 941 --gc0pJq0M:08jU534c0p 943 The boundary delimiter MUST occur at the beginning of a line, 944 i.e., following a CRLF, and the initial CRLF is considered to 945 be attached to the boundary delimiter line rather than part of 946 the preceding part. The boundary may be followed by zero or 947 more characters of linear whitespace. It is then terminated by 948 either another CRLF and the header fields for the next part, 949 or by two CRLFs, in which case there are no header fields for 950 the next part. If no Content-Type field is present it is 951 assumed to be of message/rfc822 in a multipart/digest and 952 text/plain otherwise. 954 NOTE: The CRLF preceding the boundary delimiter line is 955 conceptually attached to the boundary so that it is possible 956 to have a part that does not end with a CRLF (line break). 957 Body parts that must be considered to end with line breaks, 958 therefore, must have two CRLFs preceding the boundary 959 delimiter line, the first of which is part of the preceding 960 body part, and the second of which is part of the 961 encapsulation boundary. 963 Boundary delimiters must not appear within the encapsulated 964 material, and must be no longer than 70 characters, not 965 counting the two leading hyphens. 967 The boundary delimiter line following the last body part is a 968 distinguished delimiter that indicates that no further body 969 parts will follow. Such a delimiter line is identical to the 970 previous delimiter lines, with the addition of two more 971 hyphens after the boundary parameter value. 973 --gc0pJq0M:08jU534c0p-- 975 NOTE TO IMPLEMENTORS: Boundary string comparisons must 976 compare the boundary value with the beginning of each 977 candidate line. An exact match of the entire candidate line 978 is not required; it is sufficient that the boundary appear in 979 its entirety following the CRLF. 981 There appears to be room for additional information prior to 982 the first boundary delimiter line and following the final 983 boundary delimiter line. These areas should generally be left 984 blank, and implementations must ignore anything that appears 985 before the first boundary delimiter line or after the last 986 one. 988 NOTE: These "preamble" and "epilogue" areas are generally not 989 used because of the lack of proper typing of these parts and 990 the lack of clear semantics for handling these areas at 991 gateways, particularly X.400 gateways. However, rather than 992 leaving the preamble area blank, many MIME implementations 993 have found this to be a convenient place to insert an 994 explanatory note for recipients who read the message with 995 pre-MIME software, since such notes will be ignored by MIME- 996 compliant software. 998 NOTE: Because boundary delimiters must not appear in the body 999 parts being encapsulated, a user agent must exercise care to 1000 choose a unique boundary parameter value. The boundary 1001 parameter value in the example above could have been the 1002 result of an algorithm designed to produce boundary delimiters 1003 with a very low probability of already existing in the data to 1004 be encapsulated without having to prescan the data. Alternate 1005 algorithms might result in more "readable" boundary delimiters 1006 for a recipient with an old user agent, but would require more 1007 attention to the possibility that the boundary delimiter might 1008 appear at the beginning of some line in the encapsulated part. 1009 The simplest boundary delimiter line possible is something 1010 like "---", with a closing boundary delimiter line of "-----". 1012 As a very simple example, the following multipart message has 1013 two parts, both of them plain text, one of them explicitly 1014 typed and one of them implicitly typed: 1016 From: Nathaniel Borenstein 1017 To: Ned Freed 1018 Date: Sun, 21 Mar 1993 23:56:48 -0800 (PST) 1019 Subject: Sample message 1020 MIME-Version: 1.0 1021 Content-type: multipart/mixed; boundary="simple boundary" 1023 This is the preamble. It is to be ignored, though it 1024 is a handy place for composition agents to include an 1025 explanatory note to non-MIME conformant readers. 1027 --simple boundary 1029 This is implicitly typed plain US-ASCII text. 1030 It does NOT end with a linebreak. 1031 --simple boundary 1032 Content-type: text/plain; charset=us-ascii 1034 This is explicitly typed plain US-ASCII text. 1035 It DOES end with a linebreak. 1037 --simple boundary-- 1039 This is the epilogue. It is also to be ignored. 1041 The use of a media type of multipart in a body part within 1042 another multipart entity is explicitly allowed. In such 1043 cases, for obvious reasons, care must be taken to ensure that 1044 each nested multipart entity uses a different boundary 1045 delimiter. See RFC MIME-CONF for an example of nested 1046 multipart entities. 1048 The use of the multipart media type with only a single body 1049 part may be useful in certain contexts, and is explicitly 1050 permitted. 1052 The only mandatory global parameter for the multipart media 1053 type is the boundary parameter, which consists of 1 to 70 1054 characters from a set of characters known to be very robust 1055 through mail gateways, and NOT ending with white space. (If a 1056 boundary delimiter line appears to end with white space, the 1057 white space must be presumed to have been added by a gateway, 1058 and must be deleted.) It is formally specified by the 1059 following BNF: 1061 boundary := 0*69 bcharsnospace 1063 bchars := bcharsnospace / " " 1065 bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" / 1066 "+" / "_" / "," / "-" / "." / 1067 "/" / ":" / "=" / "?" 1069 Overall, the body of a multipart entity may be specified as 1070 follows: 1072 dash-boundary := "--" boundary 1073 ; boundary taken from the value of 1074 ; boundary parameter of the 1075 ; Content-Type field. 1077 multipart-body := [preamble CRLF] 1078 dash-boundary transport-padding CRLF 1079 body-part *encapsulation 1080 close-delimiter transport-padding 1081 [CRLF epilogue] 1083 transport-padding := *LWSP-char 1084 ; Composers MUST NOT generate 1085 ; non-zero length transport 1086 ; padding, but receivers MUST 1087 ; be able to handle padding 1088 ; added by message transports. 1090 encapsulation := delimiter transport-padding 1091 CRLF body-part 1093 delimiter := CRLF dash-boundary 1095 close-delimiter := delimiter "--" 1097 preamble := discard-text 1099 epilogue := discard-text 1101 discard-text := *(*text CRLF) *text 1102 ; To be ignored upon receipt. 1104 body-part := <"message" as defined in RFC 822, with all 1105 header fields optional, not starting with the 1106 specified dash-boundary, and with the 1107 delimiter not occurring anywhere in the 1108 body part. Note that the semantics of a 1109 part differ from the semantics of a message, 1110 as described in the text.> 1112 IMPORTANT NOTE: The free insertion of linear-white-space and 1113 RFC 822 comments between the elements shown in this BNF is NOT 1114 allowed since this BNF does not specify a structured header 1115 field. 1117 NOTE: In certain transport enclaves, RFC 822 restrictions 1118 such as the one that limits bodies to printable US-ASCII 1119 characters may not be in force. (That is, the transport 1120 domains may resemble standard Internet mail transport as 1121 specified in RFC 821 and assumed by RFC 822, but without 1122 certain restrictions.) The relaxation of these restrictions 1123 should be construed as locally extending the definition of 1124 bodies, for example to include octets outside of the US-ASCII 1125 range, as long as these extensions are supported by the 1126 transport and adequately documented in the Content-Transfer- 1127 Encoding header field. However, in no event are headers 1128 (either message headers or body-part headers) allowed to 1129 contain anything other than US-ASCII characters. 1131 NOTE: Conspicuously missing from the multipart type is a 1132 notion of structured, related body parts. In general, it 1133 seems premature to try to standardize interpart structure yet. 1134 It is recommended that those wishing to provide a more 1135 structured or integrated multipart messaging facility should 1136 define a subtype of multipart that is syntactically identical, 1137 but that always expects the inclusion of a distinguished part 1138 that can be used to specify the structure and integration of 1139 the other parts, probably referring to them by their Content- 1140 ID field. If this approach is used, other implementations 1141 will not recognize the new subtype, but will treat it as the 1142 primary subtype (multipart/mixed) and will thus be able to 1143 show the user the parts that are recognized. 1145 7.1.2. Handling Nested Messages and Multiparts 1147 The "message/rfc822" subtype defined in a subsequent section 1148 of this document has no terminating condition other than 1149 running out of data. Similarly, an improperly truncated 1150 multipart object may not have any terminating boundary marker, 1151 and can turn up operationally due to mail system malfunctions. 1153 It is essential that such objects be handled correctly when 1154 they are themselves imbedded inside of another multipart 1155 structure. MIME implementations are therefore required to 1156 recognize outer level boundary markers at ANY level of inner 1157 nesting. It is not sufficient to only check for the next 1158 expected marker or other terminating condition. 1160 7.1.3. Mixed Subtype 1162 The "mixed" subtype of multipart is intended for use when the 1163 body parts are independent and need to be bundled in a 1164 particular order. Any multipart subtypes that an 1165 implementation does not recognize must be treated as being of 1166 subtype "mixed". 1168 7.1.4. Alternative Subtype 1170 The multipart/alternative type is syntactically identical to 1171 multipart/mixed, but the semantics are different. In 1172 particular, each of the parts is an "alternative" version of 1173 the same information. 1175 Systems should recognize that the content of the various parts 1176 are interchangeable. Systems should choose the "best" type 1177 based on the local environment and references, in some cases 1178 even through user interaction. As with multipart/mixed, the 1179 order of body parts is significant. In this case, the 1180 alternatives appear in an order of increasing faithfulness to 1181 the original content. In general, the best choice is the LAST 1182 part of a type supported by the recipient system's local 1183 environment. 1185 Multipart/alternative may be used, for example, to send a 1186 message in a fancy text format in such a way that it can 1187 easily be displayed anywhere: 1189 From: Nathaniel Borenstein 1190 To: Ned Freed 1191 Date: Mon, 22 Mar 1993 09:41:09 -0800 (PST) 1192 Subject: Formatted text mail 1193 MIME-Version: 1.0 1194 Content-Type: multipart/alternative; boundary=boundary42 1196 --boundary42 1197 Content-Type: text/plain; charset=us-ascii 1199 ... plain text version of message goes here ... 1201 --boundary42 1202 Content-Type: text/enriched 1204 ... RFC 1563 text/enriched version of same message 1205 goes here ... 1207 --boundary42 1208 Content-Type: application/x-whatever 1210 ... fanciest version of same message goes here ... 1212 --boundary42-- 1214 In this example, users whose mail systems understood the 1215 "application/x-whatever" format would see only the fancy 1216 version, while other users would see only the enriched or 1217 plain text version, depending on the capabilities of their 1218 system. 1220 In general, user agents that compose multipart/alternative 1221 entities must place the body parts in increasing order of 1222 preference, that is, with the preferred format last. For 1223 fancy text, the sending user agent should put the plainest 1224 format first and the richest format last. Receiving user 1225 agents should pick and display the last format they are 1226 capable of displaying. In the case where one of the 1227 alternatives is itself of type "multipart" and contains 1228 unrecognized sub-parts, the user agent may choose either to 1229 show that alternative, an earlier alternative, or both. 1231 NOTE: From an implementor's perspective, it might seem more 1232 sensible to reverse this ordering, and have the plainest 1233 alternative last. However, placing the plainest alternative 1234 first is the friendliest possible option when 1235 multipart/alternative entities are viewed using a non-MIME- 1236 conformant viewer. While this approach does impose some 1237 burden on conformant MIME viewers, interoperability with older 1238 mail readers was deemed to be more important in this case. 1240 It may be the case that some user agents, if they can 1241 recognize more than one of the formats, will prefer to offer 1242 the user the choice of which format to view. This makes 1243 sense, for example, if a message includes both a nicely- 1244 formatted image version and an easily-edited text version. 1245 What is most critical, however, is that the user not 1246 automatically be shown multiple versions of the same data. 1247 Either the user should be shown the last recognized version or 1248 should be given the choice. 1250 NOTE ON THE SEMANTICS OF CONTENT-ID IN MULTIPART/ALTERNATIVE: 1251 Each part of a multipart/alternative entity represents the 1252 same data, but the mappings between the two are not 1253 necessarily without information loss. For example, 1254 information is lost when translating ODA to PostScript or 1255 plain text. It is recommended that each part should have a 1256 different Content-ID value in the case where the information 1257 content of the two parts is not identical. And when the 1258 information content is identical -- for example, where several 1259 parts of type "message/external-body" specify alternate ways 1260 to access the identical data -- the same Content-ID field 1261 value should be used, to optimize any caching mechanisms that 1262 might be present on the recipient's end. However, the 1263 Content-ID values used by the parts should NOT be the same 1264 Content-ID value that describes the multipart/alternative as a 1265 whole, if there is any such Content-ID field. That is, one 1266 Content-ID value will refer to the multipart/alternative 1267 entity, while one or more other Content-ID values will refer 1268 to the parts inside it. 1270 7.1.5. Digest Subtype 1272 This document defines a "digest" subtype of the multipart 1273 Content-Type. This type is syntactically identical to 1274 multipart/mixed, but the semantics are different. In 1275 particular, in a digest, the default Content-Type value for a 1276 body part is changed from "text/plain" to "message/rfc822". 1277 This is done to allow a more readable digest format that is 1278 largely compatible (except for the quoting convention) with 1279 RFC 934. 1281 A digest in this format might, then, look something like this: 1283 From: Moderator-Address 1284 To: Recipient-List 1285 Date: Mon, 22 Mar 1994 13:34:51 +0000 1286 Subject: Internet Digest, volume 42 1287 MIME-Version: 1.0 1288 Content-Type: multipart/digest; 1289 boundary="---- next message ----" 1291 ------ next message ---- 1293 From: someone-else 1294 Date: Fri, 26 Mar 1993 11:13:32 +0200 1295 Subject: my opinion 1297 ...body goes here ... 1299 ------ next message ---- 1301 From: someone-else-again 1302 Date: Fri, 26 Mar 1993 10:07:13 -0500 1303 Subject: my different opinion 1305 ... another body goes here ... 1307 ------ next message ------ 1309 7.1.6. Parallel Subtype 1311 This document defines a "parallel" subtype of the multipart 1312 Content-Type. This type is syntactically identical to 1313 multipart/mixed, but the semantics are different. In 1314 particular, in a parallel entity, the order of body parts is 1315 not significant. 1317 A common presentation of this type is to display all of the 1318 parts simultaneously on hardware and software that are capable 1319 of doing so. However, composing agents should be aware that 1320 many mail readers will lack this capability and will show the 1321 parts serially in any event. 1323 7.1.7. Other Multipart Subtypes 1325 Other multipart subtypes are expected in the future. MIME 1326 implementations must in general treat unrecognized subtypes of 1327 multipart as being equivalent to "multipart/mixed". 1329 7.2. Message Media Type 1331 It is frequently desirable, in sending mail, to encapsulate 1332 another mail message. A special media type, "message", is 1333 defined to facilitate this. In particular, the "rfc822" 1334 subtype of "message" is used to encapsulate RFC 822 messages. 1336 NOTE: It has been suggested that subtypes of message might be 1337 defined for forwarded or rejected messages. However, 1338 forwarded and rejected messages can be handled as multipart 1339 messages in which the first part contains any control or 1340 descriptive information, and a second part, of type 1341 message/rfc822, is the forwarded or rejected message. 1342 Composing rejection and forwarding messages in this manner 1343 will preserve the type information on the original message and 1344 allow it to be correctly presented to the recipient, and hence 1345 is strongly encouraged. 1347 Subtypes of message often impose restrictions on what 1348 encodings are allowed. These restrictions are described in 1349 conjunction with each specific subtype. 1351 Mail gateways, relays, and other mail handling agents are 1352 commonly known to alter the top-level header of an RFC 822 1353 message. In particular, they frequently add, remove, or 1354 reorder header fields. Such alterations are explicitly 1355 forbidden for the encapsulated headers embedded in the bodies 1356 of messages of type "message." 1358 7.2.1. RFC822 Subtype 1360 A media type of "message/rfc822" indicates that the body 1361 contains an encapsulated message, with the syntax of an RFC 1362 822 message. However, unlike top-level RFC 822 messages, the 1363 restriction that each message/rfc822 body must include a 1364 "From", "Date", and at least one destination header is removed 1365 and replaced with the requirement that at least one of "From", 1366 "Subject", or "Date" must be present. 1368 No encoding other than "7bit", "8bit", or "binary" is 1369 permitted for body parts of type "message/rfc822". The 1370 message header fields are always US-ASCII in any case, and 1371 data within the body can still be encoded, in which case the 1372 Content-Transfer-Encoding header field in the encapsulated 1373 message will reflect this. Non-US-ASCII text in the headers 1374 of an encapsulated message can be specified using the 1375 mechanisms described in RFC MIME-HEADERS. 1377 It should be noted that, despite the use of the numbers "822", 1378 a message/rfc822 entity can include enhanced information as 1379 defined in this document. In other words, a message/rfc822 1380 message may be a MIME message. 1382 7.2.2. Partial Subtype 1384 The "partial" subtype is defined to allow large entities to be 1385 delivered as several separate pieces of mail and automatically 1386 reassembled by a receiving user agent. (The concept is 1387 similar to IP fragmentation and reassembly in the basic 1388 Internet Protocols.) This mechanism can be used when 1389 intermediate transport agents limit the size of individual 1390 messages that can be sent. The media type "message/partial" 1391 thus indicates that the body contains a fragment of a larger 1392 entity. 1394 Three parameters must be specified in the Content-Type field 1395 of type message/partial: The first, "id", is a unique 1396 identifier, as close to a world-unique identifier as possible, 1397 to be used to match the fragments together. (In general, the 1398 identifier is essentially a message-id; if placed in double 1399 quotes, it can be ANY message-id, in accordance with the BNF 1400 for "parameter" given earlier in this specification.) The 1401 second, "number", an integer, is the fragment number, which 1402 indicates where this fragment fits into the sequence of 1403 fragments. The third, "total", another integer, is the total 1404 number of fragments. This third subfield is required on the 1405 final fragment, and is optional (though encouraged) on the 1406 earlier fragments. Note also that these parameters may be 1407 given in any order. 1409 Thus, the second piece of a 3-piece message may have either of 1410 the following header fields: 1412 Content-Type: Message/Partial; number=2; total=3; 1413 id="oc=jpbe0M2Yt4s@thumper.bellcore.com" 1415 Content-Type: Message/Partial; 1416 id="oc=jpbe0M2Yt4s@thumper.bellcore.com"; 1417 number=2 1419 But the third piece MUST specify the total number of 1420 fragments: 1422 Content-Type: Message/Partial; number=3; total=3; 1423 id="oc=jpbe0M2Yt4s@thumper.bellcore.com" 1425 Note that fragment numbering begins with 1, not 0. 1427 When the fragments of an entity broken up in this manner are 1428 put together, the result is a complete MIME entity, which may 1429 have its own Content-Type header field, and thus may contain 1430 any other data type. 1432 7.2.2.1. Message Fragmentation and Reassembly 1434 The semantics of a reassembled partial message must be those 1435 of the "inner" message, rather than of a message containing 1436 the inner message. This makes it possible, for example, to 1437 send a large audio message as several partial messages, and 1438 still have it appear to the recipient as a simple audio 1439 message rather than as an encapsulated message containing an 1440 audio message. That is, the encapsulation of the message is 1441 considered to be "transparent". 1443 When generating and reassembling the pieces of a 1444 message/partial message, the headers of the encapsulated 1445 message must be merged with the headers of the enclosing 1446 entities. In this process the following rules must be 1447 observed: 1449 (1) All of the header fields from the initial enclosing 1450 message, except those that start with "Content-" and 1451 the specific header fields "Subject", "Message-ID", 1452 "Encrypted", and "MIME-Version", must be copied, in 1453 order, to the new message. 1455 (2) The header fields in the enclosed message which start 1456 with "Content-", plus the "Subject", "Message-ID", 1457 "Encrypted", and "MIME-Version" fields, must be 1458 appended, in order, to the header fields of the new 1459 message. Any header fields in the enclosed message 1460 which do not start with "Content-" (except for the 1461 "Subject", "Message-ID", "Encrypted", and "MIME- 1462 Version" fields) will be ignored and dropped. 1464 (3) All of the header fields from the second and any 1465 subsequent enclosing messages are discarded by the 1466 reassembly process. 1468 7.2.2.2. Fragmentation and Reassembly Example 1470 If an audio message is broken into two pieces, the first piece 1471 might look something like this: 1473 X-Weird-Header-1: Foo 1474 From: Bill@host.com 1475 To: joe@otherhost.com 1476 Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST) 1477 Subject: Audio mail (part 1 of 2) 1478 Message-ID: 1479 MIME-Version: 1.0 1480 Content-type: message/partial; id="ABC@host.com"; 1481 number=1; total=2 1483 X-Weird-Header-1: Bar 1484 X-Weird-Header-2: Hello 1485 Message-ID: 1486 Subject: Audio mail 1487 MIME-Version: 1.0 1488 Content-type: audio/basic 1489 Content-transfer-encoding: base64 1491 ... first half of encoded audio data goes here ... 1493 and the second half might look something like this: 1495 From: Bill@host.com 1496 To: joe@otherhost.com 1497 Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST) 1498 Subject: Audio mail (part 2 of 2) 1499 MIME-Version: 1.0 1500 Message-ID: 1501 Content-type: message/partial; 1502 id="ABC@host.com"; number=2; total=2 1504 ... second half of encoded audio data goes here ... 1506 Then, when the fragmented message is reassembled, the 1507 resulting message to be displayed to the user should look 1508 something like this: 1510 X-Weird-Header-1: Foo 1511 From: Bill@host.com 1512 To: joe@otherhost.com 1513 Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST) 1514 Subject: Audio mail 1515 Message-ID: 1516 MIME-Version: 1.0 1517 Content-type: audio/basic 1518 Content-transfer-encoding: base64 1520 ... first half of encoded audio data goes here ... 1521 ... second half of encoded audio data goes here ... 1523 Because data of type "message" may never be encoded in base64 1524 or quoted-printable, a problem might arise if message/partial 1525 entities are constructed in an environment that supports 1526 binary or 8-bit transport. The problem is that the binary 1527 data would be split into multiple message/partial messages, 1528 each of them requiring binary transport. If such messages 1529 were encountered at a gateway into a 7-bit transport 1530 environment, there would be no way to properly encode them for 1531 the 7-bit world, aside from waiting for all of the fragments, 1532 reassembling the inner message, and then encoding the 1533 reassembled data in base64 or quoted-printable. Since it is 1534 possible that different fragments might go through different 1535 gateways, even this is not an acceptable solution. For this 1536 reason, it is specified that MIME entities of type 1537 message/partial must always have a content-transfer-encoding 1538 of 7-bit (the default). In particular, even in environments 1539 that support binary or 8-bit transport, the use of a content- 1540 transfer-encoding of "8bit" or "binary" is explicitly 1541 prohibited for entities of type message/partial. 1543 Because some message transfer agents may choose to 1544 automatically fragment large messages, and because such agents 1545 may use very different fragmentation thresholds, it is 1546 possible that the pieces of a partial message, upon 1547 reassembly, may prove themselves to comprise a partial 1548 message. This is explicitly permitted. 1550 The inclusion of a "References" field in the headers of the 1551 second and subsequent pieces of a fragmented message that 1552 references the Message-Id on the previous piece may be of 1553 benefit to mail readers that understand and track references. 1554 However, the generation of such "References" fields is 1555 entirely optional. 1557 Finally, it should be noted that the "Encrypted" header field 1558 has been made obsolete by Privacy Enhanced Messaging (PEM) 1559 [RFC1421, RFC1422, RFC1423, and RFC1424], but the rules above 1560 are nevertheless believed to describe the correct way to treat 1561 it if it is encountered in the context of conversion to and 1562 from message/partial fragments. 1564 7.2.3. External-Body Subtype 1566 The external-body subtype indicates that the actual body data 1567 are not included, but merely referenced. In this case, the 1568 parameters describe a mechanism for accessing the external 1569 data. 1571 When an entity is of type "message/external-body", it consists 1572 of a header, two consecutive CRLFs, and the message header for 1573 the encapsulated message. If another pair of consecutive 1574 CRLFs appears, this of course ends the message header for the 1575 encapsulated message. However, since the encapsulated 1576 message's body is itself external, it does NOT appear in the 1577 area that follows. For example, consider the following 1578 message: 1580 Content-type: message/external-body; 1581 access-type=local-file; 1582 name="/u/nsb/Me.jpeg" 1584 Content-type: image/jpeg 1585 Content-ID: 1586 Content-Transfer-Encoding: binary 1588 THIS IS NOT REALLY THE BODY! 1590 The area at the end, which might be called the "phantom body", 1591 is ignored for most external-body messages. However, it may 1592 be used to contain auxiliary information for some such 1593 messages, as indeed it is when the access-type is "mail- 1594 server". The only access-type defined in this document that 1595 uses the phantom body is "mail-server", but other access-types 1596 may be defined in the future in other documents that use this 1597 area. 1599 The encapsulated headers in ALL message/external-body entities 1600 MUST include a Content-ID header field to give a unique 1601 identifier by which to reference the data. This identifier 1602 may be used for caching mechanisms, and for recognizing the 1603 receipt of the data when the access-type is "mail-server". 1605 Note that, as specified here, the tokens that describe 1606 external-body data, such as file names and mail server 1607 commands, are required to be in the US-ASCII character set. 1608 If this proves problematic in practice, a new mechanism may be 1609 required as a future extension to MIME, either as newly 1610 defined access-types for message/external-body or by some 1611 other mechanism. 1613 As with message/partial, MIME entities of type 1614 message/external-body MUST have a content-transfer-encoding of 1615 7-bit (the default). In particular, even in environments that 1616 support binary or 8-bit transport, the use of a content- 1617 transfer-encoding of "8bit" or "binary" is explicitly 1618 prohibited for entities of type message/external-body. 1620 7.2.3.1. General External-Body Parameters 1622 The parameters that may be used with any message/external-body 1623 are: 1625 (1) ACCESS-TYPE -- A word indicating the supported access 1626 mechanism by which the file or data may be obtained. 1627 This word is not case sensitive. Values include, but 1628 are not limited to, "FTP", "ANON-FTP", "TFTP", "LOCAL- 1629 FILE", and "MAIL-SERVER". Future values, except for 1630 experimental values beginning with "X-", must be 1631 registered with IANA, as described in RFC MIME-REG. 1632 This parameter is unconditionally mandatory and MUST be 1633 present on EVERY message/external-body. 1635 (2) EXPIRATION -- The date (in the RFC 822 "date-time" 1636 syntax, as extended by RFC 1123 to permit 4 digits in 1637 the year field) after which the existence of the 1638 external data is not guaranteed. This parameter may be 1639 used with ANY access-type and is ALWAYS optional. 1641 (3) SIZE -- The size (in octets) of the data. The intent 1642 of this parameter is to help the recipient decide 1643 whether or not to expend the necessary resources to 1644 retrieve the external data. Note that this describes 1645 the size of the data in its canonical form, that is, 1646 before any Content-Transfer-Encoding has been applied 1647 or after the data have been decoded. This parameter 1648 may be used with ANY access-type and is ALWAYS 1649 optional. 1651 (4) PERMISSION -- A case-insensitive field that indicates 1652 whether or not it is expected that clients might also 1653 attempt to overwrite the data. By default, or if 1654 permission is "read", the assumption is that they are 1655 not, and that if the data is retrieved once, it is 1656 never needed again. If PERMISSION is "read-write", 1657 this assumption is invalid, and any local copy must be 1658 considered no more than a cache. "Read" and "Read- 1659 write" are the only defined values of permission. This 1660 parameter may be used with ANY access-type and is 1661 ALWAYS optional. 1663 The precise semantics of the access-types defined here are 1664 described in the sections that follow. 1666 7.2.3.2. The 'ftp' and 'tftp' Access-Types 1668 An access-type of FTP or TFTP indicates that the message body 1669 is accessible as a file using the FTP [RFC-959] or TFTP [RFC- 1670 783] protocols, respectively. For these access-types, the 1671 following additional parameters are mandatory: 1673 (1) NAME -- The name of the file that contains the actual 1674 body data. 1676 (2) SITE -- A machine from which the file may be obtained, 1677 using the given protocol. This must be a fully 1678 qualified domain name, not a nickname. 1680 (3) Before any data are retrieved, using FTP, the user will 1681 generally need to be asked to provide a login id and a 1682 password for the machine named by the site parameter. 1683 For security reasons, such an id and password are not 1684 specified as content-type parameters, but must be 1685 obtained from the user. 1687 In addition, the following parameters are optional: 1689 (1) DIRECTORY -- A directory from which the data named by 1690 NAME should be retrieved. 1692 (2) MODE -- A case-insensitive string indicating the mode 1693 to be used when retrieving the information. The valid 1694 values for access-type "TFTP" are "NETASCII", "OCTET", 1695 and "MAIL", as specified by the TFTP protocol [RFC- 1696 783]. The valid values for access-type "FTP" are 1697 "ASCII", "EBCDIC", "IMAGE", and "LOCALn" where "n" is a 1698 decimal integer, typically 8. These correspond to the 1699 representation types "A" "E" "I" and "L n" as specified 1700 by the FTP protocol [RFC-959]. Note that "BINARY" and 1701 "TENEX" are not valid values for MODE and that "OCTET" 1702 or "IMAGE" or "LOCAL8" should be used instead. IF MODE 1703 is not specified, the default value is "NETASCII" for 1704 TFTP and "ASCII" otherwise. 1706 7.2.3.3. The 'anon-ftp' Access-Type 1708 The "anon-ftp" access-type is identical to the "ftp" access 1709 type, except that the user need not be asked to provide a name 1710 and password for the specified site. Instead, the ftp 1711 protocol will be used with login "anonymous" and a password 1712 that corresponds to the user's mail address. 1714 7.2.3.4. The 'local-file' Access-Type 1716 An access-type of "local-file" indicates that the actual body 1717 is accessible as a file on the local machine. Two additional 1718 parameters are defined for this access type: 1720 (1) NAME -- The name of the file that contains the actual 1721 body data. This parameter is mandatory for the 1722 "local-file" access-type. 1724 (2) SITE -- A domain specifier for a machine or set of 1725 machines that are known to have access to the data 1726 file. This optional parameter is used to describe the 1727 locality of reference for the data, that is, the site 1728 or sites at which the file is expected to be visible. 1729 Asterisks may be used for wildcard matching to a part 1730 of a domain name, such as "*.bellcore.com", to indicate 1731 a set of machines on which the data should be directly 1732 visible, while a single asterisk may be used to 1733 indicate a file that is expected to be universally 1734 available, e.g., via a global file system. 1736 7.2.3.5. The 'mail-server' Access-Type 1738 The "mail-server" access-type indicates that the actual body 1739 is available from a mail server. Two additional parameters 1740 are defined for this access-type: 1742 (1) SERVER -- The email address of the mail server from 1743 which the actual body data can be obtained. This 1744 parameter is mandatory for the "mail-server" access- 1745 type. 1747 (2) SUBJECT -- The subject that is to be used in the mail 1748 that is sent to obtain the data. Note that keying mail 1749 servers on Subject lines is NOT recommended, but such 1750 mail servers are known to exist. This is an optional 1751 parameter. 1753 Because mail servers accept a variety of syntaxes, some of 1754 which is multiline, the full command to be sent to a mail 1755 server is not included as a parameter in the content-type 1756 header field. Instead, it is provided as the "phantom body" 1757 when the media type is message/external-body and the access- 1758 type is mail-server. 1760 Note that MIME does not define a mail server syntax. Rather, 1761 it allows the inclusion of arbitrary mail server commands in 1762 the phantom body. Implementations must include the phantom 1763 body in the body of the message it sends to the mail server 1764 address to retrieve the relevant data. 1766 Unlike other access-types, mail-server access is asynchronous 1767 and will happen at an unpredictable time in the future. For 1768 this reason, it is important that there be a mechanism by 1769 which the returned data can be matched up with the original 1770 message/external-body entity. MIME mail servers must use the 1771 same Content-ID field on the returned message that was used in 1772 the original message/external-body entity, to facilitate such 1773 matching. 1775 7.2.3.6. External-Body Security Issues 1777 Message/external-body entities give rise to two important 1778 security issues: 1780 (1) Accessing data via a message/external-body reference 1781 effectively results in the message recipient performing 1782 an operation that was specified by the message 1783 originator. It is therefore possible for the message 1784 originator to trick a recipient into doing something 1785 they would not have done otherwise. For example, an 1786 originator could specify a action that attempts 1787 retrieval of material that the recipient is not 1788 authorized to obtain, causing the recipient to 1789 unwittingly violate some security policy. For this 1790 reason, user agents capable of resolving external 1791 references must always take steps to describe the 1792 action they are to take to the recipient and ask for 1793 explicit permisssion prior to performing it. 1795 The 'mail-server' access-type is particularly 1796 vulnerable, in that it causes the recipient to send a 1797 new message whose contents are specified by the 1798 original message's originator. Given the potential for 1799 abuse, any such request messages that are constructed 1800 should contain a clear indication that they were 1801 generated automatically (e.g. in a Comments: header 1802 field) in an attempt to resolve a MIME 1803 message/external-body reference. 1805 (2) MIME will sometimes be used in environments that 1806 provide some guarantee of message integrity and 1807 authenticity. If present, such guarantees may apply 1808 only to the actual direct content of messages -- they 1809 may or may not apply to data accessed through MIME's 1810 message/external-body mechanism. In particular, it may 1811 be possible to subvert certain access mechanisms even 1812 when the messaging system itself is secure. 1814 It should be noted that this problem exists either with 1815 or without the availabilty of MIME mechanisms. A 1816 casual reference to an FTP site containing a document 1817 in the text of a secure message brings up similar 1818 issues -- the only difference is that MIME provides for 1819 automatic retrieval of such material, and users may 1820 place unwarranted trust is such automatic retrieval 1821 mechanisms. 1823 7.2.3.7. Examples and Further Explanations 1825 When the external-body mechanism is used in conjunction with 1826 the multipart/alternative media type it extends the 1827 functionality of multipart/alternative to include the case 1828 where the same object is provided in the same format but via 1829 different accces mechanisms. When this is done the originator 1830 of the message must order the part first in terms of preferred 1831 formats and then by preferred access mechanisms. The 1832 recipient's viewer should then evaluate the list both in terms 1833 of format and access mechanisms. 1835 With the emerging possibility of very wide-area file systems, 1836 it becomes very hard to know in advance the set of machines 1837 where a file will and will not be accessible directly from the 1838 file system. Therefore it may make sense to provide both a 1839 file name, to be tried directly, and the name of one or more 1840 sites from which the file is known to be accessible. An 1841 implementation can try to retrieve remote files using FTP or 1842 any other protocol, using anonymous file retrieval or 1843 prompting the user for the necessary name and password. If an 1844 external body is accessible via multiple mechanisms, the 1845 sender may include multiple parts of type message/external- 1846 body within an entity of type multipart/alternative. 1848 However, the external-body mechanism is not intended to be 1849 limited to file retrieval, as shown by the mail-server 1850 access-type. Beyond this, one can imagine, for example, using 1851 a video server for external references to video clips. 1853 The embedded message header fields which appear in the body of 1854 the message/external-body data must be used to declare the 1855 media type of the external body if it is anything other than 1856 plain US-ASCII text, since the external body does not have a 1857 header section to declare its type. Similarly, any Content- 1858 transfer-encoding other than "7bit" must also be declared 1859 here. Thus a complete message/external-body message, 1860 referring to a document in PostScript format, might look like 1861 this: 1863 From: Whomever 1864 To: Someone 1865 Date: Whenever 1866 Subject: whatever 1867 MIME-Version: 1.0 1868 Message-ID: 1869 Content-Type: multipart/alternative; boundary=42 1870 Content-ID: 1872 --42 1873 Content-Type: message/external-body; name="BodyFormats.ps"; 1874 site="thumper.bellcore.com"; mode="image"; 1875 access-type=ANON-FTP; directory="pub"; 1876 expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)" 1878 Content-type: application/postscript 1879 Content-ID: 1881 --42 1882 Content-Type: message/external-body; access-type=local-file; 1883 name="/u/nsb/writing/rfcs/RFC-MIME.ps"; 1884 site="thumper.bellcore.com"; 1885 expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)" 1887 Content-type: application/postscript 1888 Content-ID: 1890 --42 1891 Content-Type: message/external-body; 1892 access-type=mail-server 1893 server="listserv@bogus.bitnet"; 1894 expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)" 1896 Content-type: application/postscript 1897 Content-ID: 1899 get RFC-MIME.DOC 1901 --42-- 1903 Note that in the above examples, the default Content- 1904 transfer-encoding of "7bit" is assumed for the external 1905 postscript data. 1907 Like the message/partial type, the message/external-body media 1908 type is intended to be transparent, that is, to convey the 1909 data type in the external body rather than to convey a message 1910 with a body of that type. Thus the headers on the outer and 1911 inner parts must be merged using the same rules as for 1912 message/partial. In particular, this means that the Content- 1913 type header is overridden, but the From and Subject headers 1914 are preserved. 1916 Note that since the external bodies are not transported along 1917 with the external body reference, they need not conform to 1918 transport limitations that apply to the reference itself. In 1919 particular, Internet mail transports may impose 7-bit and line 1920 length limits, but these do not automatically apply to binary 1921 external body references. Thus a Content-Transfer-Encoding is 1922 not generally necessary, though it is permitted. 1924 Note that the body of a message of type "message/external- 1925 body" is governed by the basic syntax for an RFC 822 message. 1926 In particular, anything before the first consecutive pair of 1927 CRLFs is header information, while anything after it is body 1928 information, which is ignored for most access-types. 1930 7.2.4. Other Message Subtypes 1932 MIME implementations must in general treat unrecognized 1933 subtypes of message as being equivalent to 1934 "application/octet-stream". 1936 8. Experimental Media Type Values 1938 A media type value beginning with the characters "X-" is a 1939 private value, to be used by consenting systems by mutual 1940 agreement. Any format without a rigorous and public 1941 definition must be named with an "X-" prefix, and publicly 1942 specified values shall never begin with "X-". (Older versions 1943 of the widely used Andrew system use the "X-BE2" name, so new 1944 systems should probably choose a different name.) 1946 In general, the use of "X-" top-level types is strongly 1947 discouraged. Implementors should invent subtypes of the 1948 existing types whenever possible. In many cases, a subtype of 1949 application will be more appropriate than a new top-level 1950 type. 1952 9. Summary 1954 The five discrete media types provide provide a standardized 1955 mechanism for tagging messages or body parts as audio, image, 1956 or several other kinds of data. The composite "multipart" and 1957 "message" media types allow mixing and hierarchical 1958 structuring of objects of different types in a single message. 1959 A distinguished parameter syntax allows further specification 1960 of data format details, particularly the specification of 1961 alternate character sets. Additional optional header fields 1962 provide mechanisms for certain extensions deemed desirable by 1963 many implementors. Finally, a number of useful media types are 1964 defined for general use by consenting user agents, notably 1965 message/partial, and message/external-body. 1967 10. Security Considerations 1969 Security issues are discussed in the context of the 1970 application/postscript type, the message/external-body type, 1971 and in RFC MIME-REG. Implementors should pay special 1972 attention to the security implications of any media types that 1973 can cause the remote execution of any actions in the 1974 recipient's environment. In such cases, the discussion of the 1975 application/postscript type may serve as a model for 1976 considering other media types with remote execution 1977 capabilities. 1979 11. Authors' Addresses 1981 For more information, the authors of this document are best 1982 contacted via Internet mail: 1984 Nathaniel S. Borenstein 1985 First Virtual Holdings 1986 25 Washington Avenue 1987 Morristown, NJ 07960 1988 USA 1990 Email: nsb@nsb.fv.com 1991 Phone: +1 201 540 8967 1992 Fax: +1 201 993 3032 1994 Ned Freed 1995 Innosoft International, Inc. 1996 1050 East Garvey Avenue South 1997 West Covina, CA 91790 1998 USA 2000 Email: ned@innosoft.com 2001 Phone: +1 818 919 3600 2002 Fax: +1 818 919 3614 2004 MIME is a result of the work of the Internet Engineering Task 2005 Force Working Group on Email Extensions. The chairman of that 2006 group, Greg Vaudreuil, may be reached at: 2008 Gregory M. Vaudreuil 2009 Tigon Corporation 2010 17060 Dallas Parkway 2011 Dallas Texas, 75248 2013 Email: greg.vaudreuil@ons.octel.com 2014 Phone: +1 214 733 2722 2015 Appendix A -- Collected Grammar 2017 This appendix contains the complete BNF grammar for all the 2018 syntax specified by this document. 2020 By itself, however, this grammar is incomplete. It refers to 2021 several entities that are defined by RFC 822. Rather than 2022 reproduce those definitions here, and risk unintentional 2023 differences between the two, this document simply refers the 2024 reader to RFC 822 for the remaining definitions. Wherever a 2025 term is undefined, it refers to the RFC 822 definition. 2027 boundary := 0*69 bcharsnospace 2029 bchars := bcharsnospace / " " 2031 bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" / 2032 "+" / "_" / "," / "-" / "." / 2033 "/" / ":" / "=" / "?" 2035 body-part := <"message" as defined in RFC 822, with all 2036 header fields optional, not starting with the 2037 specified dash-boundary, and with the 2038 delimiter not occurring anywhere in the 2039 body part. Note that the semantics of a 2040 part differ from the semantics of a message, 2041 as described in the text.> 2043 close-delimiter := delimiter "--" 2045 dash-boundary := "--" boundary 2046 ; boundary taken from the value of 2047 ; boundary parameter of the 2048 ; Content-Type field. 2050 delimiter := CRLF dash-boundary 2052 discard-text := *(*text CRLF) 2053 ; To be ignored upon receipt. 2055 encapsulation := delimiter transport-padding 2056 CRLF body-part 2058 epilogue := discard-text 2060 multipart-body := [preamble CRLF] 2061 dash-boundary transport-padding CRLF 2062 body-part *encapsulation 2063 close-delimiter transport-padding 2064 [CRLF epilogue] 2066 preamble := discard-text 2068 transport-padding := *LWSP-char 2069 ; Composers MUST NOT generate 2070 ; non-zero length transport 2071 ; padding, but receivers MUST 2072 ; be able to handle padding 2073 ; added by message transports.