idnits 2.17.1 draft-ietf-822ext-mime-conf-05.txt: ** The Abstract section seems to be numbered Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-25) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 238: '...orming user agents MUST include proper...' RFC 2119 keyword, line 794: '...multipart object MUST NOT contain any ...' Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 98 has weird spacing: '...ortions of MI...' == Line 141 has weird spacing: '...tations must ...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (March 1996) is 10268 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'RFC821' on line 130 looks like a reference -- Missing reference section? 'RFC1421' on line 431 looks like a reference -- Missing reference section? 'ATK' on line 903 looks like a reference -- Missing reference section? 'ISO-2022' on line 907 looks like a reference -- Missing reference section? 'ISO-8859' on line 912 looks like a reference -- Missing reference section? 'ISO-646' on line 924 looks like a reference -- Missing reference section? 'JPEG' on line 929 looks like a reference -- Missing reference section? 'MPEG' on line 932 looks like a reference -- Missing reference section? 'PCM' on line 937 looks like a reference -- Missing reference section? 'POSTSCRIPT' on line 941 looks like a reference -- Missing reference section? 'POSTSCRIPT2' on line 945 looks like a reference -- Missing reference section? 'RFC-783' on line 949 looks like a reference -- Missing reference section? 'RFC-821' on line 953 looks like a reference -- Missing reference section? 'RFC-822' on line 957 looks like a reference -- Missing reference section? 'RFC-934' on line 961 looks like a reference -- Missing reference section? 'RFC-959' on line 965 looks like a reference -- Missing reference section? 'RFC-1049' on line 970 looks like a reference -- Missing reference section? 'RFC-1154' on line 974 looks like a reference -- Missing reference section? 'RFC-1341' on line 979 looks like a reference -- Missing reference section? 'RFC-1342' on line 985 looks like a reference -- Missing reference section? 'RFC-1344' on line 990 looks like a reference -- Missing reference section? 'RFC-1345' on line 994 looks like a reference -- Missing reference section? 'RFC-1421' on line 998 looks like a reference -- Missing reference section? 'RFC-1422' on line 1004 looks like a reference -- Missing reference section? 'RFC-1423' on line 1009 looks like a reference -- Missing reference section? 'RFC-1424' on line 1014 looks like a reference -- Missing reference section? 'RFC-1521' on line 1019 looks like a reference -- Missing reference section? 'RFC-1522' on line 1025 looks like a reference -- Missing reference section? 'RFC-1524' on line 1030 looks like a reference -- Missing reference section? 'RFC-1543' on line 1035 looks like a reference -- Missing reference section? 'RFC-1563' on line 1039 looks like a reference -- Missing reference section? 'RFC-1590' on line 1043 looks like a reference -- Missing reference section? 'RFC-1602' on line 1047 looks like a reference -- Missing reference section? 'RFC-1652' on line 1052 looks like a reference -- Missing reference section? 'RFC-1700' on line 1060 looks like a reference -- Missing reference section? 'RFC-1741' on line 1065 looks like a reference -- Missing reference section? 'RFC-MIME-IMB' on line 1069 looks like a reference -- Missing reference section? 'RFC-MIME-IMT' on line 1074 looks like a reference -- Missing reference section? 'RFC-MIME-HEADERS' on line 1079 looks like a reference -- Missing reference section? 'RFC-MIME-REG' on line 1085 looks like a reference -- Missing reference section? 'RFC-MIME-CONF' on line 1090 looks like a reference -- Missing reference section? 'US-ASCII' on line 1095 looks like a reference -- Missing reference section? 'X400' on line 1099 looks like a reference Summary: 9 errors (**), 0 flaws (~~), 3 warnings (==), 45 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Nathaniel Borenstein 3 Internet Draft Ned Freed 4 6 Multipurpose Internet Mail Extensions 7 (MIME) Part Five: 9 Conformance Criteria and Examples 11 March 1996 13 Status of this Memo 15 This document is an Internet-Draft. Internet-Drafts are 16 working documents of the Internet Engineering Task Force 17 (IETF), its areas, and its working groups. Note that other 18 groups may also distribute working documents as Internet- 19 Drafts. 21 Internet-Drafts are draft documents valid for a maximum of six 22 months. Internet-Drafts may be updated, replaced, or obsoleted 23 by other documents at any time. It is not appropriate to use 24 Internet-Drafts as reference material or to cite them other 25 than as a "working draft" or "work in progress". 27 To learn the current status of any Internet-Draft, please 28 check the 1id-abstracts.txt listing contained in the 29 Internet-Drafts Shadow Directories on ds.internic.net (US East 30 Coast), nic.nordu.net (Europe), ftp.isi.edu (US West Coast), 31 or munnari.oz.au (Pacific Rim). 33 1. Abstract 35 STD 11, RFC 822, defines a message representation protocol 36 specifying considerable detail about US-ASCII message headers, 37 and leaves the message content, or message body, as flat US- 38 ASCII text. This set of documents, collectively called the 39 Multipurpose Internet Mail Extensions, or MIME, redefines the 40 format of messages to allow for 41 (1) textual message bodies in character sets other than 42 US-ASCII, 44 (2) an extensible set of different formats for non-textual 45 message bodies, 47 (3) multi-part message bodies, and 49 (4) textual header information in character sets other than 50 US-ASCII. 52 These documents are based on earlier work documented in RFC 53 934, STD 11, and RFC 1049, but extends and revises them. 54 Because RFC 822 said so little about message bodies, these 55 documents are largely orthogonal to (rather than a revision 56 of) RFC 822. 58 The initial document in this set, RFC MIME-IMB, specifies the 59 various headers used to describe the structure of MIME 60 messages. The second document defines the general structure of 61 the MIME media typing system and defines an initial set of 62 media types. The third document, RFC MIME-HEADERS, describes 63 extensions to RFC 822 to allow non-US-ASCII text data in 64 Internet mail header fields. The fourth document, RFC MIME- 65 REG, specifies various IANA registration procedures for MIME- 66 related facilities. This fifth and final document describes 67 MIME conformance criteria as well as providing some 68 illustrative examples of MIME message formats, 69 acknowledgements, and the bibliography. 71 These documents are revisions of RFCs 1521, 1522, and 1590, 72 which themselves were revisions of RFCs 1341 and 1342. 73 Appendix B of this document describes differences and changes 74 from previous versions. 76 2. Table of Contents 78 1 Abstract .............................................. 1 79 2 Table of Contents ..................................... 3 80 3 Introduction .......................................... 3 81 4 MIME Conformance ...................................... 3 82 5 Guidelines for Sending Email Data ..................... 7 83 6 Canonical Encoding Model .............................. 10 84 7 Summary ............................................... 13 85 8 Security Considerations ............................... 13 86 9 Authors' Addresses .................................... 13 87 10 Acknowledgements ..................................... 15 88 A A Complex Multipart Example ........................... 17 89 B Changes from RFC 1521, 1522, and 1590 ................. 19 90 C References ............................................ 23 92 3. Introduction 94 The first and second documents in this set define MIME header 95 fields and the initial set of MIME media types. The third 96 document describes extensions to RFC822 formats to allow for 97 character sets other than US-ASCII. This document describes 98 what portions of MIME must be supported by a conformant MIME 99 implementation. It also describes various pitfalls of 100 contemporary messaging systems as well as the canonical 101 encoding model MIME is based on. 103 4. MIME Conformance 105 The mechanisms described in these documents are open-ended. 106 It is definitely not expected that all implementations will 107 support all available media types, nor that they will all 108 share the same extensions. In order to promote 109 interoperability, however, it is useful to define the concept 110 of "MIME-conformance" to define a certain level of 111 implementation that allows the useful interworking of messages 112 with content that differs from US-ASCII text. In this 113 section, we specify the requirements for such conformance. 115 A mail user agent that is MIME-conformant MUST: 117 (1) Always generate a "MIME-Version: 1.0" header field in 118 any message it creates. 120 (2) Recognize the Content-Transfer-Encoding header field 121 and decode all received data encoded by either quoted- 122 printable or base64 implementations. The identity 123 transformations 7bit, 8bit, and binary must also be 124 recognized. 126 Any non-7bit data that is sent without encoding must be 127 properly labelled with a content-transfer-encoding of 128 8bit or binary, as appropriate. If the underlying 129 transport does not support 8bit or binary (as SMTP 130 [RFC821] does not), the sender is required to both 131 encode and label data using an appropriate Content- 132 Transfer-Encoding such as quoted-printable or base64. 134 (3) Must treat any unrecognized Content-Transfer-Encoding 135 as if it had a Content-Type of "application/octet- 136 stream", regardless of whether or not the actual 137 Content-Type is recognized. 139 (4) Recognize and interpret the Content-Type header field, 140 and avoid showing users raw data with a Content-Type 141 field other than text. Implementations must be able 142 to send at least text/plain messages, with the 143 character set specified with the charset parameter if 144 it is not US-ASCII. 146 (5) Ignore any content type parameters whose names they do 147 not recognize. 149 (6) Explicitly handle the following media type values, to 150 at least the following extents: 152 Text: 154 -- Recognize and display "text" mail with the 155 character set "US-ASCII." 157 -- Recognize other character sets at least to the 158 extent of being able to inform the user about what 159 character set the message uses. 161 -- Recognize the "ISO-8859-*" character sets to the 162 extent of being able to display those characters that 163 are common to ISO-8859-* and US-ASCII, namely all 164 characters represented by octet values 1-127. 166 -- For unrecognized subtypes in a known character 167 set, show or offer to show the user the "raw" version 168 of the data after conversion of the content from 169 canonical form to local form. 171 -- Treat material in an unknown character set as if 172 it were "application/octet-stream". 174 Image, audio, and video: 176 -- At a minumum provide facilities to treat any 177 unrecognized subtypes as if they were 178 "application/octet-stream". 180 Application: 182 -- Offer the ability to remove either of the quoted- 183 printable or base64 encodings defined in this 184 document if they were used and put the resulting 185 information in a user file. 187 Multipart: 189 -- Recognize the mixed subtype. Display all relevant 190 information on the message level and the body part 191 header level and then display or offer to display 192 each of the body parts individually. 194 -- Recognize the "alternative" subtype, and avoid 195 showing the user redundant parts of 196 multipart/alternative mail. 198 -- Recognize the "multipart/digest" subtype, 199 specifically using "message/rfc822" rather than 200 "text/plain" as the default media type for body parts 201 inside "multipart/digest" entities. 203 -- Treat any unrecognized subtypes as if they were 204 "mixed". 206 Message: 208 -- Recognize and display at least the RFC822 message 209 encapsulation (message/rfc822) in such a way as to 210 preserve any recursive structure, that is, displaying 211 or offering to display the encapsulated data in 212 accordance with its media type. 214 -- Treat any unrecognized subtypes as if they were 215 "application/octet-stream". 217 (7) Upon encountering any unrecognized Content-Type field, 218 an implementation must treat it as if it had a media 219 type of "application/octet-stream" with no parameter 220 sub-arguments. How such data are handled is up to an 221 implementation, but likely options for handling such 222 unrecognized data include offering the user to write it 223 into a file (decoded from its mail transport format) or 224 offering the user to name a program to which the 225 decoded data should be passed as input. 227 (8) Conformant user agents are required, if they provide 228 non-standard support for non-MIME messages employing 229 character sets other than US-ASCII, to do so on 230 received messages only. Conforming user agents must not 231 send non-MIME messages containing anything other than 232 US-ASCII text. 234 In particular, the use of non-US-ASCII text in mail 235 messages without a MIME-Version field is strongly 236 discouraged as it impedes interoperability when sending 237 messages between regions with different localization 238 conventions. Conforming user agents MUST include proper 239 MIME labelling when sending anything other than plain 240 text in the US-ASCII character set. 242 In addition, non-MIME user agents should be upgraded if 243 at all possible to include appropriate MIME header 244 information in the messages they send even if nothing 245 else in MIME is supported. This upgrade will have 246 little, if any, effect on non-MIME recipients and will 247 aid MIME in correctly displaying such messages. It 248 also provides a smooth transition path to eventual 249 adoption of other MIME capabilities. 251 (9) Conforming user agents must ensure that any string of 252 non-white-space printable US-ASCII characters within a 253 "*text" or "*ctext" that begins with "=?" and ends with 254 "?=" be a valid encoded-word. ("begins" means: At the 255 start of the field-body or immediately following 256 linear-white-space; "ends" means: At the end of the 257 field-body or immediately preceding linear-white- 258 space.) In addition, any "word" within a "phrase" that 259 begins with "=?" and ends with "?=" must be a valid 260 encoded-word. 262 (10) Conforming user agents must be able to distinguish 263 encoded-words from "text", "ctext", or "word"s, 264 according to the rules in section 6, anytime they 265 appear in appropriate places in message headers. It 266 must support both the "B" and "Q" encodings for any 267 character set which it supports. The program must be 268 able to display the unencoded text if the character set 269 is "US-ASCII". For the ISO-8859-* character sets, the 270 mail reading program must at least be able to display 271 the characters which are also in the US-ASCII set. 273 A user agent that meets the above conditions is said to be 274 MIME-conformant. The meaning of this phrase is that it is 275 assumed to be "safe" to send virtually any kind of properly- 276 marked data to users of such mail systems, because such 277 systems will at least be able to treat the data as 278 undifferentiated binary, and will not simply splash it onto 279 the screen of unsuspecting users. 281 There is another sense in which it is always "safe" to send 282 data in a format that is MIME-conformant, which is that such 283 data will not break or be broken by any known systems that are 284 conformant with RFC 821 and RFC 822. User agents that are 285 MIME-conformant have the additional guarantee that the user 286 will not be shown data that were never intended to be viewed 287 as text. 289 5. Guidelines for Sending Email Data 291 Internet email is not a perfect, homogeneous system. Mail may 292 become corrupted at several stages in its travel to a final 293 destination. Specifically, email sent throughout the Internet 294 may travel across many networking technologies. Many 295 networking and mail technologies do not support the full 296 functionality possible in the SMTP transport environment. 297 Mail traversing these systems is likely to be modified in 298 order that it can be transported. 300 There exist many widely-deployed non-conformant MTAs in the 301 Internet. These MTAs, speaking the SMTP protocol, alter 302 messages on the fly to take advantage of the internal data 303 structure of the hosts they are implemented on, or are just 304 plain broken. 306 The following guidelines may be useful to anyone devising a 307 data format (media type) that is supposed to survive the 308 widest range of networking technologies and known broken MTAs 309 unscathed. Note that anything encoded in the base64 encoding 310 will satisfy these rules, but that some well-known mechanisms, 311 notably the UNIX uuencode facility, will not. Note also that 312 anything encoded in the Quoted-Printable encoding will survive 313 most gateways intact, but possibly not some gateways to 314 systems that use the EBCDIC character set. 316 (1) Under some circumstances the encoding used for data may 317 change as part of normal gateway or user agent 318 operation. In particular, conversion from base64 to 319 quoted-printable and vice versa may be necessary. This 320 may result in the confusion of CRLF sequences with line 321 breaks in text bodies. As such, the persistence of 322 CRLF as something other than a line break must not be 323 relied on. 325 (2) Many systems may elect to represent and store text data 326 using local newline conventions. Local newline 327 conventions may not match the RFC822 CRLF convention -- 328 systems are known that use plain CR, plain LF, CRLF, or 329 counted records. The result is that isolated CR and LF 330 characters are not well tolerated in general; they may 331 be lost or converted to delimiters on some systems, and 332 hence must not be relied on. 334 (3) The transmission of NULs (US-ASCII value 0) is 335 problematic in Internet mail. (This is largely the 336 result of NULs being used as a termination character by 337 many of the standard runtime library routines in the C 338 programming language.) The practice of using NULs as 339 termination characters is so entrenched now that 340 messages should not rely on them being preserved. 342 (4) TAB (HT) characters may be misinterpreted or may be 343 automatically converted to variable numbers of spaces. 344 This is unavoidable in some environments, notably those 345 not based on the US-ASCII character set. Such 346 conversion is STRONGLY DISCOURAGED, but it may occur, 347 and mail formats must not rely on the persistence of 348 TAB (HT) characters. 350 (5) Lines longer than 76 characters may be wrapped or 351 truncated in some environments. Line wrapping and line 352 truncation are STRONGLY DISCOURAGED, but unavoidable in 353 some cases. Applications which require long lines must 354 somehow differentiate between soft and hard line 355 breaks. (A simple way to do this is to use the 356 quoted-printable encoding.) 358 (6) Trailing "white space" characters (SPACE, TAB (HT)) on 359 a line may be discarded by some transport agents, while 360 other transport agents may pad lines with these 361 characters so that all lines in a mail file are of 362 equal length. The persistence of trailing white space, 363 therefore, must not be relied on. 365 (7) Many mail domains use variations on the US-ASCII 366 character set, or use character sets such as EBCDIC 367 which contain most but not all of the US-ASCII 368 characters. The correct translation of characters not 369 in the "invariant" set cannot be depended on across 370 character converting gateways. For example, this 371 situation is a problem when sending uuencoded 372 information across BITNET, an EBCDIC system. Similar 373 problems can occur without crossing a gateway, since 374 many Internet hosts use character sets other than US- 375 ASCII internally. The definition of Printable Strings 376 in X.400 adds further restrictions in certain special 377 cases. In particular, the only characters that are 378 known to be consistent across all gateways are the 73 379 characters that correspond to the upper and lower case 380 letters A-Z and a-z, the 10 digits 0-9, and the 381 following eleven special characters: 383 "'" (US-ASCII decimal value 39) 384 "(" (US-ASCII decimal value 40) 385 ")" (US-ASCII decimal value 41) 386 "+" (US-ASCII decimal value 43) 387 "," (US-ASCII decimal value 44) 388 "-" (US-ASCII decimal value 45) 389 "." (US-ASCII decimal value 46) 390 "/" (US-ASCII decimal value 47) 391 ":" (US-ASCII decimal value 58) 392 "=" (US-ASCII decimal value 61) 393 "?" (US-ASCII decimal value 63) 395 A maximally portable mail representation will confine 396 itself to relatively short lines of text in which the 397 only meaningful characters are taken from this set of 398 73 characters. The base64 encoding follows this rule. 400 (8) Some mail transport agents will corrupt data that 401 includes certain literal strings. In particular, a 402 period (".") alone on a line is known to be corrupted 403 by some (incorrect) SMTP implementations, and a line 404 that starts with the five characters "From " (the fifth 405 character is a SPACE) are commonly corrupted as well. 406 A careful composition agent can prevent these 407 corruptions by encoding the data (e.g., in the quoted- 408 printable encoding using "=46rom " in place of "From " 409 at the start of a line, and "=2E" in place of "." alone 410 on a line). 412 Please note that the above list is NOT a list of recommended 413 practices for MTAs. RFC 821 MTAs are prohibited from altering 414 the character of white space or wrapping long lines. These 415 BAD and invalid practices are known to occur on established 416 networks, and implementations should be robust in dealing with 417 the bad effects they can cause. 419 6. Canonical Encoding Model 421 There was some confusion, in earlier versions of these 422 documents, regarding the model for when email data was to be 423 converted to canonical form and encoded, and in particular how 424 this process would affect the treatment of CRLFs, given that 425 the representation of newlines varies greatly from system to 426 system. For this reason, a canonical model for encoding is 427 presented below. 429 The process of composing a MIME entity can be modeled as being 430 done in a number of steps. Note that these steps are roughly 431 similar to those steps used in PEM [RFC1421] and are performed 432 for each "innermost level" body: 434 (1) Creation of local form. 436 The body to be transmitted is created in the system's 437 native format. The native character set is used and, 438 where appropriate, local end of line conventions are 439 used as well. The body may be a UNIX-style text file, 440 or a Sun raster image, or a VMS indexed file, or audio 441 data in a system-dependent format stored only in 442 memory, or anything else that corresponds to the local 443 model for the representation of some form of 444 information. Fundamentally, the data is created in the 445 "native" form that corresponds to the type specified by 446 the media type. 448 (2) Conversion to canonical form. 450 The entire body, including "out-of-band" information 451 such as record lengths and possibly file attribute 452 information, is converted to a universal canonical 453 form. The specific media type of the body as well as 454 its associated attributes dictate the nature of the 455 canonical form that is used. Conversion to the proper 456 canonical form may involve character set conversion, 457 transformation of audio data, compression, or various 458 other operations specific to the various media types. 459 If character set conversion is involved, however, care 460 must be taken to understand the semantics of the media 461 type, which may have strong implications for any 462 character set conversion, e.g. with regard to 463 syntactically meaningful characters in a text subtype 464 other than "plain". 466 For example, in the case of text/plain data, the text 467 must be converted to a supported character set and 468 lines must be delimited with CRLF delimiters in 469 accordance with RFC 822. Note that the restriction on 470 line lengths implied by RFC 822 is eliminated if the 471 next step employs either quoted-printable or base64 472 encoding. 474 (3) Apply transfer encoding. 476 A Content-Transfer-Encoding appropriate for this body 477 is applied. Note that there is no fixed relationship 478 between the media type and the transfer encoding. In 479 particular, it may be appropriate to base the choice of 480 base64 or quoted-printable on character frequency 481 counts which are specific to a given instance of a 482 body. 484 (4) Insertion into entity. 486 The encoded body is inserted into a MIME entity with 487 appropriate headers. The entity is then inserted into 488 the body of a higher-level entity (message or 489 multipart) as needed. 491 Conversion from entity form to local form is accomplished by 492 reversing these steps. Note that reversal of these steps may 493 produce differing results since there is no guarantee that the 494 original and final local forms are the same. 496 It is vital to note that these steps are only a model; they 497 are specifically NOT a blueprint for how an actual system 498 would be built. In particular, the model fails to account for 499 two common designs: 501 (1) In many cases the conversion to a canonical form prior 502 to encoding will be subsumed into the encoder itself, 503 which understands local formats directly. For example, 504 the local newline convention for text bodies might be 505 carried through to the encoder itself along with 506 knowledge of what that format is. 508 (2) The output of the encoders may have to pass through one 509 or more additional steps prior to being transmitted as 510 a message. As such, the output of the encoder may not 511 be conformant with the formats specified by RFC 822. 512 In particular, once again it may be appropriate for the 513 converter's output to be expressed using local newline 514 conventions rather than using the standard RFC 822 CRLF 515 delimiters. 517 Other implementation variations are conceivable as well. The 518 vital aspect of this discussion is that, in spite of any 519 optimizations, collapsings of required steps, or insertion of 520 additional processing, the resulting messages must be 521 consistent with those produced by the model described here. 522 For example, a message with the following header fields: 524 Content-type: text/foo; charset=bar 525 Content-Transfer-Encoding: base64 527 must be first represented in the text/foo form, then (if 528 necessary) represented in the "bar" character set, and finally 529 transformed via the base64 algorithm into a mail-safe form. 531 NOTE: Some confusion has been caused by systems that represent 532 messages in a format which uses local newline conventions 533 which differ from the RFC822 CRLF convention. It is important 534 to note that these formats are not canonical RFC822/MIME. 535 These formats are instead *encodings* of RFC822, where CRLF 536 sequences in the canonical representation of the message are 537 encoded as the local newline convention. Note that formats 538 which encode CRLF sequences as, for example, LF are not 539 capable of representing MIME messages containing binary data 540 which contains LF octets not part of CRLF line separation 541 sequences. 543 7. Summary 545 This document defines what is meant by MIME Conformance. It 546 also details various problems known to exist in the Internet 547 email system and how to use MIME to overcome them. Finally, it 548 describes MIME's canonical encoding model. 550 8. Security Considerations 552 Security issues are discussed in the second document in this 553 set, RFC MIME-IMT. 555 9. Authors' Addresses 557 For more information, the authors of this document are best 558 contacted via Internet mail: 560 Nathaniel S. Borenstein 561 First Virtual Holdings 562 25 Washington Avenue 563 Morristown, NJ 07960 564 USA 566 Email: nsb@nsb.fv.com 567 Phone: +1 201 540 8967 568 Fax: +1 201 993 3032 570 Ned Freed 571 Innosoft International, Inc. 572 1050 East Garvey Avenue South 573 West Covina, CA 91790 574 USA 576 Email: ned@innosoft.com 577 Phone: +1 818 919 3600 578 Fax: +1 818 919 3614 580 MIME is a result of the work of the Internet Engineering Task 581 Force Working Group on Email Extensions. The chairman of that 582 group, Greg Vaudreuil, may be reached at: 584 Gregory M. Vaudreuil 585 Octel Network Services 586 17080 Dallas Parkway 587 Dallas, TX 75248-1905 588 USA 590 Email: Greg.Vaudreuil@Octel.Com 591 10. Acknowledgements 593 This document is the result of the collective effort of a 594 large number of people, at several IETF meetings, on the 595 IETF-SMTP and IETF-822 mailing lists, and elsewhere. Although 596 any enumeration seems doomed to suffer from egregious 597 omissions, the following are among the many contributors to 598 this effort: 600 Harald Tveit Alvestrand Marc Andreessen 601 Randall Atkinson Bob Braden 602 Philippe Brandon Brian Capouch 603 Kevin Carosso Uhhyung Choi 604 Peter Clitherow Dave Collier-Brown 605 Cristian Constantinof John Coonrod 606 Mark Crispin Dave Crocker 607 Stephen Crocker Terry Crowley 608 Walt Daniels Jim Davis 609 Frank Dawson Axel Deininger 610 Hitoshi Doi Kevin Donnelly 611 Steve Dorner Keith Edwards 612 Chris Eich Dana S. Emery 613 Johnny Eriksson Craig Everhart 614 Patrik Faltstrom Erik E. Fair 615 Roger Fajman Alain Fontaine 616 Martin Forssen James M. Galvin 617 Stephen Gildea Philip Gladstone 618 Thomas Gordon Keld Simonsen 619 Terry Gray Phill Gross 620 James Hamilton David Herron 621 Mark Horton Bruce Howard 622 Bill Janssen Olle Jarnefors 623 Risto Kankkunen Phil Karn 624 Alan Katz Tim Kehres 625 Neil Katin Steve Kille 626 Kyuho Kim Anders Klemets 627 John Klensin Valdis Kletniek 628 Jim Knowles Stev Knowles 629 Bob Kummerfeld Pekka Kytolaakso 630 Stellan Lagerstrom Vincent Lau 631 Timo Lehtinen Donald Lindsay 632 Warner Losh Carlyn Lowery 633 Laurence Lundblade Charles Lynn 634 John R. MacMillan Larry Masinter 635 Rick McGowan Michael J. McInerny 636 Leo Mclaughlin Goli Montaser-Kohsari 637 Tom Moore John Gardiner Myers 638 Erik Naggum Mark Needleman 639 Chris Newman John Noerenberg 640 Mats Ohrman Julian Onions 641 Michael Patton David J. Pepper 642 Erik van der Poel Blake C. Ramsdell 643 Christer Romson Luc Rooijakkers 644 Marshall T. Rose Jonathan Rosenberg 645 Guido van Rossum Jan Rynning 646 Harri Salminen Michael Sanderson 647 Yutaka Sato Markku Savela 648 Richard Alan Schafer Masahiro Sekiguchi 649 Mark Sherman Bob Smart 650 Peter Speck Henry Spencer 651 Einar Stefferud Michael Stein 652 Klaus Steinberger Peter Svanberg 653 James Thompson Steve Uhler 654 Stuart Vance Peter Vanderbilt 655 Greg Vaudreuil Ed Vielmetti 656 Larry W. Virden Ryan Waldron 657 Rhys Weatherly Jay Weber 658 Dave Wecker Wally Wedel 659 Sven-Ove Westberg Brian Wideen 660 John Wobus Glenn Wright 661 Rayan Zachariassen David Zimmerman 663 The authors apologize for any omissions from this list, which 664 are certainly unintentional. 666 Appendix A -- A Complex Multipart Example 668 What follows is the outline of a complex multipart message. 669 This message contains five parts that are to be displayed 670 serially: two introductory plain text objects, an embedded 671 multipart message, a text/enriched object, and a closing 672 encapsulated text message in a non-ASCII character set. The 673 embedded multipart message itself contains two objects to be 674 displayed in parallel, a picture and an audio fragment. 676 MIME-Version: 1.0 677 From: Nathaniel Borenstein 678 To: Ned Freed 679 Date: Fri, 07 Oct 1994 16:15:05 -0700 (PDT) 680 Subject: A multipart example 681 Content-Type: multipart/mixed; 682 boundary=unique-boundary-1 684 This is the preamble area of a multipart message. 685 Mail readers that understand multipart format 686 should ignore this preamble. 688 If you are reading this text, you might want to 689 consider changing to a mail reader that understands 690 how to properly display multipart messages. 692 --unique-boundary-1 694 ... Some text appears here ... 696 [Note that the blank between the boundary and the start 697 of the text in this part means no header fields were 698 given and this is text in the US-ASCII character set. 699 It could have been done with explicit typing as in the 700 next part.] 702 --unique-boundary-1 703 Content-type: text/plain; charset=US-ASCII 705 This could have been part of the previous part, but 706 illustrates explicit versus implicit typing of body 707 parts. 709 --unique-boundary-1 710 Content-Type: multipart/parallel; boundary=unique-boundary-2 712 --unique-boundary-2 713 Content-Type: audio/basic 714 Content-Transfer-Encoding: base64 716 ... base64-encoded 8000 Hz single-channel 717 mu-law-format audio data goes here ... 719 --unique-boundary-2 720 Content-Type: image/jpeg 721 Content-Transfer-Encoding: base64 723 ... base64-encoded image data goes here ... 725 --unique-boundary-2-- 727 --unique-boundary-1 728 Content-type: text/enriched 730 This is enriched. 731 as defined in RFC 1563 733 Isn't it 734 cool? 736 --unique-boundary-1 737 Content-Type: message/rfc822 739 From: (mailbox in US-ASCII) 740 To: (address in US-ASCII) 741 Subject: (subject in US-ASCII) 742 Content-Type: Text/plain; charset=ISO-8859-1 743 Content-Transfer-Encoding: Quoted-printable 745 ... Additional text in ISO-8859-1 goes here ... 747 --unique-boundary-1-- 748 Appendix B -- Changes from RFC 1521, 1522, and 1590 750 These documents are a revision of RFC 1521, 1522, and 1590. 751 For the convenience of those familiar with the earlier 752 documents, the changes from those documents are summarized in 753 this appendix. For further history, note that Appendix H in 754 RFC 1521 specified how that document differed from its 755 predecessor, RFC 1341. 757 (1) This document has been completely reformatted and split 758 into multiple documents. This was done to improve the 759 quality of the plain text version of this document, 760 which is required to be the reference copy. 762 (2) BNF describing the overall structure of MIME object 763 headers has been added. This is a documentation change 764 only -- the underlying syntax has not changed in any 765 way. 767 (3) The specific BNF for the seven media types in MIME has 768 been removed. This BNF was incorrect, incomplete, amd 769 inconsistent with the type-indendependent BNF. And 770 since the type-independent BNF already fully specifies 771 the syntax of the various MIME headers, the type- 772 specific BNF was, in the final analysis, completely 773 unnecessary and caused more problems than it solved. 775 (4) The more specific "US-ASCII" character set name has 776 replaced the use of the term ASCII in many parts of 777 this specification. 779 (5) The informal concept of a primary subtype has been 780 removed. 782 (6) The term "object" was being used inconsistently. The 783 definition of this term has been clarified, along with 784 the related terms "body", "body part", and "entity", 785 and usage has been corrected where appropriate. 787 (7) The BNF for the multipart media type has been 788 rearranged to make it clear that the CRLF preceeding 789 the boundary marker is actually part of the marker 790 itself rather than the preceeding body part. 792 (8) The prose and BNF describing the multipart media type 793 have been changed to make it clear that the body parts 794 within a multipart object MUST NOT contain any lines 795 beginning with the boundary parameter string. 797 (9) In the rules on reassembling "message/partial" MIME 798 entities, "Subject" is added to the list of headers to 799 take from the inner message, and the example is 800 modified to clarify this point. 802 (10) In the discussion of the application/postscript type, 803 an additional paragraph has been added warning about 804 possible interoperability problems caused by embedding 805 of binary data inside a PostScript MIME entity. 807 (11) Added a clarifying note to the basic syntax rules for 808 the Content-Type header field to make it clear that the 809 following two forms: 811 Content-type: text/plain; charset=us-ascii (comment) 813 Content-type: text/plain; charset="us-ascii" 815 are completely equivalent. 817 (12) The following sentence has been removed from the 818 discussion of the MIME-Version header: "However, 819 conformant software is encouraged to check the version 820 number and at least warn the user if an unrecognized 821 MIME-version is encountered." 823 (13) A typo was fixed that said "application/external-body" 824 instead of "message/external-body". 826 (14) The definition of a character set has been reorganized 827 to make the requirements clearer. 829 (15) The definition of the "image/gif" media type has been 830 moved to a separate document. This change was made 831 because of potential conflicts with IETF rules 832 governing the standardization of patented technology. 834 (16) The definitions of "7bit" and "8bit" have been 835 tightened so that use of bare CR, LF can only be used 836 as end-of-line sequences. The document also no longer 837 requires that NUL characters be preserved, which brings 838 MIME into alignment with real-world implementations. 840 (17) The definition of canonical text in MIME has been 841 tightened so that line breaks must be represented by a 842 CRLF sequence. CR and LF characters are not allowed 843 outside of this usage. The definition of quoted- 844 printable encoding has been altered accordingly. 846 (18) Prose was added to clarify the use of the "7bit", 847 "8bit", and "binary" transfer-encodings on multipart or 848 message entities encapsulating "8bit" or "binary" data. 850 (19) In the section on MIME Conformance, "multipart/digest" 851 support was added to the list of requirements for 852 minimal MIME conformance. Also, the requirement for 853 "message/rfc822" support were strengthened to clarify 854 the importance of recognizing recursive structure. 856 (20) The various restrictions on subtypes of "message" are 857 now specified entirely on a subtype by subtype basis. 859 (21) The definition of "message/rfc822" was changed to 860 indicate that at least one of the "From", "Subject", or 861 "Date" headers must be present. 863 (22) The required handling of unrecognized subtypes as 864 "application/octet-stream" has been made more explicit 865 in both the type definitions sections and the 866 conformance guidelines. 868 (23) Examples using text/richtext were changed to 869 text/enriched. 871 (24) The BNF definition of subtype has been changed to make 872 it clear that either an IANA registered subtype or a 873 nonstandard "X-" subtype must be used in a Content-Type 874 header field. 876 (25) MIME media types that are simply registered for use and 877 those that are standardized by the IETF are now 878 distinguished in the MIME BNF. 880 (26) All of the various MIME registration procedures have 881 been extensively revised. IANA registration procedures 882 for character sets have been moved to a separate 883 document that isn't part of the MIME specification set. 885 (27) The use of escape and shift mechanisms in the US-ASCII 886 and ISO-8859-X character sets this specification 887 defines has been clarified: Such mechanisms should 888 never be used in conjunction with these character sets 889 and their effect if they are used is undefined. 891 (28) The definition of the AFS access-type for 892 message/external-body has been removed. 894 (29) The handling of the combination of 895 multipart/alternative and message/external-body is now 896 specifically addressed. 898 (30) Security issues specific to message/external-body are 899 now discussed in some detail. 901 Appendix C -- References 903 [ATK] 904 Borenstein, Nathaniel S., Multimedia Applications 905 Development with the Andrew Toolkit, Prentice-Hall, 1990. 907 [ISO-2022] 908 International Standard -- Information Processing -- ISO 909 7-bit and 8-bit Coded Character Sets -- Code Extension 910 Techniques, ISO 2022:1986. 912 [ISO-8859] 913 International Standard -- Information Processing -- 8-bit 914 Single-Byte Coded Graphic Character Sets -- Part 1: Latin 915 Alphabet No. 1, ISO 8859-1:1987. Part 2: Latin alphabet 916 No. 2, ISO 8859-2, 1987. Part 3: Latin alphabet No. 3, 917 ISO 8859-3, 1988. Part 4: Latin alphabet No. 4, ISO 918 8859-4, 1988. Part 5: Latin/Cyrillic alphabet, ISO 919 8859-5, 1988. Part 6: Latin/Arabic alphabet, ISO 8859-6, 920 1987. Part 7: Latin/Greek alphabet, ISO 8859-7, 1987. 921 Part 8: Latin/Hebrew alphabet, ISO 8859-8, 1988. Part 9: 922 Latin alphabet No. 5, ISO 8859-9, 1990. 924 [ISO-646] 925 International Standard -- Information Processing -- ISO 926 7-bit Coded Character Set For Information Interchange, 927 ISO 646:1983. 929 [JPEG] 930 JPEG Draft Standard ISO 10918-1 CD. 932 [MPEG] 933 Video Coding Draft Standard ISO 11172 CD, ISO 934 IEC/JTC1/SC2/WG11 (Motion Picture Experts Group), May, 935 1991. 937 [PCM] 938 CCITT, Fascicle III.4 - Recommendation G.711, "Pulse Code 939 Modulation (PCM) of Voice Frequencies", Geneva, 1972. 941 [POSTSCRIPT] 942 Adobe Systems, Inc., PostScript Language Reference 943 Manual, Addison-Wesley, 1985. 945 [POSTSCRIPT2] 946 Adobe Systems, Inc., PostScript Language Reference 947 Manual, Addison-Wesley, Second Edition, 1990. 949 [RFC-783] 950 Sollins, K.R., "TFTP Protocol (revision 2)", RFC-783, 951 MIT, June 1981. 953 [RFC-821] 954 Postel, J.B., "Simple Mail Transfer Protocol", STD 10, 955 RFC 821, USC/Information Sciences Institute, August 1982. 957 [RFC-822] 958 Crocker, D., "Standard for the Format of ARPA Internet 959 Text Messages", STD 11, RFC 822, UDEL, August 1982. 961 [RFC-934] 962 Rose, M. and E. Stefferud, "Proposed Standard for Message 963 Encapsulation", RFC 934, Delaware and NMA, January 1985. 965 [RFC-959] 966 Postel, J. and J. Reynolds, "File Transfer Protocol", STD 967 9, RFC 959, USC/Information Sciences Institute, October 968 1985. 970 [RFC-1049] 971 Sirbu, M., "Content-Type Header Field for Internet 972 Messages", RFC 1049, CMU, March 1988. 974 [RFC-1154] 975 Robinson, D. and R. Ullmann, "Encoding Header Field for 976 Internet Messages", RFC 1154, Prime Computer, Inc., April 977 1990. 979 [RFC-1341] 980 Borenstein, N., and N. Freed, "MIME (Multipurpose 981 Internet Mail Extensions): Mechanisms for Specifying and 982 Describing the Format of Internet Message Bodies", RFC 983 1341, Bellcore, Innosoft, June 1992. 985 [RFC-1342] 986 Moore, K., "Representation of Non-Ascii Text in Internet 987 Message Headers", RFC 1342, University of Tennessee, June 988 1992. 990 [RFC-1344] 991 Borenstein, N., "Implications of MIME for Internet Mail 992 Gateways", RFC 1344, Bellcore, June 1992. 994 [RFC-1345] 995 Simonsen, K., "Character Mnemonics & Character Sets", RFC 996 1345, Rationel Almen Planlaegning, June 1992. 998 [RFC-1421] 999 Linn, J., "Privacy Enhancement for Internet Electronic 1000 Mail: Part I -- Message Encryption and Authentication 1001 Procedures", RFC 1421, IAB IRTF PSRG, IETF PEM WG, 1002 February 1993. 1004 [RFC-1422] 1005 Kent, S., "Privacy Enhancement for Internet Electronic 1006 Mail: Part II -- Certificate-Based Key Management", RFC 1007 1422, IAB IRTF PSRG, IETF PEM WG, February 1993. 1009 [RFC-1423] 1010 Balenson, D., "Privacy Enhancement for Internet 1011 Electronic Mail: Part III -- Algorithms, Modes, and 1012 Identifiers", IAB IRTF PSRG, IETF PEM WG, February 1993. 1014 [RFC-1424] 1015 Kaliski, B., "Privacy Enhancement for Internet Electronic 1016 Mail: Part IV -- Key Certification and Related 1017 Services", IAB IRTF PSRG, IETF PEM WG, February 1993. 1019 [RFC-1521] 1020 Borenstein, N. and Freed, N., "MIME (Multipurpose 1021 Internet Mail Extensions): Mechanisms for Specifying and 1022 Describing the Format of Internet Message Bodies", RFC 1023 1521, Bellcore, Innosoft, September, 1993. 1025 [RFC-1522] 1026 Moore, K., "Representation of Non-ASCII Text in Internet 1027 Message Headers", RFC 1522, University of Tennessee, 1028 September 1993. 1030 [RFC-1524] 1031 Borenstein, N., "A User Agent Configuration Mechanism for 1032 Multimedia Mail Format Information", RFC 1524, Bellcore, 1033 September 1993. 1035 [RFC-1543] 1036 Postel, J., "Instructions to RFC Authors", RFC 1543, 1037 USC/Information Sciences Institute, October 1993. 1039 [RFC-1563] 1040 Borenstein, N., "The text/enriched MIME Content-type", 1041 RFC 1563, Bellcore, January, 1994. 1043 [RFC-1590] 1044 Postel, J., "Media Type Registration Procedure", RFC 1045 1590, USC/Information Sciences Institute, March 1994. 1047 [RFC-1602] 1048 Internet Architecture Board, Internet Engineering 1049 Steering Group, Huitema, C., Gross, P., "The Internet 1050 Standards Process -- Revision 2", March 1994. 1052 [RFC-1652] 1053 Klensin, J., (WG Chair), Freed, N., (Editor), Rose, M., 1054 Stefferud, E., and Crocker, D., "SMTP Service Extension 1055 for 8bit-MIME transport", RFC 1652, United Nations 1056 University, Innosoft, Dover Beach Consulting, Inc., 1057 Network Management Associates, Inc., The Branch Office, 1058 March 1994. 1060 [RFC-1700] 1061 Reynolds, J. and J. Postel, "Assigned Numbers", STD 2, 1062 RFC 1700, USC/Information Sciences Institute, October 1063 1994. 1065 [RFC-1741] 1066 Faltstrom, P., Crocker, D., and Fair, E., "MIME Content 1067 Type for BinHex Encoded Files", December 1994. 1069 [RFC-MIME-IMB] 1070 Borenstein, N. and Freed, N., "Multipurpose Internet Mail 1071 Extensions (MIME) Part One: Format of Internet Message 1072 Bodies", RFC MIME-IMB, Bellcore, Innosoft, March 1996. 1074 [RFC-MIME-IMT] 1075 Borenstein, N. and Freed, N., "Multipurpose Internet Mail 1076 Extensions (MIME) Part Two: Media Types", RFC MIME-IMT, 1077 Bellcore, Innosoft, March 1996. 1079 [RFC-MIME-HEADERS] 1080 Moore, K., "Multipurpose Internet Mail Extensions (MIME) 1081 Part Three: Representation of Non-Ascii Text in Internet 1082 Message Headers", RFC MIME-HEADERS, University of 1083 Tennessee, ?. 1085 [RFC-MIME-REG] 1086 Freed, N., Klensin, J, Postel, J., "Multipurpose Internet 1087 Mail Extensions (MIME) Part Four: MIME Registration 1088 Procedures", RFC MIME-REG, ISI, Innosoft, March 1996. 1090 [RFC-MIME-CONF] 1091 Borenstein, N. and Freed, N., "Multipurpose Internet Mail 1092 Extensions (MIME) Part Five: Conformance Criteria and 1093 Examples", RFC MIME-CONF, Bellcore, Innosoft, March 1996. 1095 [US-ASCII] 1096 Coded Character Set -- 7-Bit American Standard Code for 1097 Information Interchange, ANSI X3.4-1986. 1099 [X400] 1100 Schicker, Pietro, "Message Handling Systems, X.400", 1101 Message Handling Systems and Distributed Applications, E. 1102 Stefferud, O-j. Jacobsen, and P. Schicker, eds., North- 1103 Holland, 1989, pp. 3-41.