idnits 2.17.1 draft-ietf-822ext-mime-conf-04.txt: ** The Abstract section seems to be numbered Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-23) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 242: '...orming user agents MUST include proper...' RFC 2119 keyword, line 776: '...multipart object MUST NOT contain any ...' Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 102 has weird spacing: '...ortions of MI...' == Line 145 has weird spacing: '...tations must ...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (January 1996) is 10326 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'RFC821' on line 134 looks like a reference -- Missing reference section? 'RFC1421' on line 413 looks like a reference -- Missing reference section? 'ATK' on line 880 looks like a reference -- Missing reference section? 'ISO-2022' on line 884 looks like a reference -- Missing reference section? 'ISO-8859' on line 889 looks like a reference -- Missing reference section? 'ISO-646' on line 901 looks like a reference -- Missing reference section? 'JPEG' on line 906 looks like a reference -- Missing reference section? 'MPEG' on line 909 looks like a reference -- Missing reference section? 'PCM' on line 914 looks like a reference -- Missing reference section? 'POSTSCRIPT' on line 918 looks like a reference -- Missing reference section? 'POSTSCRIPT2' on line 922 looks like a reference -- Missing reference section? 'RFC-783' on line 926 looks like a reference -- Missing reference section? 'RFC-821' on line 930 looks like a reference -- Missing reference section? 'RFC-822' on line 934 looks like a reference -- Missing reference section? 'RFC-934' on line 938 looks like a reference -- Missing reference section? 'RFC-959' on line 942 looks like a reference -- Missing reference section? 'RFC-1049' on line 947 looks like a reference -- Missing reference section? 'RFC-1154' on line 951 looks like a reference -- Missing reference section? 'RFC-1341' on line 956 looks like a reference -- Missing reference section? 'RFC-1342' on line 962 looks like a reference -- Missing reference section? 'RFC-1344' on line 967 looks like a reference -- Missing reference section? 'RFC-1345' on line 971 looks like a reference -- Missing reference section? 'RFC-1421' on line 975 looks like a reference -- Missing reference section? 'RFC-1422' on line 981 looks like a reference -- Missing reference section? 'RFC-1423' on line 986 looks like a reference -- Missing reference section? 'RFC-1424' on line 991 looks like a reference -- Missing reference section? 'RFC-1521' on line 996 looks like a reference -- Missing reference section? 'RFC-1522' on line 1002 looks like a reference -- Missing reference section? 'RFC-1524' on line 1007 looks like a reference -- Missing reference section? 'RFC-1543' on line 1012 looks like a reference -- Missing reference section? 'RFC-1563' on line 1016 looks like a reference -- Missing reference section? 'RFC-1590' on line 1020 looks like a reference -- Missing reference section? 'RFC-1602' on line 1024 looks like a reference -- Missing reference section? 'RFC-1652' on line 1029 looks like a reference -- Missing reference section? 'RFC-1700' on line 1037 looks like a reference -- Missing reference section? 'RFC-1741' on line 1042 looks like a reference -- Missing reference section? 'RFC-MIME-IMB' on line 1046 looks like a reference -- Missing reference section? 'RFC-MIME-IMT' on line 1051 looks like a reference -- Missing reference section? 'RFC-MIME-HEADERS' on line 1056 looks like a reference -- Missing reference section? 'RFC-MIME-REG' on line 1062 looks like a reference -- Missing reference section? 'RFC-MIME-CONF' on line 1067 looks like a reference -- Missing reference section? 'US-ASCII' on line 1073 looks like a reference -- Missing reference section? 'X400' on line 1077 looks like a reference Summary: 9 errors (**), 0 flaws (~~), 3 warnings (==), 45 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group Nathaniel Borenstein 2 Internet Draft Ned Freed 3 5 Multipurpose Internet Mail Extensions 6 (MIME) Part Five: 8 Conformance Criteria and Examples 10 January 1996 12 Status of this Memo 14 This document is an Internet-Draft. Internet-Drafts are 15 working documents of the Internet Engineering Task Force 16 (IETF), its areas, and its working groups. Note that other 17 groups may also distribute working documents as Internet- 18 Drafts. 20 Internet-Drafts are draft documents valid for a maximum of six 21 months. Internet-Drafts may be updated, replaced, or obsoleted 22 by other documents at any time. It is not appropriate to use 23 Internet-Drafts as reference material or to cite them other 24 than as a "working draft" or "work in progress". 26 To learn the current status of any Internet-Draft, please 27 check the 1id-abstracts.txt listing contained in the 28 Internet-Drafts Shadow Directories on ds.internic.net (US East 29 Coast), nic.nordu.net (Europe), ftp.isi.edu (US West Coast), 30 or munnari.oz.au (Pacific Rim). 32 1. Abstract 34 STD 11, RFC 822, defines a message representation protocol 35 specifying considerable detail about US-ASCII message headers, 36 and leaves the message content, or message body, as flat US- 37 ASCII text. This set of documents, collectively called the 38 Multipurpose Internet Mail Extensions, or MIME, redefines the 39 format of messages to allow for 40 (1) textual message bodies in character sets other than 41 US-ASCII, 43 (2) non-textual message bodies, 45 (3) multi-part message bodies, and 47 (4) textual header information in character sets other than 48 US-ASCII. 50 These documents are based on earlier work documented in RFC 51 934, STD 11, and RFC 1049, but extends and revises them. 52 Because RFC 822 said so little about message bodies, these 53 documents are largely orthogonal to (rather than a revision 54 of) RFC 822. 56 In particular, these documents are designed to provide 57 facilities to include multiple parts in a single message, to 58 represent body and header text in character sets other than 59 US-ASCII, to represent formatted multi-font text messages, to 60 represent non-textual material such as images and audio clips, 61 and generally to facilitate later extensions defining new 62 types of Internet mail for use by cooperating mail agents. 64 The initial document in this set, RFC MIME-IMB, specifies the 65 various headers used to describe the structure of MIME 66 messages. The second document defines the general structure of 67 the MIME media typing system and defines an initial set of 68 media types. The third document, RFC MIME-HEADERS, describes 69 extensions to RFC 822 to allow non-US-ASCII text data in 70 Internet mail header fields. The fourth document, RFC MIME- 71 REG, specifies various IANA registration procedures for MIME- 72 related facilities. This fifth and final document describes 73 MIME conformance criteria as well as providing some 74 illustrative examples of MIME message formats, 75 acknowledgements, and the bibliography. 77 These documents are revisions of RFCs 1521, 1522, and 1590, 78 which themselves were revisions of RFCs 1341 and 1342. 79 Appendix B of this document describes differences and changes 80 from previous versions. 82 2. Table of Contents 84 1 Abstract .............................................. 1 85 2 Table of Contents ..................................... 3 86 3 Introduction .......................................... 3 87 4 MIME Conformance ...................................... 3 88 5 Guidelines for Sending Email Data ..................... 7 89 6 Canonical Encoding Model .............................. 10 90 7 Summary ............................................... 13 91 8 Security Considerations ............................... 13 92 9 Authors' Addresses .................................... 13 93 10 Acknowledgements ..................................... 15 94 A A Complex Multipart Example ........................... 17 95 B Changes from RFC 1521, 1522, and 1590 ................. 19 96 C References ............................................ 23 98 3. Introduction 100 The first and second documents in this set defined MIME header 101 field and the initial set of MIME media types. This document 102 describes what portions of MIME must be supported by a 103 conformant MIME implementation. It also describes various 104 pitfalls of contemporary messaging systems as well as the 105 canonical encoding model MIME is based on. 107 4. MIME Conformance 109 The mechanisms described in these documents are open-ended. 110 It is definitely not expected that all implementations will 111 support all available media types, nor that they will all 112 share the same extensions. In order to promote 113 interoperability, however, it is useful to define the concept 114 of "MIME-conformance" to define a certain level of 115 implementation that allows the useful interworking of messages 116 with content that differs from US-ASCII text. In this 117 section, we specify the requirements for such conformance. 119 A mail user agent that is MIME-conformant MUST: 121 (1) Always generate a "MIME-Version: 1.0" header field in 122 any message it creates. 124 (2) Recognize the Content-Transfer-Encoding header field 125 and decode all received data encoded by either quoted- 126 printable or base64 implementations. The identity 127 transformations 7bit, 8bit, and binary must also be 128 recognized. 130 Any non-7bit data that is sent without encoding must be 131 properly labelled with a content-transfer-encoding of 132 8bit or binary, as appropriate. If the underlying 133 transport does not support 8bit or binary (as SMTP 134 [RFC821] does not), the sender is required to both 135 encode and label data using an appropriate Content- 136 Transfer-Encoding such as quoted-printable or base64. 138 (3) Must treat any unrecognized Content-Transfer-Encoding 139 as if it had a Content-Type of "application/octet- 140 stream", regardless of whether or not the actual 141 Content-Type is recognized. 143 (4) Recognize and interpret the Content-Type header field, 144 and avoid showing users raw data with a Content-Type 145 field other than text. Implementations must be able 146 to send at least text/plain messages, with the 147 character set specified with the charset parameter if 148 it is not US-ASCII. 150 (5) Ignore any content type parameters whose names they do 151 not recognize. 153 (6) Explicitly handle the following media type values, to 154 at least the following extents: 156 Text: 158 -- Recognize and display "text" mail with the 159 character set "US-ASCII." 161 -- Recognize other character sets at least to the 162 extent of being able to inform the user about what 163 character set the message uses. 165 -- Recognize the "ISO-8859-*" character sets to the 166 extent of being able to display those characters that 167 are common to ISO-8859-* and US-ASCII, namely all 168 characters represented by octet values 1-127. 170 -- For unrecognized subtypes in a known character 171 set, show or offer to show the user the "raw" version 172 of the data after conversion of the content from 173 canonical form to local form. 175 -- Treat material in an unknown character set as if 176 it were "application/octet-stream". 178 Image, audio, and video: 180 -- At a minumum provide facilities to treat any 181 unrecognized subtypes as if they were 182 "application/octet-stream". 184 Application: 186 -- Offer the ability to remove either of the quoted- 187 printable or base64 encodings defined in this 188 document if they were used and put the resulting 189 information in a user file. 191 Multipart: 193 -- Recognize the mixed subtype. Display all relevant 194 information on the message level and the body part 195 header level and then display or offer to display 196 each of the body parts individually. 198 -- Recognize the "alternative" subtype, and avoid 199 showing the user redundant parts of 200 multipart/alternative mail. 202 -- Recognize the "multipart/digest" subtype, 203 specifically using "message/rfc822" rather than 204 "text/plain" as the default media type for body parts 205 inside "multipart/digest" entities. 207 -- Treat any unrecognized subtypes as if they were 208 "mixed". 210 Message: 212 -- Recognize and display at least the RFC822 message 213 encapsulation (message/rfc822) in such a way as to 214 preserve any recursive structure, that is, displaying 215 or offering to display the encapsulated data in 216 accordance with its media type. 218 -- Treat any unrecognized subtypes as if they were 219 "application/octet-stream". 221 (7) Upon encountering any unrecognized Content-Type field, 222 an implementation must treat it as if it had a media 223 type of "application/octet-stream" with no parameter 224 sub-arguments. How such data are handled is up to an 225 implementation, but likely options for handling such 226 unrecognized data include offering the user to write it 227 into a file (decoded from its mail transport format) or 228 offering the user to name a program to which the 229 decoded data should be passed as input. 231 (8) Conformant user agents are required, if they provide 232 non-standard support for non-MIME messages employing 233 character sets other than US-ASCII, to do so on 234 received messages only. Conforming user agents must not 235 send non-MIME messages containing anything other than 236 US-ASCII text. 238 In particular, the use of non-US-ASCII text in mail 239 messages without a MIME-Version field is strongly 240 discouraged as it impedes interoperability when sending 241 messages between regions with different localization 242 conventions. Conforming user agents MUST include proper 243 MIME labelling when sending anything other than plain 244 text in the US-ASCII character set. 246 In addition, non-MIME user agents should be upgraded if 247 at all possible to include appropriate MIME header 248 information in the messages they send even if nothing 249 else in MIME is supported. This upgrade will have 250 little, if any, effect on non-MIME recipients and will 251 aid MIME in correctly displaying such messages. It 252 also provides a smooth transition path to eventual 253 adoption of other MIME capabilities. 255 A user agent that meets the above conditions is said to be 256 MIME-conformant. The meaning of this phrase is that it is 257 assumed to be "safe" to send virtually any kind of properly- 258 marked data to users of such mail systems, because such 259 systems will at least be able to treat the data as 260 undifferentiated binary, and will not simply splash it onto 261 the screen of unsuspecting users. 263 There is another sense in which it is always "safe" to send 264 data in a format that is MIME-conformant, which is that such 265 data will not break or be broken by any known systems that are 266 conformant with RFC 821 and RFC 822. User agents that are 267 MIME-conformant have the additional guarantee that the user 268 will not be shown data that were never intended to be viewed 269 as text. 271 5. Guidelines for Sending Email Data 273 Internet email is not a perfect, homogeneous system. Mail may 274 become corrupted at several stages in its travel to a final 275 destination. Specifically, email sent throughout the Internet 276 may travel across many networking technologies. Many 277 networking and mail technologies do not support the full 278 functionality possible in the SMTP transport environment. 279 Mail traversing these systems is likely to be modified in 280 order that it can be transported. 282 There exist many widely-deployed non-conformant MTAs in the 283 Internet. These MTAs, speaking the SMTP protocol, alter 284 messages on the fly to take advantage of the internal data 285 structure of the hosts they are implemented on, or are just 286 plain broken. 288 The following guidelines may be useful to anyone devising a 289 data format (media type) that is supposed to survive the 290 widest range of networking technologies and known broken MTAs 291 unscathed. Note that anything encoded in the base64 encoding 292 will satisfy these rules, but that some well-known mechanisms, 293 notably the UNIX uuencode facility, will not. Note also that 294 anything encoded in the Quoted-Printable encoding will survive 295 most gateways intact, but possibly not some gateways to 296 systems that use the EBCDIC character set. 298 (1) Under some circumstances the encoding used for data may 299 change as part of normal gateway or user agent 300 operation. In particular, conversion from base64 to 301 quoted-printable and vice versa may be necessary. This 302 may result in the confusion of CRLF sequences with line 303 breaks in text bodies. As such, the persistence of 304 CRLF as something other than a line break must not be 305 relied on. 307 (2) Many systems may elect to represent and store text data 308 using local newline conventions. Local newline 309 conventions may not match the RFC822 CRLF convention -- 310 systems are known that use plain CR, plain LF, CRLF, or 311 counted records. The result is that isolated CR and LF 312 characters are not well tolerated in general; they may 313 be lost or converted to delimiters on some systems, and 314 hence must not be relied on. 316 (3) The transmission of NULs (US-ASCII value 0) is 317 problematic in Internet mail. (This is largely the 318 result of NULs being used as a termination character by 319 many of the standard runtime library routines in the C 320 programming language.) The practice of using NULs as 321 termination characters is so entrenched now that 322 messages should not rely on them being preserved. 324 (4) TAB (HT) characters may be misinterpreted or may be 325 automatically converted to variable numbers of spaces. 326 This is unavoidable in some environments, notably those 327 not based on the US-ASCII character set. Such 328 conversion is STRONGLY DISCOURAGED, but it may occur, 329 and mail formats must not rely on the persistence of 330 TAB (HT) characters. 332 (5) Lines longer than 76 characters may be wrapped or 333 truncated in some environments. Line wrapping and line 334 truncation are STRONGLY DISCOURAGED, but unavoidable in 335 some cases. Applications which require long lines must 336 somehow differentiate between soft and hard line 337 breaks. (A simple way to do this is to use the 338 quoted-printable encoding.) 340 (6) Trailing "white space" characters (SPACE, TAB (HT)) on 341 a line may be discarded by some transport agents, while 342 other transport agents may pad lines with these 343 characters so that all lines in a mail file are of 344 equal length. The persistence of trailing white space, 345 therefore, must not be relied on. 347 (7) Many mail domains use variations on the US-ASCII 348 character set, or use character sets such as EBCDIC 349 which contain most but not all of the US-ASCII 350 characters. The correct translation of characters not 351 in the "invariant" set cannot be depended on across 352 character converting gateways. For example, this 353 situation is a problem when sending uuencoded 354 information across BITNET, an EBCDIC system. Similar 355 problems can occur without crossing a gateway, since 356 many Internet hosts use character sets other than US- 357 ASCII internally. The definition of Printable Strings 358 in X.400 adds further restrictions in certain special 359 cases. In particular, the only characters that are 360 known to be consistent across all gateways are the 73 361 characters that correspond to the upper and lower case 362 letters A-Z and a-z, the 10 digits 0-9, and the 363 following eleven special characters: 365 "'" (US-ASCII decimal value 39) 366 "(" (US-ASCII decimal value 40) 367 ")" (US-ASCII decimal value 41) 368 "+" (US-ASCII decimal value 43) 369 "," (US-ASCII decimal value 44) 370 "-" (US-ASCII decimal value 45) 371 "." (US-ASCII decimal value 46) 372 "/" (US-ASCII decimal value 47) 373 ":" (US-ASCII decimal value 58) 374 "=" (US-ASCII decimal value 61) 375 "?" (US-ASCII decimal value 63) 377 A maximally portable mail representation will confine 378 itself to relatively short lines of text in which the 379 only meaningful characters are taken from this set of 380 73 characters. The base64 encoding follows this rule. 382 (8) Some mail transport agents will corrupt data that 383 includes certain literal strings. In particular, a 384 period (".") alone on a line is known to be corrupted 385 by some (incorrect) SMTP implementations, and a line 386 that starts with the five characters "From " (the fifth 387 character is a SPACE) are commonly corrupted as well. 388 A careful composition agent can prevent these 389 corruptions by encoding the data (e.g., in the quoted- 390 printable encoding using "=46rom " in place of "From " 391 at the start of a line, and "=2E" in place of "." alone 392 on a line). 394 Please note that the above list is NOT a list of recommended 395 practices for MTAs. RFC 821 MTAs are prohibited from altering 396 the character of white space or wrapping long lines. These 397 BAD and invalid practices are known to occur on established 398 networks, and implementations should be robust in dealing with 399 the bad effects they can cause. 401 6. Canonical Encoding Model 403 There was some confusion, in earlier versions of these 404 documents, regarding the model for when email data was to be 405 converted to canonical form and encoded, and in particular how 406 this process would affect the treatment of CRLFs, given that 407 the representation of newlines varies greatly from system to 408 system. For this reason, a canonical model for encoding is 409 presented below. 411 The process of composing a MIME entity can be modeled as being 412 done in a number of steps. Note that these steps are roughly 413 similar to those steps used in PEM [RFC1421] and are performed 414 for each "innermost level" body: 416 (1) Creation of local form. 418 The body to be transmitted is created in the system's 419 native format. The native character set is used and, 420 where appropriate, local end of line conventions are 421 used as well. The body may be a UNIX-style text file, 422 or a Sun raster image, or a VMS indexed file, or audio 423 data in a system-dependent format stored only in 424 memory, or anything else that corresponds to the local 425 model for the representation of some form of 426 information. Fundamentally, the data is created in the 427 "native" form that corresponds to the type specified by 428 the media type. 430 (2) Conversion to canonical form. 432 The entire body, including "out-of-band" information 433 such as record lengths and possibly file attribute 434 information, is converted to a universal canonical 435 form. The specific media type of the body as well as 436 its associated attributes dictate the nature of the 437 canonical form that is used. Conversion to the proper 438 canonical form may involve character set conversion, 439 transformation of audio data, compression, or various 440 other operations specific to the various media types. 441 If character set conversion is involved, however, care 442 must be taken to understand the semantics of the media 443 type, which may have strong implications for any 444 character set conversion, e.g. with regard to 445 syntactically meaningful characters in a text subtype 446 other than "plain". 448 For example, in the case of text/plain data, the text 449 must be converted to a supported character set and 450 lines must be delimited with CRLF delimiters in 451 accordance with RFC 822. Note that the restriction on 452 line lengths implied by RFC 822 is eliminated if the 453 next step employs either quoted-printable or base64 454 encoding. 456 (3) Apply transfer encoding. 458 A Content-Transfer-Encoding appropriate for this body 459 is applied. Note that there is no fixed relationship 460 between the media type and the transfer encoding. In 461 particular, it may be appropriate to base the choice of 462 base64 or quoted-printable on character frequency 463 counts which are specific to a given instance of a 464 body. 466 (4) Insertion into entity. 468 The encoded body is inserted into a MIME entity with 469 appropriate headers. The entity is then inserted into 470 the body of a higher-level entity (message or 471 multipart) as needed. 473 Conversion from entity form to local form is accomplished by 474 reversing these steps. Note that reversal of these steps may 475 produce differing results since there is no guarantee that the 476 original and final local forms are the same. 478 It is vital to note that these steps are only a model; they 479 are specifically NOT a blueprint for how an actual system 480 would be built. In particular, the model fails to account for 481 two common designs: 483 (1) In many cases the conversion to a canonical form prior 484 to encoding will be subsumed into the encoder itself, 485 which understands local formats directly. For example, 486 the local newline convention for text bodies might be 487 carried through to the encoder itself along with 488 knowledge of what that format is. 490 (2) The output of the encoders may have to pass through one 491 or more additional steps prior to being transmitted as 492 a message. As such, the output of the encoder may not 493 be conformant with the formats specified by RFC 822. 494 In particular, once again it may be appropriate for the 495 converter's output to be expressed using local newline 496 conventions rather than using the standard RFC 822 CRLF 497 delimiters. 499 Other implementation variations are conceivable as well. The 500 vital aspect of this discussion is that, in spite of any 501 optimizations, collapsings of required steps, or insertion of 502 additional processing, the resulting messages must be 503 consistent with those produced by the model described here. 504 For example, a message with the following header fields: 506 Content-type: text/foo; charset=bar 507 Content-Transfer-Encoding: base64 509 must be first represented in the text/foo form, then (if 510 necessary) represented in the "bar" character set, and finally 511 transformed via the base64 algorithm into a mail-safe form. 513 NOTE: Some confusion has been caused by systems that represent 514 messages in a format which uses local newline conventions 515 which differ from the RFC822 CRLF convention. It is important 516 to note that these formats are not canonical RFC822/MIME. 517 These formats are instead *encodings* of RFC822, where CRLF 518 sequences in the canonical representation of the message are 519 encoded as the local newline convention. Note that formats 520 which encode CRLF sequences as, for example, LF are not 521 capable of representing MIME messages containing binary data 522 which contains LF octets not part of CRLF line separation 523 sequences. 525 7. Summary 527 This document defines what is meant by MIME Conformance. It 528 also details various problems known to exist in the Internet 529 email system and how to use MIME to overcome them. Finally, it 530 describes MIME's canonical encoding model. 532 8. Security Considerations 534 Security issues are discussed in the second document in this 535 set, RFC MIME-IMT. 537 9. Authors' Addresses 539 For more information, the authors of this document are best 540 contacted via Internet mail: 542 Nathaniel S. Borenstein 543 First Virtual Holdings 544 25 Washington Avenue 545 Morristown, NJ 07960 546 USA 548 Email: nsb@nsb.fv.com 549 Phone: +1 201 540 8967 550 Fax: +1 201 993 3032 552 Ned Freed 553 Innosoft International, Inc. 554 1050 East Garvey Avenue South 555 West Covina, CA 91790 556 USA 558 Email: ned@innosoft.com 559 Phone: +1 818 919 3600 560 Fax: +1 818 919 3614 562 MIME is a result of the work of the Internet Engineering Task 563 Force Working Group on Email Extensions. The chairman of that 564 group, Greg Vaudreuil, may be reached at: 566 Gregory M. Vaudreuil 567 Tigon Corporation 568 17060 Dallas Parkway 569 Dallas Texas, 75248 571 Email: greg.vaudreuil@ons.octel.com 572 Phone: +1 214 733 2722 573 10. Acknowledgements 575 This document is the result of the collective effort of a 576 large number of people, at several IETF meetings, on the 577 IETF-SMTP and IETF-822 mailing lists, and elsewhere. Although 578 any enumeration seems doomed to suffer from egregious 579 omissions, the following are among the many contributors to 580 this effort: 582 Harald Tveit Alvestrand Marc Andreessen 583 Randall Atkinson Bob Braden 584 Philippe Brandon Brian Capouch 585 Kevin Carosso Uhhyung Choi 586 Peter Clitherow Dave Collier-Brown 587 Cristian Constantinof John Coonrod 588 Mark Crispin Dave Crocker 589 Stephen Crocker Terry Crowley 590 Walt Daniels Jim Davis 591 Frank Dawson Axel Deininger 592 Hitoshi Doi Kevin Donnelly 593 Steve Dorner Keith Edwards 594 Chris Eich Dana S. Emery 595 Johnny Eriksson Craig Everhart 596 Patrik Faltstrom Erik E. Fair 597 Roger Fajman Alain Fontaine 598 Martin Forssen James M. Galvin 599 Stephen Gildea Philip Gladstone 600 Thomas Gordon Keld Simonsen 601 Terry Gray Phill Gross 602 James Hamilton David Herron 603 Mark Horton Bruce Howard 604 Bill Janssen Olle Jarnefors 605 Risto Kankkunen Phil Karn 606 Alan Katz Tim Kehres 607 Neil Katin Steve Kille 608 Kyuho Kim Anders Klemets 609 John Klensin Valdis Kletniek 610 Jim Knowles Stev Knowles 611 Bob Kummerfeld Pekka Kytolaakso 612 Stellan Lagerstrom Vincent Lau 613 Timo Lehtinen Donald Lindsay 614 Warner Losh Carlyn Lowery 615 Laurence Lundblade Charles Lynn 616 John R. MacMillan Larry Masinter 617 Rick McGowan Michael J. McInerny 618 Leo Mclaughlin Goli Montaser-Kohsari 619 Tom Moore John Gardiner Myers 620 Erik Naggum Mark Needleman 621 Chris Newman John Noerenberg 622 Mats Ohrman Julian Onions 623 Michael Patton David J. Pepper 624 Erik van der Poel Blake C. Ramsdell 625 Christer Romson Luc Rooijakkers 626 Marshall T. Rose Jonathan Rosenberg 627 Guido van Rossum Jan Rynning 628 Harri Salminen Michael Sanderson 629 Yutaka Sato Markku Savela 630 Richard Alan Schafer Masahiro Sekiguchi 631 Mark Sherman Bob Smart 632 Peter Speck Henry Spencer 633 Einar Stefferud Michael Stein 634 Klaus Steinberger Peter Svanberg 635 James Thompson Steve Uhler 636 Stuart Vance Peter Vanderbilt 637 Greg Vaudreuil Ed Vielmetti 638 Larry W. Virden Ryan Waldron 639 Rhys Weatherly Jay Weber 640 Dave Wecker Wally Wedel 641 Sven-Ove Westberg Brian Wideen 642 John Wobus Glenn Wright 643 Rayan Zachariassen David Zimmerman 645 The authors apologize for any omissions from this list, which 646 are certainly unintentional. 648 Appendix A -- A Complex Multipart Example 650 What follows is the outline of a complex multipart message. 651 This message contains five parts that are to be displayed 652 serially: two introductory plain text objects, an embedded 653 multipart message, a text/enriched object, and a closing 654 encapsulated text message in a non-ASCII character set. The 655 embedded multipart message itself contains two objects to be 656 displayed in parallel, a picture and an audio fragment. 658 MIME-Version: 1.0 659 From: Nathaniel Borenstein 660 To: Ned Freed 661 Date: Fri, 07 Oct 1994 16:15:05 -0700 (PDT) 662 Subject: A multipart example 663 Content-Type: multipart/mixed; 664 boundary=unique-boundary-1 666 This is the preamble area of a multipart message. 667 Mail readers that understand multipart format 668 should ignore this preamble. 670 If you are reading this text, you might want to 671 consider changing to a mail reader that understands 672 how to properly display multipart messages. 674 --unique-boundary-1 676 ... Some text appears here ... 678 [Note that the blank between the boundary and the start 679 of the text in this part means no header fields were 680 given and this is text in the US-ASCII character set. 681 It could have been done with explicit typing as in the 682 next part.] 684 --unique-boundary-1 685 Content-type: text/plain; charset=US-ASCII 687 This could have been part of the previous part, but 688 illustrates explicit versus implicit typing of body 689 parts. 691 --unique-boundary-1 692 Content-Type: multipart/parallel; boundary=unique-boundary-2 694 --unique-boundary-2 695 Content-Type: audio/basic 696 Content-Transfer-Encoding: base64 698 ... base64-encoded 8000 Hz single-channel 699 mu-law-format audio data goes here ... 701 --unique-boundary-2 702 Content-Type: image/jpeg 703 Content-Transfer-Encoding: base64 705 ... base64-encoded image data goes here ... 707 --unique-boundary-2-- 709 --unique-boundary-1 710 Content-type: text/enriched 712 This is enriched. 713 as defined in RFC 1563 715 Isn't it 716 cool? 718 --unique-boundary-1 719 Content-Type: message/rfc822 721 From: (mailbox in US-ASCII) 722 To: (address in US-ASCII) 723 Subject: (subject in US-ASCII) 724 Content-Type: Text/plain; charset=ISO-8859-1 725 Content-Transfer-Encoding: Quoted-printable 727 ... Additional text in ISO-8859-1 goes here ... 729 --unique-boundary-1-- 730 Appendix B -- Changes from RFC 1521, 1522, and 1590 732 These documents are a revision of RFC 1521, 1522, and 1590. 733 For the convenience of those familiar with the earlier 734 documents, the changes from those documents are summarized in 735 this appendix. For further history, note that Appendix H in 736 RFC 1521 specified how that document differed from its 737 predecessor, RFC 1341. 739 (1) This document has been completely reformatted and split 740 into multiple documents. This was done to improve the 741 quality of the plain text version of this document, 742 which is required to be the reference copy. 744 (2) BNF describing the overall structure of MIME object 745 headers has been added. This is a documentation change 746 only -- the underlying syntax has not changed in any 747 way. 749 (3) The specific BNF for the seven media types in MIME has 750 been removed. This BNF was incorrect, incomplete, amd 751 inconsistent with the type-indendependent BNF. And 752 since the type-independent BNF already fully specifies 753 the syntax of the various MIME headers, the type- 754 specific BNF was, in the final analysis, completely 755 unnecessary and caused more problems than it solved. 757 (4) The more specific "US-ASCII" character set name has 758 replaced the use of the term ASCII in many parts of 759 this specification. 761 (5) The informal concept of a primary subtype has been 762 removed. 764 (6) The term "object" was being used inconsistently. The 765 definition of this term has been clarified, along with 766 the related terms "body", "body part", and "entity", 767 and usage has been corrected where appropriate. 769 (7) The BNF for the multipart media type has been 770 rearranged to make it clear that the CRLF preceeding 771 the boundary marker is actually part of the marker 772 itself rather than the preceeding body part. 774 (8) The prose and BNF describing the multipart media type 775 have been changed to make it clear that the body parts 776 within a multipart object MUST NOT contain any lines 777 beginning with the boundary parameter string. 779 (9) In the rules on reassembling "message/partial" MIME 780 entities, "Subject" is added to the list of headers to 781 take from the inner message, and the example is 782 modified to clarify this point. 784 (10) In the discussion of the application/postscript type, 785 an additional paragraph has been added warning about 786 possible interoperability problems caused by embedding 787 of binary data inside a PostScript MIME entity. 789 (11) Added a clarifying note to the basic syntax rules for 790 the Content-Type header field to make it clear that the 791 following two forms: 793 Content-type: text/plain; charset=us-ascii (comment) 795 Content-type: text/plain; charset="us-ascii" 797 are completely equivalent. 799 (12) The following sentence has been removed from the 800 discussion of the MIME-Version header: "However, 801 conformant software is encouraged to check the version 802 number and at least warn the user if an unrecognized 803 MIME-version is encountered." 805 (13) A typo was fixed that said "application/external-body" 806 instead of "message/external-body". 808 (14) The definition of a character set has been reorganized 809 to make the requirements clearer. 811 (15) The definition of the "image/gif" media type has been 812 moved to a separate document. This change was made 813 because of potential conflicts with IETF rules 814 governing the standardization of patented technology. 816 (16) The definitions of "7bit" and "8bit" have been 817 tightened so that use of bare CR, LF can only be used 818 as end-of-line sequences. The document also no longer 819 requires that NUL characters be preserved, which brings 820 MIME into alignment with real-world implementations. 822 (17) The definition of canonical text in MIME has been 823 tightened so that line breaks must be represented by a 824 CRLF sequence. CR and LF characters are not allowed 825 outside of this usage. The definition of quoted- 826 printable encoding has been altered accordingly. 828 (18) Prose was added to clarify the use of the "7bit", 829 "8bit", and "binary" transfer-encodings on multipart or 830 message entities encapsulating "8bit" or "binary" data. 832 (19) In the section on MIME Conformance, "multipart/digest" 833 support was added to the list of requirements for 834 minimal MIME conformance. Also, the requirement for 835 "message/rfc822" support were strengthened to clarify 836 the importance of recognizing recursive structure. 838 (20) The various restrictions on subtypes of "message" are 839 now specified entirely on a subtype by subtype basis. 841 (21) The definition of "message/rfc822" was changed to 842 indicate that at least one of the "From", "Subject", or 843 "Date" headers must be present. 845 (22) The required handling of unrecognized subtypes as 846 "application/octet-stream" has been made more explicit 847 in both the type definitions sections and the 848 conformance guidelines. 850 (23) Examples using text/richtext were changed to 851 text/enriched. 853 (24) The BNF definition of subtype has been changed to make 854 it clear that either an IANA registered subtype or a 855 nonstandard "X-" subtype must be used in a Content-Type 856 header field. 858 (25) The use of escape and shift mechanisms in the US-ASCII 859 and ISO-8859-X character sets this specification 860 defines has been clarified: Such mechanisms should 861 never be used in conjunction with these character sets 862 and their effect if they are used is undefined. 864 (26) The definition of the AFS access-type for 865 message/external-body has been removed. 867 (27) MIME objects that are simply registered for use and 868 those that are standardized by the IETF are now 869 distinguished in the MIME BNF. 871 (28) The handling of the combination of 872 multipart/alternative and message/external-body is now 873 specifically addressed. 875 (29) Security issues specific to message/external-body are 876 now discussed in some detail. 878 Appendix C -- References 880 [ATK] 881 Borenstein, Nathaniel S., Multimedia Applications 882 Development with the Andrew Toolkit, Prentice-Hall, 1990. 884 [ISO-2022] 885 International Standard -- Information Processing -- ISO 886 7-bit and 8-bit Coded Character Sets -- Code Extension 887 Techniques, ISO 2022:1986. 889 [ISO-8859] 890 International Standard -- Information Processing -- 8-bit 891 Single-Byte Coded Graphic Character Sets -- Part 1: Latin 892 Alphabet No. 1, ISO 8859-1:1987. Part 2: Latin alphabet 893 No. 2, ISO 8859-2, 1987. Part 3: Latin alphabet No. 3, 894 ISO 8859-3, 1988. Part 4: Latin alphabet No. 4, ISO 895 8859-4, 1988. Part 5: Latin/Cyrillic alphabet, ISO 896 8859-5, 1988. Part 6: Latin/Arabic alphabet, ISO 8859-6, 897 1987. Part 7: Latin/Greek alphabet, ISO 8859-7, 1987. 898 Part 8: Latin/Hebrew alphabet, ISO 8859-8, 1988. Part 9: 899 Latin alphabet No. 5, ISO 8859-9, 1990. 901 [ISO-646] 902 International Standard -- Information Processing -- ISO 903 7-bit Coded Character Set For Information Interchange, 904 ISO 646:1983. 906 [JPEG] 907 JPEG Draft Standard ISO 10918-1 CD. 909 [MPEG] 910 Video Coding Draft Standard ISO 11172 CD, ISO 911 IEC/JTC1/SC2/WG11 (Motion Picture Experts Group), May, 912 1991. 914 [PCM] 915 CCITT, Fascicle III.4 - Recommendation G.711, "Pulse Code 916 Modulation (PCM) of Voice Frequencies", Geneva, 1972. 918 [POSTSCRIPT] 919 Adobe Systems, Inc., PostScript Language Reference 920 Manual, Addison-Wesley, 1985. 922 [POSTSCRIPT2] 923 Adobe Systems, Inc., PostScript Language Reference 924 Manual, Addison-Wesley, Second Edition, 1990. 926 [RFC-783] 927 Sollins, K.R., "TFTP Protocol (revision 2)", RFC-783, 928 MIT, June 1981. 930 [RFC-821] 931 Postel, J.B., "Simple Mail Transfer Protocol", STD 10, 932 RFC 821, USC/Information Sciences Institute, August 1982. 934 [RFC-822] 935 Crocker, D., "Standard for the Format of ARPA Internet 936 Text Messages", STD 11, RFC 822, UDEL, August 1982. 938 [RFC-934] 939 Rose, M. and E. Stefferud, "Proposed Standard for Message 940 Encapsulation", RFC 934, Delaware and NMA, January 1985. 942 [RFC-959] 943 Postel, J. and J. Reynolds, "File Transfer Protocol", STD 944 9, RFC 959, USC/Information Sciences Institute, October 945 1985. 947 [RFC-1049] 948 Sirbu, M., "Content-Type Header Field for Internet 949 Messages", RFC 1049, CMU, March 1988. 951 [RFC-1154] 952 Robinson, D. and R. Ullmann, "Encoding Header Field for 953 Internet Messages", RFC 1154, Prime Computer, Inc., April 954 1990. 956 [RFC-1341] 957 Borenstein, N., and N. Freed, "MIME (Multipurpose 958 Internet Mail Extensions): Mechanisms for Specifying and 959 Describing the Format of Internet Message Bodies", RFC 960 1341, Bellcore, Innosoft, June 1992. 962 [RFC-1342] 963 Moore, K., "Representation of Non-Ascii Text in Internet 964 Message Headers", RFC 1342, University of Tennessee, June 965 1992. 967 [RFC-1344] 968 Borenstein, N., "Implications of MIME for Internet Mail 969 Gateways", RFC 1344, Bellcore, June 1992. 971 [RFC-1345] 972 Simonsen, K., "Character Mnemonics & Character Sets", RFC 973 1345, Rationel Almen Planlaegning, June 1992. 975 [RFC-1421] 976 Linn, J., "Privacy Enhancement for Internet Electronic 977 Mail: Part I -- Message Encryption and Authentication 978 Procedures", RFC 1421, IAB IRTF PSRG, IETF PEM WG, 979 February 1993. 981 [RFC-1422] 982 Kent, S., "Privacy Enhancement for Internet Electronic 983 Mail: Part II -- Certificate-Based Key Management", RFC 984 1422, IAB IRTF PSRG, IETF PEM WG, February 1993. 986 [RFC-1423] 987 Balenson, D., "Privacy Enhancement for Internet 988 Electronic Mail: Part III -- Algorithms, Modes, and 989 Identifiers", IAB IRTF PSRG, IETF PEM WG, February 1993. 991 [RFC-1424] 992 Kaliski, B., "Privacy Enhancement for Internet Electronic 993 Mail: Part IV -- Key Certification and Related 994 Services", IAB IRTF PSRG, IETF PEM WG, February 1993. 996 [RFC-1521] 997 Borenstein, N. and Freed, N., "MIME (Multipurpose 998 Internet Mail Extensions): Mechanisms for Specifying and 999 Describing the Format of Internet Message Bodies", RFC 1000 1521, Bellcore, Innosoft, September, 1993. 1002 [RFC-1522] 1003 Moore, K., "Representation of Non-ASCII Text in Internet 1004 Message Headers", RFC 1522, University of Tennessee, 1005 September 1993. 1007 [RFC-1524] 1008 Borenstein, N., "A User Agent Configuration Mechanism for 1009 Multimedia Mail Format Information", RFC 1524, Bellcore, 1010 September 1993. 1012 [RFC-1543] 1013 Postel, J., "Instructions to RFC Authors", RFC 1543, 1014 USC/Information Sciences Institute, October 1993. 1016 [RFC-1563] 1017 Borenstein, N., "The text/enriched MIME Content-type", 1018 RFC 1563, Bellcore, January, 1994. 1020 [RFC-1590] 1021 Postel, J., "Media Type Registration Procedure", RFC 1022 1590, USC/Information Sciences Institute, March 1994. 1024 [RFC-1602] 1025 Internet Architecture Board, Internet Engineering 1026 Steering Group, Huitema, C., Gross, P., "The Internet 1027 Standards Process -- Revision 2", March 1994. 1029 [RFC-1652] 1030 Klensin, J., (WG Chair), Freed, N., (Editor), Rose, M., 1031 Stefferud, E., and Crocker, D., "SMTP Service Extension 1032 for 8bit-MIME transport", RFC 1652, United Nations 1033 University, Innosoft, Dover Beach Consulting, Inc., 1034 Network Management Associates, Inc., The Branch Office, 1035 March 1994. 1037 [RFC-1700] 1038 Reynolds, J. and J. Postel, "Assigned Numbers", STD 2, 1039 RFC 1700, USC/Information Sciences Institute, October 1040 1994. 1042 [RFC-1741] 1043 Faltstrom, P., Crocker, D., and Fair, E., "MIME Content 1044 Type for BinHex Encoded Files", December 1994. 1046 [RFC-MIME-IMB] 1047 Borenstein, N. and Freed, N., "Multipurpose Internet Mail 1048 Extensions (MIME) Part One: Format of Internet Message 1049 Bodies", RFC MIME-IMB, Bellcore, Innosoft, January 1996. 1051 [RFC-MIME-IMT] 1052 Borenstein, N. and Freed, N., "Multipurpose Internet Mail 1053 Extensions (MIME) Part Two: Media Types", RFC MIME-IMT, 1054 Bellcore, Innosoft, January 1996. 1056 [RFC-MIME-HEADERS] 1057 Moore, K., "Multipurpose Internet Mail Extensions (MIME) 1058 Part Three: Representation of Non-Ascii Text in Internet 1059 Message Headers", RFC MIME-HEADERS, University of 1060 Tennessee, ?. 1062 [RFC-MIME-REG] 1063 Postel, J. and Freed, N., "Multipurpose Internet Mail 1064 Extensions (MIME) Part Four: Media Type Registration 1065 Procedure", RFC MIME-REG, ISI, Innosoft, January 1996. 1067 [RFC-MIME-CONF] 1068 Borenstein, N. and Freed, N., "Multipurpose Internet Mail 1069 Extensions (MIME) Part Five: Conformance Criteria and 1070 Examples", RFC MIME-CONF, Bellcore, Innosoft, January 1071 1996. 1073 [US-ASCII] 1074 Coded Character Set -- 7-Bit American Standard Code for 1075 Information Interchange, ANSI X3.4-1986. 1077 [X400] 1078 Schicker, Pietro, "Message Handling Systems, X.400", 1079 Message Handling Systems and Distributed Applications, E. 1080 Stefferud, O-j. Jacobsen, and P. Schicker, eds., North- 1081 Holland, 1989, pp. 3-41.