idnits 2.17.1 draft-ietf-mhtml-info-11.txt: ** The Abstract section seems to be numbered Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 1279 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 40 instances of too long lines in the document, the longest one being 1 character in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (March 1999) is 9171 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'HOSTS' is defined on line 1192, but no explicit reference was found in the text == Unused Reference: 'MIDCID' is defined on line 1206, but no explicit reference was found in the text == Unused Reference: 'MIME2' is defined on line 1214, but no explicit reference was found in the text == Unused Reference: 'NEWS' is defined on line 1218, but no explicit reference was found in the text == Unused Reference: 'REL' is defined on line 1221, but no explicit reference was found in the text == Unused Reference: 'RELURL' is defined on line 1225, but no explicit reference was found in the text == Unused Reference: 'RFC822' is defined on line 1228, but no explicit reference was found in the text == Unused Reference: 'SMTP' is defined on line 1231, but no explicit reference was found in the text == Unused Reference: 'URL' is defined on line 1234, but no explicit reference was found in the text == Unused Reference: 'URLBODY' is defined on line 1237, but no explicit reference was found in the text ** Obsolete normative reference: RFC 1806 (ref. 'CONDISP') (Obsoleted by RFC 2183) ** Obsolete normative reference: RFC 1866 (ref. 'HTML2') (Obsoleted by RFC 2854) ** Downref: Normative reference to an Informational RFC: RFC 1945 (ref. 'HTTP') == Outdated reference: A later version (-07) exists of draft-ietf-mhtml-rev-02 ** Obsolete normative reference: RFC 1036 (ref. 'NEWS') (Obsoleted by RFC 5536, RFC 5537) -- Possible downref: Non-RFC (?) normative reference: ref. 'REL' ** Obsolete normative reference: RFC 1808 (ref. 'RELURL') (Obsoleted by RFC 3986) ** Obsolete normative reference: RFC 822 (Obsoleted by RFC 2822) ** Obsolete normative reference: RFC 821 (ref. 'SMTP') (Obsoleted by RFC 2821) ** Obsolete normative reference: RFC 1738 (ref. 'URL') (Obsoleted by RFC 4248, RFC 4266) Summary: 15 errors (**), 0 flaws (~~), 14 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group Jacob Palme 2 Internet Draft Stockholm University/KTH 3 draft-ietf-mhtml-info-11.txt Category-to-be: Informational 4 Expires: September 1998 March 1999 6 Sending HTML in MIME, an informational supplement to the RFC: 7 MIME Encapsulation of Aggregate Documents, such as HTML (MHTML) 9 Status of this Memo 11 This document is an Internet-Draft and is in full conformance 12 with all provisions of Section 10 of RFC2026. 14 Internet-Drafts are working documents of the Internet Engineering 15 Task Force (IETF), its areas, and its working groups. Note that 16 other groups may also distribute working documents as 17 Internet-Drafts. 19 Internet-Drafts are draft documents valid for a maximum of six 20 months and may be updated, replaced, or obsoleted by other 21 documents at any time. It is inappropriate to use Internet- 22 Drafts as reference material or to cite them other than as 23 "work in progress." 25 The list of current Internet-Drafts can be accessed at 26 http://www.ietf.org/ietf/1id-abstracts.txt 28 The list of Internet-Draft Shadow Directories can be accessed at 29 http://www.ietf.org/shadow.html. 31 Copyright (C) The Internet Society 1998. All Rights Reserved. 33 1. Abstract 35 The memo "MIME Encapsulation of Aggregate Documents, such as HTML 36 (MHTML)" (draft-ietf-mhtml-rev-05.txt) specifies how to send packaged 37 aggregate HTML objects in MIME format. This memo is an accompanying 38 informational document, intended to be an aid to developers. This 39 document is not an Internet standard. 41 Issues discussed are implementation methods, caching strategies, problems 42 with rewriting of URIs, making messages suitable both for mailers which 43 can and which cannot handle Multipart/related and handling recipients 44 which do not have full Internet connectivity. 46 The latest version of this document is available in HTML format at: 47 http://www.dsv.su.se/~jpalme/ietf/mhtml-info.html 49 Differences from the previous versions 9 and 10 of this draft 51 (1) A paragraph about one disadvantage with MAILTO action elements has 52 been added to section 10. 54 (2) A new section 13: Default font size has been added 56 (3) A new temporary section "Issue list" immediately below has been added 58 Issue list 60 Section in Issue description 61 this draft 63 4 Should some more method of communication between html 64 viewer and e-mail program be described? Are the methods 65 correctly described? 67 5 Are there any more problems with rewriting URIs which 68 should be described in section 5? 70 8 Is it OK to say that senders should not assume that 71 recipients will show the value of Content-Description 72 inside Multipart/Related (since HTML has other methods of 73 showing this, for example the element? 75 9 Should we recommend Multipart/related as done in section 9? 77 9 Section 9 describes two ways of using 78 Multipart/alternative, 9.1 with Multipart/alternative 79 inside Multipart/related, and 9.2 with 80 Multipart/alternative outside Multipart/related. 82 Note: I have tested with a few existing mailers. Eudora 83 4.0.1 puts multipart/related outside multipart/alternative, 84 Netscape puts multipart/alternative outside 85 multipart/related. I did not know how to put images into a 86 message with Outlook Express, so I am not sure how it would 87 handle this. 89 The advantage with multipart/related outside, as Eudora 90 does it, is that the image will be shown to recipients 91 whose mailers can handle attachments but not html. 93 Should we recommend support for both alternatives or for 94 only one of them? 96 10 Is the description of pros and cons of mailto versus http 97 ACTION element in forms OK? 99 12 Section 12 contains the figure which was removed from the 100 standard, because people said it was not correct, but which 101 I feel described the character encoding issues better than 102 the text in the standard. If, however, the figure is still 103 incorrect, we should perhaps remove that section? 105 13 Is the description about conversion from HTTP to MIME 106 correct? 108 14 Is the new section 13 on default font size correct? 110 2. Table of Contents 112 1. Abstract 113 2. Table of Contents 114 3. Introduction 115 4. Implementation methods 116 4.1 Method 1: Combining viewer and MIME receiving program 117 4.2 Method 2: Rewriting the HTML 118 4.3 Method 3: Using a translation table 119 4.4 Method 4: Using a proxy HTTP server to retrieve referenced body 120 parts 121 4.5 Method 5: Putting the mail client into a proxy HTTP server 122 4.6 Other methods 123 4.7 Combined methods 124 4.8 Communication between document viewer and mail client 125 5. Problems with rewriting URIs when copying HTML documents 126 6. Caching of body parts 127 7. "Save as" command 128 8. Recipients which cannot handle the Multipart/related Content-Type 129 9. Use of the Content-Type: Multipart/alternative 130 9.1 Multipart/alternative inside Multipart/related 131 9.2 Multipart/alternative outside Multipart/related 132 9.3 Comparing the two methods 133 9.4 Reducing the download time 134 10. Recipient may not have full Internet connectivity 135 11. Encoding of non-ascii characters 136 12. Conversion from HTTP to MIME 137 13. Default font size 138 14. Acknowledgments 139 15. References 140 16. Author's Address 142 Mailing List Information 144 Further discussion on this document should be done through the mailing 145 list MHTML@SEGATE.SUNET.SE. 147 To subscribe to this list, send a message to 148 LISTSERV@SEGATE.SUNET.SE 149 which contains the text 150 SUB MHTML 152 Archives of this list are available by anonymous ftp from 153 FTP://SEGATE.SUNET.SE/lists/mHTML/ 154 The archives are also available by email. Send a message to 155 LISTSERV@SEGATE.SUNET.SE with the text "INDEX MHTML" to get a list of the 156 archive files, and then a new message "GET " to retrieve the 157 archive files. 159 Comments on less important details may also be sent to the editor, Jacob 160 Palme . 162 More information may also be available at URL: 163 HTTP://www.dsv.su.se/~jpalme/ietf/mhtml.html 165 3. Introduction 167 [MHTML] specifies how to send packaged aggregate HTML objects in MIME 168 multipart format. This memo is an accompanying informational document, 169 intended to be an aid to developers. This document is not an Internet 170 standard. 172 4. Implementation methods 174 The [MHTML] standard has been intentionally written to be implementable 175 both in cases where a HTML document viewer (web browser) and a program 176 receiving MIME objects, such as an email program, are combined, and when 177 they are separate programs. Implementation is of course easier if the 178 document viewer is combined with the MIME receiving client. 180 Below are described different implementation methods. Real 181 implementations may sometimes combine ideas from more than one of the 182 different methods described below. 184 Note: Some document viewers can take a whole document of "Content-Type: 185 message" or "Content-Type: multipart" as one single file to be displayed. 186 When such viewers are known to be used, the problems described below 187 become much easier to handle, just submit the whole combined MIME message 188 as a single file to the viewer. 190 4.1 Method 1: Combining viewer and MIME receiving program 192 This is the architecturally simplest approach. A web-browser with a built 193 in MIME receiving program (such as an email program) will be able to use 194 its own document viewer capabilities to display HTML-formatted messages. 195 Since it is the same program, that program will more easily be able to 196 connect a URL in the HTML text to a body part in the message. 198 4.2 Method 2: Rewriting the HTML 200 +----------+ +--------+ 201 | Document | | Mail | 202 | viewer | | client | 203 +-------+--+ +-+------+ 204 | | 205 +--+-------------------------------+--+ 206 | +----------+ +--+ +--+ | 207 | | Start | | | | | Related | Figure 1 208 | | HTML | | | | | body part | 209 | | document | | | | | parts | 210 | +----------+ +--+ +--+ | 211 +-------------------------------------+ 213 If the document viewer is separate from the MIME receiving client, the 214 MIME client might turn over the HTML body part to the document viewer and 215 ask it to display it (Figure 1). One way of doing this is to store the 216 HTML body part in a file, and ask the document viewer to display this 217 file. If multipart/related is used, this can be implemented by storing 218 all the body parts within the multipart/related in an otherwise empty 219 folder/directory. 221 The mail client may have to rewrite the HTML, replacing URI-s with 222 (possibly relative) URL-s which the Document viewer can resolve as file 223 names in the same directory/folder where the HTML document itself is 224 stored when turning it over to the Document viewer. Problems with such 225 rewriting of URIs is discussed in section 5 below. 227 4.3 Method 3: Using a translation table 229 +----------+ +--------+ 230 | Document | | Mail | 231 | viewer | | client | 232 +-------+--+ +-+------+ 233 | | 234 +--+------------------------------+-+ 235 | +--------+ +--+ +--+ | 236 | | Trans- | | | | | Related | Figure 2 237 | | lation | | | | | body part | 238 | | table | | | | | parts | 239 | +--------+ +--+ +--+ | 240 +-----------------------------------+ 242 An alternative to rewriting the HTML file before turning it over to the 243 Document viewer may be to use a translation table, in case the Document 244 viewer has the capability to use such a table to rewrite URL-s on the fly 245 while displaying the document (Figure 2). This requires that the Document 246 viewer is capable of receiving CID: URL-s and resolving them using this 247 translation table in the same way as for other URL-s. 249 4.4 Method 4: Using a proxy HTTP server to retrieve referenced body 250 parts 252 +--------+ +-----------+ +--------+ 253 | Proxy | | Data base | | Mail | 254 | web |-------| of cached |-------| server | 255 | server | | objects | | | 256 +----+---+ +-----------+ +----+---+ 257 | | 258 +----+-----+ +----+---+ Figure 3 259 | Document | | Mail | 260 | viewer | | client | 261 +-------+--+ +-+------+ 262 | | 263 +--+------------------------------+-+ 264 | Start HTML object | 265 +-----------------------------------+ 267 Yet another method is to use a proxy web server, to which the document 268 viewer requests are sent, and which will then use the cached body parts 269 instead of normal web retrieval from the network (Figure 3). If the 270 Document viewer is set to use this proxy server for all URL-s, including 271 CID URL-s, no rewriting of the HTML will be necessary. 273 4.5 Method 5: Putting the mail client into a proxy HTTP server 275 +--------+--------+ 276 | Proxy | Mail | 277 | HTTP | client | 278 | server | | 279 +--------+--------+ 280 | 281 HTTP protocol Figure 4 282 | 283 +----+-----+ 284 | Document | 285 | Viewer | 286 +----------+ 288 A mail client can also be included in an HTTP server (Figure 4). The user 289 will then not have to install any mail client software in his personal 290 computer; all the mail functionality is mapped on HTTP and HTML elements. 292 4.6 Other methods 294 The mail client and the document viewer can of course communicate in 295 other ways, such as using inter-process communication. 297 4.7 Combined methods 299 Several of the methods described above can also be combined. The mailer 300 might for example display simpler HTML documents itself, but 301 automatically or manually transfer the HTML documents to a separate HTML 302 viewer for more complex documents. 304 A common practice in HTML viewers is to simply ignore all markups which 305 the viewer does not understand. This practice, if implemented in a mailer 306 with limited HTML viewing capabilities, might mean that the user is shown 307 a very incomplete message without any warning that information is 308 missing. In this case, it is better to give the user some kind of 309 warning, combined with a command to view the letter with a separate HTML 310 viewer, or turn the document over automatically to a separate viewer when 311 the document contains markup which the mailer cannot render itself. 313 4.8 Communication between document viewer and mail client 315 Many document viewers (web browsers) have API-s to allow other programs 316 to communicate with them. There is however no accepted real or de-facto 317 standard for such API-s, which means that a mail program which relies on 318 such API-s will only be able to use those document viewers, whose API 319 they support. 321 Note however, that most of the methods described above can be implemented 322 with a very minimal such API. The only API function needed is to be able 323 to tell a document viewer, when it is started, to open a particular file. 324 And this API function is a standardized part of the operating system on 325 most platforms. In particular, method 1 and 3 above uses the 326 functionality that a relative URL is resolved with the location of the 327 base document as base. This means that if the base document is a file, 328 relative URL-s will be resolved as FILE URL-s in the same 329 directory/folder where the HTML document itself is placed. 331 There is a need for buttons in the Web page which the user can use to get 332 back to the mail program again after reading the mail with the document 333 viewer. A common technique to achieve this is to define a new MIME data 334 type for this button. The document viewer is then configured to transfer 335 control to the mail client when the user pushes this button; i.e. 336 downloads a file of this new MIME type. 338 5. Problems with rewriting URIs when copying HTML documents 340 Sending of HTML-formatted messages is based on the assumption that an 341 HTML documents, together with in-line objects like images, applets and 342 frames, can be copied into a MIME message. Such copying may require 343 rewriting of URIs containing references between the different message 344 parts. The MHTML standard [MHTML] has been carefully prepared to allow 345 existing web pages to be copied without such rewriting, through the use 346 of the Content-Location MIME content heading field. 348 There is however a problem if the source HTML document contains relative 349 URIs in parameters to objects and applets, such as in the example below: 351 From: foo1@bar.net 352 To: foo2@bar.net 353 Subject: A simple example 354 Mime-Version: 1.0 355 Content-Type: multipart/related; boundary="boundary-example-1"; 356 type=Text/HTML 357 Content-Base: "http://www.ietf.cnri.reston.va.us" 359 --boundary-example 1 360 Content-Type: Text/HTML; charset=US-ASCII 362 ... text of the HTML document... 363 365 366 367 ...etc... 369 --boundary-example-1 370 Content-Location: "image.gif" 371 Content-Type: IMAGE/GIF 372 Content-Transfer-Encoding: BASE64 374 R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5 375 ..etc... 377 --boundary-example-1-- 379 Only the object might know that the imageurl parameter is a relative URI. 380 It's nearly impossible for the HTML parser to understand that the 381 parameter is a relative URI. Simply searching for "image.gif" is not 382 robust, as the string "image.gif" may be used elsewhere. URIs in scripts 383 can also have similar problems. 385 One might envisage even more difficult cases, an applet might take a 386 parameter "subject" and another parameter "range" and when subject="auto" 387 and range="1-5" it could compute, and try to use auto1.gif, auto2.gif ... 388 auto5.gif as relative URLs. 390 Some implementation methods described in section 4 above, for example 391 method 2 described in section 4.2, may require rewriting of the URIs in 392 the HTML document. 394 There is no perfect solution to this problem. 396 One way of alleviating the problem is to produce the original document 397 using only absolute URIs, preferably of the CID type, since they are more 398 easily identifiable. 400 Another way of alleviating the problem is to make all URIs and 401 Content-Locations into simple relative URIs containing file names only 402 (without paths, preferably using a file name format common to most 403 platforms, i.e. 1-6 ascii letters or digits, a period, and 1-3 extension 404 ascii letters or digits). An implementation using method 2 described in 405 section 4.2 above can then just store the parts as files in an empty 406 directory on the recipient computer with the Content-Locations as file 407 names. It can then turn the start HTML file over to a document viewer, 408 and need not rewrite the URIs at all. This simple variant of use of the 409 MHTML standard is probably most robust, and those implementors who can 410 control the production of the HTML documents to be sent are thus 411 recommended to use this variant. 413 6. Caching of body parts 415 Suppose a message contains body parts with the Content-Location header as 416 defined in [MHTML]. A receiving agent might then put this body part into 417 a web cache, with the URI in the Content-Location as its name, so that 418 later retrievals of this URI use the cached body parts. There is however 419 no guarantee that such a cached item is correct. Such caching is thus not 420 recommended for use in other ways than for resolution of links within one 421 particular MIME message. 423 The MHTML standard does not cover links between different messages, but 424 if you want to implement this, use of Content-ID and/or Message-ID, 425 rather than Content-Location, is recommended. 427 If incoming messages are stored in a store where messages can be 428 automatically deleted (purged), purging of body parts should not occur 429 before purging of the whole message, to which they belong. 431 If an incoming message contains a body part which is linked via Content- 432 Location, then no HTTP lookup should be performed to check if the body 433 part is recent. The message should thus still contain the old HTML 434 document, even if the HTTP-available document has been revised. (Example: 435 "Here is the weather map of October 29, 1997"). Exception from this is: 437 (a) If the linked document is not enclosed in the message, but referred 438 to via Content-Type: message/external-body, then the latest version 439 should be shown using ordinary HTTP caching conventions. 441 (b) If a new message is sent with a Supersedes reference to the old 442 message, the old message should still show the old version of all 443 the body parts, but it might be wise to inform the user that a 444 superseding message is available. 446 7. "Save as" command 448 Many HTML viewers have a "Save as" command to save an HTML document in a 449 local file. Usually, this command has two variants, "Save as text" which 450 converts the HTML document to plain text before saving it, and "Save as 451 source" which saves the HTML document as an HTML-formatted document. 453 These two variants may not be enough in the case of MHTML documents. 454 There is a third option, which might be named "Save as aggregate". This 455 option would save the HTML plus all related parts in a file with the 456 Content-Type: Multipart/related. The file would thus begin with the 457 heading of the Multipart/related body part. 459 There are two variants of this: Saving the document as it looked like 460 when you got it, or saving the document including all inline body parts, 461 even those you had to retrieve from the Internet when showing the message 462 to the user. The second format is of special value, because it provides 463 an archiving format of the full document, allowing the user to view it in 464 the future as it looked like at one particular time, even though web 465 content may change in the future. 467 Finally, a user may also want to save the e-mail or http heading fields 468 of an incoming message. This is sometimes the same as "Save as 469 aggregate", but may include additional body parts before or outside of 470 the mulitpart/related aggregate. 472 To indicate whether such a saved document was received by e-mail or http, 473 it might be saved with an additional surrounding body part of content- 474 type message/rfc822 or message/http. 476 Example, suppose you receive by e-mail the following message: 478 MAIL FROM: 479 RCPT TO: 480 DATA 481 From: Alice 482 To: Bob 483 Date: 23 Jan 1998 10:51 484 Subject: A simple example 485 Mime-Version: 1.0 486 Content-Type: multipart/related; boundary="boundary-example-1"; 487 type="text/html"; start= 489 --boundary-example-1 490 Content-Type: text/html;charset=US-ASCII 491 Content-ID: 493 Here is the IETF logo with white background: 494 IETF logo with white background 496 And here is the IETF logo with transparent background: 497 500 --boundary-example-1 501 Content-Location: ietflogo.gif 502 Content-Base: http://www.ietf.cnri.reston.va.us/images/ 503 Content-Type: IMAGE/GIF 504 Content-Transfer-Encoding: BASE64 506 R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5 507 NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A 508 etc... 510 --boundary-example-1-- 511 . 513 Saving the above message as text might give the following file: 515 From: Alice 516 To: Bob 517 Date: 23 Jan 1998 10:51 518 Subject: A simple example 520 Here is the IETF logo with white background: 521 IETF logo with white background 522 And here is the IETF logo with transparent background: 523 IETF logo with transparent background 525 Saving the same text as html source might give the following file: 527 Here is the IETF logo with white background: 528 IETF logo with white background 530 And here is the IETF logo with transparent background: 531 534 Saving the same text as aggregate might give the following file 536 From: Alice 537 To: Bob 538 Date: 23 Jan 1998 10:51 539 Subject: A simple example 540 Mime-Version: 1.0 541 Content-Type: multipart/related; boundary="boundary-example-1"; 542 type="text/html"; start= 544 --boundary-example-1 545 Content-Type: text/html;charset=US-ASCII 546 Content-ID: 548 Here is the IETF logo with white background: 549 IETF logo with white background 551 And here is the IETF logo with transparent background: 552 555 --boundary-example-1 556 Content-Location: ietflogo.gif 557 Content-Base: http://www.ietf.cnri.reston.va.us/images/ 558 Content-Type: IMAGE/GIF 559 Content-Transfer-Encoding: BASE64 561 R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5 562 NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A 563 etc... 565 --boundary-example-1-- 567 Saving the same text as archiving aggregate might give the following file 568 (where the missing body part is fetched through http and added to the 569 saved file): 571 From: Alice 572 To: Bob 573 Date: 23 Jan 1998 10:51 574 Subject: A simple example 575 Mime-Version: 1.0 576 Content-Type: multipart/related; boundary="boundary-example-1"; 577 type="text/html"; start= 579 --boundary-example-1 580 Content-Type: text/html;charset=US-ASCII 581 Content-ID: 583 Here is the IETF logo with white background: 584 IETF logo with white background 586 And here is the IETF logo with transparent background: 587 590 --boundary-example-1 591 Content-Location: ietflogo.gif 592 Content-Base: http://www.ietf.cnri.reston.va.us/images/ 593 Content-Type: IMAGE/GIF 594 Content-Transfer-Encoding: BASE64 596 R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5 597 NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A 598 etc... 600 --boundary-example-1 601 Content-Location: ietflogo2e.gif 602 Content-Base: http://www.ietf.cnri.reston.va.us/images/ 603 Content-Type: IMAGE/GIF 604 Content-Transfer-Encoding: BASE64 606 R0lGODlhGAGgANX/ACkpKTExMTk5OUJCQkpKSlJSUlpaWmNjY2tra3Nzc3t7e4 607 SEhIyMjJSUlJycnKWlpa2trbW1tcDAwM7Ozv/eQnNzjHNzlGtrjGNjhFpae1pa 608 etc... 610 --boundary-example-1-- 612 Saving the same message as message might give the following file: 614 from: 615 To: 616 Mime-Version: 1.0 617 Content-Type: Message/rfc822; boundary="boundary-example-2" 619 --boundary-example-2 620 From: Alice 621 To: Bob 622 Date: 23 Jan 1998 10:51 623 Subject: A simple example 624 Mime-Version: 1.0 625 Content-Type: multipart/related; boundary="boundary-example-1"; 626 type="text/html"; start= 628 --boundary-example-1 629 Content-Type: text/html;charset=US-ASCII 630 Content-ID: 632 Here is the IETF logo with white background: 633 IETF logo with white background 635 And here is the IETF logo with transparent background: 636 639 --boundary-example-1 640 Content-Location: ietflogo.gif 641 Content-Base: http://www.ietf.cnri.reston.va.us/images/ 642 Content-Type: IMAGE/GIF 643 Content-Transfer-Encoding: BASE64 645 R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5 646 NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A 647 etc... 649 --boundary-example-1-- 650 --boundary-example-2-- 652 8. Recipients which cannot handle the Multipart/related Content-Type 654 A message sent according to the specifications in [MHTML] may have 655 recipients, whose mailers cannot handle the Multipart/related 656 Content-Type in the way specified in [MHTML]. 658 According to [MIME1] a mailer which encounters an unknown subtype to 659 Multipart, should handle this as Multipart/mixed. 661 To improve this, Multipart/alternative can be used as discussed in 662 section 9 of this memo. 664 Content-Disposition, as specified in [CONDISP] and in [MHTML], section 665 10, can also be used as an aid to mailers which do not understand 666 Multipart/related. 668 Captions on images, which are included in the HTML text, might for 669 non-HTML-capable recipients be found in the Content-Description header 670 [CONDISP]. Do not assume, however, that HTML-capable user agents will 671 display the Content-Description header, they may assume that this 672 information is included in the HTML text instead. 674 9. Use of the Content-Type: Multipart/alternative 676 If the message is sent to recipients, all of which may not have mailers 677 capable of handling the Text/HTML content-type, then the "Content-Type: 678 Multipart/Alternative" [MIME1] can be used in two ways: 680 9.1 Multipart/alternative inside Multipart/related 682 The Multipart/alternative is put inside the "Content-Type 683 Multipart/related", body parts can be specified with "Content-Type: 684 Text/plain" as the first choice, and "Content-Type: Text/HTML" as the 685 second choice. 687 Example: 689 Content-Type: Multipart/related; boundary="boundary-example-1"; 690 type=MULTIPART/ALTERNATIVE 692 --boundary-example 1 693 Content-Type: MULTIPART/ALTERNATIVE 694 Boundary: boundary-example-2 696 --boundary-example-2 697 Content-Type: Text/plain 699 ... plain text version of the document for recipients 700 whose mailers cannot handle Text/HTML ... 702 --boundary-example-2 703 Content-Type: Text/HTML; charset=US-ASCII 704 Content-ID: content-id-example@example.host 706 ... text of the HTML document ... 708 --boundary-example-2-- 709 --boundary-example-1 710 Content-Type: Image/GIF 712 ... a body part, to which the HTML document has a link ... 713 --boundary-example-1-- 715 Note that the type parameter of Multipart/related in this case should be 716 Multipart/alternative and not Text/HTML. 718 9.2 Multipart/alternative outside Multipart/related 720 The multipart/alternative is put outside the Multipart/Related, with 721 Multipart/Related as one alternative and Multipart/Mixed as the other 722 alternative. Note however that the [MHTML] does not recommend links from 723 inside Multipart/Related to objects outside of the Multipart/Related, so 724 putting inline images outside the Multipart/Related is not suitable. 725 Instead, such inline images may have to repeated in both branches of the 726 multipart/alternative with this method. 728 Example: 730 Content-Type: MULTIPART/ALTERNATIVE 731 Boundary: boundary-example-1 733 --boundary-example-1 734 Content-Type: Multipart/mixed; boundary="boundary-example-3" 736 --boundary-example-3 737 Content-Type: Text/plain; charset=US-ASCII 739 ... plain text version of the message for recipients 740 whose mailers cannot handle Text/HTML ... 742 --boundary-example-3 743 Content-Type: Image/GIF 745 ... A picture associated with the plain text message ... 746 --boundary-example-3-- 748 --boundary-example-1 749 Content-Type: Multipart/related; boundary="boundary-example-1"; 750 type=Text/HTML 752 --boundary-example 2 753 Content-Type: Text/HTML; charset=US-ASCII 754 Content-ID: content-id-example@example.host 756 ... text of the HTML document ... 758 --boundary-example-2 759 Content-Type: Image/GIF 761 ... a body part, to which the HTML document has a link ... 762 --boundary-example-2-- 763 --boundary-example-1-- 765 9.3 Comparing the two methods 767 When choosing between these two methods of employing 768 multipart/alternative, note the following: 770 (1) Clients which do not support Multipart/related, and which thus will 771 interpret it as Multipart/mixed, will with choice 9.1 display 772 the inline objects. Thus, a recipient whose mailer can handle 773 image/gif but not multipart/related will still be shown the images, 774 they will not be suppressed by being inside a suppressed branch of 775 the Multipart/alternative. 777 (2) Choice 9.2 will not show inline images in the Multipart/Related, 778 unless this information is repeated in both branches of the 779 Multipart/Alternative. 781 A general warning: Some mailers do not support "Content-Type: 782 Multipart/alternative", and may then interpret it as Multipart/mixed, 783 even though support of multipart/alternative is required for MIME 784 conformance. 786 9.4 Reducing the download time 788 If a message is sent as multipart/alternative, this would normally mean 789 that the mail client downloads both variants, and then shows only one of 790 the to the user. This will thus increase the download time. A way of 791 avoiding this problem is to use the FETCH command of IMAP, which allows a 792 client to download only certain body parts from a multipart message. 794 10. Textual alternatives to HTML forms 796 One important usage of HTML in e-mail is to send forms, which the 797 recipients fill in and return. It is then problematic how to handle 798 recipients whose mailers do not support HTML. One way is to use textual 799 encoding of the forms. This encoding is done so that the user action 800 needed to send in the form is made simple also for those who have only 801 textual e-mail systems. Important is that the textual users are not 802 forced to write complex commands in special command languages. Instead, 803 the form should be written so that the user need only make simple 804 changes to the form before sending it back, like deleting or adding 805 single characters. 807 Below is an example which shows how this can be done. The main 808 principle is that every line beginning with ";" is an explanation for 809 the reader, and every line beginning with "!" is a text, which the user 810 can convert into a command by just deleting the "!" in front of the 811 line. 813 The users will thus have to learn a very simple rule of filling in 814 forms: Just delete the "!" in front of your selections. 816 Technically, the recipient of a filled-in textual form should regard 817 all lines beginning with ";" or "!" as comment, and interpret all other 818 lines as commands. 820 10.1 Form in HTML format 822
824

Which meeting date do you prefer? 826

1 December 1997 834

7 December 1997 842

14 December 1997 850

21 December 1997 858

Who should be the chairman? 860

Mary 862

John 864

Do you want simultaneous translation during the meeting? 866

To and 867 from English 869

To and 870 from French 872

To and 873 from Japanese 875

Please propose issues to discuss during the meeting: 877

879

882 10.2 The same form in textual format 884 ; This is a computer-generated form. Please fill it in and return it 885 ; to meeting-scheduler@ietf.org. To fill in the form, just copy its 886 ; text into your reply and remove the exclamation mark (!) in front 887 ; of your choices. 889 ; If your mailer adds ">" or "> " in front of lines, you can keep 890 ; these or remove them as you prefer. 892 Question 1: Which meeting date do you prefer? 894 Option 1.1: 1 December 1997 895 ! Very good 896 ! Good 897 ! Acceptable 898 ! Bad 899 ! Very bad 901 Option 1.2: 7 December 1997 902 ! Very good 903 ! Good 904 ! Acceptable 905 ! Bad 906 ! Very bad 908 Option 1.3: 14 December 1997 909 ! Very good 910 ! Good 911 ! Acceptable 912 ! Bad 913 ! Very bad 915 Option 1.4: 21 December 1997 916 ! Very good 917 ! Good 918 ! Acceptable 919 ! Bad 920 ! Very bad 922 Question 2: Who should be the chairman? 923 ! Mary 924 ! John 926 Question 3: Do you want simultaneous translation during the meeting? 928 Option 3.1: To and from English 929 ! Yes 930 ! No 932 Option 3.2: To and from French 933 ! Yes 934 ! No 936 Option 3.3: To and from Japanese 937 ! Yes 938 ! No 940 Question 4: Please propose issues to discuss during the meeting. 941 Write your proposal on the empty lines below. 943 -- End of Question 4 945 11. Recipient may not have full Internet connectivity 947 The recipient of a message sent by email may not always have full 948 Internet connectivity. The recipient may be behind a gateway or firewall 949 which prohibits or restricts Internet connectivity. 951 This means that the recipient may not be able to resolve URI-s in an 952 email message, unless the referred-to documents are included in the email 953 message itself. Thus, it is often suitable to include in an email message 954 all documents which are referred to (directly or indirectly) by URI-s in 955 the message. This may of course not always be possible, in some cases the 956 set of referred-to documents (directly or indirectly) may be the whole 957 WWW document space, i.e. millions of documents. A choice must then be 958 made how much to include. Of course, it is most important to include all 959 inline objects, i.e. objects linked by such hyperlinks as IMG, etc., 960 which specify that the linked objects are to be shown to the user 961 immediately. 963 In the case of ACTION elements in HTML forms, by making these ACTION 964 elements of the "mailto:" URL type, rather than the "http:" URL type, you 965 will enable also recipients without full Internet connectivity to fill in 966 and send in your forms. The HTML specification [HTML2] allows default 967 action when no ACTION element is included, but this default action may 968 not be suitable when sending the HTML document via email. Thus, it is 969 better to always put an explicit ACTION element into HTML forms sent by 970 email. 972 A disadvantage with the "mailto:" URL as ACTION, however, is that this 973 may not work if the user has not specified his e-mail address in the 974 preferences of this HTML viewer. This is common for multi-user 975 workstations. 977 12. Encoding of non-ascii characters 979 Displayed text Displayed text 980 | ^ 981 V | 982 +-------------+ +----------------+ 983 | HTML editor | | HTML viewer | 984 | | | or Web browser | 985 +-------------+ +----------------+ 986 | ^ 987 V | 988 HTML markup HTML markup 989 | ^ 990 V | 991 +---------+ +---------------+ +-------------+ +---------------+ 992 | MIME | | MIME content- | | MIME | | MIME content- | 993 | encap- | | transfer- | | heading | | transfer- | 994 | sulator | | encoder | | interpreter | | decoder | 995 +---------+ +---------------+ +-------------+ +---------------+ 996 | | ^ ^ 997 V V +-----------+ | | 998 MIME heading + MIME content->| Transport |->MIME heading + MIME content 999 +-----------+ 1001 Figure 5 1003 Definitions (see Figure 5): 1005 Displayed text A visual representation of the intended text. 1007 HTML markup A sequence of characters formatted according to the 1008 HTML specification [HTML2]. 1010 MIME content A sequence of octets physically forwarded via email, 1011 may use MIME content-transfer-encoding as specified 1012 in [MIME1]. 1014 HTML editor Software used to produce HTML markup. 1016 MIME content- Software used to encode non-US-ASCII characters 1017 transfer-encoder as specified in [MIME1]. 1019 MIME content- Software used to decode non-US-ASCII characters 1020 transfer-decoder as specified in [MIME1]. 1022 MIME heading Software used to interpret the information in MIME 1023 interpreter headings. 1025 HTML viewer Software used to display HTML documents to recipients. 1027 Some implementations may have a choice of whether to represent non-ascii 1028 characters at the HTML layer (using "&" entity references or numeric 1029 character references as defined in [HTML2] section 3.2.1) or at the MIME 1030 layer (using Content-Transfer-Encoding as defined in [MIME1] section 5). 1032 In choosing between these two representation methods, note the following 1033 effects: 1035 (1) Modifying HTML markup may disrupt security content integrity 1036 checksums. If the checksums are computed between the HTML editor 1037 and the MIME encapsulator, then making the encoding in the MIME 1038 encapsulator will not break the checksums. 1040 (2) The choice of modifying HTML markup may be more suitable for 1041 recipients whose mailers do not support MIME. 1043 (3) Using MIME Content-Transfer-Encoding may be more suitable for 1044 recipients who have MIME-compliant mailers but do pass the text over 1045 to a document viewer (web browser). 1047 13. Conversion from HTTP to MIME 1049 Information received or retrieved using HTTP cannot always be sent 1050 unchanged as email using the "Content-Type: Text/HTML", because of the 1051 restrictions which MIME places on the format of "Content-Type: 1052 Text/HTML". The same problem may occur for documents retrieved via HTTP, 1053 which are in other textual formats than HTML. In particular, note the 1054 following: 1056 (a) Content-encodings allowed in HTTP, but not allowed in MIME, must 1057 be removed. 1059 (b) HTTP allows line breaks as bare CRs or bare LFs or something 1060 else, while MIME only allows line breaks as CRLF in subtypes 1061 of the Text content-type. 1063 (c) HTTP allows character sets like Unicode-1-1, which do not 1064 represent line breaks as CRLFs, such text may have to be 1065 rewritten to character sets like Unicode-1-1-UTF-7 in which 1066 line breaks are represented as CRLFs. 1068 A good overview of the differences, with regard to the use of 1069 "Content-Type: Text", between MIME and HTTP, can be found in [HTTP] 1070 appendix C. 1072 If you want to provide web documents, which can be sent through e-mail 1073 without modification (which might break integrity checksums), then you 1074 SHOULD provide them up in the canonical form, with line breaks as CRLF, 1075 and avoid lines longer than 76 characters/line. 1077 If you want to send HTTP unchanged via email, you might consider using 1078 the "Content-Type: Message/HTTP" instead of the "Content-Type: 1079 Text/HTML". Note that with this Content-Type, the whole object, as sent 1080 through HTTP, can be encoded as a single object with, for example, BASE64 1081 encoding. After decoding of the BASE64, the resulting object can have 1082 HTTP peculiar formats, like single LF or single CR between lines. 1083 However, some mailers may not be capable of handling the Message/HTTP 1084 Content-Type. 1086 Example, the binary part of the following message 1088 Content-Type: message/http 1089 Content-Transfer-Encoding: base64 1091 SFRUUC8xLjEgMjAwIE9LDURhdGU6IFNhdCwgMTQgRmViIDE5OTggMTM6MDM6MzggR01U 1092 DVNlcnZlcjogQXBhY2hlLzEuMi40DUxhc3QtTW9kaWZpZWQ6IFdlZCwgMjMgSnVsIDE5 1093 ... ... ... 1095 might, when the base64 encoding above is decoded, yield: 1097 HTTP/1.1 200 OK 1098 Date: Sat, 14 Feb 1998 13:03:38 GMT 1099 ETag: "43788-124-33d658c5" 1100 Content-Length: 292 1101 Accept-Ranges: bytes 1102 Content-Type: text/html 1104 ... ... 1106 14. Default font size 1108 Many HTML editors and viewers allow the user to specify the size of the 1109 default font ( or according to personal 1110 wishes, for example 10 pt or 12 pt or 14 pt depending on eye sight and 1111 screen distance. This setting should *not* cause a change in the FONT 1112 SIZE= value in the generated HTML which is produced and sent. The reason 1113 for this is that otherwise users may inadvertently send whole letters 1114 with the text in or , which may be easy to 1115 read for the sender but difficult to read for some recipients. 1117 Similarly, a user choice of default FONT, to for example GENEVA or ARIAL, 1118 should not cause or to be sent. User 1119 who wish to send e-mail with or must 1120 explicitly specify this, for example using a FONT command in their HTML 1121 editor or e-mail text editor. 1123 15. Copyright and disclaimer 1125 The IETF takes no position regarding the validity or scope of 1126 any intellectual property or other rights that might be claimed 1127 to pertain to the implementation or use of the technology 1128 described in this document or the extent to which any license 1129 under such rights might or might not be available; neither does 1130 it represent that it has made any effort to identify any such 1131 rights. Information on the IETF's procedures with respect to 1132 rights in standards-track and standards-related documentation 1133 can be found in BCP-11. Copies of claims of rights made 1134 available for publication and any assurances of licenses to be 1135 made available, or the result of an attempt made to obtain a 1136 general license or permission for the use of such proprietary 1137 rights by implementors or users of this specification can be 1138 obtained from the IETF Secretariat." 1140 The IETF invites any interested party to bring to its attention 1141 any copyrights, patents or patent applications, or other 1142 proprietary rights which may cover technology that may be 1143 required to practice this standard. Please address the 1144 information to the IETF Executive Director. 1146 Copyright (C) The Internet Society (date). All Rights Reserved. 1148 This document and translations of it may be copied and 1149 furnished to others, and derivative works that comment on or 1150 otherwise explain it or assist in its implmentation may be 1151 prepared, copied, published and distributed, in whole or in 1152 part, without restriction of any kind, provided that the above 1153 copyright notice and this paragraph are included on all such 1154 copies and derivative works. However, this document itself may 1155 not be modified in any way, such as by removing the copyright 1156 notice or references to the Internet Society or other Internet 1157 organizations, except as needed for the purpose of developing 1158 Internet standards in which case the procedures for copyrights 1159 defined in the Internet Standards process must be followed, or 1160 as required to translate it into languages other than English. 1162 The limited permissions granted above are perpetual and will 1163 not be revoked by the Internet Society or its successors or 1164 assigns. 1166 16. Acknowledgments 1168 Harald Tveit Alvestrand, Richard Baker, Dave Crocker, Martin J. Duerst, 1169 Roy Fielding, Lewis Geer, Al Gilman, Paul Hoffman, Alexander Hopmann, 1170 Mark K. Joseph, Greg Herlihy, Valdis Kletnieks, Daniel LaLiberte, Ed 1171 Levinson, Jay Levitt, Albert Lunde, Larry Masinter, Keith Moore, Gavin 1172 Nicol, Pete Resnick, Jon Smirl, Einar Stefferud, Jamie Zawinski and 1173 several other people have helped us with preparing this memo. I alone 1174 take responsibility for any errors which may still be in the memo. 1176 17. References 1178 Temporary note: This list contains some references to Internet drafts. It 1179 is anticipated that these Internet drafts will become RFC-s before this 1180 memo. The references will then in this memo be changed to refer to the 1181 corresponding RFC instead. This list also includes some RFC-s which are 1182 not up to date, and which will be replaced by new memos presently in ietf 1183 draft status. 1185 Ref. Author, title 1186 --------- ------------------------------------------------------- 1188 [CONDISP] R. Troost, S. Dorner: "Communicating Presentation 1189 Information in Internet Messages: The Content- 1190 Disposition Header", RFC 1806, June 1995. 1192 [HOSTS] R. Braden (editor): "Requirements for Internet Hosts -- 1193 Application and Support", STD-3, RFC 1123, October 1194 1989. 1196 [HTML2] T. Berners-Lee, D. Connolly: "Hypertext Markup Language 1197 - 2.0", RFC 1866, November 1995. 1199 [HTTP] T. Berners-Lee, R. Fielding, H. Frystyk: Hypertext 1200 Transfer Protocol -- HTTP/1.0. RFC 1945, May 1996. 1202 [MHTML] J. Palme & A. Hopmann: "Packaging Aggregate HTML 1203 Objects in MIME Email", draft-ietf-mhtml-rev- 1204 02.txt , October 1997. 1206 [MIDCID] E. Levinson: "Message/External-Body Content-ID Access 1207 Type", draft-ietf-mhtml-cid-v2-00.txt, July, 1997. 1209 [MIME1] N. Freed & N. Borenstein: "MIME (Multipurpose Internet 1210 Mail Extensions) Part One: Mechanisms for Specifying 1211 and Describing the Format of Internet Message Bodies", 1212 RFC 2045, November 1996. 1214 [MIME2] N. Freed & N. Borenstein: "Multipurpose Internet Mail 1215 Extensions (MIME) Part Two: Media Types". RFC 2046, 1216 November 1996. 1218 [NEWS] M.R. Horton, R. Adams: "Standard for interchange of 1219 USENET messages", RFC 1036, December 1987. 1221 [REL] Harald Tveit Alvestrand, Edward Levinson: "The MIME 1222 Multipart/Related Content-type", , August 1997. 1225 [RELURL] R. Fielding: "Relative Uniform Resource Locators", RFC 1226 1808, June 1995. 1228 [RFC822] D. Crocker: "Standard for the format of ARPA Internet 1229 text messages." STD 11, RFC 822, August 1982. 1231 [SMTP] J. Postel: "Simple Mail Transfer Protocol", STD 10, RFC 1232 821, August 1982. 1234 [URL] T. Berners-Lee, L. Masinter, M. McCahill: "Uniform 1235 Resource Locators (URL)", RFC 1738, December 1994. 1237 [URLBODY] N. Freed and Keith Moore: "Definition of the URL MIME 1238 External-Body Access-Type", RFC 2017, October 1996. 1240 18. Author's Address 1242 Jacob Palme Phone: +46-8-16 16 67 1243 Stockholm University and KTH Fax: +46-8-783 08 29 1244 Electrum 230 Email: jpalme@dsv.su.se 1245 S-164 40 Kista, Sweden 1247 Working group chairman: 1249 Einar Stefferud