Network Working Group E. Levinson July 24, 1997 Standards Track Content-ID and Message-ID Uniform Resource Locators This draft document is being circulated for comment. Please send your comments to the authors or to the mimesgml mail list mhtml@segate.sunet.se. Status of this Memo This document is an Internet-Draft. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet- Drafts as reference material or to cite them other than as "work in progress." To view the entire list of current Internet-Drafts, please check the "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow Directories on ftp.is.co.za (Africa), ftp.nordu.net (Europe), munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or ftp.isi.edu (US West Coast). Abstract The Uniform Resource Locator (URL) schemes, "cid:" and "mid:" allow references to messages and the body parts of messages. For example, within a single multipart message, one HTML body part might include embedded references to other parts of the same message. Changes from (RFC 2111) Clarified the example on page 3 on of converting cid URLs to Content- IDs. The example now uses a cid URL instead of an mid. Corrected the example messages to have the correct Content-ID form; they now use the angle brackets. Added a Message-ID header to the second example. 1. Introduction The use of [MIME] within email to convey Web pages and their associated images requires a URL scheme to permit the HTML to refer to the images or other data included in the message. The Content-ID Uniform Resource Locator, "cid:", serves that purpose. Similarly Net News readers use Message-IDs to link related messages together. The Message-ID URL provides a scheme, "mid:", to refer to such messages as a "resource". The "mid" (Message-ID) and "cid" (Content-ID) URL schemes provide identifiers for messages and their body parts. The "mid" scheme uses (a part of) the message-id of an email message to refer to a specific message. The "cid" scheme refers to a specific body part of a message; its use is generally limited to references to other body parts in the same message as the referring body part. The "mid" scheme may also refer to a specific body part within a designated message, by including the content-ID's address. A note on terminology. The terms "body part" and "MIME entity" are used interchangeably. They refer to the headers and body of a MIME Levinson Expires January 1998 [Page 1] Internet Draft Message- & Content-ID URLs July, 1997 message, either the message itself or one of the body parts contained in a Multipart message. 2. The MID and CID URL Schemes RFC1738 [URL] reserves the "mid" and "cid" schemes for Message-ID and Content-ID respectively. This memorandum defines the syntax for those URLs. Because they use the same syntactic elements they are presented together. The URLs take the form content-id = url-addr-spec message-id = url-addr-spec url-addr-spec = addr-spec ; URL encoding of RFC 822 addr-spec cid-url = "cid" ":" content-id mid-url = "mid" ":" message-id [ "/" content-id ] Notes: In Internet mail messages, the addr-spec in a Content-ID [MIME] or Message-ID [822] header is enclosed in angle brackets (<>). Since addr-spec in a Message-ID or Content-ID might contain characters not allowed within a URL; any such character (including "/", which is reserved within the "mid" scheme) must be hex-encoded using the %hh escape mechanism in [URL]. A "mid" URL with only a "message-id" refers to an entire message. With the appended "content-id", it refers to a body part within a message, as does a "cid" URL. The Content-ID of a MIME body part is required to be globally unique. However, in many systems that store messages, body parts are not indexed independently their context (message). The "mid" URL long form was designed to supply the context needed to support interoperability with such systems. A implementation conforming to this specification is required to support the "mid" URL long form (message-id/content-id). Conforming implementations can choose to, but are not required to, take advantage of the content-id's uniqueness and interpret a "cid" URL to refer to any body part within the message store. In limited circumstances (e.g., within multipart/alternate), a single message may contain several body parts that have the same Content-ID. That occurs, for example, when identical data can be accessed through Levinson Expires January 1998 [Page 2] Internet Draft Message- & Content-ID URLs July, 1997 different methods. In those cases, conforming implementations are required to use the rules of the containing MIME entity (e.g., multipart/alternate) to select the body part to which the Content-ID refers. A "cid" URL is converted to the corresponding Content-ID message header [MIME] by removing the "cid:" prefix, converting the % encoded character to their equivalent US-ASCII characters, and enclosing the remaining parts with an angle bracket pair, "<" and ">". For example, "cid:foo4%25foo1@bar.net" corresponds to Content-ID: Reversing the process and converting URL special characters to their % encodings produces the original cid. A "mid" URL is converted to a Message-ID or Message-ID/Content-ID pair in a similar fashion. Both message-id and content-id are required to be globally unique. That is, no two different messages will ever have the same Message-ID addr-spec; no different body parts will ever have the same Content-ID addr-spec. A common technique used by many message systems is to use a time and date stamp along with the local host's domain name, e.g., 950124.162336@XIson.com. Some Examples The following message contains an HTML body part that refers to an image contained in another body part. Both body parts are contained in a Multipart/Related MIME entity. The HTML IMG tag contains a cidurl which points to the image. From: foo1@bar.net To: foo2@bar.net Subject: A simple example Mime-Version: 1.0 Content-Type: multipart/related; boundary="boundary-example-1"; type=Text/HTML --boundary-example 1 Content-Type: Text/HTML; charset=US-ASCII ... text of the HTML document, which might contain a hyperlink to the other body part, for example through a statement such as: IETF logo --boundary-example-1 Levinson Expires January 1998 [Page 3] Internet Draft Message- & Content-ID URLs July, 1997 Content-ID: Content-Type: IMAGE/GIF Content-Transfer-Encoding: BASE64 R0lGODlhGAGgAPEAAP/////ZRaCgoAAAACH+PUNvcHlyaWdodCAoQykgMTk5 NSBJRVRGLiBVbmF1dGhvcml6ZWQgZHVwbGljYXRpb24gcHJvaGliaXRlZC4A etc... --boundary-example-1-- The following message points to another message (hopefully still in the recipient's message store). From: bar@none.com To: phooey@all.com Subject: Here's how to do it Message-ID: <970701.32784@VIers.none.com> Content-type: text/html; charset=usascii ... The items in my previous message, shows how the approach you propose can be used to accomplish ... 3. Security Considerations The URLs defined here provide an addressing or referencing mechanism. The values of these URLs disclose no more about the originators environment than the corresponding Message-ID and Content-ID values. Where concern exists about such disclosures the originator of a message using mid and cid URLs must take precautions to insure that confidential information is not disclosed. Those precautions should already be in place to handle existing mail use of the Message-ID and Content-ID. 4. References [822] Crocker, D., "Standard for the Format of ARPA Internet Text Messages," August 1982, University of Delaware, STD 11, RFC 822. [MIME] N. Borenstein, N. Freed, "MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for Specifying and Describ- ing the Format of Internet Message Bodies," September 1993, RFC 1521. Levinson Expires January 1998 [Page 4] Internet Draft Message- & Content-ID URLs July, 1997 [URL] Berners-Lee, T., Masinter, L., and McCahill, M., "Uniform Resource Locators (URL)," December 1994. [MULREL] E. Levinson, "The MIME Multipart/Related Content-type," Decem- ber 1995, RFC 1874. 5. Acknowledgments The original concept of "mid" and "cid" URLs were part of the Tim Berners-Lee's original vision of the World Wide Web. The ideas and design have benefited greatly by discussions with Harald Alvestrand, Dan Connolly, Roy Fielding, Larry Masinter, Jacob Palme, and others in the MHTML working group. 6. Author's Address Edward Levinson 47 Clive Street Metuchen, NJ 08840-1060 USA +1 908 549 3716 Levinson Expires January 1998 [Page 5] --Tiger-Lily Content-Type: Text/plain Content-Description: MIMEHTML Working Group E. Levinson July 24, 1997 Internet Draft The MIME Multipart/Related Content-type This draft document is being circulated for comment. Please send your comments to the authors or to the mail list . Abstract The Multipart/Related content-type provides a common mechanism for representing objects that are aggregates of related MIME body parts. This document defines the Multipart/Related content-type and provides examples of its use. 0. Changes from previous draft (RFC 2112) Corrected cid urls to conform to RFC 2111; the angle brackets were removed. 1. Introduction Several applications of MIME, including MIME-PEM, and MIME-Macintosh and other proposals, require multiple body parts that make sense only in the aggregate. The present approach to these compound objects has been to define specific multipart subtypes for each new object. In keeping with the MIME philosophy of having one mechanism to achieve the same goal for different purposes, this document describes a single mechanism for such aggregate or compound objects. The Multipart/Related content-type addresses the MIME representation of compound objects. The object is categorized by a "type" parameter. Additional parameters are provided to indicate a specific starting body part or root and auxiliary information which may be required when unpacking or processing the object. Multipart/Related MIME entities may contain Content-Disposition headers that provide suggestions for the storage and display of a body part. Multipart/Related processing takes precedence over Content-Disposition; the interaction between them is discussed in section 4. Responsibility for the display or processing of a Multipart/Related's constituent entities rests with the Levinson Expires January 1998 [Page 1] Internet Draft Multipart/Related July, 1997 application that handles the compound object. 2. Multipart/Related Registration Information The following form is copied from RFC 1590, Appendix A. To: IANA@isi.edu Subject: Registration of new Media Type content-type/subtype Media Type name: Multipart Media subtype name: Related Required parameters: Type, a media type/subtype. Optional parameters: Start Start-info Encoding considerations: Multipart content-types cannot have encodings. Security considerations: Depends solely on the referenced type. Published specification: RFC-REL (this document). Person & email address to contact for further information: Edward Levinson 47 Clive Street Metuchen, NJ 08840-1060 +1 908 494 1606 XIson@cnj.digex.net 3. Intended usage The Multipart/Related media type is intended for compound objects consisting of several inter-related body parts. For a Multipart/Related object, proper display cannot be achieved by individually displaying the constituent body parts. The content-type of the Multipart/Related object is specified by the type parameter. The "start" parameter, if given, points, via a content-ID, to the body part that contains the object root. The default root is the first body part within the Multipart/Related body. The relationships among the body parts of a compound object distinguishes it from other object types. These relationships are often represented by links internal to the object's components that reference the other components. Within a single operating Levinson Expires January 1998 [Page 2] Internet Draft Multipart/Related July, 1997 environment the links are often file names, such links may be represented within a MIME message using content-IDs or the value of some other "Content-" headers. 3.1. The Type Parameter The type parameter must be specified and its value is the MIME media type of the "root" body part. It permits a MIME user agent to determine the content-type without reference to the enclosed body part. If the value of the type parameter and the root body part's content-type differ then the User Agent's behavior is undefined. 3.2. The Start Parameter The start parameter, if given, is the content-ID of the compound object's "root". If not present the "root" is the first body part in the Multipart/Related entity. The "root" is the element the applications processes first. 3.3. The Start-Info Parameter Additional information can be provided to an application by the start-info parameter. It contains either a string or points, via a content-ID, to another MIME entity in the message. A typical use might be to provide additional command line parameters or a MIME entity giving auxiliary information for processing the compound object. Applications that use Multipart/Related must specify the interpretation of start-info. User Agents shall provide the parameter's value to the processing application. Processes can distinguish a start-info reference from a token or quoted-string by examining the first non-white-space character, "<" indicates a reference. 3.4. Syntax related-param := [ ";" "start" "=" cid ] [ ";" "start-info" "=" ( cid-list / value ) ] [ ";" "type" "=" type "/" subtype ] ; order independent cid-list := cid cid-list cid := msg-id ; c.f. [822] value := token / quoted-string ; c.f. [MIME] Levinson Expires January 1998 [Page 3] Internet Draft Multipart/Related July, 1997 ; value cannot begin with "<" Note that the parameter values will usually require quoting. Msg-id contains the special characters "<", ">", "@", and perhaps other special characters. If msg-id contains quoted-strings, those quote marks must be escaped. Similarly, the type parameter contains the special character "/". 4. Handling Content-Disposition Headers Content-Disposition Headers [DISP] suggest presentation styles for MIME body parts. [DISP] describes two presentation styles, called the disposition type, INLINE and ATTACHMENT. These, used within a multipart entity, allow the sender to suggest presentation information. [DISP] also provides for an optional storage (file) name. Content-Disposition headers could appear in one or more body parts contained within a Multipart/Related entity. Using Content-Disposition headers in addition to Multipart/Related provides presentation information to User Agents that do not recognize Multipart/Related. They will treat the multipart as Multipart/Mixed and they may find the Content-Disposition information useful. With Multipart/Related however, the application processing the compound object determines the presentation style for all the contained parts. In that context the Content-Disposition header information is redundant or even misleading. Hence, User Agents that understand Multipart/Related shall ignore the disposition type within a Multipart/Related body part. It may be possible for a User Agent capable of handling both Multipart/Related and Content-Disposition headers to provide the invoked application the Content-Disposition header's optional filename parameter to the Multipart/Related. The use of that information will depend on the specific application and should be specified when describing the handling of the corresponding compound object. Such descriptions would be appropriate in an RFC registering that object's media type. 5. Examples 5.1 Application/X-FixedRecord The X-FixedRecord content-type consists of one or more octet-streams and a list of the lengths of each record. The root, which lists the record lengths of each record within the streams. The record length list, type Application/X-FixedRecord, consists of a set of INTEGERs Levinson Expires January 1998 [Page 4] Internet Draft Multipart/Related July, 1997 in ASCII format, one per line. Each INTEGER gives the number of octets from the octet-stream body part that constitute the next "record". The example below, uses a single data block. Content-Type: Multipart/Related; boundary=example-1 start="<950120.aaCC@XIson.com>"; type="Application/X-FixedRecord" start-info="-o ps" --example-1 Content-Type: Application/X-FixedRecord Content-ID: <950120.aaCC@XIson.com> 25 10 34 10 25 21 26 10 --example-1 Content-Type: Application/octet-stream Content-Description: The fixed length records Content-Transfer-Encoding: base64 Content-ID: <950120.aaCB@XIson.com> T2xkIE1hY0RvbmFsZCBoYWQgYSBmYXJtCkUgSS BFIEkgTwpBbmQgb24gaGlzIGZhcm0gaGUgaGFk IHNvbWUgZHVja3MKRSBJIEUgSSBPCldpdGggYS BxdWFjayBxdWFjayBoZXJlLAphIHF1YWNrIHF1 YWNrIHRoZXJlLApldmVyeSB3aGVyZSBhIHF1YW NrIHF1YWNrCkUgSSBFIEkgTwo= --example-1-- 5.2 Text/X-Okie The Text/X-Okie is an invented markup language permitting the inclusion of images with text. A feature of this example is the inclusion of two additional body parts, both picture. They are referred to internally by the encapsulated document via each picture's body part content-ID. Usage of "cid:", as in this example, may be useful for a variety of compound objects. It is not, however, a part of the Multipart/Related specification. Levinson Expires January 1998 [Page 5] Internet Draft Multipart/Related July, 1997 Content-Type: Multipart/Related; boundary=example-2; start="<950118.AEBH@XIson.com>" type="Text/x-Okie" --example-2 Content-Type: Text/x-Okie; charset=iso-8859-1; declaration="<950118.AEB0@XIson.com>" Content-ID: <950118.AEBH@XIson.com> Content-Description: Document {doc} This picture was taken by an automatic camera mounted ... {image file=cid:950118.AECB@XIson.com} {para} Now this is an enlargement of the area ... {image file=cid:950118:AFDH@XIson.com} {/doc} --example-2 Content-Type: image/jpeg Content-ID: <950118.AFDH@XIson.com> Content-Transfer-Encoding: BASE64 Content-Description: Picture A [encoded jpeg image] --example-2 Content-Type: image/jpeg Content-ID: <950118.AECB@XIson.com> Content-Transfer-Encoding: BASE64 Content-Description: Picture B [encoded jpeg image] --example-2-- 5.3 Content-Disposition In the above example each image body part could also have a Content- Disposition header. For example, ... --example-2 Content-Type: image/jpeg Content-ID: <950118.AECB@XIson.com> Content-Transfer-Encoding: BASE64 Content-Description: Picture B Content-Disposition: INLINE [encoded jpeg image] --example-2-- Levinson Expires January 1998 [Page 6] Internet Draft Multipart/Related July, 1997 User Agents that recognize Multipart/Related will ignore the Content- Disposition header's disposition type. Other User Agents will process the Multipart/Related as Multipart/Mixed and may make use of that header's information. 6. User Agent Requirements User agents that do not recognize Multipart/Related shall, in accordance with [MIME], treat the entire entity as Multipart/Mixed. MIME User Agents that do recognize Multipart/Related entities but are unable to process the given type should give the user the option of suppressing the entire Multipart/Related body part shall be. Existing MIME-capable mail user agents (MUAs) handle the existing media types in a straightforward manner. For discrete media types (e.g. text, image, etc.) the body of the entity can be directly passed to a display process. Similarly the existing composite subtypes can be reduced to handing one or more discrete types. Handling Multipart/Related differs in that processing cannot be reduced to handling the individual entities. The following sections discuss what information the processing application requires. It is possible that an application specific "receiving agent" will manipulate the entities for display prior to invoking actual application process. Okie, above, is an example of this; it may need a receiving agent to parse the document and substitute local file names for the originator's file names. Other applications may just require a table showing the correspondence between the local file names and the originator's. The receiving agent takes responsibility for such processing. 6.1 Data Requirements MIME-capable mail user agents (MUAs) are required to provide the application: (a) the bodies of the MIME entities and the entity Content-* headers, (b) the parameters of the Multipart/Related Content-type header, and (c) the correspondence between each body's local file name, that body's header data, and, if present, the body part's content-ID. 6.2 Storing Multipart/Related Entities The Multipart/Related media type will be used for objects that have Levinson Expires January 1998 [Page 7] Internet Draft Multipart/Related July, 1997 internal linkages between the body parts. When the objects are stored the linkages may require processing by the application or its receiving agent. 6.3 Recursion MIME is a recursive structure. Hence one must expect a Multi- part/Related entity to contain other Multipart/Related entities. When a Multipart/Related entity is being processed for display or storage, any enclosed Multipart/Related entities shall be processed as though they were being stored. 6.4 Configuration Considerations It is suggested that MUAs that use configuration mechanisms, see [CFG] for an example, refer to Multipart/Related as Multi- part/Related/, were is the value of the "type" parame- ter. 7. Security considerations Security considerations relevant to Multipart/Related are identical to those of the underlying content-type. 8. Acknowledgments This proposal is the result of conversations the author has had with many people. In particular, Harald A. Alvestrand, James Clark, Charles Goldfarb, Gary Houston, Ned Freed, Ray Moody, and Don Stinch- field, provided both encouragement and invaluable help. The author, however, take full responsibility for all errors contained in this document. 9. References [822] Crocker, D., "Standard for the Format of ARPA Internet Text Messages", August 1982, University of Delaware, RFC 822. [CID] E. Levinson, J. Clark, "Message/External-Body Content-ID Access Type", 12/26/1995, RFC 1873 Levinson, E., "Mes- sage/External-Body Content-ID Access Type", work in progress, ftp://ds.internic.net/internet-drafts/draft-ietf- mimesgml-access-cid-01.txt. [CFG] Borenstein, N., "A User Agent Configuration Mechanism For Multimedia Mail Format Information", September 23, 1993, RFC 1524 Levinson Expires January 1998 [Page 8] Internet Draft Multipart/Related July, 1997 [DISP] R. Troost, S. Dorner, "Communicating Presentation Informa- tion in Internet Messages: The Content-Disposition Header", June 7, 1995, RFC 1806 [MIME] Borenstein, N. and Freed, N., "MIME (Multipurpose Internet Mail Extensions): Mechanisms for Specifying and Describing the Format of Internet Message Bodies", June 1992, RFC 1341. 9. Author's address Edward Levinson 47 Clive Street Metuchen, NJ 08840-1060 USA +1 908 494 1606 Levinson Expires January 1998 [Page 9]