idnits 2.17.1 draft-hall-mime-app-mbox-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3667, Section 5.1 on line 14. -- Found old boilerplate from RFC 3978, Section 5.5 on line 381. ** Found boilerplate matching RFC 3978, Section 5.4, paragraph 1 (on line 372), which is fine, but *also* found old RFC 2026, Section 10.4C, paragraph 1 text on line 35. ** The document seems to lack an RFC 3978 Section 5.1 IPR Disclosure Acknowledgement -- however, there's a paragraph with a matching beginning. Boilerplate error? ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. ** The document seems to lack an RFC 3979 Section 5, para. 1 IPR Disclosure Acknowledgement. ** The document seems to lack an RFC 3979 Section 5, para. 2 IPR Disclosure Acknowledgement. ** The document seems to lack an RFC 3979 Section 5, para. 3 IPR Disclosure Invitation. ** The document uses RFC 3667 boilerplate or RFC 3978-like boilerplate instead of verbatim RFC 3978 boilerplate. After 6 May 2005, submission of drafts without verbatim RFC 3978 boilerplate is not accepted. The following non-3978 patterns matched text found in the document. That text should be removed or replaced: By submitting this Internet-Draft, I certify that any applicable patent or other IPR claims of which I am aware have been disclosed, or will be disclosed, and any of which I become aware will be disclosed, in accordance with RFC 3668. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 2) being 100 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 9 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 2 instances of too long lines in the document, the longest one being 1 character in excess of 72. ** The abstract seems to contain references ([RFC2048]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Unrecognized Status in ' Category: Standards-Track', assuming Proposed Standard (Expected one of 'Standards Track', 'Full Standard', 'Draft Standard', 'Proposed Standard', 'Best Current Practice', 'Informational', 'Experimental', 'Informational', 'Historic'.) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 2005) is 7008 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC2045' is mentioned on line 157, but not defined ** Obsolete normative reference: RFC 2048 (Obsoleted by RFC 4288, RFC 4289) ** Obsolete normative reference: RFC 2822 (Obsoleted by RFC 5322) Summary: 13 errors (**), 0 flaws (~~), 5 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTERNET-DRAFT Eric A. Hall 3 Document: draft-hall-mime-app-mbox-04.txt February 2005 4 Expires: August, 2005 5 Category: Standards-Track 7 The APPLICATION/MBOX Media-Type 9 Status of this Memo 11 By submitting this Internet-Draft, I certify that any applicable 12 patent or other IPR claims of which I am aware have been 13 disclosed, and any of which I become aware will be disclosed, in 14 accordance with RFC 3668. 16 Internet-Drafts are working documents of the Internet Engineering 17 Task Force (IETF), its areas, and its working groups. Note that 18 other groups may also distribute working documents as Internet- 19 Drafts. 21 Internet-Drafts are draft documents valid for a maximum of six 22 months and may be updated, replaced, or obsoleted by other 23 documents at any time. It is inappropriate to use Internet-Drafts 24 as reference material or to cite them other than as "work in 25 progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt. 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 Copyright Notice 35 Copyright (C) The Internet Society (2004). All Rights Reserved. 37 Abstract 39 This memo requests that the application/mbox media-type be 40 authorized for allocation by the IESG, according to the terms 41 specified in RFC 2048 [RFC2048]. This memo also defines a default 42 format for the mbox database, which must be supported by all 43 conformant implementations. 45 1. Background and Overview 47 UNIX-like operating systems have historically made widespread use 48 of "mbox" database files for a variety of local email purposes. In 49 the common case, mbox files store linear sequences of one or more 50 electronic mail messages, with local email clients treating the 51 database as a logical folder of email messages. mbox databases are 52 also used by a variety of other messaging tools, such as mailing 53 list management programs, archiving and filtering utilities, 54 messaging servers, and other related applications. In recent 55 years, mbox databases have also become common on a large number of 56 non-UNIX computing platforms, for similar kinds of purposes. 58 The increased pervasiveness of these files has led to an increased 59 demand for a standardized, network-wide interchange of these files 60 as discrete database objects. In turn, this dictates a need for a 61 media-type definition for mbox files in general, which is the 62 subject and purpose of this memo. 64 2. About the mbox Database 66 The mbox database format is not documented in an authoritative 67 specification, but instead exists as a well-known output format 68 that is anecdotally documented, or which is only authoritatively 69 documented for a specific platform or tool. 71 mbox databases typically contain a linear sequence of electronic 72 mail messages. Each message begins with a separator line that 73 identifies the message sender, and also identifies the date and 74 time at which the message was received by the final recipient 75 (either the last-hop system in the transfer path, or the system 76 which serves as the recipient's mailstore). Each message is 77 typically terminated by an empty line. The end of the database is 78 usually recognized by either the absence of any additional data, 79 or by the presence of an explicit end-of-file marker. 81 The structure of the separator lines vary across implementations, 82 but usually contain the exact character sequence of "From", 83 followed by a single Space character (0x20), an email address of 84 some kind, another Space character, a timestamp sequence of some 85 kind, and an end-of-line marker. However, due to the lack of any 86 authoritative specification, each of these attributes are known to 87 vary widely across implementations. For example, the email address 88 can reflect any addressing syntax which has ever been used on any 89 messaging system in all of history (specifically including address 90 forms which are not compatible with Internet messages, as defined 91 by RFC 2822 [RFC2822]). Similarly, the timestamp sequences can 92 also vary according to system output, while the end-of-line 93 sequences will often reflect platform-specific requirements. 94 Different data formats can even appear within a single database as 95 a result of multiple mbox files being concatenated together, or 96 because a single file was accessed by multiple messaging clients 97 which have each used their own syntax for the separator line. 99 Message data within mbox databases often reflects site-specific 100 peculiarities. For example, it is entirely possible for the 101 message body or headers in an mbox database to contain untagged 102 eight-bit character data that implicitly reflects a site-specific 103 default language or locale, or for timestamps and email addresses 104 to reflect local defaults, with none of this data being widely 105 portable beyond the local scope. Similarly, message data can also 106 contain unencoded eight-bit binary data, or can use encoding 107 formats which represent a specific platform (E.G., BINHEX or 108 UUENCODE sequences). 110 Many implementations are also known to escape message body lines 111 that begin with the character sequence of "From ", so as to 112 prevent confusion with overly-liberal parsers that do not search 113 for full separator lines. In the common case, a leading Greater- 114 Than symbol (0x3E) is used for this purpose (with "From " becoming 115 ">From "). However, other implementations are known not to escape 116 such lines unless they are immediately preceded by a blank line or 117 if they also appear to contain an email address and a timestamp. 118 Other implementations are also known to perform secondary escapes 119 against these lines if they are already escaped or quoted, while 120 others ignore these mechanisms altogether. 122 A comprehensive description of mbox database files on UNIX-like 123 systems can be found at http://qmail.org./man/man5/mbox.html, 124 which should be treated as mostly authoritative for those 125 variations which are otherwise only documented in anecdotal form. 126 However, readers are advised that many other platforms and tools 127 make use of mbox databases, and that there are many more potential 128 variations that can be encountered in the wild. 130 In order to mitigate errors that may arise from such vagaries, 131 this specification defines a "format" parameter to the 132 APPLICATION/MBOX media-type declaration, which can be used to 133 identify the specific kind of mbox database that is being 134 transferred. Furthermore, this specification defines a "default" 135 database format which MUST be supported by implementations that 136 claim to be compliant with this specification, and which is to be 137 used as the implicit format for undeclared APPLICATION/MBOX data 138 objects. Additional format types are to be defined in subsequent 139 specifications. Messaging systems which receive a mbox database 140 with an unknown format parameter value SHOULD treat the data as an 141 opaque binary object, as if the data had been declared as 142 APPLICATION/OCTET-STREAM. 144 Refer to Appendix A for a description of the default mbox format. 146 Note that RFC 2046 [RFC2046] defines the multipart/digest media- 147 type for transferring platform-independent message files. Since 148 that specification defines a set of neutral and strict formatting 149 rules, the multipart/digest media-type already facilitates highly- 150 predictable transfer and conversion operations, and as such 151 implementers are strongly encouraged to support and use that 152 media-type where possible. 154 3. Prerequisites and Terminology 156 Readers of this document are expected to be familiar with the 157 specification for MIME [RFC2045] and MIME-type registrations 158 [RFC2048]. 160 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL 161 NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" 162 in this document are to be interpreted as described in RFC 2119 163 [RFC2119]. 165 4. The APPLICATION/MBOX Media-Type Registration 167 This section provides the media-type registration application (as 168 per [RFC2048]), which will be submitted to IANA after IESG 169 approval of this specification. 171 MIME media type name: application 173 MIME subtype name: mbox 175 Required parameters: none 177 Optional parameters: The "format" parameter identifies the format 178 of the mbox database and the messages contained therein. The 179 default value for the "format" parameter is "default", and refers 180 to the formatting rules defined in Appendix A of this memo. mbox 181 databases that do not have a "format" parameter SHOULD be 182 interpreted as having the implicit "format" value of "default". 183 mbox databases that have an unknown value for the "format" 184 parameter SHOULD be treated as opaque data objects, as if the 185 media-type had been specified as APPLICATION/OCTET-STREAM. 186 Additional values for the format parameter are to be defined in 187 subsequent specifications, and registered with IANA. 189 Encoding considerations: If an email client receives an mbox 190 database as a message attachment, and then stores that attachment 191 within a local mbox database, the contents of the two database 192 files may become irreversibly intermingled, such that neither 193 database is no longer independently recognizable. In order to 194 avoid these collisions, messaging systems which support this 195 specification MUST encode an mbox database (or at a minimum, the 196 separator lines) with a non-transparent transfer encoding (such as 197 BASE64 or Quoted-Printable) whenever an APPLICATION/MBOX object is 198 transferred via messaging protocols. Other transfer services are 199 generally encouraged to adopt similar encoding strategies to allow 200 for any subsequent retransmission which might occur, but are not 201 explicitly required to do so. Implementers should also be prepared 202 to encode mbox data locally if non-compliant data is received. 204 Security considerations: mbox data is passive, and does not 205 generally represent a unique or new security threat. However, 206 there is risk in sharing any kind of data, in that unintentional 207 information may be exposed, and this risk certainly applies to 208 mbox data as well. 210 Interoperability considerations: Due to the lack of a single 211 authoritative specification for mbox databases, there are a large 212 number of variations between database formats (refer to the 213 introduction text for common examples), and it is expected that 214 non-conformant data will be erroneously tagged or exchanged. 215 Although the "default" format specified in this memo does not 216 allow for these kinds of vagaries, prior negotiation or agreement 217 between humans may sometimes be needed. 219 Published specification: see Appendix A. 221 Applications which use this media type: hundreds of messaging 222 products make use of the mbox database format, in one form or 223 another. 225 Magic number(s): mbox database files can be recognized by having a 226 leading character sequence of "From", followed by a single Space 227 character (0x20), followed by additional printable character data 228 (refer to the description in Appendix A for details). However, 229 implementers are cautioned that all such files will not be 230 compliant with all of the formatting rules, so implementers should 231 treat these files with an appropriate amount of circumspection. 233 File extension(s): mbox database files sometimes have an ".mbox" 234 extension, but this is not required nor expected. As with magic 235 numbers, implementers should avoid reflexive assumptions about the 236 contents of such files. 238 Macintosh File Type Code(s): None are known to be common. 240 Person & email address to contact for further information: Eric A. 241 Hall (ehall@ntrg.com) 243 Intended usage: COMMON 245 5. Security Considerations 247 See the discussion in section 4. 249 6. IANA Considerations 251 Upon IESG approval, IANA would be expected to register the 252 APPLICATION/MBOX media-type in the MIME registry, using the 253 application provided in section 4 above. 255 Furthermore, IANA would be expected to establish and maintain a 256 registry of values for the "format" parameter as described in this 257 memo. The first registration would be the "default" value, using 258 the description provided in Appendix A. Subsequent values for the 259 "format" parameter MUST be accompanied by some form of 260 recognizable, complete, and legitimate specification, such as an 261 IESG-approved specification. or some kind of authoritative vendor 262 documentation. 264 7. Normative References 266 [RFC2046] Freed, N., Borenstein, N., "Multipurpose 267 Internet Mail Extensions (MIME) Part Two: 268 Media Types", RFC 2046, November 1996. 270 [RFC2048] Freed, N., Klensin, J., Postel, J., 271 "Multipurpose Internet Mail Extensions (MIME) 272 Part Four: Registration Procedures", BCP 13, 273 RFC 2048, November 1996. 275 [RFC2119] Bradner, S., "Key words for use in RFCs to 276 Indicate Requirement Levels", BCP 14, RFC 277 2119, March 1997. 279 [RFC2822] Resnick, P., "Internet Message Format", RFC 280 2822, April 2001. 282 Appendix A. The "default" mbox Database Format 284 In order to improve interoperability among messaging systems, this 285 memo defines a "default" mbox database format, which MUST be 286 supported by all implementations claiming to be compliant with 287 this specification. 289 The "default" mbox database format uses a linear sequence of 290 Internet messages, with each message being immediately prefaced by 291 a separator line, and being terminated by an empty line. More 292 specifically: 294 o Each message within the database MUST follow the syntax and 295 formatting rules defined in RFC 2822 [RFC2822] and its 296 related specifications, with the exception that the canonical 297 mbox database MUST use a single Line-Feed character (0x0A) as 298 the end-of-line sequence, and MUST NOT use a Carriage- 299 Return/Line-Feed pair (NB: this requirement only applies to 300 the canonical mbox database as transferred, and does not 301 override any other specifications). This usage represents the 302 most common historical representation of the mbox database 303 format, and allows for the least amount of conversion. 305 o Messages within the default mbox database MUST consist of 306 seven-bit characters within an eight-bit stream. Eight-bit 307 data within the stream MUST be converted to a seven-bit form 308 (using an appropriate, standardized encoding) and 309 appropriately tagged (with the correct header fields) before 310 the database is transferred. 312 o Message headers and data in the default mbox database MUST be 313 fully-qualified, as per the relevant specification[s]. For 314 example, email addresses in the various header fields MUST 315 have legitimate domain names (as per RFC 2822), while 316 extended characters and encodings MUST be specified in the 317 appropriate location (as per the appropriate MIME 318 specifications), and so forth. 320 o Each message in the mbox database MUST be immediately 321 preceded by a single separator line, which MUST conform to 322 the following syntax: 324 The exact character sequence of "From"; 326 a single Space character (0x20); 328 the email address of the message sender (as obtained from 329 the message envelope or other authoritative source), 330 conformant with the "addr-spec" syntax from RFC 2822; 332 a single Space character; 334 a timestamp indicating the UTC date and time when the 335 message was originally received, conformant with the 336 syntax of the traditional UNIX 'ctime' output sans 337 timezone (note that the use of UTC precludes the need for 338 a timezone indicator); 340 an end-of-line marker. 342 o Each message in the database MUST be terminated by an empty 343 line, containing a single end-of-line marker. 345 Note that the first message in an mbox database will only be 346 prefaced by a separator line, while every other message will begin 347 with two end-of-line sequences (one at the end of the message 348 itself, and another to mark the end of the message within the mbox 349 database file stream) and a separator line (marking the new 350 message). The end of the database is implicitly reached when no 351 more message data or separator lines are found. 353 Also note that this specification does not prescribe any escape 354 syntax for message body lines that begin with the character 355 sequence of "From ". Recipient systems are expected to parse full 356 separator lines as they are documented above. 358 Acknowledgments 360 Funding for the RFC editor function is currently provided by the 361 Internet Society. 363 Authors' Addresses 365 Eric A. Hall 366 ehall@ntrg.com 368 Full Copyright Statement 370 Copyright (C) The Internet Society 2004. This document is subject 371 to the rights, licenses and restrictions contained in BCP 78, and 372 except as set forth therein, the authors retain all their rights. 374 This document and the information contained herein are provided on 375 an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 376 REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND 377 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, 378 EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT 379 THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR 380 ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A 381 PARTICULAR PURPOSE.