idnits 2.17.1 draft-codogno-mime-nntp8bit-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-27) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There is 1 instance of too long lines in the document, the longest one being 1 character in excess of 72. ** There are 16 instances of lines with control characters in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (August 06, 1998) is 9396 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 1341 (ref. 'MIME') (Obsoleted by RFC 1521) ** Obsolete normative reference: RFC 1036 (ref. 'NEWS') (Obsoleted by RFC 5536, RFC 5537) -- Possible downref: Non-RFC (?) normative reference: ref. 'NEWNNTP' ** Obsolete normative reference: RFC 977 (ref. 'NNTP') (Obsoleted by RFC 3977) -- Possible downref: Non-RFC (?) normative reference: ref. 'USEFOR' Summary: 13 errors (**), 0 flaws (~~), 1 warning (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 INTERNET-DRAFT Maurizio Codogno 2 draft-codogno-mime-nntp8bit-00.txt CSELT 3 Expires: February 11, 1999 Date: August 06, 1998 5 The MIME application/nntp8bit Content-type 7 Status of this Memo 9 This document is an Internet Draft; Internet Drafts are working 10 documents of the Internet Engineering Task Force (IETF) its Areas, 11 and Working Groups. Note that other groups may also distribute 12 working documents as Internet Drafts. 14 Internet Drafts are draft documents valid for a maximum of six 15 months. They may be updated, replaced, or obsoleted by other 16 documents at any time. It is not appropriate to use Internet Drafts 17 as reference material or to cite them other than as a "working draft" 18 or "work in progress". 20 Please check the abstract listing in each Internet Draft directory 21 for the current status of this or any other Internet Draft. 23 To view the entire list of current Internet-Drafts, please check 24 the "1id-abstracts.txt" listing contained in the Internet-Drafts 25 Shadow Directories on ftp.is.co.za (Africa), ftp.nordu.net 26 (Northern Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au 27 (Pacific Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu 28 (US West Coast). 30 Abstract 32 The application/nntp8bit content-type is proposed and defined as an 33 efficient and simple way to transmit raw ("binary") data over an NNTP 34 connection, taking into account the foreseeable limitations of that 35 standard. 37 1. Introduction 39 Usenet News [NNTP, NEWS] are a very popular data transmission format: 40 at the time of writing, there are tens of thousands of different 41 discussion groups, and the traffic generated per site could be as 42 much as 10 GB/day. 44 The vast majority of the data is composed by binary files (images, 45 audio or video clips, software programs...) which comprise up to 90% 46 of the global traffic. Unfortunately, the two main ways used to codify 47 binary data, that is UUENCODE and MIME application/octet-stream with 48 Content-Transfer-Encoding base64, add a 33% overhead on the dimension 49 of the file sent. 51 The new specifics of the NNTP protocol which are worked up now 52 [NEWNNTP] require an 8-bit-wide channel, and the companion new 53 definition for Usenet Message Format [USEFOR] does not object to the 54 presence of 8-bit data. There is however a problem, which does not 55 alloy to send raw data directly: it is not possible to have in the 56 body of an article an ASCII NUL (0x00) character, and ASCII CR and LF 57 (0x0d, 0x0a) must appear together. Moreover, each line in the body 58 must be at most 998 octets long, and must end with the CR-LF 59 sequence (not counted in the 998 octets limit). 61 A rather simple way to cope with these limitation is to develop a 62 MIME Content Type which codes the text in such a way to comply with 63 this. This solution has been preferred to the definition of a new 64 Content Transfer Encoding because it is simple to have the former 65 working: if a newsreader does not understand the format, it is 66 possible to save the article and process it with an external filter. 68 2. application/nntp8bit Registration Information 70 The following form is copied from RFC 1590, Appendix A: registration 71 of the new media type will be duly performed. 73 To: IANA@isi.edu 74 Subject: Registration of new Media Type content-type/subtype 76 Media Type name: application 78 Media subtype name: nntp8bit 80 Required parameters: Type, a media type/subtype 82 Optional parameters: Name, the name of the file 84 Encoding considerations: it must be encoded "8bit" or "binary". 86 Security considerations: NONE 88 Published specification: RFC-REL (this document). 90 Person & email address to contact for further information: 91 Maurizio Codogno 92 CSELT CF/IM Dept. 93 Via G. Reiss Romoli, 274 94 I-10148 Torino TO 95 Italy 96 +39 011 228 6132 97 99 3. Definition of the coding 101 Since it is expected that, at least in the beginning, the MIME type 102 application/nntp8bit would not be commonly deployed, the 103 specification of the coding has deliberately kept simple. Moreover, 104 it can be supposed that most binary files sent by Usenet News are 105 already compressed: therefore, it was thought that it is simple 106 just to escape offending characters. A single exception has been 107 made: since there may be the case that someone sends uncompressed 108 files, and it seems that they contain a large amount of NUL 109 characters, NUL is coded with a single octet. 111 Since no chunk of data between CRLF pairs can be longer than 998 112 octets, it is also necessary to add CRLF pairs in suitable places. 113 The coding algorithm, written in pseudo-C, runs as follow: 115 ----------------- cut ---------------------- 116 int nchar=0; 117 char c, NUL=0x00, CR=0x0d, LF=0x0a; 118 char X80=0x80, X81=0x81, X8A=0x8a, X8D=0x8d; 120 while ((c=getchar()) != EndOfFile) { 121 if (c == NUL) 122 { printf("%c",X80); nchar++; } 123 else if (c == CR) 124 { printf("%c%c",X81,X8D); nchar+=2; } 125 else if (c == LF) 126 { printf("%c%c",X81,X8A); nchar+=2; } 127 else if (c == X80) 128 { printf("%c%c",X81,X80); nchar+=2; } 129 else if (c == X81) 130 { printf("%c%c",X81,X80); nchar+=2; } 131 else 132 { printf("%c",c); nchar++; } 134 if (nchar >= 997) 135 { printf("%c%c",CR,LF); nchar=0; } 136 } 137 ----------------- cut ---------------------- 139 while the uncoding algorithm is the following: 141 ----------------- cut ---------------------- 142 char c, NUL=0x00, CR=0x0d, LF=0x0a; 143 char X80=0x80, X81=0x81, X8A=0x8a, X8D=0x8d; 145 while ((c=getchar()) != EndOfFile) { 146 if (c == CR) 147 c=getchar(); /* eat CRLF */ 148 else if (c == X80) 149 printf("%c",NUL); 150 else if (c == X81) { 151 c=getchar(); /* get escaped char */ 152 if (c == X80) printf("%c",X80); 153 else if (c == X81) printf("%c",X81); 154 else if (c == X8A) printf("%c",LF); 155 else if (c == X8D) printf("%c",CR); 156 } 157 else 158 printf("%c",c); 159 } 160 ----------------- cut ---------------------- 161 Note that a real implementation should of course check for malformed 162 input data, and return correspondingly an error message. 164 The overhead induced by this coding can be roughly measured as 165 follows: 167 - four octets out of 256 are coded with two octects, increasing 168 the total dimension by 1.6% on average; 169 - there are two extra octets each 997 or 998, adding a further 0.2%; 170 - there is the MIME header overhead, which is negligible for large 171 files. 173 It is therefore possible to code a typical article with just 2% 174 overhead, rather than the 33% of UUENCODE or base64 encoding. 176 4. User Agent Requirements 178 User agents that do not recognize application/nntp8bit shall, in 179 accordance with [MIME], treat the entire entity as 180 application/octet-stream. This is ok, since the data may then be 181 saved as an external file which can be processed offline. 183 MIME User Agents that recognize application/nntp8bit will decode the 184 stream of data and present it to the user as a file with content 185 defined in the Type parameter. 187 4.1 Recursion 189 MIME is a recursive structure. Hence one must expect an 190 application/nntp8bit entity to contain other application/nntp8bit 191 entities. When a application/nntp8bit entity is being processed for 192 display or storage, any enclosed application/nntp8bit entities shall 193 be processed as though they were being stored. 195 5. Further work 197 It could be possible to define a way to process articles split before 198 transmission, because of their large size. Two possible ways to do 199 this are 201 - add a MIME optional parameter which says which part of the file is 202 being sent 203 - use an escape sequence "0x81 0xnn", with nn going from 01 to 79, at 204 the beginning of the stream data to indicate which part is being 205 sent. 207 The latter system limits the dimension of the complete file being 208 sent, but it is more compact. 210 6. Security considerations 212 It may be possible to prepare a coded stream which can execute 213 malicious programs, if a newsreader cannot understand this MIME Media 214 Type. It has however to be noted that the specifications for Usenet 215 message would allow such a message anyway, so no new security issue 216 should be added. 218 7. Acknowledgments 220 [I hope someone in the USEFOR IETF group will help me!] 221 The author, however, take full responsibility for all errors 222 contained in this document. 224 8. References 226 [MIME] Borenstein, N. and Freed, N., "MIME (Multipurpose Internet 227 Mail Extensions): Mechanisms for Specifying and Describing 228 the Format of Internet Message Bodies", June 1992, RFC 1341. 230 [NEWS] Horton, M., Adams, R., "Standard for Interchange of USENET 231 Messages", December 1987, AT&T Bell Labs and Center for 232 Seismic Studies, RFC 1036. 234 [NEWNNTP] Barber, S. "Network News Transport Protocol", work in 235 progress, ftp://ds.internic.net/internet-drafts/draft-ietf- 236 nntpext-base-04.txt 238 [NNTP] Kantor, B., Lapsley, P., "Network News Transfer Protocol", 239 February 1986, U.C. San Diego and U.C. Berkeley, RFC 977. 241 [USEFOR] Ritter, D., N., "User Article Format", work in progress, 242 ftp://ds.internic.net/internet-drafts/draft-ietf-usefor- 243 article-01.txt 245 9. Author's address 247 Maurizio Codogno 248 CSELT CF/IM Dept. 249 Via G. Reiss Romoli, 274 250 I-10148 Torino TO 251 Italy 252 +39 011 228 6132 253