idnits 2.17.1 draft-ietf-lemonade-compress-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 13. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 402. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 372. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 379. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 385. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing document type: Expected "INTERNET-DRAFT" in the upper left hand corner of the first page Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (April 2007) is 6214 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Downref: Normative reference to an Informational RFC: RFC 1951 ** Obsolete normative reference: RFC 3501 (Obsoleted by RFC 9051) ** Obsolete normative reference: RFC 4234 (Obsoleted by RFC 5234) -- Obsolete informational reference (is this intentional?): RFC 4346 (Obsoleted by RFC 5246) Summary: 5 errors (**), 0 flaws (~~), 1 warning (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group Arnt Gulbrandsen 2 Request for Comments: DRAFT Oryx Mail Systems GmbH 3 Intended Status: Proposed Standard April 2007 5 The IMAP COMPRESS Extension 6 draft-ietf-lemonade-compress-08.txt 8 Status of this Memo 10 By submitting this Internet-Draft, each author represents that any 11 applicable patent or other IPR claims of which he or she is aware 12 have been or will be disclosed, and any of which he or she becomes 13 aware will be disclosed, in accordance with Section 6 of BCP 79. 15 Internet-Drafts are working documents of the Internet Engineering 16 Task Force (IETF), its areas, and its working groups. Note that 17 other groups may also distribute working documents as Internet- 18 Drafts. 20 Internet-Drafts are draft documents valid for a maximum of six 21 months and may be updated, replaced, or obsoleted by other documents 22 at any time. It is inappropriate to use Internet-Drafts as 23 reference material or to cite them other than as "work in progress". 25 The list of current Internet-Drafts can be accessed at 26 http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet- 27 Draft Shadow Directories can be accessed at 28 http://www.ietf.org/shadow.html. 30 Copyright Notice 32 Copyright (C) The IETF Trust (2007). 34 Abstract 36 The COMPRESS extension allows an IMAP connection to be effectively 37 and efficiently compressed. 39 Table of Contents 41 1. Conventions Used in This Document . . . . . . . . . . . . . . 2 42 2. Introduction and Overview . . . . . . . . . . . . . . . . . . 2 43 3. The COMPRESS Command . . . . . . . . . . . . . . . . . . . . . 3 44 4. Compression Efficiency . . . . . . . . . . . . . . . . . . . . 5 45 5. Formal Syntax . . . . . . . . . . . . . . . . . . . . . . . . 6 46 6. Security Considerations . . . . . . . . . . . . . . . . . . . 7 47 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 48 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 7 49 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 7 50 9.1. Normative References . . . . . . . . . . . . . . . . . . . 7 51 9.2. Informative References . . . . . . . . . . . . . . . . . . 8 52 10. Author's Address . . . . . . . . . . . . . . . . . . . . . . 8 54 1. Conventions Used in This Document 56 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 57 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 58 document are to be interpreted as described in [RFC2119]. 60 Formal syntax is defined by [RFC4234] as modified by [RFC3501]. 62 In the examples, "C:" and "S:" indicate lines sent by the client and 63 server respectively. "[...]" denotes elision. 65 2. Introduction and Overview 67 A server which supports the COMPRESS extension indicates this with 68 one or more capability names consisting of "COMPRESS=" followed by a 69 supported compression algorithm name as described in this document. 71 The goal of COMPRESS is to reduce the bandwidth usage of IMAP. 73 Compared to PPP compression (see [RFC1962]) and modem-based 74 compression (see [MNP] and [V42BIS]), COMPRESS offers much better 75 compression efficiency. COMPRESS can be used together with TLS 76 [RFC4346], SASL encryption, VPNs etc. Compared to TLS compression 77 [RFC3749], COMPRESS has the following (dis)advantages: 79 - COMPRESS can be implemented easily both by IMAP servers and 80 clients. 82 - IMAP COMPRESS benefits from an intimate knowledge of the IMAP 83 protocol's state machine, allowing for dynamic and aggressive 84 optimization of the underlying compression algorithm's parameters. 86 - When the TLS layer implements compression, any protocol using that 87 layer can transparently benefit from that compression (e.g. SMTP 88 and IMAP). COMPRESS is specific to IMAP. 90 In order to increase interoperation, it is desirable to have as few 91 different compression algorithms as possible, so this document 92 specifies only one. The DEFLATE algorithm (defined in [RFC1951]) is 93 standard, widely available and fairly efficient, so it is the only 94 algorithm defined by this document. 96 In order to increase interoperation, IMAP servers which advertise 97 this extension SHOULD also advertise the TLS DEFLATE compression 98 mechanism as defined in [RFC3749]. IMAP clients MAY use either 99 COMPRESS or TLS compression. 101 The extension adds one new command (COMPRESS) and no new responses. 103 3. The COMPRESS Command 105 Arguments: Name of compression mechanism: "DEFLATE". 107 Responses: None 109 Result: OK The server will compress its responses and expects the 110 client to compress its commands. 111 NO Compression is already active via another layer. 112 BAD Command unknown, invalid or unknown argument, or COMPRESS 113 already active. 115 The COMPRESS command instructs the server to use the named 116 compression mechanism ("DEFLATE" is the only one defined) for all 117 commands and/or responses after COMPRESS. 119 The client MUST NOT send any further commands until it has seen the 120 result of COMPRESS. If the response was OK, the client MUST compress 121 starting with the first command after COMPRESS. If the server 122 response was BAD or NO, the client MUST NOT turn on compression. 124 If the server responds NO because it knows that the same mechanism 125 is active already (e.g. because TLS has negotiated the same 126 mechanism), it MUST send COMPRESSIONACTIVE as resp-text-code (see 127 [RFC3501] section 7.1), and the resp-text SHOULD say which layer 128 compresses. 130 If the server issues an OK response, the server MUST compress 131 starting immediately after the CRLF which ends the tagged OK 132 response. (Responses issued by the server before the OK response 133 will, of course, still be uncompressed.) If the server issues a BAD 134 or NO respnose, the server MUST NOT turn on compression. 136 For DEFLATE (as for many other compression mechanisms), the 137 compressor can trade speed against quality. When decompressing 138 there isn't much of a tradeoff. Consequently, the client and server 139 are both free to pick the best reasonable rate of compression for 140 the data they send. 142 When COMPRESS is combined with TLS (see [RFC4346]) or SASL (see 143 [RFC4422]) security layers, the sending order of the three 144 extensions MUST be first COMPRESS, then SASL, and finally TLS. That 145 is, before data is transmitted it is first compressed. Second, if a 146 SASL security layer has been negotiated, the compressed data is then 147 signed and/or encrypted accordingly. Third, if a TLS security layer 148 has been negotiated, the data from the previous step is signed 149 and/or encrypted accordingly. When receiving data, the processing 150 order MUST be reversed. This ensures that before sending, data is 151 compressed before it is encrypted, independent of the order in which 152 the client issues COMPRESS, AUTHENTICATE, and STARTTLS. 154 The following example illustrates how commands and responses are 155 compressed during a simple login sequence: 157 S: * OK [CAPABILITY IMAP4REV1 STARTTLS COMPRESS=DEFLATE] 158 C: a starttls 159 S: a OK TLS active 161 From this point on, everything is encrypted. 163 C: b login arnt tnra 164 S: b OK Logged in as arnt 165 C: c compress deflate 166 S: d OK DEFLATE active 168 From this point on, everything is compressed before being 169 encrypted. 171 The following example demonstrates how a server may refuse to 172 compress twice: 174 S: * OK [CAPABILITY IMAP4REV1 STARTTLS COMPRESS=DEFLATE] 175 [...] 176 C: c compress deflate 177 S: c NO [COMPRESSIONACTIVE] DEFLATE active via TLS 179 4. Compression Efficiency 181 This section is informative, not normative. 183 IMAP poses some unusual problems for a compression layer. 185 Upstream is fairly simple. Most IMAP clients send the same few 186 commands again and again, so any compression algorithm which can 187 exploit repetition works efficiently. The APPEND command is an 188 exception; clients which send many APPEND commands may want to 189 surround large literals with flushes in the same way as is 190 recommended for servers later in this section. 192 Downstream has the unusual property that several kinds of data are 193 sent, confusing all dictionary-based compression algorithms. 195 One type is IMAP responses. These are highly compressible; zlib 196 using its least CPU-intensive setting compresses typical responses 197 to 25-40% of their original size. 199 Another is email headers. These are equally compressible, and 200 benefit from using the same dictionary as the IMAP responses. 202 A third is email body text. Text is usually fairly short and 203 includes much ASCII, so the same compression dictionary will do a 204 good job here, too. When multiple messages in the same thread are 205 read at the same time, quoted lines etc. can often be compressed 206 almost to zero. 208 Finally, attachments (non-text email bodies) are transmitted, either 209 in binary form or encoded with base-64. 211 When attachments are retrieved in binary form, DEFLATE may be able 212 to compress them, but the format of the attachment is usually not 213 IMAP-like, so the dictionary built while compressing IMAP does not 214 help. The compressor has to adapt its dictionary from IMAP to the 215 attachment's format, and then back. A few file formats aren't 216 compressible at all using deflate, e.g. .gz, .zip and .jpg files. 218 When attachments are retrieved in base-64 form, the same problems 219 apply, but the base-64 encoding adds another problem. 8-bit 220 compression algorithms such as deflate work well on 8-bit file 221 formats, however base-64 turns a file into something resembling 222 6-bit bytes, hiding most of the 8-bit file format from the 223 compressor. 225 When using the zlib library (see [RFC1951]), the functions 226 deflateInit2(), deflate(), inflateInit2() and inflate() suffice to 227 implement this extension. The windowBits value must be in the range 228 -8 to -15, or else deflateInit2() uses the wrong format. 229 deflateParams() can be used to improve compression rate and resource 230 use. The Z_FULL_FLUSH argument to deflate() can be used to clear the 231 dictionary (the receiving peer does not need to do anything). 233 A client can improve downstream compression by implementing BINARY 234 (defined in [RFC3516]) and using FETCH BINARY instead of FETCH BODY. 235 In the author's experience, the improvement ranges from 5% to 40% 236 depending on the attachment being downloaded. 238 A server can improve downstream compression if it hints to the 239 compressor that the data type is about to change strongly, e.g. by 240 sending a Z_FULL_FLUSH at the start and end of large non-text 241 literals (before and after '*CHAR8' in the definition of literal in 242 RFC 3501, page 86). Small literals are best left alone. A possible 243 boundary is 5k. 245 A server can improve the CPU efficiency both of the server and the 246 client if it adjusts the compression level (e.g. using the 247 deflateParams() function in zlib) at these points, to avoid trying 248 to compress uncompressible attachments. A very simple strategy is to 249 change the level to 0 to at the start of a literal provided the 250 first two bytes are either 0x1F 0x8B (as in deflate-compressed 251 files) or 0xFF 0xD8 (JPEG), and to keep it at 1-5 the rest of the 252 time. More complex strategies are possible. 254 5. Formal Syntax 256 The following syntax specification uses the Augmented Backus-Naur 257 Form (ABNF) notation as specified in [RFC4234]. This syntax augments 258 the grammar specified in [RFC3501]. [RFC4234] defines SP and 259 [RFC3501] defines command-auth, capability and resp-text-code. 261 Except as noted otherwise, all alphabetic characters are case- 262 insensitive. The use of upper or lower case characters to define 263 token strings is for editorial clarity only. Implementations MUST 264 accept these strings in a case-insensitive fashion. 266 command-auth =/ compress 268 compress = "COMPRESS" SP algorithm 270 capability =/ "COMPRESS=" algorithm 271 ;; multiple COMPRESS capabilities allowed 273 algorithm = "DEFLATE" 274 resp-text-code =/ "COMPRESSIONACTIVE" 276 Note that due the syntax of capability names, future algorithm names 277 must be atoms. 279 6. Security Considerations 281 As for TLS compression [RFC3749]. 283 7. IANA Considerations 285 The IANA is requested to add COMPRESS=DEFLATE the list of IMAP 286 capabilities. [Note to IANA: This is at 287 http://www.iana.org/assignments/imap4-capabilities] 289 Note to IANA: This RFC does not specify the creation of a registry 290 for compression mechanisms. The current feeling of the IMAP 291 community is that is is unlikely that another compression mechanism 292 will be added in the future. However, if this RFC is extended in the 293 future by another RFC, and another compression mechanism is added at 294 that time, it would then be appropriate to create a registry. 296 8. Acknowledgements 298 Eric Burger, Dave Cridland, Tony Finch, Ned Freed, Philip Guenther, 299 Randall Gellens, Tony Hansen, Cullen Jennings, Stephane Maes, Alexey 300 Melnikov, Lyndon Nerenberg and Zoltan Ordogh have all helped with 301 this document. 303 The author would also like to thank various people in the rooms at 304 meetings, whose help is real, but not reflected in the author's 305 mailbox. 307 9. References 309 9.1. Normative References 311 [RFC1951] Deutsch, "DEFLATE Compressed Data Format Specification 312 version 1.3", RFC 1951, Aladdin Enterprises, May 1996. 314 [RFC2119] Bradner, "Key words for use in RFCs to Indicate 315 Requirement Levels", RFC 2119, Harvard University, March 316 1997. 318 [RFC3501] Crispin, "Internet Message Access Protocol - Version 319 4rev1", RFC 3501, University of Washington, June 2003. 321 [RFC4234] Crocker, Overell, "Augmented BNF for Syntax 322 Specifications: ABNF", RFC 4234, Brandenburg 323 Internetworking, Demon Internet Ltd, October 2005. 325 9.2. Informative References 327 [RFC1962] Rand, "The PPP Compression Control Protocol (CCP)", RFC 328 1962, June 1996. 330 [RFC3516] Nerenberg, "IMAP4 Binary Content Extension", RFC 3516, 331 Orthanc Systems, April 2003. 333 [RFC3749] Hollenbeck, "Transport Layer Security Protocol 334 Compression Methods", RFC 3749, VeriSign, May 2004. 336 [RFC4346] Dierks, Rescorla, "The Transport Layer Security (TLS) 337 Protocol, Version 1.1", RFC 4346, April 2006. 339 [RFC4422] Melnikov, Zeilenga, "Simple Authentication and Security 340 Layer (SASL)", RFC 4422, Isode Limited, June 2006. 342 [V42BIS] ITU, "V.42bis: Data compression procedures for data 343 circuit-terminating equipment (DCE) using error 344 correction procedures", http://www.itu.int/rec/T-REC- 345 V.42bis, January 1990. 347 [MNP] Gilbert Held, "The Complete Modem Reference", Second 348 Edition, Wiley Professional Computing, ISBN 349 0-471-00852-4, May 1994. 351 10. Author's Address 353 Arnt Gulbrandsen 354 Oryx Mail Systems GmbH 355 Schweppermannstr. 8 356 D-81671 Muenchen 357 Germany 359 Fax: +49 89 4502 9758 361 Email: arnt@oryx.com 363 Intellectual Property Statement 365 The IETF takes no position regarding the validity or scope of any 366 Intellectual Property Rights or other rights that might be claimed 367 to pertain to the implementation or use of the technology described 368 in this document or the extent to which any license under such 369 rights might or might not be available; nor does it represent that 370 it has made any independent effort to identify any such rights. 371 Information on the procedures with respect to rights in RFC 372 documents can be found in BCP 78 and BCP 79. 374 Copies of IPR disclosures made to the IETF Secretariat and any 375 assurances of licenses to be made available, or the result of an 376 attempt made to obtain a general license or permission for the use 377 of such proprietary rights by implementers or users of this 378 specification can be obtained from the IETF on-line IPR repository 379 at http://www.ietf.org/ipr. 381 The IETF invites any interested party to bring to its attention any 382 copyrights, patents or patent applications, or other proprietary 383 rights that may cover technology that may be required to implement 384 this standard. Please address the information to the IETF at ietf- 385 ipr@ietf.org. 387 Copyright Statement 389 Copyright (C) The IETF Trust (2007). This document is subject to 390 the rights, licenses and restrictions contained in BCP 78, and 391 except as set forth therein, the authors retain all their rights. 393 Disclaimer of Validity 395 This document and the information contained herein are provided on 396 an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 397 REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE 398 IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL 399 WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY 400 WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE 401 ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS 402 FOR A PARTICULAR PURPOSE. 404 Acknowledgment 406 Funding for the RFC Editor function is currently provided by the 407 Internet Society.