< draft-ietf-lemonade-compress-01.txt   draft-ietf-lemonade-compress-02.txt >
Network Working Group Arnt Gulbrandsen Network Working Group Arnt Gulbrandsen
Request for Comments: DRAFT Oryx Mail Systems GmbH Request for Comments: DRAFT Oryx Mail Systems GmbH
draft-ietf-lemonade-compress-01.txt June 2006 July 2006
The IMAP COMPRESS=DEFLATE Extension The IMAP COMPRESS=DEFLATE Extension
draft-ietf-lemonade-compress-02.txt
Status of this Memo Status of this Memo
By submitting this Internet-Draft, each author represents that any By submitting this Internet-Draft, each author represents that any
applicable patent or other IPR claims of which he or she is aware applicable patent or other IPR claims of which he or she is aware
have been or will be disclosed, and any of which he or she becomes have been or will be disclosed, and any of which he or she becomes
aware will be disclosed, in accordance with Section 6 of BCP 79. aware will be disclosed, in accordance with Section 6 of BCP 79.
Internet-Drafts are working documents of the Internet Engineering Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that Task Force (IETF), its areas, and its working groups. Note that
skipping to change at page 1, line 40 skipping to change at page 1, line 41
Copyright Notice Copyright Notice
Copyright (C) The Internet Society 2006. Copyright (C) The Internet Society 2006.
Abstract Abstract
The COMPRESS=DEFLATE extension allows an IMAP connection to be The COMPRESS=DEFLATE extension allows an IMAP connection to be
compressed using the DEFLATE algorithm, such that effective compressed using the DEFLATE algorithm, such that effective
compression is available even when TLS is used. compression is available even when TLS is used.
Conventions Used in This Document Table of Contents
The key words "REQUIRED", "MUST", "MUST NOT", "SHOULD", "SHOULD 1. Conventions Used in This Document . . . . . . . . . . . . . . 2
NOT", and "MAY" in this document are to be interpreted as described 2. Introduction and Overview . . . . . . . . . . . . . . . . . . 2
in "Key words for use in RFCs to Indicate Requirement Levels" 3. The COMPRESS Command . . . . . . . . . . . . . . . . . . . . . 3
4. Compression Efficiency . . . . . . . . . . . . . . . . . . . . 4
5. Formal Syntax . . . . . . . . . . . . . . . . . . . . . . . . 5
6. Security Considerations . . . . . . . . . . . . . . . . . . . 6
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 6
8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 6
9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 6
9.1. Normative References . . . . . . . . . . . . . . . . . . . 6
9.2. Informative References . . . . . . . . . . . . . . . . . . 6
10. Author's Address . . . . . . . . . . . . . . . . . . . . . . 7
11. Open Issues . . . . . . . . . . . . . . . . . . . . . . . . . 7
[KEYWORDS]. Formal syntax is defined by [ABNF] as modified by 1. Conventions Used in This Document
[IMAP].
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [KEYWORDS].
Formal syntax is defined by [ABNF] as modified by [IMAP].
In the example, "C:" and "S:" indicate lines sent by the client and In the example, "C:" and "S:" indicate lines sent by the client and
server respectively. server respectively.
Introduction and Overview 2. Introduction and Overview
An IMAP server that supports this extension announces An IMAP server that supports this extension announces
"COMPRESS=DEFLATE" as one of its capabilities. "COMPRESS=DEFLATE" as one of its capabilities.
The goal of COMPRESS=DEFLATE is to reduce the bandwidth usage of The goal of COMPRESS=DEFLATE is to reduce the bandwidth usage of
IMAP. On regular IMAP connections, the PPP or MNP compression used IMAP. On regular IMAP connections, the PPP or MNP compression used
with many low-bandwidth links compresses IMAP well. However, when with many low-bandwidth links compresses IMAP well. However, when
TLS is used, PPP/MNP compression is ineffective. TLS too may provide TLS is used, PPP/MNP compression is ineffective. TLS too may provide
compression, but few or no implementations do so in practice. compression, but a careful IMAP implementation can do much better.
In order to increase interoperation, it is desirable to have as few In order to increase interoperation, it is desirable to have as few
different compression algorithms as possible, so this document different compression algorithms as possible, so this document
specifies only one. The DEFLATE algorithm is standard, widely specifies only one. The DEFLATE algorithm is standard, widely
available, unencumbered by patents and fairly efficient. Hopefully available, unencumbered by patents and fairly efficient. Hopefully
it will not be necessary to define additional algorithms. it will not be necessary to define additional algorithms.
The extension adds one new command (COMPRESS) and no new responses. The extension adds one new command (COMPRESS) and no new responses.
The COMPRESS Command 3. The COMPRESS Command
Arguments: Name of compression mechanism: "DEFLATE". Arguments: Name of compression mechanism: "DEFLATE".
Direction: "UP", "DOWN" or "BOTH".
Responses: None Responses: None
Result: OK The server will compress its responses (if the direction Result: OK The server will compress its responses and expects the
is DOWN or BOTH) and expects the client to compress its client to compress its commands.
commands (if the direction is UP or BOTH). NO The server doesn't support the requested mechanism.
NO The connection already is compressed, or the server BAD Command unknown, invalid argument, or COMPRESS already
doesn't support the requested mechanism, or the direction active.
specified is unknown.
BAD Command unknown or invalid argument.
The COMPRESS command instructs the server to use the named The COMPRESS command instructs the server to use the named
compression mechanism ("DEFLATE" is the only one defined) for future compression mechanism ("DEFLATE" is the only one defined) for all
commands and/or responses. If the direction specified is "UP", only commands and/or responses after COMPRESS.
commands are compressed. If the direction specified is "DOWN", only
The client MUST NOT send any commands until it has seen the result
of COMPRESS.
For DEFLATE (as for many other compression mechanisms), the For DEFLATE (as for many other compression mechanisms), the
compressor can trade speed against quality. When decompressing compressor can trade speed against quality. When decompressing
there isn't much of a tradeoff. Consequently, the client and server there isn't much of a tradeoff. Consequently, the client and server
are both free to pick the best reasonable rate of compression for are both free to pick the best reasonable rate of compression for
the data they send. the data they send.
The client MUST NOT send additional commands until it has seen the If both [STARTTLS] and COMPRESS are in use, the data should be
result of COMPRESS.
If both SASL/TLS and COMPRESS are in use, the data should be
compressed before it is encrypted (and decrypted before it is compressed before it is encrypted (and decrypted before it is
decompressed), independent of the order in which the client issues decompressed), independent of the order in which the client issues
COMPRESS, AUTHENTICATE and STARTTLS. COMPRESS, AUTHENTICATE and STARTTLS.
Example The following example illustrates how commands and responses are
compressed during a simple login sequence:
This example shows a simple login sequence. The client uses TLS for
privacy and [DEFLATE] for compression.
S: * OK [CAPABILITY IMAP4REV1 STARTTLS COMPRESS=DEFLATE] S: * OK [CAPABILITY IMAP4REV1 STARTTLS COMPRESS=DEFLATE]
C: a starttls C: a starttls
S: a OK S: a OK TLS active
From this point on, everything is encrypted.
C: b compress deflate C: b compress deflate
S: b OK S: b OK DEFLATE active
From this point on, everything is compressed before being
encrypted.
C: c login arnt tnra C: c login arnt tnra
S: c OK S: c OK Logged in as arnt
Compression Efficiency 4. Compression Efficiency
IMAP poses some unusual problems for a compression layer. IMAP poses some unusual problems for a compression layer.
Upstream is fairly simple. Most IMAP clients send the same few Upstream is fairly simple. Most IMAP clients send the same few
commands again and again, so any compression algorith which can commands again and again, so any compression algorith which can
exploit quotes works efficiently. The APPEND command is an exploit repetition works efficiently. The APPEND command is an
exception; clients which send many APPEND commands may want to take exception; clients which send many APPEND commands may want to take
special care. special care of literals, in the same way that servers do.
Downstream has the unusual property that 3-4 kinds of data are sent, Downstream has the unusual property that several kinds of data are
confusing all dictionary-based compression algorithms. sent, confusing all dictionary-based compression algorithms.
The first type is IMAP responses. These are highly compressible; One type is IMAP responses. These are highly compressible; zlib
zlib using its least CPU-intensive setting compresses typical using its least CPU-intensive setting compresses typical responses
responses to 25-40% of their original size. to 25-40% of their original size.
The second is email headers. These are equally compressible, and Another is email headers. These are equally compressible, and
benefit from using the same dictionary as the IMAP responses. benefit from using the same dictionary as the IMAP responses.
The third is email body text. Text is usually fairly short and A third is email body text. Text is usually fairly short and
includes much ASCII, so the same compression dictionary will do a includes much ASCII, so the same compression dictionary will do a
good job here, too. When multiple messages in the same thread are good job here, too. When multiple messages in the same thread are
read at the same time, quoted lines etc. can often be compressed read at the same time, quoted lines etc. can often be compressed
almost to zero. almost to zero.
Finally, attachments (non-text email bodies) are transmitted, either Finally, attachments (non-text email bodies) are transmitted, either
in [BINARY] form or encoded with base-64. in [BINARY] form or encoded with base-64.
When attachments are retrieved in [BINARY] form, DEFLATE may be able When attachments are retrieved in [BINARY] form, DEFLATE may be able
to compress them, but the format of the attachment is usually not to compress them, but the format of the attachment is usually not
IMAP-like, so the dictionary built while compressing IMAP does not IMAP-like, so the dictionary built while compressing IMAP does not
help. The compressor has to adapt from IMAP to the attachment's help. The compressor has to adapt its dictionary from IMAP to the
format, and then back. attachment's format, and then back. A few file formats aren't
compressible at all using deflate, e.g. .gz, .zip and .jpg files.
When attachments are retrieved in base-64 form, the same problems When attachments are retrieved in base-64 form, the same problems
apply, but the base-64 encoding adds another problem. 8-bit apply, but the base-64 encoding adds another problem. 8-bit
compression algorithms such as deflate work well on 8-bit file compression algorithms such as deflate work well on 8-bit file
formats, however base-64 turns a file into something resembling a formats, however base-64 turns a file into something resembling
6-bit bytes in an 8-bit format. 6-bit bytes, hiding most of the 8-bit file format from the
compressor.
A few file formats aren't compressible using deflate, e.g. .gz, .zip
and .jpg files.
According to the author's measurements, the compression level used
makes little difference. zlib's level 1 compresses IMAP almost as
well as level 9, and for the receiver, level 1 seems to require
(just a tiny bit) pmore CPU than level 9. Independent verification
is strongly desired.
Implementation Notes
When using the zlib library (see [DEFLATE]), the functions When using the zlib library (see [DEFLATE]), the functions
deflateInit(), deflate(), inflateInit() and inflate() suffice to deflateInit(), deflate(), inflateInit() and inflate() suffice to
implement this extension. implement this extension. deflateParams() can be used to improve
compression rate and resource use.
Note that when using TLS, compression may actually decrease the CPU
usage, depending on which algorithms are used in TLS. This is
because fewer bytes need to be encrypted, and encryption is
generally more expensive than compression.
A client can improve downstream compression by implementing [BINARY] A client can improve downstream compression by implementing [BINARY]
and using FETCH BINARY instead of FETCH BODY. and using FETCH BINARY instead of FETCH BODY. In the author's
experience, the improvement ranges from 5% to 40% depending on the
attachment being downloaded.
A server can improve downstream compression if it hints to the A server can improve downstream compression if it hints to the
compressor that the data type is about to change strongly, e.g. by compressor that the data type is about to change strongly, e.g. by
sending a Z_FULL_FLUSH at the start and end of large non-text sending a Z_FULL_FLUSH at the start and end of large non-text
literals (before and after '*CHAR8' in the definition of literal in literals (before and after '*CHAR8' in the definition of literal in
RFC 3501, page 86). RFC 3501, page 86). Small literals are best left alone.
A server can improve the CPU efficiency both of the server and the A server can improve the CPU efficiency both of the server and the
client if it adjusts the compression level (e.g. using the client if it adjusts the compression level (e.g. using the
deflateParams() function in zlib) at these points. A very simple deflateParams() function in zlib) at these points. A very simple
strategy is to change the level 0 to at the start of a literal strategy is to change the level 0 to at the start of a literal
provided the first two bytes are either 0x1F 0x8B (as in deflate- provided the first two bytes are either 0x1F 0x8B (as in deflate-
compressed files) or 0xFF 0xD8 (JPEG), and to keep it at 1-5 the compressed files) or 0xFF 0xD8 (JPEG), and to keep it at 1-5 the
rest of the time. rest of the time.
Formal Syntax Note that when using TLS, compression may actually decrease the CPU
usage, depending on which algorithms are used in TLS. This is
because fewer bytes need to be encrypted, and encryption is
generally more expensive than compression.
5. Formal Syntax
The following syntax specification uses the Augmented Backus-Naur The following syntax specification uses the Augmented Backus-Naur
Form (ABNF) notation as specified in [ABNF]. Non-terminals Form (ABNF) notation as specified in [ABNF]. Non-terminals
referenced but not defined below are as defined by [ABNF] (SP, CRLF) referenced but not defined below are as defined by [ABNF] (SP, CRLF)
or [IMAP] (all others). or [IMAP] (all others).
Except as noted otherwise, all alphabetic characters are case- Except as noted otherwise, all alphabetic characters are case-
insensitive. The use of upper or lower case characters to define insensitive. The use of upper or lower case characters to define
token strings is for editorial clarity only. Implementations MUST token strings is for editorial clarity only. Implementations MUST
accept these strings in a case-insensitive fashion. accept these strings in a case-insensitive fashion.
command-any =/ compress command-any =/ compress
compress = "COMPRESS" SP algorithm SP ( "UP" / "DOWN" / compress = "COMPRESS" SP algorithm
"BOTH" )
algorithm = "DEFLATE" algorithm = "DEFLATE"
Security considerations 6. Security Considerations
(As for [TLSCOMP] RFC 3749.) As for [TLSCOMP] RFC 3749.
IANA Considerations 7. IANA Considerations
The IANA is requested to add COMPRESS=DEFLATE to the list of IMAP The IANA is requested to add COMPRESS=DEFLATE to the list of IMAP
extensions. extensions.
Credits 8. Acknowledgements
Quite a few people on the LEMONADE mailing list have offered
comments, including Dave Cridland, Ned Freed and Tony Hansen. And
various people in the rooms at meetings. Send me mail, I'll add you.
Open Issues Eric Burger, Dave Cridland, Tony Finch, Ned Freed, Philip Guenther,
Randall Gellens, Tony Hansen, Alexey Melnikov, Lyndon Nerenberg and
Zoltan Ordogh have all helped with this document.
Both ends can already disable compression at any point by calling The author would also like to thank various people in the rooms at
deflateParams(). The only missing feature is for the client to meetings, whose help is real, but not reflected in the author's
request that the server stop compressing - are there use-cases for mailbox.
that? It requires adding more server-side state, so I'm wary.
What text and numbers are needed wrt. compression levels? A bit of 9. References
solid information is not amiss.
Normative References 9.1. Normative References
[ABNF] Crocker, Overell, "Augmented BNF for Syntax [ABNF] Crocker, Overell, "Augmented BNF for Syntax
Specifications: ABNF", RFC 2234, Internet Mail Specifications: ABNF", RFC 4234, Brandenburg
Consortium, Demon Internet Ltd, November 1997. Internetworking, Demon Internet Ltd, October 2005.
[IMAP] Crispin, "Internet Message Access Protocol - Version [IMAP] Crispin, "Internet Message Access Protocol - Version
4rev1", RFC 3501, University of Washington, June 2003. 4rev1", RFC 3501, University of Washington, June 2003.
[KEYWORDS] Bradner, "Key words for use in RFCs to Indicate [KEYWORDS] Bradner, "Key words for use in RFCs to Indicate
Requirement Levels", RFC 2119, Harvard University, March Requirement Levels", RFC 2119, Harvard University, March
1997. 1997.
[DEFLATE] Deutsch, "DEFLATE Compressed Data Format Specification [DEFLATE] Deutsch, "DEFLATE Compressed Data Format Specification
version 1.3", RFC 1951, Aladdin Enterprises, May 1996. version 1.3", RFC 1951, Aladdin Enterprises, May 1996.
[STARTTLS] Newman, C. "Using TLS with IMAP, POP3 and ACAP", RFC [STARTTLS] Newman, C. "Using TLS with IMAP, POP3 and ACAP", RFC
2595, June 1999. 2595, June 1999.
Informative References 9.2. Informative References
[TLSCOMP] Hollenbeck, "Transport Layer Security Protocol [TLSCOMP] Hollenbeck, "Transport Layer Security Protocol
Compression Methods", RFC 3749, VeriSign, May 2004. Compression Methods", RFC 3749, VeriSign, May 2004.
Author's Address [BINARY] Nerenberg, "IMAP4 Binary Content Extension", Orthanc
Systems, April 2003.
10. Author's Address
Arnt Gulbrandsen Arnt Gulbrandsen
Oryx Mail Systems GmbH Oryx Mail Systems GmbH
Schweppermannstr. 8 Schweppermannstr. 8
D-81671 Muenchen D-81671 Muenchen
Germany Germany
Fax: +49 89 4502 9758 Fax: +49 89 4502 9758
Email: arnt@oryx.com Email: arnt@oryx.com
11. Open Issues
What text and numbers are needed wrt. compression levels? A bit of
solid information is not amiss.
Intellectual Property Statement Intellectual Property Statement
The IETF takes no position regarding the validity or scope of any The IETF takes no position regarding the validity or scope of any
Intellectual Property Rights or other rights that might be claimed to Intellectual Property Rights or other rights that might be claimed
pertain to the implementation or use of the technology described in this to pertain to the implementation or use of the technology described
document or the extent to which any license under such rights might or in this document or the extent to which any license under such
might not be available; nor does it represent that it has made any rights might or might not be available; nor does it represent that
independent effort to identify any such rights. Information on the it has made any independent effort to identify any such rights.
procedures with respect to rights in RFC documents can be found in BCP 78 Information on the procedures with respect to rights in RFC
and BCP 79. documents can be found in BCP 78 and BCP 79.
Copies of IPR disclosures made to the IETF Secretariat and any assurances Copies of IPR disclosures made to the IETF Secretariat and any
of licenses to be made available, or the result of an attempt made to assurances of licenses to be made available, or the result of an
obtain a general license or permission for the use of such proprietary attempt made to obtain a general license or permission for the use
rights by implementers or users of this specification can be obtained from of such proprietary rights by implementers or users of this
the IETF on-line IPR repository at http://www.ietf.org/ipr. specification can be obtained from the IETF on-line IPR repository
at http://www.ietf.org/ipr.
The IETF invites any interested party to bring to its attention any The IETF invites any interested party to bring to its attention any
copyrights, patents or patent applications, or other proprietary rights copyrights, patents or patent applications, or other proprietary
that may cover technology that may be required to implement this standard. rights that may cover technology that may be required to implement
Please address the information to the IETF at ietf-ipr@ietf.org. this standard. Please address the information to the IETF at ietf-
ipr@ietf.org.
Copyright Statement Copyright Statement
Copyright (C) The Internet Society (2006). Copyright (C) The Internet Society (2006).
This document is subject to the rights, licenses and restrictions This document is subject to the rights, licenses and restrictions
contained in BCP 78, and except as set forth therein, the authors retain contained in BCP 78, and except as set forth therein, the authors
all their rights. retain all their rights.
Disclaimer of Validity Disclaimer of Validity
This document and the information contained herein are provided on an "AS This document and the information contained herein are provided on
IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE
SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE
TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR
LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
FITNESS FOR A PARTICULAR PURPOSE. WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Acknowledgment Acknowledgment
Funding for the RFC Editor function is currently provided by the Internet Funding for the RFC Editor function is currently provided by the
Society. Internet Society.
 End of changes. 48 change blocks. 
112 lines changed or deleted 128 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/