idnits 2.17.1 draft-sabin-lzs-payload-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-26) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 6 longer pages, the longest (page 2) being 59 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The abstract seems to contain references ([Atkins96], [RFC-1974]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 191: '... sender MUST reset the compres...' RFC 2119 keyword, line 192: '... Thus, the HIST_RESET bit MUST be set...' RFC 2119 keyword, line 199: '... The sender MUST flush the compre...' RFC 2119 keyword, line 229: '...n implementation SHOULD monitor the re...' RFC 2119 keyword, line 231: '...se, the uncompressed payload SHOULD be...' Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. 'ANSI94' -- Possible downref: Non-RFC (?) normative reference: ref. 'Atkins96' -- Possible downref: Non-RFC (?) normative reference: ref. 'Calgary' ** Downref: Normative reference to an Informational RFC: RFC 1974 Summary: 10 errors (**), 0 flaws (~~), 2 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Draft November 1996 (Expires May 1997) 4 M. Sabin, Consultant 5 R. Monsour, Hi/fn Inc. 7 LZS Payload Compression Transform for ESP 8 10 Status of this Memo 12 This document is an Internet-Draft. Internet-Drafts are working 13 documents of the Internet Engineering Task Force (IETF), its areas, 14 and its working groups. Note that other groups may also distribute 15 working documents as Internet-Drafts. 17 Internet-Drafts are draft documents valid for a maximum of six months 18 and may be updated, replaced, or obsoleted by other documents at any 19 time. It is inappropriate to use Internet-Drafts as reference 20 material or to cite them other than as "work in progress." 22 To learn the current status of any Internet-Draft, please check the 23 "1id-abstracts.txt" listing contained in the Internet-Drafts Shadow 24 Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), 25 munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or 26 ftp.isi.edu (US West Coast). 28 It is intended that a future version of this draft be submitted to 29 the IESG for publication as an Informational RFC. Comments about 30 this draft should be submitted to the authors or to the IPSEC mailing 31 list (ipsec@tis.com). 33 Abstract 35 This memo proposes a "payload compression transform" based on the LZS 36 compression algorithm. The transform can be used to compress the 37 payload field of an IP datagram that uses the Encapsulating Security 38 Payload (ESP) format. The compression transform proposed here is 39 stateless, meaning that a datagram can be decompressed independently 40 of any other datagram. Compression is performed prior to the 41 encryption operation of ESP, which has the side benefit of reducing 42 the amount of data that must be encrypted. 44 This memo anticipates a forthcoming draft of ESP that will supercede 45 [Atkins96]. The forthcoming draft will allow for ESP payloads to be 46 compressed via transforms such as the one described in this memo. 48 Sabin, et al [Page 1] 49 Acknowledgments 51 The LZS details presented here are similar to those in "PPP Stac LZS 52 Compression Protocol," by R. Friend and W. A. Simpson [RFC-1974]. 54 The authors wish to thank the many participants of the IPSEC mailing 55 list whose discussion made possible the architecture for integrating 56 compression with ESP. 58 Table of Contents 60 1. Introduction 61 2. Format of Transformed Payload 62 3. Compression Procedure 63 4. Decompression Procedure 64 5. Security Considerations 65 6. References 66 7. Author's Addresses 67 8. Appendix: Compression Efficiency versus Datagram Size 69 1. Introduction 71 Encrypted data is random in nature and not compressible. When an IP 72 datagram is encrypted, compression methods used at lower protocol 73 layers -- e.g., PPP compression [RFC-1962] -- no longer work. If 74 both compression and encryption are desired, compression must be 75 performed first. 77 A side benefit of compressing the data first is that the amount of 78 data which must be encrypted is reduced. In some implementations, 79 compression is done in hardware and encryption is done in software, 80 and this can represent a significant reduction in software overhead. 82 The Encapsulating Security Payload (ESP) format is well suited to 83 combining compression with encryption, for a variety of reasons: 85 - ESP is the place were encryption is applied to an IP datagram. 86 It is straightforward to precede the encryption step by an 87 optional compression step. The compression step transforms an 88 uncompressed ESP payload into a compressed ESP payload. This 89 "payload compression transform" can be independently defined and 90 used with any ESP transform. 92 - ESP provides a security parameters index (SPI) that links a 93 datagram to security parameters negotiated elsewhere. A 94 destination uses the SPI to look up the ESP transform needed to 95 decode an incoming datagram. If compression details are included 96 among the negotiated parameters, a destination can also use the 97 SPI to look up the compression transform needed to decode the ESP 99 Sabin, et al [Page 2] 100 payload. 102 This memo proposes a payload compression transform based on the LZS 103 compression algorithm. The transform can be used to compress any ESP 104 payload. The transform is stateless, meaning that the payload of a 105 datagram can be decompressed independently of any other datagram. 106 This is important in IP, where the delivery of individual datagrams 107 is not guaranteed. 109 1.1 Background of LZS Compression 111 The LZS algorithm is a lossless compression method that uses a 112 sliding window of 2,048 bytes. During compression, redundant 113 sequences of data are replaced with tokens that represent those 114 sequences. During decompression, the original sequences are 115 substituted for the tokens, in such a way that the original data 116 is exactly recovered. LZS differs from lossy schemes, such as 117 those often used for video compression, that do not exactly 118 reproduce the original data. 120 Details of LZS formatting can be found in [ANSI94]. 122 The efficiency of the LZS algorithm depends on the degree of 123 redundancy in the original data. A typical compression ratio 124 is 2:1. LZS achieves a compression ratio of 2.34:1 on 125 the University of Calgary Text Compression Corpus [Calgary]. 127 1.2 Licensing 129 Source and object licenses for LZS are available on a 130 non-discriminatory basis. Hardware implementations are also 131 available. For more information, contact Hi/fn at the address 132 listed with the authors' addresses. 134 1.3 Requirements Terminology 136 In this document, the words that are used to define the 137 significance of each particular requirement are usually 138 capitalized. These words are: 140 - MUST: This word, or the adjective "REQUIRED," means that the 141 item is an absolute requirement of the specification. 143 - SHOULD: This word, or the adjective "RECOMMENDED," means 144 that there might exist valid reasons in particular 145 circumstances to ignore this item, but the full implications 146 should be understood and the case carefully weighed before 147 taking a different course. 149 Sabin, et al [Page 3] 150 - MAY: This word, or the adjective "OPTIONAL," means that the 151 item is truly optional. One vendor might choose to include the 152 item because of a particular marketplace requirement or because 153 it enhances the product, while another vendor might omit the 154 item. 156 2. Format of Transformed Payload 158 The input to the payload compression transform is a payload to be 159 encapsulated by ESP. The output of the transform is a new payload of 160 following format: 162 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 163 | CC | | 164 +-+-+-+-+-+-+-+-+ | 165 | | 166 | Payload Data (compressed or uncompressed) | 167 | | 168 | +-+-+-+-+-+-+-+-| 169 | | 170 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 172 2.1 Compression Control 174 The Compression Control (CC) field is a single, bit-mapped byte. 175 The bits are numbered 7 (most significant) to 0 (least 176 significant). The following bits are defined: 178 - COMPRESSED (bit 7) 180 This bit is set to 1 to indicate the payload is compressed. It 181 is cleared to 0 to indicate the payload is not compressed. 183 - HIST_RESET (bit 6) 185 This bit is set to 1 to indicate that the compression history 186 associated with this datagram's SPI was reset prior to 187 processing this datagram's payload. It is cleared to 0 to 188 indicate the compression history was not reset. 190 In order to make the transform stateless between datagrams, the 191 sender MUST reset the compression history prior to processing 192 each datagram's payload. Thus, the HIST_RESET bit MUST be set 193 to 1 in every datagram. (The HIST_RESET bit is defined here 194 for upwards compatibility with future transforms that may allow 196 Sabin, et al [Page 4] 197 statefulness.) 199 The sender MUST flush the compressor each time it transmits a 200 compressed datagram. Flushing means that all data going into 201 the compressor is included in the output, i.e., no data is held 202 back in the hope of achieving better compression. Flushing is 203 necessary to prevent a datagram's data from spilling over into 204 a later datagram. 206 2.2 Payload Data 208 The Payload Data is either compressed or uncompressed. The value 209 of the COMPRESSED bit of the CC field is set accordingly. The 210 Payload Data field is an integral number of bytes in length. 212 3. Compression Procedure 214 The compression procedure consists of the following steps: 216 i) The sender resets the compression history and sets the 217 HIST_RESET bit of the CC field to 1. 219 ii) The sender decides whether or not to compress the payload. 221 - If the sender chooses to compress the payload, the LZS 222 algorithm is applied. The resulting compressed data is 223 formatted according to [ANSI94]. The COMPRESSED bit of the CC 224 field is set to 1. 226 - If the sender chooses not to compress the payload, the 227 COMPRESSED bit of the CC field is set to 0. 229 An implementation SHOULD monitor the results of the payload 230 compression operation and reject the operation if it results in 231 expansion. In such a case, the uncompressed payload SHOULD be 232 transmitted with the COMPRESSED bit set to 1. 234 After the payload has been transformed by these steps, the 235 transformed payload is submitted to the encode procedure of the 236 selected ESP transform. 238 4. Decompression Procedure 240 Prior to applying the decompression procedure, the decode procedure 241 of the selected ESP transform is applied to extract the payload. 243 The decompression procedure consists of the following steps: 245 Sabin, et al [Page 5] 246 i) The receiver checks the HIST_RESET bit of the CC field. If 247 HIST_RESET = 1, the decompression history is reset. If HIST_RESET 248 = 0, the datagram is discarded. 250 ii) The receiver checks the COMPRESSED bit of the CC field. If 251 COMPRESSED = 1, the LZS decompression algorithm is applied to the 252 payload data. If COMPRESSED = 0, decompression is not applied. 254 5. Security Considerations 256 Security issues are not discussed in this memo. 258 6. References 260 [ANSI94] American National Standards Institute, Inc., "Data 261 Compression Method for Information Systems," ANSI X3.241-1994, 262 August 1994. 264 [Atkins96] Atkinson, R., "IP Encapsulating Security Protocol," 265 RFC-xxxx, June 1996. 267 [Calgary] Text Compression Corpus, University of Calgary, available 268 at 269 ftp://ftp.cpsc.ucalgary.ca/pub/projects/text.compression.corpus. 271 [RFC-1962] Rand, D., "The PPP Compression Control Protocol (CCP)," 272 RFC-1962, June 1996. 274 [RFC-1974] Friend, R., and Simpson, W.A., "PPP Stac LZS Compression 275 Protocol," RFC-1974, August 1996. 277 7. Authors' Addresses 279 Michael Sabin 280 883 Mango Avenue 281 Sunnyvale, CA 94087 282 Email: mike.sabin@worldnet.att.net 284 Robert Monsour 285 Hi/fn Inc. 286 12636 High Bluff Drive 287 San Diego, CA 92130 288 Email: rmonsour@earthlink.net 290 8. Appendix: Compression Efficiency versus Datagram Size 292 The following table offers some guidance on the compression 294 Sabin, et al [Page 6] 295 efficiency that can be achieved as a function of datagram size. Each 296 entry in the table shows the compression ratio that was achieved when 297 the proposed transform was applied to a test file using datagrams of 298 a specified size. 300 The test file was the University of Calgary Text Compression Corpus 301 [Calgary]. The length of the file prior to compression was 3,278,000 302 bytes. When the entire file was compressed as a single payload, a 303 compression ratio of 2.34 resulted. 305 Datagram size,| 64 128 256 512 1024 2048 4096 8192 16384 306 bytes | 307 --------------|---------------------------------------------------- 308 Compression |1.18 1.28 1.43 1.58 1.74 1.91 2.04 2.11 2.14 309 ratio | 311 Sabin, et al [Page 7]