idnits 2.17.1 draft-ietf-pppext-bsd-compress-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-19) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity. ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 38 instances of too long lines in the document, the longest one being 6 characters in excess of 72. ** The abstract seems to contain references ([2], [1]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 112: '...or, the receiver MAY momentary force C...' RFC 2119 keyword, line 117: '... it SHOULD send a Reset-Request L...' RFC 2119 keyword, line 122: '... The receiver MUST discard all comp...' RFC 2119 keyword, line 130: '... receiver MUST clear its compress...' RFC 2119 keyword, line 139: '... MAY transmit an additional Reset...' (5 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 360 has weird spacing: '... u_int hsiz...' == Line 361 has weird spacing: '... u_char hshif...' == Line 362 has weird spacing: '... u_char n_bit...' == Line 363 has weird spacing: '... u_char debug...' == Line 364 has weird spacing: '... u_char unit;...' == (10 more instances...) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (October 1995) is 10414 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: '0' is mentioned on line 655, but not defined -- Looks like a reference, but probably isn't: '--i' on line 597 == Unused Reference: '4' is defined on line 1127, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. '1' -- Possible downref: Non-RFC (?) normative reference: ref. '2' -- Possible downref: Non-RFC (?) normative reference: ref. '3' -- Possible downref: Non-RFC (?) normative reference: ref. '4' Summary: 12 errors (**), 0 flaws (~~), 9 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Schryver 3 Internet Draft 4 expires in six months October 1995 6 PPP BSD Compression Protocol 7 draft-ietf-pppext-bsd-compress-05.txt 9 Status of this Memo 11 This document is the product of the Point-to-Point Protocol Working 12 Group of the Internet Engineering Task Force (IETF). Comments should 13 be submitted to the ietf-ppp@merit.edu mailing list. 15 Distribution of this memo is unlimited. 17 This document is an Internet Draft. Internet Drafts are working 18 documents of the Internet Engineering Task Force (IETF), its Areas, 19 and its Working Groups. Note that other groups may also distribute 20 working documents as Internet Drafts. 22 Internet Drafts are draft documents valid for a maximum of six 23 months. Internet Drafts may be updated, replaced, or obsoleted by 24 other documents at any time. It is not appropriate to use Internet 25 Drafts as reference material or to cite them other than as a 26 ``working draft'' or ``work in progress.'' 28 Please check the 1id-abstracts.txt listing contained in the 29 internet-drafts Shadow Directories on nic.ddn.mil, nnsc.nsf.net, 30 nic.nordu.net, ftp.nisc.sri.com, or munnari.oz.au to learn the 31 current status of any Internet Draft. 33 Abstract 35 The Point-to-Point Protocol (PPP) [1] provides a standard method for 36 transporting multi-protocol datagrams over point-to-point links. 38 The PPP Compression Control Protocol [2] provides a method to 39 negotiate and utilize compression protocols over PPP encapsulated 40 links. 42 This document describes the use of the Unix Compress compression 43 protocol for compressing PPP encapsulated packets. 45 1. Introduction 47 UNIX compress as embodied in the freely and widely distributed BSD 48 source has the following features: 50 - dynamic table clearing when compression becomes less effective. 52 - automatic turning off of compression when the overall result 53 is not smaller than the input. 55 - dynamic choice of code width within predetermined limits. 57 - heavily used for many years in networks, on modem and other 58 point-to-point links to transfer netnews. 60 - an effective code width requires less than 64KBytes of memory 61 on both sender and receive. 63 1.1. Licensing 65 BSD Unix compress command source is widely and freely available, with 66 no additional license for many computer vendors. The included source 67 code is based on the BSD compress command source and carries only the 68 copyright of The Regents of the University of California. Use the 69 code entirely at your own risk. It has no warranties or 70 indemnifications of any sort. Note that there are patents on LZW. 72 2. BSD Compress Packets 74 Before any BSD Compress packets may be communicated, PPP must reach 75 the Network-Layer Protocol phase, and the CCP Control Protocol must 76 reach the Opened state. 78 Exactly one BSD Compress datagram is encapsulated in the PPP 79 Information field, where the PPP Protocol field contains 0xFD or 80 0xFB. 0xFD is used when the PPP multilink protocol is not used or 81 "above" multilink. 0xFB is used "below" multilink, to compress 82 independently on individual links of a multilink bundle. 84 The maximum length of the BSD Compress datagram transmitted over a 85 PPP link is the same as the maximum length of the Information field 86 of a PPP encapsulated packet. 88 Only packets with PPP Protocol numbers in the range 0x0000 to 0x3FFF 89 and neither 0xFD nor 0xFB are compressed. Other PPP packets are 90 always sent uncompressed. Control packets are infrequent and should 91 not be compressed for robustness. 93 Padding 95 BSD Compress packets require the previous negotiation of the 96 Self-Describing-Padding Configuration Option [3] if padding is 97 added to packets. If no padding is added, than Self-Describing- 98 Padding is not required. 100 Reliability and Sequencing 102 BSD Compress requires the packets to be delivered in sequence. It 103 relies on Reset-Request and Reset-Ack LCP packets or on 104 renegotiation of the Compression Control Protocol [2] to indicate 105 loss of synchronization between the transmitter and receiver. The 106 LCP FCS detects corrupted packets and the normal mechanisms 107 discard them. Missing or out of order packets are detected by the 108 sequence number in each packet. The packet sequence number ought 109 to be checked before decoding the packet. 111 Instead of transmitting a Reset-Request packet when detecting a 112 decompression error, the receiver MAY momentary force CCP to drop 113 out of the Opened state by transmitting a new CCP Configure- 114 Request. This method is more expensive than using Reset-Requests. 116 When the receiver first encounters an unexpected sequence number 117 it SHOULD send a Reset-Request LCP packet as defined in the 118 Compression Control Protocol. When the transmitter sends the 119 Reset-Ack or when the receiver receives a Reset-ACK, they must 120 reset the sequence number to zero, clear the compression 121 dictionary, and resume sending and receiving compressed packets. 122 The receiver MUST discard all compressed packets after detecting 123 an error and until it receives a Reset-Ack. This strategy can be 124 thought of as abandoning the transmission of one "file" and 125 starting the transmission of a new "file." 127 The transmitter must clear its compression dictionary and respond 128 with a Reset-Ack each time it receives a Reset-Request, because it 129 cannot know if previous Reset-Acks reached the receiver. The 130 receiver MUST clear its compression dictionary each time it 131 receives a Reset-Ack, because the transmitter will have cleared 132 its compression dictionary. 134 When the link is busy, one decompression error is usually followed 135 by several more before the Reset-Ack can be received. It is 136 undesirable to transmit Reset-Requests more frequently than the 137 round-trip-time of the link, because redundant Reset-Requests 138 cause unnecessary compression dictionary clearing. The receiver 139 MAY transmit an additional Reset-Request each time it receives a 140 compressed or uncompressed packet until it finally receives a 141 Reset-Ack, but the receiver ought not transmit another Reset- 142 Request until the Reset-Ack for the previous one is late. The 143 receiver MUST transmit enough Reset-Request packets to ensure that 144 the transmitter receives at least one. For example, the receiver 145 might choose to not transmit another Reset-Request until after one 146 second (or, of course, a Reset-Ack has been received and 147 decompression resumed). 149 Data Expansion 151 When significant data expansion is detected, the PPP packet MUST 152 be sent without compression. Packets that would expand by fewer 153 than 3 bytes SHOULD be sent without compression, but MAY be sent 154 compressed provided the result does not exceed the MTU of the 155 link. This makes moot standards document exegesises about exactly 156 which bytes, such as the Protocol fields, count toward expansion. 158 When a packet is received with PPP Protocol numbers in the range 159 0x0000 to 0x3FFF, (except, of course, 0xFD and 0xFB) it is assumed 160 that the packet would have caused expansion. The packet is 161 locally compressed to update the compression history. 163 Sending incompressible packets in their native encapsulation 164 avoids maximum transmission unit complications. If uncompressed 165 packets could be larger than their native form, then it would be 166 necessary for the upper layers of an implementation to treat the 167 PPP link as if it had a smaller MTU, to ensure that compressed 168 incompressible packets are never larger than the negotiated PPP 169 MTU. 171 Using native encapsulation for incompressible packets complicates 172 the implementation. The transmitter and the receiver must start 173 putting information into the compression dictionary starting with 174 the same packets, without relying upon seeing a compressed packet 175 for synchronization. The first few packets after clearing the 176 dictionary are usually incompressible, and so are likely to sent 177 in their native encapsulation, just like packets before 178 compression is turned on. If CCP or LCP packets are handled 179 separately from Network-Layer packets (e.g. a "daemon" for control 180 packets and "kernel code" for data packets), care must be taken to 181 ensure that the transmitter synchronizes clearing the dictionary 182 with the transmission of the configure-ACK or Reset-Ack that 183 starts compression, and the receiver must similarly ensure that 184 its dictionary is cleared before it processes the next packet. 186 A difficulty caused by sending data that would expand uncompressed 187 is that the receiver must adaptively clear its dictionary at 188 precisely the same times as the sender. In the classic BSD 189 compression code, the dictionary clearing is signaled by the 190 reserved code 256. Because data that would expend is sent without 191 compression, there is no reliable way for the sender to signal 192 explicitly when it has cleared its dictionary. This difficulty is 193 resolved by specifying the parameters that control the dictionary 194 clearing, and having both sender and receiver clear their 195 dictionaries at the same times. 197 2.1. Packet Format 199 A summary of the BSD Compress packet format is shown below. 201 The fields are transmitted from left to right. 203 0 1 2 3 204 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 205 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 206 | PPP Protocol | Sequence 207 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 208 | Data ... 209 +-+-+-+-+-+-+-+-+ 211 PPP Protocol 213 The PPP Protocol field is described in the Point-to-Point Protocol 214 Encapsulation [1]. 216 When the BSD Compress compression protocol is successfully 217 negotiated by the PPP Compression Control Protocol [2], the value 218 of the protocol field is 0xFD or 0xFB. This value MAY be 219 compressed when Protocol-Field-Compression is negotiated. 221 Sequence 223 The sequence number is sent most significant octet first. It 224 starts at 0 when the dictionary is cleared, and is incremented by 225 1 for each packet, including uncompressed packets. The sequence 226 number after 65535 is zero. In other words, the sequence number 227 "wraps" in the usual way. 229 The sequence number ensures that lost or out of order packets do 230 not cause the compression databases of the peers to become 231 unsynchronized. When an unexpected sequence number is 232 encountered, the dictionaries must be resynchronized with a CCP 233 Reset-Request or Configure-Request. The packet sequence number 234 can be checked before a compressed packet is decoded. 236 Data 238 The compressed PPP encapsulated packet, consisting of the Protocol 239 and Data fields of the original, uncompressed packet follows. 241 The Protocol field compression MUST be applied to the protocol 242 field in the original packet before the sequence number is 243 computed or the entire packet is compressed, regardless of whether 244 the PPP protocol field compression has been negotiated. Thus, if 245 the original protocol number was less than 0x100, it must be 246 compressed to a single byte. 248 The format of the compressed data is more precisely described by 249 the example code in the "BSD Compress Algorithm" appendix. 251 3. Configuration Option Format 253 Description 255 The CCP BSD Compress Configuration Option negotiates the use of 256 BSD Compress on the link. By default or ultimate disagreement, no 257 compression is used. 259 A summary of the BSD Compress Configuration Option format is shown 260 below. The fields are transmitted from left to right. 262 0 1 2 263 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 264 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 265 | Type | Length | Vers| Dict | 266 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 268 Type 270 21 or 0x15 for BSD compress. 272 Length 274 3 276 Vers 278 Must be the binary number 001. 280 Dict 282 The size in bits of the largest code used. It can range from 9 to 283 16. A common choice is 12. The code included below can support 284 code sizes from 9 to 15. 286 It is convenient to treat the byte containing the Vers and Dict 287 fields as a single field with legal values ranging from 0x29 to 288 0x30. 290 Note that the peer receiving compressed data must use the same 291 code size as the peer sending data. It is not practical for the 292 receiver to use a larger dictionary or code size, because both 293 dictionaries must be cleared at the same time, even when the data 294 is not compressible, so that uncompressed packets are being sent, 295 and so the receiver cannot receive LZW "CLEAR" codes. 297 When a received Configure-Request specifies a smaller dictionary 298 than the local preference, it is often best to accept it instead 299 of using a Configure-Nak to ask the peer to specify a larger 300 dictionary. 302 A. BSD Compress Algorithm 304 This code is the core of a commercial workstation implementation. It 305 was derived by transliterating the 4.*BSD compress command. It is 306 unlikely to be of direct use in any system that does not have the 307 same mixture of mbufs and STREAMS buffers. It may need to be retuned 308 for CPU's other than RISC's with many registers and certain 309 addressing modes. However, the code is the most accurate and 310 unambiguous way of defining the changes to the BSD compress source 311 required to apply it to a stream instead of a file. 313 Note that it assumes a "short" contains 16 bits and an "int" contains 314 at least 32 bits. Where it would matter if more than 32 bits were in 315 an "int" or "long," __uint32_t is used instead. 317 /* Because this code is derived from the 4.3BSD compress source: 318 * 319 * 320 * Copyright (c) 1985, 1986 The Regents of the University of California. 321 * All rights reserved. 322 * 323 * This code is derived from software contributed to Berkeley by 324 * James A. Woods, derived from original work by Spencer Thomas 325 * and Joseph Orost. 326 * 327 * Redistribution and use in source and binary forms, with or without 328 * modification, are permitted provided that the following conditions 329 * are met: 330 * 1. Redistributions of source code must retain the above copyright 331 * notice, this list of conditions and the following disclaimer. 332 * 2. Redistributions in binary form must reproduce the above copyright 333 * notice, this list of conditions and the following disclaimer in the 334 * documentation and/or other materials provided with the distribution. 335 * 3. All advertising materials mentioning features or use of this software 336 * must display the following acknowledgement: 337 * This product includes software developed by the University of 338 * California, Berkeley and its contributors. 339 * 4. Neither the name of the University nor the names of its contributors 340 * may be used to endorse or promote products derived from this software 341 * without specific prior written permission. 342 * 343 * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND 344 * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 345 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE 346 * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE 347 * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 348 * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 349 * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 350 * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT 351 * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY 352 * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF 353 * SUCH DAMAGE. 354 */ 356 /* ***************** */ 358 struct bsd_db { 359 int totlen; /* length of this structure */ 360 u_int hsize; /* size of the hash table */ 361 u_char hshift; /* used in hash function */ 362 u_char n_bits; /* current bits/code */ 363 u_char debug; 364 u_char unit; 365 u_short mru; 366 u_short seqno; /* # of last byte of packet */ 367 u_int maxmaxcode; /* largest valid code */ 368 u_int max_ent; /* largest code in use */ 369 u_int in_count; /* uncompressed bytes */ 370 u_int bytes_out; /* compressed bytes */ 371 u_int ratio; /* recent compression ratio */ 372 u_int checkpoint; /* when to next check the ratio */ 373 int clear_count; /* times dictionary cleared */ 374 int incomp_count; /* incompressible packets */ 375 int decomp_count; /* packets decompressed */ 376 int overshoot; /* excess decompression buf */ 377 int undershoot; /* insufficient decomp. buf */ 378 u_short *lens; /* array of lengths of codes */ 379 struct bsd_dict { 380 union { /* hash value */ 381 __uint32_t fcode; 382 struct { 383 #ifdef BSD_LITTLE_ENDIAN 384 u_short prefix; /* preceding code */ 385 u_char suffix; /* last character of new code */ 386 u_char pad; 387 #else 388 u_char pad; 389 u_char suffix; /* last character of new code */ 390 u_short prefix; /* preceding code */ 391 #endif 392 } hs; 393 } f; 394 u_short codem1; /* output of hash table -1 */ 395 u_short cptr; /* map code to hash table entry */ 396 } dict[1]; 397 }; 398 #define BSD_OVHD (2+2) /* BSD compress overhead/packet */ 399 #define MIN_BSD_BITS 9 400 #define MAX_BSD_BITS 15 /* implementation limit */ 401 #define BSD_VERS 1 /* when shifted */ 402 #ifdef _KERNEL 403 extern struct bsd_db *pf_bsd_init(struct bsd_db*, int, int, int); 404 extern int pf_bsd_comp(struct bsd_db*,u_char*,int,struct mbuf*,int); 405 extern mblk_t* pf_bsd_decomp(struct bsd_db*, mblk_t*); 406 extern void pf_bsd_incomp(struct bsd_db*, mblk_t*, u_int); 407 #endif 409 /* ***************** */ 410 /* PPP "BSD compress" compression 411 * The differences between this compression and the classic BSD LZW 412 * source are obvious from the requirement that the classic code worked 413 * with files while this handles arbitrarily long streams that 414 * are broken into packets. They are: 415 * 416 * When the code size expands, a block of junk is not emitted by 417 * the compressor and not expected by the decompressor. 418 * 419 * New codes are not necessarily assigned every time an old 420 * code is output by the compressor. This is because a packet 421 * end forces a code to be emitted, but does not imply that a 422 * new sequence has been seen. 423 * 424 * The compression ratio is checked at the first end of a packet 425 * after the appropriate gap. Besides simplifying and speeding 426 * things up, this makes it more likely that the transmitter 427 * and receiver will agree when the dictionary is cleared when 428 * compression is not going well. 429 */ 431 /* 432 * the next two codes should not be changed lightly, as they must not 433 * lie within the contiguous general code space. 434 */ 435 #define CLEAR 256 /* table clear output code */ 436 #define FIRST 257 /* first free entry */ 437 #define LAST 255 439 #define BSD_INIT_BITS MIN_BSD_BITS 441 #define MAXCODE(b) ((1 << (b)) - 1) 442 #define BADCODEM1 MAXCODE(MAX_BSD_BITS); 443 #define BSD_HASH(prefix,suffix,hshift) ((((__uint32_t)(suffix)) << (hshift)) \ 444 ^ (__uint32_t)(prefix)) 445 #define BSD_KEY(prefix,suffix) ((((__uint32_t)(suffix)) << 16) \ 446 + (__uint32_t)(prefix)) 448 #define CHECK_GAP 10000 /* Ratio check interval */ 450 #define RATIO_SCALE_LOG 8 451 #define RATIO_SCALE (1<>RATIO_SCALE_LOG) 454 /* clear the dictionary 455 */ 456 static void 457 pf_bsd_clear(struct bsd_db *db) 458 { 459 db->clear_count++; 460 db->max_ent = FIRST-1; 461 db->n_bits = BSD_INIT_BITS; 462 db->ratio = 0; 463 db->bytes_out = 0; 464 db->in_count = 0; 465 db->incomp_count = 0; 466 db->decomp_count = 0; 467 db->overshoot = 0; 468 db->undershoot = 0; 469 db->checkpoint = CHECK_GAP; 470 } 472 /* If the dictionary is full, then see if it is time to reset it. 473 * 474 * Compute the compression ratio using fixed-point arithmetic 475 * with 8 fractional bits. 476 * 477 * Since we have an infinite stream instead of a single file, 478 * watch only the local compression ratio. 479 * 480 * Since both peers must reset the dictionary at the same time even in 481 * the absence of CLEAR codes (while packets are incompressible), they 482 * must compute the same ratio. 483 */ 484 static int /* 1=output CLEAR */ 485 pf_bsd_check(struct bsd_db *db) 486 { 487 register u_int new_ratio; 489 if (db->in_count >= db->checkpoint) { 490 /* age the ratio by limiting the size of the counts */ 491 if (db->in_count >= RATIO_MAX 492 || db->bytes_out >= RATIO_MAX) { 493 db->in_count -= db->in_count/4; 494 db->bytes_out -= db->bytes_out/4; 495 } 497 db->checkpoint = db->in_count + CHECK_GAP; 499 if (db->max_ent >= db->maxmaxcode) { 500 /* Reset the dictionary only if the ratio is worse, 501 * or if it looks as if it has been poisoned 502 * by incompressible data. 503 * 504 * This does not overflow, because 505 * db->in_count <= RATIO_MAX. 506 */ 507 new_ratio = db->in_count<bytes_out != 0) 509 new_ratio /= db->bytes_out; 511 if (new_ratio < db->ratio 512 || new_ratio < 1*RATIO_SCALE) { 513 pf_bsd_clear(db); 514 return 1; 515 } 516 db->ratio = new_ratio; 517 } 518 } 519 return 0; 520 } 522 /* Initialize the database. 523 */ 524 struct bsd_db * 525 pf_bsd_init(struct bsd_db *db, /* initialize this database */ 526 int unit, /* for debugging */ 527 int bits, /* size of LZW code word */ 528 int mru) /* MRU for input, 0 for output */ 529 { 530 register int i; 531 register u_short *lens; 532 register u_int newlen, hsize, hshift, maxmaxcode; 534 switch (bits) { 535 case 9: /* needs 82152 for both directions */ 536 case 10: /* needs 84144 */ 537 case 11: /* needs 88240 */ 538 case 12: /* needs 96432 */ 539 hsize = 5003; 540 hshift = 4; 541 break; 542 case 13: /* needs 176784 */ 543 hsize = 9001; 544 hshift = 5; 545 break; 546 case 14: /* needs 353744 */ 547 hsize = 18013; 548 hshift = 6; 549 break; 550 case 15: /* needs 691440 */ 551 hsize = 35023; 552 hshift = 7; 553 break; 554 case 16: /* needs 1366160--far too much, */ 555 /* hsize = 69001; */ /* and 69001 is too big for cptr */ 556 /* hshift = 8; */ /* in struct bsd_db */ 557 /* break; */ 558 default: 559 if (db) { 560 if (db->lens) 561 kern_free(db->lens); 562 kern_free(db); 563 } 564 return 0; 565 } 566 maxmaxcode = MAXCODE(bits); 567 newlen = sizeof(*db) + (hsize-1)*(sizeof(db->dict[0])); 569 if (db) { 570 lens = db->lens; 571 if (db->totlen != newlen) { 572 if (lens) 573 kern_free(lens); 574 kern_free(db); 575 db = 0; 576 } 577 } 578 if (!db) { 579 db = (struct bsd_db*)kern_malloc(newlen); 580 if (!db) 581 return 0; 582 if (mru == 0) { 583 lens = 0; 584 } else { 585 lens = (u_short*)kern_malloc((maxmaxcode+1) 586 * sizeof(*lens)); 587 if (!lens) { 588 kern_free(db); 589 return 0; 590 } 591 i = LAST+1; 592 while (i != 0) 593 lens[--i] = 1; 594 } 595 i = hsize; 596 while (i != 0) { 597 db->dict[--i].codem1 = BADCODEM1; 598 db->dict[i].cptr = 0; 599 } 600 } 602 bzero(db,sizeof(*db)-sizeof(db->dict)); 603 db->lens = lens; 604 db->unit = unit; 605 db->mru = mru; 606 db->hsize = hsize; 607 db->hshift = hshift; 608 db->maxmaxcode = maxmaxcode; 609 db->clear_count = -1; 611 pf_bsd_clear(db); 613 return db; 614 } 616 /* compress a packet 617 * Assume the protocol is known to be >= 0x21 and < 0xff. 618 * One change from the BSD compress command is that when the 619 * code size expands, we do not output a bunch of padding. 620 */ 621 int /* new slen */ 622 pf_bsd_comp(struct bsd_db *db, 623 u_char *cp_buf, /* compress into here */ 624 int proto, /* this original PPP protocol */ 625 struct mbuf *m, /* from here */ 626 int slen) 627 { 628 register int hshift = db->hshift; 629 register u_int max_ent = db->max_ent; 630 register u_int n_bits = db->n_bits; 631 register u_int bitno = 32; 632 register __uint32_t accum = 0; 633 register struct bsd_dict *dictp; 634 register __uint32_t fcode; 635 register u_char c; 636 register int hval, disp, ent; 637 register u_char *rptr, *wptr; 638 struct mbuf *n; 640 #define OUTPUT(ent) { \ 641 bitno -= n_bits; \ 642 accum |= ((ent) << bitno); \ 643 do { \ 644 *wptr++ = accum>>24; \ 645 accum <<= 8; \ 646 bitno += 8; \ 647 } while (bitno <= 24); \ 648 } 650 /* start with the protocol byte */ 651 ent = proto; 652 db->in_count++; 654 /* install sequence number */ 655 cp_buf[0] = db->seqno>>8; 656 cp_buf[1] = db->seqno; 657 db->seqno++; 659 wptr = &cp_buf[2]; 660 slen = m->m_len; 661 db->in_count += slen; 662 rptr = mtod(m, u_char*); 663 n = m->m_next; 664 for (;;) { 665 if (slen == 0) { 666 if (!n) 667 break; 668 slen = n->m_len; 669 rptr = mtod(n, u_char*); 670 n = n->m_next; 671 if (!slen) 672 continue; /* handle 0-length buffers */ 673 db->in_count += slen; 674 } 676 slen--; 677 c = *rptr++; 678 fcode = BSD_KEY(ent,c); 679 hval = BSD_HASH(ent,c,hshift); 680 dictp = &db->dict[hval]; 682 /* Validate and then check the entry. */ 683 if (dictp->codem1 >= max_ent) 684 goto nomatch; 685 if (dictp->f.fcode == fcode) { 686 ent = dictp->codem1+1; 687 continue; /* found (prefix,suffix) */ 688 } 690 /* continue probing until a match or invalid entry */ 691 disp = (hval == 0) ? 1 : hval; 692 do { 693 hval += disp; 694 if (hval >= db->hsize) 695 hval -= db->hsize; 696 dictp = &db->dict[hval]; 697 if (dictp->codem1 >= max_ent) 698 goto nomatch; 699 } while (dictp->f.fcode != fcode); 700 ent = dictp->codem1+1; /* finally found (prefix,suffix) */ 701 continue; 703 nomatch: 704 OUTPUT(ent); /* output the prefix */ 706 /* code -> hashtable */ 707 if (max_ent < db->maxmaxcode) { 708 struct bsd_dict *dictp2; 709 /* expand code size if needed */ 710 if (max_ent >= MAXCODE(n_bits)) 711 db->n_bits = ++n_bits; 713 /* Invalidate old hash table entry using 714 * this code, and then take it over. 715 */ 716 dictp2 = &db->dict[max_ent+1]; 717 if (db->dict[dictp2->cptr].codem1 == max_ent) 718 db->dict[dictp2->cptr].codem1 = BADCODEM1; 719 dictp2->cptr = hval; 720 dictp->codem1 = max_ent; 721 dictp->f.fcode = fcode; 723 db->max_ent = ++max_ent; 724 } 725 ent = c; 726 } 727 OUTPUT(ent); /* output the last code */ 728 db->bytes_out += (wptr-&cp_buf[2] /* count complete bytes */ 729 + (32-bitno+7)/8); 731 if (pf_bsd_check(db)) 732 OUTPUT(CLEAR); /* do not count the CLEAR */ 734 /* Pad dribble bits of last code with ones. 735 * Do not emit a completely useless byte of ones. 736 */ 737 if (bitno != 32) 738 *wptr++ = (accum | (0xff << (bitno-8))) >> 24; 740 /* Increase code size if we would have without the packet 741 * boundary and as the decompressor will. 742 */ 743 if (max_ent >= MAXCODE(n_bits) 744 && max_ent < db->maxmaxcode) 745 db->n_bits++; 747 return (wptr - cp_buf); 748 #undef OUTPUT 749 } 751 /* Update the "BSD Compress" dictionary on the receiver for 752 * incompressible data by pretending to compress the incoming data. 753 */ 754 void 755 pf_bsd_incomp(struct bsd_db *db, 756 mblk_t *dmsg, 757 u_int ent) /* start with the protocol byte */ 758 { 759 register u_int hshift = db->hshift; 760 register u_int max_ent = db->max_ent; 761 register u_int n_bits = db->n_bits; 762 register struct bsd_dict *dictp; 763 register __uint32_t fcode; 764 register u_char c; 765 register int hval, disp; 766 register int slen; 767 register u_int bitno = 7; 768 register u_char *rptr; 770 db->incomp_count++; 772 db->in_count++; /* count the protocol as 1 byte */ 773 db->seqno++; 774 rptr = dmsg->b_rptr+PPP_BUF_HEAD_INFO; 775 for (;;) { 776 slen = dmsg->b_wptr - rptr; 777 if (slen == 0) { 778 dmsg = dmsg->b_cont; 779 if (!dmsg) 780 break; 781 rptr = dmsg->b_rptr; 782 continue; /* skip zero-length buffers */ 783 } 784 db->in_count += slen; 786 do { 787 c = *rptr++; 788 fcode = BSD_KEY(ent,c); 789 hval = BSD_HASH(ent,c,hshift); 790 dictp = &db->dict[hval]; 792 /* validate and then check the entry */ 793 if (dictp->codem1 >= max_ent) 794 goto nomatch; 795 if (dictp->f.fcode == fcode) { 796 ent = dictp->codem1+1; 797 continue; /* found (prefix,suffix) */ 798 } 800 /* continue probing until a match or invalid entry */ 801 disp = (hval == 0) ? 1 : hval; 802 do { 803 hval += disp; 804 if (hval >= db->hsize) 805 hval -= db->hsize; 806 dictp = &db->dict[hval]; 807 if (dictp->codem1 >= max_ent) 808 goto nomatch; 809 } while (dictp->f.fcode != fcode); 810 ent = dictp->codem1+1; 811 continue; /* finally found (prefix,suffix) */ 813 nomatch: /* output (count) the prefix */ 814 bitno += n_bits; 816 /* code -> hashtable */ 817 if (max_ent < db->maxmaxcode) { 818 struct bsd_dict *dictp2; 819 /* expand code size if needed */ 820 if (max_ent >= MAXCODE(n_bits)) 821 db->n_bits = ++n_bits; 822 /* Invalidate previous hash table entry 823 * assigned this code, and then take it over. 824 */ 825 dictp2 = &db->dict[max_ent+1]; 826 if (db->dict[dictp2->cptr].codem1 == max_ent) 827 db->dict[dictp2->cptr].codem1 = BADCODEM1; 828 dictp2->cptr = hval; 829 dictp->codem1 = max_ent; 830 dictp->f.fcode = fcode; 832 db->max_ent = ++max_ent; 833 db->lens[max_ent] = db->lens[ent]+1; 834 } 835 ent = c; 836 } while (--slen != 0); 837 } 838 bitno += n_bits; /* output (count) the last code */ 839 db->bytes_out += bitno/8; 841 (void)pf_bsd_check(db); 843 /* Increase code size if we would have without the packet 844 * boundary and as the decompressor will. 845 */ 846 if (max_ent >= MAXCODE(n_bits) 847 && max_ent < db->maxmaxcode) 848 db->n_bits++; 849 } 851 /* Decompress "BSD Compress" 852 */ 853 mblk_t* /* 0=failed, so zap CCP */ 854 pf_bsd_decomp(struct bsd_db *db, 855 mblk_t *cmsg) 856 { 857 register u_int max_ent = db->max_ent; 858 register __uint32_t accum = 0; 859 register u_int bitno = 32; /* 1st valid bit in accum */ 860 register u_int n_bits = db->n_bits; 861 register u_int tgtbitno = 32-n_bits; /* bitno when accum full */ 862 register struct bsd_dict *dictp; 863 register int explen, i; 864 register u_int incode, oldcode, finchar; 865 register u_char *p, *rptr, *rptr9, *wptr0, *wptr; 866 mblk_t *dmsg, *dmsg1, *bp; 868 db->decomp_count++; 869 rptr = cmsg->b_rptr; 870 ASSERT(cmsg->b_wptr >= rptr+PPP_BUF_MIN); 871 ASSERT(PPP_BUF_ALIGN(rptr)); 872 rptr += PPP_BUF_MIN; 874 /* get the sequence number */ 875 i = 0; 876 explen = 2; 877 do { 878 while (rptr >= cmsg->b_wptr) { 879 bp = cmsg; 880 cmsg = cmsg->b_cont; 881 freeb(bp); 882 if (!cmsg) { 883 if (db->debug) 884 printf("bsd_decomp%d: missing" 885 " %d header bytes\n", 886 db->unit, explen); 887 return 0; 888 } 889 rptr = cmsg->b_rptr; 890 } 891 i = (i << 8) + *rptr++; 892 } while (--explen != 0); 893 if (i != db->seqno++) { 894 freemsg(cmsg); 895 if (db->debug) 896 printf("bsd_decomp%d: bad sequence number 0x%x" 897 " instead of 0x%x\n", 898 db->unit, i, db->seqno-1); 899 return 0; 900 } 902 /* Guess how much memory we will need. Assume this packet was 903 * compressed by at least 1.5X regardless of the recent ratio. 904 */ 905 if (db->ratio > (RATIO_SCALE*3)/2) 906 explen = (msgdsize(cmsg)*db->ratio)/RATIO_SCALE; 907 else 908 explen = (msgdsize(cmsg)*3)/2; 909 if (explen > db->mru) 910 explen = db->mru; 912 dmsg = dmsg1 = allocb(explen+PPP_BUF_HEAD_INFO, BPRI_HI); 913 if (!dmsg1) { 914 freemsg(cmsg); 915 return 0; 916 } 917 wptr = dmsg1->b_wptr; 919 ((struct ppp_buf*)wptr)->type = BEEP_FRAME; 920 /* the protocol field must be compressed */ 921 ((struct ppp_buf*)wptr)->proto = 0; 922 wptr += PPP_BUF_HEAD_PROTO+1; 924 rptr9 = cmsg->b_wptr; 925 db->bytes_out += rptr9-rptr; 926 wptr0 = wptr; 927 explen = dmsg1->b_datap->db_lim - wptr; 928 oldcode = CLEAR; 929 for (;;) { 930 if (rptr >= rptr9) { 931 bp = cmsg; 932 cmsg = cmsg->b_cont; 933 freeb(bp); 934 if (!cmsg) /* quit at end of message */ 935 break; 936 rptr = cmsg->b_rptr; 937 rptr9 = cmsg->b_wptr; 938 db->bytes_out += rptr9-rptr; 939 continue; /* handle 0-length buffers */ 940 } 942 /* Accumulate bytes until we have a complete code. 943 * Then get the next code, relying on the 32-bit, 944 * unsigned accum to mask the result. 945 */ 946 bitno -= 8; 947 accum |= *rptr++ << bitno; 948 if (tgtbitno < bitno) 949 continue; 950 incode = accum >> tgtbitno; 951 accum <<= n_bits; 952 bitno += n_bits; 954 if (incode == CLEAR) { 955 /* The dictionary must only be cleared at 956 * the end of a packet. But there could be an 957 * empty message block at the end. 958 */ 959 if (rptr != rptr9 960 || cmsg->b_cont != 0) { 961 cmsg->b_rptr = rptr; 962 i = msgdsize(cmsg); 963 if (i != 0) { 964 freemsg(dmsg); 965 freemsg(cmsg); 966 if (db->debug) 967 printf("bsd_decomp%d: " 968 "bad CLEAR\n", 969 db->unit); 970 return 0; 971 } 972 } 973 pf_bsd_clear(db); 974 freemsg(cmsg); 975 wptr0 = wptr; 976 break; 977 } 979 /* Special case for KwKwK string. */ 980 if (incode > max_ent) { 981 if (incode > max_ent+2 982 || incode > db->maxmaxcode 983 || oldcode == CLEAR) { 984 freemsg(dmsg); 985 freemsg(cmsg); 986 if (db->debug) 987 printf("bsd_decomp%d: bad code %x\n", 988 db->unit, incode); 989 return 0; 990 } 991 i = db->lens[oldcode]; 992 /* do not write past end of buf */ 993 explen -= i+1; 994 if (explen < 0) { 995 db->undershoot -= explen; 996 db->in_count += wptr-wptr0; 997 dmsg1->b_wptr = wptr; 998 CK_WPTR(dmsg1); 999 explen = MAX(64,i+1); 1000 bp = allocb(explen, BPRI_HI); 1001 if (!bp) { 1002 freemsg(cmsg); 1003 freemsg(dmsg); 1004 return 0; 1005 } 1006 dmsg1->b_cont = bp; 1007 dmsg1 = bp; 1008 wptr0 = wptr = dmsg1->b_wptr; 1009 explen = dmsg1->b_datap->db_lim-wptr - (i+1); 1010 } 1011 p = (wptr += i); 1012 *wptr++ = finchar; 1013 finchar = oldcode; 1014 } else { 1015 i = db->lens[finchar = incode]; 1016 explen -= i; 1017 if (explen < 0) { 1018 db->undershoot -= explen; 1019 db->in_count += wptr-wptr0; 1020 dmsg1->b_wptr = wptr; 1021 CK_WPTR(dmsg1); 1022 explen = MAX(64,i); 1023 bp = allocb(explen, BPRI_HI); 1024 if (!bp) { 1025 freemsg(dmsg); 1026 freemsg(cmsg); 1027 return 0; 1028 } 1029 dmsg1->b_cont = bp; 1030 dmsg1 = bp; 1031 wptr0 = wptr = dmsg1->b_wptr; 1032 explen = dmsg1->b_datap->db_lim-wptr - i; 1033 } 1034 p = (wptr += i); 1035 } 1037 /* decode code and install in decompressed buffer */ 1038 while (finchar > LAST) { 1039 dictp = &db->dict[db->dict[finchar].cptr]; 1040 *--p = dictp->f.hs.suffix; 1041 finchar = dictp->f.hs.prefix; 1042 } 1043 *--p = finchar; 1045 /* If not first code in a packet, and 1046 * if not out of code space, then allocate a new code. 1047 * 1048 * Keep the hash table correct so it can be used 1049 * with uncompressed packets. 1050 */ 1051 if (oldcode != CLEAR 1052 && max_ent < db->maxmaxcode) { 1053 struct bsd_dict *dictp2; 1054 __uint32_t fcode; 1055 int hval, disp; 1057 fcode = BSD_KEY(oldcode,finchar); 1058 hval = BSD_HASH(oldcode,finchar,db->hshift); 1059 dictp = &db->dict[hval]; 1060 /* look for a free hash table entry */ 1061 if (dictp->codem1 < max_ent) { 1062 disp = (hval == 0) ? 1 : hval; 1063 do { 1064 hval += disp; 1065 if (hval >= db->hsize) 1066 hval -= db->hsize; 1067 dictp = &db->dict[hval]; 1068 } while (dictp->codem1 < max_ent); 1069 } 1071 /* Invalidate previous hash table entry 1072 * assigned this code, and then take it over 1073 */ 1074 dictp2 = &db->dict[max_ent+1]; 1075 if (db->dict[dictp2->cptr].codem1 == max_ent) { 1076 db->dict[dictp2->cptr].codem1 = BADCODEM1; 1077 } 1078 dictp2->cptr = hval; 1079 dictp->codem1 = max_ent; 1080 dictp->f.fcode = fcode; 1082 db->max_ent = ++max_ent; 1083 db->lens[max_ent] = db->lens[oldcode]+1; 1085 /* Expand code size if needed. 1086 */ 1087 if (max_ent >= MAXCODE(n_bits) 1088 && max_ent < db->maxmaxcode) { 1089 db->n_bits = ++n_bits; 1090 tgtbitno = 32-n_bits; 1091 } 1092 } 1093 oldcode = incode; 1094 } 1096 db->in_count += wptr-wptr0; 1097 dmsg1->b_wptr = wptr; 1098 CK_WPTR(dmsg1); 1100 db->overshoot += explen; 1102 /* Keep the checkpoint right so that incompressible packets 1103 * clear the dictionary at the right times. 1104 */ 1105 if (pf_bsd_check(db) 1106 && db->debug) { 1107 printf("bsd_decomp%d: peer should have " 1108 "cleared dictionary\n", db->unit); 1109 } 1111 return dmsg; 1112 } 1113 Security Considerations 1115 Security issues are not discussed in this memo. 1117 References 1119 [1] Simpson, W.A., "The Point-to-Point Protocol (PPP)", work in 1120 progress. 1122 [2] Rand, D., "The PPP Compression Control Protocol (CCP)", work in 1123 progress. 1125 [3] Simpson, W.A., "PPP LCP Extensions", work in progress. 1127 [4] Simpson, W.A., "PPP in HDLC Framing", work in progress. 1129 Acknowledgments 1131 William Simpson provided and supported the very valuable idea of not 1132 using any additional header bytes for incompressible packets. 1134 Chair's Address 1136 The working group can be contacted via the current chair: 1138 Fred Baker 1139 Advanced Computer Communications 1140 315 Bollay Drive 1141 Santa Barbara, California 93117 1143 EMail: fbaker@acc.com 1145 Author's Address 1147 Questions about this memo can also be directed to: 1149 Vernon Schryver 1150 2482 Lee Hill Drive 1151 Boulder, Colorado 80302 1153 EMail: vjs@rhyolite.com 1154 Table of Contents 1156 1. Introduction .......................................... 1 1157 1.1 Licensing ....................................... 1 1159 2. BSD Compress Packets .................................. 2 1160 2.1 Packet Format ................................... 4 1162 3. Configuration Option Format ........................... 6 1164 APPENDICES ................................................... 8 1166 A. BSD Compress Algorithm ................................ 8 1168 SECURITY CONSIDERATIONS ...................................... 26 1170 REFERENCES ................................................... 26 1172 ACKNOWLEDGEMENTS ............................................. 26 1174 CHAIR'S ADDRESS .............................................. 27 1176 AUTHOR'S ADDRESS ............................................. 27