idnits 2.17.1 

draft-ietf-rohc-sigcomp-user-guide-04.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 15.

  -- Found old boilerplate from RFC 3978, Section 5.5 on line 1891.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1868.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1875.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1881.

  ** This document has an original RFC 3978 Section 5.4 Copyright Line,
     instead of the newer IETF Trust Copyright according to RFC 4748.

  ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead
     of the newer disclaimer which includes the IETF Trust according to RFC
     4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not
     match the current year

  == Line 1492 has weird spacing: '...ocation    pad...'

  == Line 1637 has weird spacing: '...ocation   pad ...'

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (October 24, 2005) is 6759 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: '0' is mentioned on line 413, but not defined

  == Unused Reference: '1' is defined on line 1725, but no explicit reference
     was found in the text

  == Unused Reference: '2' is defined on line 1729, but no explicit reference
     was found in the text

  ** Obsolete normative reference: RFC 3667 (ref. '2') (Obsoleted by RFC 3978)

  ** Obsolete normative reference: RFC 2234 (ref. '3') (Obsoleted by RFC 4234)

  -- Possible downref: Non-RFC (?) normative reference: ref. '7'

  -- Possible downref: Non-RFC (?) normative reference: ref. '8'

  -- Possible downref: Non-RFC (?) normative reference: ref. '9'

  ** Downref: Normative reference to an Informational RFC: RFC 1951 (ref.
     '10')

  -- Possible downref: Non-RFC (?) normative reference: ref. '11'


     Summary: 8 errors (**), 0 flaws (~~), 7 warnings (==), 11 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Robust Header Compression                                     A. Surtees
3	Internet-Draft                                                   M. West
4	Expires: April 27, 2006                      Siemens/Roke Manor Research
5	                                                        October 24, 2005

7	                          SigComp Users' Guide
8	               draft-ietf-rohc-sigcomp-user-guide-04.txt

10	Status of this Memo

12	   By submitting this Internet-Draft, each author represents that any
13	   applicable patent or other IPR claims of which he or she is aware
14	   have been or will be disclosed, and any of which he or she becomes
15	   aware will be disclosed, in accordance with Section 6 of BCP 79.

17	   Internet-Drafts are working documents of the Internet Engineering
18	   Task Force (IETF), its areas, and its working groups.  Note that
19	   other groups may also distribute working documents as Internet-
20	   Drafts.

22	   Internet-Drafts are draft documents valid for a maximum of six months
23	   and may be updated, replaced, or obsoleted by other documents at any
24	   time.  It is inappropriate to use Internet-Drafts as reference
25	   material or to cite them other than as "work in progress."

27	   The list of current Internet-Drafts can be accessed at
28	   http://www.ietf.org/ietf/1id-abstracts.txt.

30	   The list of Internet-Draft Shadow Directories can be accessed at
31	   http://www.ietf.org/shadow.html.

33	   This Internet-Draft will expire on April 27, 2006.

35	Copyright Notice

37	   Copyright (C) The Internet Society (2005).

39	Abstract

41	   This document provides an informational guide for users of the
42	   SigComp protocol.  The aim of the document is to assist users when
43	   making SigComp implementation decisions; for example the choice of
44	   compression algorithm and the level of robustness against lost or
45	   misordered packets.

47	Table of Contents

49	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
50	   2.  Overview of the User Guide . . . . . . . . . . . . . . . . . .  3
51	   3.  UDVM assembly language . . . . . . . . . . . . . . . . . . . .  4
52	     3.1   Lexical level  . . . . . . . . . . . . . . . . . . . . . .  4
53	     3.2   Syntactic level  . . . . . . . . . . . . . . . . . . . . .  5
54	       3.2.1   Expressions  . . . . . . . . . . . . . . . . . . . . .  7
55	       3.2.2   Instructions . . . . . . . . . . . . . . . . . . . . .  8
56	       3.2.3   Directives . . . . . . . . . . . . . . . . . . . . . .  9
57	       3.2.4   Labels . . . . . . . . . . . . . . . . . . . . . . . . 10
58	     3.3   Uploading the bytecode to the UDVM . . . . . . . . . . . . 10
59	   4.  Compression algorithms . . . . . . . . . . . . . . . . . . . . 12
60	     4.1   Well-known Compression Algorithms  . . . . . . . . . . . . 12
61	       4.1.1   Simplified LZ77  . . . . . . . . . . . . . . . . . . . 12
62	       4.1.2   LZSS . . . . . . . . . . . . . . . . . . . . . . . . . 16
63	       4.1.3   LZW  . . . . . . . . . . . . . . . . . . . . . . . . . 18
64	       4.1.4   DEFLATE  . . . . . . . . . . . . . . . . . . . . . . . 21
65	       4.1.5   LZJH . . . . . . . . . . . . . . . . . . . . . . . . . 24
66	     4.2   Adapted Algorithms . . . . . . . . . . . . . . . . . . . . 28
67	       4.2.1   Modified DEFLATE . . . . . . . . . . . . . . . . . . . 28
68	   5.  Additional SigComp mechanisms  . . . . . . . . . . . . . . . . 31
69	     5.1   Acknowledging a state item . . . . . . . . . . . . . . . . 32
70	     5.2   Static dictionary  . . . . . . . . . . . . . . . . . . . . 33
71	     5.3   CRC checksum . . . . . . . . . . . . . . . . . . . . . . . 34
72	     5.4   Announcing additional resources  . . . . . . . . . . . . . 35
73	     5.5   Shared compression . . . . . . . . . . . . . . . . . . . . 37
74	   6.  Security considerations  . . . . . . . . . . . . . . . . . . . 38
75	   7.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 38
76	   8.  Intellectual Property Right Considerations . . . . . . . . . . 38
77	   9.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 38
78	       Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 39
79	   A.  UDVM bytecode for the compression algorithms . . . . . . . . . 39
80	     A.1   Well-known Algorithms  . . . . . . . . . . . . . . . . . . 40
81	       A.1.1   Simplified LZ77  . . . . . . . . . . . . . . . . . . . 40
82	       A.1.2   LZSS . . . . . . . . . . . . . . . . . . . . . . . . . 40
83	       A.1.3   LZW  . . . . . . . . . . . . . . . . . . . . . . . . . 40
84	       A.1.4   DEFLATE  . . . . . . . . . . . . . . . . . . . . . . . 40
85	       A.1.5   LZJH . . . . . . . . . . . . . . . . . . . . . . . . . 40
86	     A.2   Adapted Algorithms . . . . . . . . . . . . . . . . . . . . 41
87	       A.2.1   Modified DEFLATE . . . . . . . . . . . . . . . . . . . 41
88	       Intellectual Property and Copyright Statements . . . . . . . . 42

90	1.  Introduction

92	   This document provides an informational guide for users of the
93	   SigComp protocol RFC-3320 [4].  The idea behind SigComp is to
94	   standardize a Universal Decompressor Virtual Machine (UDVM) that can
95	   be programmed to understand the output of many well-known compressors
96	   including DEFLATE [10] and LZW [9].  The bytecode for the choice of
97	   compression algorithm is uploaded to the UDVM as part of the
98	   compressed data.

100	   The basic SigComp RFC describes the actions that an endpoint must
101	   take upon receiving a SigComp message.  However the entity
102	   responsible for generating new SigComp messages (the SigComp
103	   compressor) is left as an implementation decision; any compressor can
104	   be used provided that it generates SigComp messages that can be
105	   successfully decompressed by the receiving endpoint.

107	   This document gives examples of a number of different compressors
108	   that can be used by the SigComp protocol.  It also gives examples of
109	   how to use some of the mechanisms (such as acknowledgements)
110	   described in RFC 3321 [5].

112	2.  Overview of the User Guide

114	   When implementing a SigComp compressor the first step is to choose a
115	   compression algorithm that can encode the application messages into a
116	   (hopefully) smaller form.  Since SigComp can upload bytecode for new
117	   algorithms to the receiving endpoint, arbitrary compression
118	   algorithms can be supported provided that bytecode has been written
119	   for the corresponding decompressor.

121	   This document provides example bytecode for the following algorithms:
122	   1.  Simplified LZ77
123	   2.  LZSS
124	   3.  LZW
125	   4.  DEFLATE
126	   5.  LZJH

128	   Any of the above algorithms may be useful depending on the desired
129	   compression ratio, processing and memory requirements, code size,
130	   implementation complexity and Intellectual Property (IPR)
131	   considerations.

133	   As well as encoding the application messages using the chosen
134	   algorithm, the SigComp compressor is responsible for ensuring that
135	   messages can be correctly decompressed even if packets are lost or
136	   misordered during transmission.  The SigComp feedback mechanism can
137	   be used to acknowledge successful decompression at the remote
138	   endpoint.

140	   The following robustness techniques and other mechanisms specific to
141	   the SigComp environment are covered in this document:
142	   1.  Acknowledgements using the SigComp feedback mechanism
143	   2.  Static dictionary
144	   3.  CRC checksum
145	   4.  Announcing additional resources
146	   5.  Shared compression

148	   Any or all of the above mechanisms can be implemented in conjunction
149	   with the chosen compression algorithm.  An example subroutine of UDVM
150	   bytecode is provided for each of the mechanisms; these subroutines
151	   can be added to the bytecode for one of the basic compression
152	   algorithms.  (NB. The subroutine or the basic algorithm may require
153	   minor modification to ensure they work together correctly.)

155	3.  UDVM assembly language

157	   Writing UDVM programs directly in bytecode would be a daunting task,
158	   so a simple assembly language is provided to facilitate the creation
159	   of new decompression algorithms.  The assembly language includes
160	   mnemonic codes for each of the UDVM instructions, as well as simple
161	   directives for evaluating integer expressions, padding the bytecode
162	   and so forth.

164	   The syntax of the UDVM assembly language uses the customary two-level
165	   description technique, partitioning the grammar into a lexical and a
166	   syntactical level.

168	3.1  Lexical level

170	   On a lexical level, a string of assembly consists of zero or more
171	   tokens optionally separated by whitespace.  Each token can be a text
172	   name, an instruction opcode, a delimiter, or an integer (specified as
173	   decimal, binary or hex).

175	   The following ABNF description RFC-2234 [3] specifies the syntax of a
176	   token:

178	   token            =     (name / opcode / delimiter / dec / bin / hex)
179	   name             =     (lowercase / "_") 0*(lowercase / digit / "_")
180	   opcode           =     uppercase *(uppercase / digit / "-")
181	   delimiter        =     "." / "," / "!" / "$" / ":" / "(" / ")" /
182	                          operator
183	   dec              =     1*(digit)
184	   bin              =     "0b" 1*("0" / "1")
185	   hex              =     "0x" 1*(hex_digit)
186	   hex_digit        =     digit / %x41-46 / %x61-66
187	   digit            =     %x30-39
188	   uppercase        =     %x41-5a
189	   lowercase        =     %x61-7a
190	   operator         =     "+" / "-" / "*" / "/" / "%" / "&" / "|" /
191	                          "^" / "~" / "<<" / ">>"

193	   When parsing for tokens a longest match is applied, i.e. a token is
194	   the longest string that matches the <token> rule specified above.

196	   The syntax of whitespace and comments is specified by the following
197	   ABNF:

199	   ws               =     *(%x09 / %x0a / %x0d / %x20 / comment)
200	   comment          =     ";" *(%x00-09 / %x0b-0c / %x0e-ff)
201	                          (%x0a / %x0d)

203	   Whitespace that matches <ws> is skipped between tokens, but serves to
204	   terminate the longest match for a token.

206	   Comments are specified by the symbol ";" and are terminated by the
207	   end of the line, for example:

209	   LOAD (temp, 1)             ; This is a comment.

211	   Any other input is a syntax error.

213	   When parsing on the lexical level the string of assembly should be
214	   divided up into a list of successive tokens.  The whitespace and
215	   comments should also be deleted.  The assembly should then be parsed
216	   on the syntactic level as explained in Section 3.2.

218	3.2  Syntactic level

220	   Once the string of assembly has been divided into tokens as per
221	   Section 3.1, the next step is to convert the assembly into a string
222	   of UDVM bytecode.

224	   On a syntactic level, a string of assembly consists of zero or more
225	   instructions, directives or labels, each of which is itself built up
226	   from one or more lexical tokens.

228	   The following ABNF description specifies the syntax of the assembly
229	   language.  Note that the lexical parsing step is assumed to have been
230	   carried out, so in particular the boundaries between tokens are
231	   already known and the comments and whitespace have been deleted:

233	   assembly          =    *(instruction / directive / label)
234	   instruction       =    opcode ["(" operand *("," operand) ")"]
235	   operand           =    [["$"] expression]
236	                              ; Operands can be left blank if they can
237	                              ; be automatically inferred by the
238	                              ; compiler, e.g. a literal operand
239	                              ; that specifies the total number of
240	                              ; operands for the instruction.
241	                              ; When "$" is prepended to an operand,
242	                              ; the corresponding integer is an
243	                              ; address rather than the actual operand
244	                              ; value. This symbol is mandatory for
245	                              ; reference operands, optional for
246	                              ; multitypes and addresses, and
247	                              ; disallowed for literals.
248	   label             =    ":" name
249	   directive         =    padding / data / set / readonly /
250	                          unknown_directive
251	   unknown_directive =    name ["(" expression *("," expression) ")"]
252	                              ; The parser can ignore unknown
253	                              ; directives.  The resulting bytecode
254	                              ; may or may not generate the expected
255	                              ; results.
256	   padding           =    ("pad" / "align" / "at") "(" expression ")"
257	   data              =    ("byte" / "word") "(" expression *(","
258	                          expression) ")"
259	   readonly          =    "readonly" "(" "0" / "1" ")"
260	   set               =    "set" "(" name "," expression ")"
261	   expression        =    value / "(" expression operator expression ")"
262	   value             =    dec / bin / hex / name / "." / "!"
263	                              ; "." is the location of this
264	                              ; instruction/directive, whereas "!" is
265	                              ; the location of the closest
266	                              ; DECOMPRESSION-FAILURE

268	   The following sections define how to convert the instructions, labels
269	   and directives into UDVM bytecode:

271	3.2.1  Expressions

273	   The operand values needed by particular instructions or directives
274	   can be given in the form of expressions.  An expression can include
275	   one or more values specified as decimal, binary or hex (binary values
276	   are preceded by "0b" and hex values are preceded by "0x").  The
277	   expression may also include one or more of the following operators:

279	          "+"    Addition
280	          "-"    Subtraction
281	          "*"    Multiplication
282	          "/"    Integer division
283	          "%"    Modulo arithmetic (a%b := a modulo b)
284	          "&"    Binary AND
285	          "|"    Binary OR
286	          "^"    Binary XOR
287	          "~"    Binary XNOR
288	          "<<"   Binary LSHIFT
289	          ">>"   Binary RSHIFT

291	   The operands for each operator must always be surrounded by
292	   parentheses so that the order in which the operators should be
293	   evaluated is clear.  For example:

295	   ((1 + (2 * 3)) & (0xabcd - 0b00101010)) gives the result 3.

297	   Expressions can also include the special values "." and "!".  When
298	   the symbol "." is encountered, it is replaced by the location in the
299	   bytecode of the current instruction/directive.  When the symbol "!"
300	   is encountered it is replaced by the location in the bytecode of the
301	   closest DECOMPRESSION-FAILURE instruction (i.e. the closest zero
302	   byte).  This can be useful when writing UDVM instructions that call a
303	   decompression failure, for example:

305	   INPUT-BYTES (1, temp, !)

307	   The above instruction causes a decompression failure to occur if it
308	   tries to input data from beyond the end of the compressed message.

310	   N.B. When using "!" in the assembly language to generate bytecode,
311	   care must be taken to ensure that the address of the zero used at
312	   bytecode generation time will still contain zero when the bytecode is
313	   run.  The readonly directive (see Section 3.2.3) can be used to do
314	   this.

316	   It is also possible to assign integer values to text names: when a
317	   text name is encountered in an expression it is replaced by the
318	   integer value assigned to it.  Section 3.2.3 explains how to assign
319	   integer values to text names.

321	3.2.2  Instructions

323	   A UDVM instruction is specified by the instruction opcode followed by
324	   zero or more operands.  The instruction operands are enclosed in
325	   parentheses and separated by commas, for example:

327	   ADD (3, 4)

329	   When generating the bytecode the parser should replace the
330	   instruction opcode with the corresponding 1-byte value as per Figure
331	   11 of SigComp [4].

333	   Each operand consists of an expression which evaluates to an integer,
334	   optionally preceded by the symbol "$".  This symbol indicates that
335	   the supplied integer value must be interpreted as the memory address
336	   at which the operand value can be found, rather than the actual
337	   operand value itself.

339	   When converting each instruction operand to bytecode, the parser
340	   first determines whether the instruction expects the operand to be a
341	   literal, a reference, a multitype or an address.  If the operand is a
342	   literal then as per Figure 8 of SigComp, the parser inserts bytecode
343	   (usually the shortest) capable of encoding the supplied operand
344	   value.

346	   Since literal operands are used to indicate the total number of
347	   operands for an instruction, it is possible to leave a literal
348	   operand blank and allow its value to be inferred automatically by the
349	   assembler.  For example:

351	   MULTILOAD (64, , 1, 2, 3, 4)

353	   The missing operand should be given the value 4 because it is
354	   followed by a total of 4 operands.

356	   If the operand is a reference then, as per Figure 9 of SigComp, the
357	   parser inserts bytecode (usually the shortest) capable of encoding
358	   the supplied memory address.  Note that reference operands will
359	   always be preceded by the symbol "$" in assembly because they always
360	   encode memory addresses rather than actual operand values.

362	   If the operand is a multitype then the parser first checks whether
363	   the symbol "$" is present.  If so then, as per Figure 10 of SigComp,
364	   it inserts bytecode (usually the shortest) capable of encoding the
365	   supplied integer as a memory address.  If not then, as per Figure 10
366	   of SigComp, it inserts bytecode (usually the shortest) that encodes
367	   the supplied integer as an operand value.

369	   If the operand is an address then the parser checks whether the
370	   symbol "$" is present.  If so then the supplied integer is encoded as
371	   a memory address, just as for the multitype instruction above.  If
372	   not then the byte position of the opcode is subtracted from the
373	   supplied integer modulo 16, and the result is encoded as an operand
374	   value as per Figure 10 of SigComp.

376	   The length of the resulting bytecode is dependent on the parser in
377	   use.  There can be several correct and useable representations of the
378	   same instruction.

380	3.2.3  Directives

382	   The assembly language provides a number of directives for evaluating
383	   expressions, moving instructions to a particular memory address etc.

385	   The directives "pad", "align" and "at" can be used to add padding to
386	   the bytecode.

388	   The directive "pad (n)" appends n consecutive padding bytes to the
389	   bytecode.  The actual value of the padding bytes is unimportant, so
390	   when the bytecode is uploaded to the UDVM the padding bytes can be
391	   set to the initial values contained in the UDVM memory (this helps to
392	   reduce the size of a SigComp message).

394	   The directive "align (n)" appends the minimum number of padding bytes
395	   to the bytecode such that the total bytecode generated so far is
396	   aligned to a multiple of n bytes.  If the bytecode is already aligned
397	   to a multiple of n bytes then no padding bytes are added.

399	   The directive "at (n)" appends enough padding bytes to the bytecode
400	   such that the total bytecode generated so far is exactly n bytes.  If
401	   more than n bytes have already been generated before the "at"
402	   directive is encountered then the assembly code contains an error.

404	   The directives "byte" and "word" can be used to add specific data
405	   strings to the bytecode.

407	   The directive "byte (n[0],..., n[k-1])" appends k consecutive bytes
408	   to the bytecode.  The byte string is supplied as expressions which
409	   evaluate to give integers n[0],..., n[k-1] from 0 to 255.

411	   The directive "word (n[0],..., n[k-1])" appends k consecutive 2-byte
412	   words to the bytecode.  The word string is supplied as expressions
413	   which evaluate to give integers n[0],..., n[k-1] from 0 to 65535.

415	   The directive "set (name, n)" assigns an integer value n to a
416	   specified text name.  The integer value can be supplied in the form
417	   of an expression.

419	   The directive "readonly (n)" where n is 0 or 1 can be used to
420	   indicate that an area of memory could be changed (0) or will not be
421	   changed (1) during the execution of the UDVM.  This directive could
422	   be used, for example, in conjunction with "!" to ensure that the
423	   address of the zero used will still contain zero when the bytecode is
424	   executed.  If no readonly directive is used, then any address
425	   containing zero can be used by "!" (i.e. by default there is assumed
426	   to be readonly(1) directive at address 0) and it is up to the author
427	   of the assembly code to ensure that the address in question will
428	   still contain zero when the bytecode is executed.  If the readonly
429	   directive is used, then bytes between a readonly (0) and readonly (1)
430	   pair are NOT to be used by "!".  When a readonly directive has been
431	   used, the bytes obey that directive from that address to either
432	   another readonly directive or the end of UDVM memory, whichever comes
433	   first.

435	3.2.4  Labels

437	   A label is a special directive used to assign memory addresses to
438	   text names.

440	   Labels are specified by giving a single colon followed by the text
441	   name to be defined.  The (absolute) position of the byte immediately
442	   following the label is evaluated and assigned to the text name.  For
443	   example:

445	   :start

447	   LOAD (temp, 1)

449	   Since the label "start" occurs at the beginning of the bytecode, it
450	   is assigned the integer value 0.

452	   Note that writing the label ":name" has exactly the same behavior as
453	   writing the directive "set (name, .)".

455	3.3  Uploading the bytecode to the UDVM

457	   Once the parser has converted a string of assembly into the
458	   corresponding bytecode, it must be copied to the UDVM memory
459	   beginning at Address 0 and then executed beginning from the first
460	   UDVM instruction in the bytecode.

462	   SigComp provides the following message format for uploading bytecode
463	   to the UDVM:

465	     0   1   2   3   4   5   6   7
466	   +---+---+---+---+---+---+---+---+
467	   | 1   1   1   1   1 | T |   0   |
468	   +---+---+---+---+---+---+---+---+
469	   |                               |
470	   :    returned feedback item     :  if T = 1
471	   |                               |
472	   +---+---+---+---+---+---+---+---+
473	   |           code_len            |
474	   +---+---+---+---+---+---+---+---+
475	   |   code_len    |  destination  |
476	   +---+---+---+---+---+---+---+---+
477	   |                               |
478	   :    uploaded UDVM bytecode     :
479	   |                               |
480	   +---+---+---+---+---+---+---+---+
481	   |                               |
482	   :   remaining SigComp message   :
483	   |                               |
484	   +---+---+---+---+---+---+---+---+

486	   The destination field should be set to the memory address of the
487	   first UDVM instruction.  Note that if this address cannot be
488	   represented by the destination field then the bytecode cannot be
489	   uploaded to the UDVM using the standard SigComp header.  In
490	   particular, the memory address of the first UDVM instruction must
491	   always be a multiple of 64 bytes or the standard SigComp header
492	   cannot be used.  Of course, there may be other ways to upload the
493	   bytecode to the UDVM, such as retrieving the bytecode directly via
494	   the INPUT-BYTES instruction.

496	   Additionally, all memory addresses between Address 0 and Address 31
497	   inclusive are initialized to endpoint-specific values by the UDVM, so
498	   they must be specified as padding in the bytecode or the standard
499	   SigComp header cannot be used.  Memory addresses from Address 32 to
500	   Address (destination - 1) inclusive are initialized to 0, so they
501	   must be specified either as padding or as 0s if the bytecode is to be
502	   successfully uploaded using the standard SigComp header.

504	   The code_len field should be set to the smallest value such that all
505	   memory addresses beginning at Address (destination + code_len) are
506	   either padding, set to 0 by the bytecode, or are beyond the total
507	   length of the bytecode.

509	   The "uploaded UDVM bytecode" should be set to contain the segment of
510	   bytecode that lies between Address (destination) and Address
511	   (destination + code_len - 1) inclusive.

513	4.  Compression algorithms

515	   This chapter describes a number of compression algorithms that can be
516	   used by a SigComp compressor.  In each case the document provides
517	   UDVM bytecode for the corresponding decompression algorithm, which
518	   can be uploaded to the receiving endpoint as part of a SigComp
519	   message.  Each algorithm (as written in this chapter) assumes that
520	   there is 16K decompression memory size, 16 cycles per bit and 8K
521	   state memory size.  Decompression will succeed with a smaller value
522	   for state memory size, however, the full state will not be created.

524	   Section 4.1.1 covers a simple algorithm in some detail, including the
525	   steps required to compress and decompress a SigComp message.  The
526	   remaining sections cover well-known compression algorithms that can
527	   be adapted for use in SigComp with minimal modification.

529	4.1  Well-known Compression Algorithms

531	4.1.1  Simplified LZ77

533	   This section describes how to implement a very simple compression
534	   algorithm based on LZ77 [7].

536	   A compressed message generated by the simplified LZ77 scheme consists
537	   of a sequence of 4-byte characters, where each character contains a
538	   2-byte position value followed by a 2-byte length value.  Each pair
539	   of integers identifies a byte string in the UDVM memory; when
540	   concatenated these byte strings form the decompressed message.

542	   When implementing a bytecode decompressor for the simplified LZ77
543	   scheme, the UDVM memory is partitioned into five distinct areas as
544	   shown below:

546	   0             64          128        256          512
547	   | scratch-pad | variables | bytecode | dictionary | circular buffer |
548	   +-------------+-----------+----------+------------+-----------------+
549	    <-----------> <---------> <--------> <----------> <--------------->
550	       64 bytes     64 bytes   128 bytes   256 bytes      512+ bytes

552	   The first 128 bytes are used to hold the 2-byte variables needed by
553	   the LZ77 decompressor.  Within this memory the first 64 bytes is used
554	   as a scratch-pad, holding the 2-byte variables that can be discarded
555	   between SigComp messages.  In contrast the next 64 bytes (and in fact
556	   all of the UDVM memory starting from Address 64) should be saved
557	   after decompressing a SigComp message to improve the compression
558	   ratio of subsequent messages.

560	   The bytecode for the LZ77 decompressor is stored beginning at Address
561	   128.  A total of 128 bytes are reserved for the bytecode although the
562	   LZ77 decompressor requires less; this allows room for adding
563	   additional features to the decompressor at a later stage.

565	   The next 256 bytes are initialized by the bytecode to contain the
566	   integers 0 to 255 inclusive.  The purpose of this memory area is to
567	   provide a dictionary of all possible uncompressed characters; this is
568	   important to ensure that the compressor can always generate a
569	   sequence of position/length pairs that encode a given message.  For
570	   example, a byte with value 0x41 (corresponding to the ASCII character
571	   "A") can be found at Address 0x0141 of the UDVM memory, so the
572	   compressed character 0x0141 0001 will decompress to give this ASCII
573	   character.  Note that encoding each byte in the application message
574	   as a separate 4-byte compressed character is not recommended,
575	   however, as the resulting "compressed" message is four times as large
576	   as the original uncompressed message.

578	   The compression ratio of LZ77 is improved by the remaining UDVM
579	   memory, which is used to store a history buffer containing the
580	   previously decompressed messages.  Compressed characters can point to
581	   strings that have previously been decompressed and stored in the
582	   buffer; so the overall compression ratio of the LZ77 algorithm
583	   improves as the decompressor "learns" more text strings and is able
584	   to encode longer strings using a single compressed character.  The
585	   buffer is circular, so older messages are overwritten by new data
586	   when the buffer becomes full.

588	   The steps required to implement an LZ77 compressor and decompressor
589	   are similar, although compression is more processor-intensive as it
590	   requires a searching operation to be performed.  Assembly for the
591	   simplified LZ77 decompressor is given below:

593	   ; Variables that do not need to be stored after decompressing each
594	   ; SigComp message are stored here:

596	   at (32)
597	   :index                          pad (2)
598	   :length_value                   pad (2)

600	   at (42)

602	   set (requested_feedback_location, 0)

604	   ; The UDVM registers must be stored beginning at Address 64:

606	   at (64)

608	   ; Variables that should be stored after decompressing a message are
609	   ; stored here. These variables will form part of the SigComp state
610	   ; item created by the bytecode:

612	   :byte_copy_left                 pad (2)
613	   :byte_copy_right                pad (2)
614	   :decompressed_pointer           pad (2)

616	   set (returned_parameters_location, 0)

618	   align (64)

620	   :initialize_memory

622	   set (udvm_memory_size, 8192)
623	   set (state_length, (udvm_memory_size - 64))

625	   ; The UDVM registers byte_copy_left and byte_copy_right are set to
626	   ; indicate the bounds of the circular buffer in the UDVM memory. A
627	   ; variable decompressed_pointer is also created and set pointing to
628	   ; the start of the circular buffer:

630	   MULTILOAD (64, 3, circular_buffer, udvm_memory_size, circular_buffer)

632	   ; The "dictionary" area of the UDVM memory is initialized to contain
633	   ; the values 0 to 255 inclusive:

635	   MEMSET (static_dictionary, 256, 0, 1)

637	   :decompress_sigcomp_message

639	   :next_character

641	   ; The next character in the compressed message is read by the UDVM
642	   ; and the position and length integers are stored in the variables
643	   ; position_value and length_value respectively. If no more
644	   ; compressed data is available the decompressor jumps to the
645	   ; "end_of_message" subroutine:

647	   INPUT-BYTES (4, index, end_of_message)

649	   ; The position_value and length_value point to a byte string in the
650	   ; UDVM memory, which is copied into the circular buffer at the
651	   ; position specified by decompressed_pointer. This allows the string
652	   ; to be referenced by later characters in the compressed message:

654	   COPY-LITERAL ($index, $length_value, $decompressed_pointer)

656	   ; The byte string is also outputted onto the end of the decompressed
657	   ; message:

659	   OUTPUT ($index, $length_value)

661	   ; The decompressor jumps back to consider the next character in the
662	   ; compressed message:

664	   JUMP (next_character)

666	   :end_of_message

668	   ; The decompressor saves the UDVM memory and halts:

670	   END-MESSAGE (requested_feedback_location,
671	   returned_parameters_location, state_length, 64,
672	   decompress_sigcomp_message, 6, 0)

674	   at (256)

676	   ; Memory for the dictionary and the circular buffer are reserved by
677	   ; the following statements:

679	   :static_dictionary              pad (256)
680	   :circular_buffer

682	   The task of an LZ77 compressor is simply to discover a sequence of 4-
683	   byte compressed characters which the above bytecode will decompress
684	   to give the desired application message.  As an example, a message
685	   compressed using the simplified LZ77 algorithm is given below:

687	   0x0154 0001 0168 0001 0165 0001 0120 0001 0152 0001 0165 0001 0173
688	   0x0002 0161 0001 0175 0001 0172 0001 0161 0001 016e 0001 0174 0001
689	   0x0120 0001 0161 0001 020d 0002 0174 0001 0201 0003 0145 0001 016e
690	   0x0001 0164 0001 0120 0001 016f 0001 0166 0001 0211 0005 0155 0001
691	   0x016e 0001 0169 0001 0176 0001 0165 0001 0172 0002 0165 0001 010a
692	   0x0001

694	   The uncompressed message is "The Restaurant at the End of the
695	   Universe\n".

697	   The bytecode for the LZ77 decompressor can be uploaded as part of the
698	   compressed message as specified in Section 3.3.  However, in order to
699	   improve the overall compression ratio it is important to avoid
700	   uploading bytecode in every compressed message.  For this reason
701	   SigComp allows the UDVM to save an area of its memory as a state item
702	   between compressed messages.  Once a state item has been created it
703	   can be retrieved by sending the corresponding state identifier using
704	   the following SigComp message format:

706	     0   1   2   3   4   5   6   7
707	   +---+---+---+---+---+---+---+---+
708	   | 1   1   1   1   1 | T |   1   |
709	   +---+---+---+---+---+---+---+---+
710	   |                               |
711	   :    returned feedback item     :  if T = 1
712	   |                               |
713	   +---+---+---+---+---+---+---+---+
714	   |                               |
715	   :   partial state identifier    :
716	   |                               |
717	   +---+---+---+---+---+---+---+---+
718	   |                               |
719	   :   remaining SigComp message   :
720	   |                               |
721	   +---+---+---+---+---+---+---+---+

723	   The partial_state_identifier field must contain the first 6 bytes of
724	   the state identifier for the state item to be accessed (see [4] for
725	   details of how state identifiers are derived).

727	4.1.2  LZSS

729	   This section provides UDVM bytecode for the simple but effective LZSS
730	   compression algorithm [8].

732	   The principal improvement offered by LZSS over LZ77 is that each
733	   compressed character begins with a 1-bit indicator flag to specify
734	   whether the character is a literal or an offset/length pair.  A
735	   literal value is simply a single uncompressed byte that is appended
736	   directly to the decompressed message.

738	   An offset/length pair contains a 12-bit offset value from 1 to 4096
739	   inclusive, followed by a 4-bit length value from 3 to 18 inclusive.
740	   Taken together these values specify one of the previously received
741	   text strings in the circular buffer, which is then appended to the
742	   end of the decompressed message.

744	   Assembly for an LZSS decompressor is given below:

746	   at (32)
747	   readonly (0)

749	   :index                          pad (2)
750	   :length_value                   pad (2)
751	   :old_pointer                    pad (2)

753	   at (42)

755	   set (requested_feedback_location, 0)

757	   at (64)

759	   :byte_copy_left                 pad (2)
760	   :byte_copy_right                pad (2)
761	   :input_bit_order                pad (2)
762	   :decompressed_pointer           pad (2)

764	   set (returned_parameters_location, 0)

766	   align (64)
767	   readonly (1)

769	   :initialize_memory

771	   set (udvm_memory_size, 8192)
772	   set (state_length, (udvm_memory_size - 64))

774	   MULTILOAD (64, 4, circular_buffer, udvm_memory_size, 0,
775	   circular_buffer)

777	   :decompress_sigcomp_message

779	   :next_character

781	   INPUT-HUFFMAN (index, end_of_message, 2, 9, 0, 255, 16384, 4, 4096,
782	   8191, 1)
783	   COMPARE ($index, 8192, length, end_of_message, literal)

785	   :literal

787	   set (index_lsb, (index + 1))

789	   OUTPUT (index_lsb, 1)
790	   COPY-LITERAL (index_lsb, 1, $decompressed_pointer)
791	   JUMP (next_character)

793	   :length

795	   INPUT-BITS (4, length_value, !)
796	   ADD ($length_value, 3)
797	   LOAD (old_pointer, $decompressed_pointer)
798	   COPY-OFFSET ($index, $length_value, $decompressed_pointer)
799	   OUTPUT ($old_pointer, $length_value)
800	   JUMP (next_character)

802	   :end_of_message

804	   END-MESSAGE (requested_feedback_location,
805	   returned_parameters_location, state_length, 64,
806	   decompress_sigcomp_message, 6, 0)

808	   readonly (0)
809	   :circular_buffer

811	   An example message compressed using the LZSS algorithm is given
812	   below:

814	   0x279a 0406 e378 b200 6074 1018 4ce6 1349 b842

816	   The uncompressed message is "Oh no, not again!".

818	4.1.3  LZW

820	   This section provides UDVM bytecode for the well-known LZW
821	   compression algorithm LZW [9].  This algorithm is used in a number of
822	   standards including the GIF image format.

824	   LZW compression operates in a similar manner to LZ77 in that it
825	   maintains a circular buffer of previously received decompressed data,
826	   and each compressed character references exactly one byte string from
827	   the circular buffer.  However, LZW also maintains a "codebook"
828	   containing 1024 position/length pairs that point to byte strings
829	   which LZW believes are most likely to occur in the uncompressed data.

831	   The byte strings stored in the LZW codebook can be referenced by
832	   sending a single 10-bit value from 0 to 1023 inclusive.  The UDVM
833	   extracts the corresponding text string from the codebook and appends
834	   it to the end of the decompressed message.  It then creates a new
835	   codebook entry containing the current text string plus the next
836	   character to occur in the decompressed message.

838	   Assembly for an LZW decompressor is given below:

840	   at (32)

842	   :length_value                   pad (2)
843	   :position_value                 pad (2)
844	   :index                          pad (2)

846	   at (42)

848	   set (requested_feedback_location, 0)

850	   at (64)

852	   :byte_copy_left                 pad (2)
853	   :byte_copy_right                pad (2)
854	   :input_bit_order                pad (2)

856	   :codebook_next                  pad (2)
857	   :current_length                 pad (2)
858	   :decompressed_pointer           pad (2)

860	   set (returned_parameters_location, 0)

862	   align (64)

864	   :initialize_memory

866	   set (udvm_memory_size, 8192)
867	   set (state_length, (udvm_memory_size - 64))

869	   MULTILOAD (64, 6, circular_buffer, udvm_memory_size, 0, codebook, 1,
870	   static_dictionary)

872	   :initialize_codebook

874	   ; The following instructions are used to initialize the first 256
875	   ; entries in the LZW codebook with single ASCII characters:

877	   set (index_lsb, (index + 1))
878	   set (current_length_lsb, (current_length + 1))

880	   COPY-LITERAL (current_length_lsb, 3, $codebook_next)
881	   COPY-LITERAL (index_lsb, 1, $decompressed_pointer)
882	   ADD ($index, 1)
883	   COMPARE ($index, 256, initialize_codebook, next_character, 0)

885	   :decompress_sigcomp_message

887	   :next_character

889	   ; The following INPUT-BITS instruction extracts 10 bits from the
890	   ; compressed message:

892	   INPUT-BITS (10, index, end_of_message)

894	   ; The following instructions interpret the received bits as an index
895	   ; into the LZW codebook, and extract the corresponding
896	   ; position/length pair:

898	   set (length_value_lsb, (length_value + 1))

900	   MULTIPLY ($index, 3)
901	   ADD ($index, codebook)
902	   COPY ($index, 3, length_value_lsb)

904	   ; The following instructions append the selected text string to the
905	   ; circular buffer and create a new codebook entry pointing to this
906	   ; text string:

908	   LOAD (current_length, 1)
909	   ADD ($current_length, $length_value)
910	   COPY-LITERAL (current_length_lsb, 3, $codebook_next)
911	   COPY-LITERAL ($position_value, $length_value, $decompressed_pointer)

913	   ; The following instruction outputs the text string specified by the
914	   ; position/length pair:

916	   OUTPUT ($position_value, $length_value)
917	   JUMP (next_character)

919	   :end_of_message

921	   END-MESSAGE (requested_feedback_location,
922	   returned_parameters_location, state_length, 64,
923	   decompress_sigcomp_message, 6, 0)

925	   :static_dictionary              pad (256)
926	   :circular_buffer

928	   at (4492)

930	   :codebook

932	   An example message compressed using the LZW algorithm is given below:
933	   0x14c6 f080 6c1b c6e1 9c20 1846 e190 201d 0684 206b 1cc2 0198 6f1c
934	   0x9071 b06c 42c6 8195 111a 4731 a021 02bf f0

936	   The uncompressed message is "So long and thanks for all the fish!\n".

938	4.1.4  DEFLATE

940	   This section provides UDVM bytecode for the DEFLATE compression
941	   algorithm.  DEFLATE is the algorithm used in the well-known "gzip"
942	   file format.

944	   The following bytecode will decompress the DEFLATE compressed data
945	   format [10] with the following modifications:
946	   1.  The DEFLATE compressed data format separates blocks of compressed
947	       data by transmitting 7 consecutive zero bits.  Each SigComp
948	       message is assumed to contain a separate block of compressed
949	       data, so the end-of-block bits are implicit and do not need to be
950	       transmitted at the end of a SigComp message.
951	   2.  This bytecode supports only DEFLATE block type 01 (data
952	       compressed with fixed Huffman codes).

954	   Assembly for the DEFLATE decompressor is given below:

956	   at (32)
957	   readonly (0)

959	   :index                          pad (2)
960	   :extra_length_bits              pad (2)
961	   :length_value                   pad (2)
962	   :extra_distance_bits            pad (2)
963	   :distance_value                 pad (2)

965	   at (42)

967	   set (requested_feedback_location, 0)

969	   at (64)

971	   :byte_copy_left                 pad (2)
972	   :byte_copy_right                pad (2)
973	   :input_bit_order                pad (2)
974	   :decompressed_pointer           pad (2)

976	   :length_table                   pad (116)
977	   :distance_table                 pad (120)

979	   set (returned_parameters_location, 0)

981	   align (64)

983	   readonly (1)
984	   :initialize_memory

986	   set (udvm_memory_size, 8192)
987	   set (state_length, (udvm_memory_size - 64))
988	   set (length_table_start, (((length_table - 4) + 65536) / 4))
989	   set (length_table_mid, (length_table_start + 24))
990	   set (distance_table_start, (distance_table / 4))

992	   MULTILOAD (64, 122, circular_buffer, udvm_memory_size, 5,
993	   circular_buffer,

995	   0,       3,       0,       4,       0,       5,
996	   0,       6,       0,       7,       0,       8,
997	   0,       9,       0,       10,      1,       11,
998	   1,       13,      1,       15,      1,       17,
999	   2,       19,      2,       23,      2,       27,
1000	   2,       31,      3,       35,      3,       43,
1001	   3,       51,      3,       59,      4,       67,
1002	   4,       83,      4,       99,      4,       115,
1003	   5,       131,     5,       163,     5,       195,
1004	   5,       227,     0,       258,

1006	   0,       1,       0,       2,       0,       3,
1007	   0,       4,       1,       5,       1,       7,
1008	   2,       9,       2,       13,      3,       17,
1009	   3,       25,      4,       33,      4,       49,
1010	   5,       65,      5,       97,      6,       129,
1011	   6,       193,     7,       257,     7,       385,
1012	   8,       513,     8,       769,     9,       1025,
1013	   9,       1537,    10,      2049,    10,      3073,
1014	   11,      4097,    11,      6145,    12,      8193,
1015	   12,      12289,   13,      16385,   13,      24577)

1017	   :decompress_sigcomp_message

1019	   INPUT-BITS (3, extra_length_bits, !)
1020	   :next_character

1022	   INPUT-HUFFMAN (index, end_of_message, 4,
1023	       7, 0, 23, length_table_start,
1024	       1, 48, 191, 0,
1025	       0, 192, 199, length_table_mid,
1026	       1, 400, 511, 144)
1027	   COMPARE ($index, length_table_start, literal, end_of_message,
1028	   length_distance)

1030	   :literal

1032	   set (index_lsb, (index + 1))

1034	   OUTPUT (index_lsb, 1)
1035	   COPY-LITERAL (index_lsb, 1, $decompressed_pointer)
1036	   JUMP (next_character)

1038	   :length_distance

1040	   ; this is the length part

1042	   MULTIPLY ($index, 4)
1043	   COPY ($index, 4, extra_length_bits)
1044	   INPUT-BITS ($extra_length_bits, extra_length_bits, !)
1045	   ADD ($length_value, $extra_length_bits)

1047	   ; this is the distance part

1049	   INPUT-HUFFMAN (index, !, 1, 5, 0, 31, distance_table_start)
1050	   MULTIPLY ($index, 4)
1051	   COPY ($index, 4, extra_distance_bits)

1053	   INPUT-BITS ($extra_distance_bits, extra_distance_bits, !)
1054	   ADD ($distance_value, $extra_distance_bits)
1055	   LOAD (index, $decompressed_pointer)
1056	   COPY-OFFSET ($distance_value, $length_value, $decompressed_pointer)
1057	   OUTPUT ($index, $length_value)
1058	   JUMP (next_character)

1060	   :end_of_message

1062	   END-MESSAGE (requested_feedback_location,
1063	   returned_parameters_location, state_length, 64,
1064	   decompress_sigcomp_message, 6, 0)

1066	   readonly (0)
1067	   :circular_buffer
1068	   An example message compressed using the DEFLATE algorithm is given
1069	   below:

1071	   0xf3c9 4c4b d551 28c9 4855 08cd cb2c 4b2d 2a4e 5548 cc4b 5170 0532
1072	   0x2b4b 3232 f3d2 b900

1074	   The uncompressed message is "Life, the Universe and Everything\n".

1076	4.1.5  LZJH

1078	   This section provides UDVM bytecode for the LZJH compression
1079	   algorithm.  LZJH is the algorithm adopted by the International
1080	   Telecommunication Union (ITU-T) Recommendation V.44 LZJH [11].

1082	   Assembly for the LZJH decompressor is given below:

1084	   at (32)
1085	   readonly (0)

1087	   ; The following 2-byte variables are stored in the scratch-pad memory
1088	   ; area because they do not need to be saved after decompressing a
1089	   ; SigComp message:

1091	   :length_value                   pad (2)
1092	   :position_value                 pad (2)
1093	   :index                          pad (2)
1094	   :extra_extension_bits           pad (2)
1095	   :codebook_old                   pad (2)

1097	   at (42)

1099	   set (requested_feedback_location, 0)

1101	   at (64)

1103	   ; UDVM_registers

1105	   :byte_copy_left                 pad (2)
1106	   :byte_copy_right                pad (2)

1108	   :input_bit_order                pad (2)

1110	   ; The following 2-byte variables are saved as state after
1111	   ; decompressing a SigComp message:

1113	   :current_length                 pad (2)
1114	   :decompressed_pointer           pad (2)
1115	   :ordinal_length                 pad (2)
1116	   :codeword_length                pad (2)
1117	   :codebook_next                  pad (2)

1119	   set (returned_parameters_location, 0)

1121	   align (64)
1122	   readonly (1)

1124	   :initialize_memory

1126	   ; The following constants can be adjusted to configure the LZJH
1127	   ; decompressor. The current settings are as recommended in the V.44
1128	   ; specification (given that a total of 8K UDVM memory is available):

1130	   set (udvm_memory_size, 8192)   ; sets the total memory for LZJH
1131	   set (max_extension_length, 8)  ; sets the maximum string extension
1132	   set (min_ordinal_length, 7)    ; sets the minimum ordinal length
1133	   set (min_codeword_length, 6)   ; sets the minimum codeword length

1135	   set (codebook_start, 4492)
1136	   set (first_codeword, (codebook_start - 12))
1137	   set (state_length, (udvm_memory_size - 64))

1139	   MULTILOAD (64, 8, circular_buffer, udvm_memory_size, 7, 0,
1140	   circular_buffer, min_ordinal_length, min_codeword_length,
1141	   codebook_start)

1143	   :decompress_sigcomp_message

1145	   :standard_prefix

1147	   ; The following code decompresses the standard 1-bit LZJH prefix
1148	   ; which specifies whether the next character is an ordinal or a
1149	   ; codeword/control value:

1151	   INPUT-BITS (1, index, end_of_message)
1152	   COMPARE ($index, 1, ordinal, codeword_control, codeword_control)

1154	   :prefix_after_codeword

1156	   ; The following code decompresses the special LZJH prefix that only
1157	   ; occurs after a codeword. It specifies whether the next character is
1158	   ; an ordinal, a codeword/control value, or a string extension:

1160	   INPUT-HUFFMAN (index, end_of_message, 2, 1, 1, 1, 2, 1, 0, 1, 0)
1161	   COMPARE ($index, 1, ordinal, string_extension, codeword_control)
1162	   :ordinal

1164	   ; The following code decompresses an ordinal character, and creates
1165	   ; a new codebook entry consisting of the ordinal character plus the
1166	   ; next character to be decompressed:

1168	   set (index_lsb, (index + 1))
1169	   set (current_length_lsb, (current_length + 1))

1171	   INPUT-BITS ($ordinal_length, index, !)
1172	   OUTPUT (index_lsb, 1)
1173	   LOAD (current_length, 2)
1174	   COPY-LITERAL (current_length_lsb, 3, $codebook_next)
1175	   COPY-LITERAL (index_lsb, 1, $decompressed_pointer)
1176	   JUMP (standard_prefix)

1178	   :codeword_control

1180	   ; The following code decompresses a codeword/control value:

1182	   INPUT-BITS ($codeword_length, index, !)
1183	   COMPARE ($index, 3, control_code, initialize_memory, codeword)

1185	   :codeword

1187	   ; The following code interprets a codeword as an index into the LZJH
1188	   ; codebook. It extracts the position/length pair from the specified
1189	   ; codebook entry; the position/length pair points to a byte string
1190	   ; in the circular buffer which is then copied to the end of the
1191	   ; decompressed message. The code also creates a new codebook entry
1192	   ; consisting of the byte string plus the next character to be
1193	   ; decompressed:

1195	   set (length_value_lsb, (length_value + 1))

1197	   MULTIPLY ($index, 3)
1198	   ADD ($index, first_codeword)
1199	   COPY ($index, 3, length_value_lsb)
1200	   LOAD (current_length, 1)
1201	   ADD ($current_length, $length_value)
1202	   LOAD (codebook_old, $codebook_next)
1203	   COPY-LITERAL (current_length_lsb, 3, $codebook_next)
1204	   COPY-LITERAL ($position_value, $length_value, $decompressed_pointer)
1205	   OUTPUT ($position_value, $length_value)
1206	   JUMP (prefix_after_codeword)

1208	   :string_extension
1209	   ; The following code decompresses a Huffman-encoded string extension:

1211	   INPUT-HUFFMAN (index, !, 4, 1, 1, 1, 1, 2, 1, 3, 2, 1, 1, 1, 13, 3,
1212	   0, 7, 5)
1213	   COMPARE ($index, 13, continue, extra_bits, extra_bits)

1215	   :extra_bits

1217	   INPUT-BITS (max_extension_length, extra_extension_bits, !)
1218	   ADD ($index, $extra_extension_bits)

1220	   :continue

1222	   ; The following code extends the most recently created codebook entry
1223	   ; by the number of bits specified in the string extension:

1225	   COPY-LITERAL ($position_value, $length_value, $position_value)
1226	   COPY-LITERAL ($position_value, $index, $decompressed_pointer)
1227	   OUTPUT ($position_value, $index)
1228	   ADD ($index, $length_value)
1229	   COPY (index_lsb, 1, $codebook_old)
1230	   JUMP (standard_prefix)

1232	   :control_code

1234	   ; The code can handle all of the control characters in V.44 except
1235	   ; for ETM (Enter Transparent Mode), which is not required for
1236	   ; message-based protocols such as SigComp.

1238	   COMPARE ($index, 1, !, flush, stepup)

1240	   :flush

1242	   ; The FLUSH control character jumps to the beginning of the next
1243	   ; complete byte in the compressed message:

1245	   INPUT-BYTES (0, 0, 0)
1246	   JUMP (standard_prefix)

1248	   :stepup

1250	   ; The STEPUP control character increases the number of bits used to
1251	   ; encode an ordinal value or a codeword:

1253	   INPUT-BITS (1, index, !)
1254	   COMPARE ($index, 1, stepup_ordinal, stepup_codeword, 0)

1256	   :stepup_ordinal
1257	   ADD ($ordinal_length, 1)
1258	   JUMP (ordinal)

1260	   :stepup_codeword

1262	   ADD ($codeword_length, 1)
1263	   JUMP (codeword_control)

1265	   :end_of_message

1267	   END-MESSAGE (requested_feedback_location,
1268	   returned_parameters_location, state_length, 64,
1269	   decompress_sigcomp_message, 6, 0)

1271	   readonly (0)
1272	   :circular_buffer

1274	   An example message compressed using the LZJH algorithm is given
1275	   below:

1277	   0x5c09 e6e0 cadc c8d2 dcce 40c2 40f2 cac2 e440 c825 c840 ccde 29e8
1278	   0xc2f0 40e0 eae4 e0de e6ca e65c 1403

1280	   The uncompressed message is "...spending a year dead for tax
1281	   purposes.\n".

1283	4.2  Adapted Algorithms

1285	4.2.1  Modified DEFLATE

1287	   Alternative algorithms can also be used with SigComp.  This section
1288	   shows a modified version of the DEFLATE [10] algorithm.  The two-
1289	   stage encoding of DEFLATE is replaced by a single step with a
1290	   discrete Huffman code for each symbol.  The literal/length symbol
1291	   probabilities are dependent upon whether the previously symbol was a
1292	   literal or a match.  Bit-handling is also simpler, in that all bits
1293	   are input using the INPUT-HUFFMAN instruction and the value of the H
1294	   bit does not change so all bits are input, read and interpreted in
1295	   the same order.

1297	   Assembly for the algorithm is given below.  String matching rules are
1298	   the same as for the other LZ-based algorithms, with the alternative
1299	   encoding of the literals and length/distance pairs.

1301	   at (32)
1302	   readonly (0)

1304	   :index                          pad (2)
1305	   :distance_value                 pad (2)
1306	   :old_pointer                    pad (2)

1308	   at (42)

1310	   set (requested_feedback_location, 0)

1312	   at (64)

1314	   :byte_copy_left                 pad (2)
1315	   :byte_copy_right                pad (2)
1316	   :input_bit_order                pad (2)
1317	   :decompressed_pointer           pad (2)

1319	   set (returned_parameters_location, 0)

1321	   at (128)
1322	   readonly (1)

1324	   :initialize_memory

1326	   set (udvm_memory_size, 8192)
1327	   set (state_length, (udvm_memory_size - 64))

1329	   MULTILOAD (64, 4, circular_buffer, udvm_memory_size, 0,
1330	   circular_buffer)

1332	   :decompress_sigcomp_message

1334	   :character_after_literal

1336	   INPUT-HUFFMAN (index, end_of_message, 16,
1337	       5, 0, 11, 46,
1338	       0, 12, 12, 256,
1339	       1, 26, 32, 257,
1340	       1, 66, 68, 32,
1341	       0, 69, 94, 97,
1342	       0, 95, 102, 264,
1343	       0, 103, 103, 511,
1344	       2, 416, 426, 35,
1345	       0, 427, 465, 58,
1346	       0, 466, 481, 272,
1347	       1, 964, 995, 288,
1348	       3, 7968, 7988, 123,
1349	       0, 7989, 8115, 384,
1350	       1, 16232, 16263, 0,
1351	       0, 16264, 16327, 320,
1352	       1, 32656, 32767, 144)

1354	   COMPARE ($index, 256, literal, distance, distance)

1356	   :character_after_match

1358	   INPUT-HUFFMAN (index, end_of_message, 16,
1359	       4, 0, 0, 511,
1360	       1, 2, 9, 256,
1361	       1, 20, 22, 32,
1362	       0, 23, 30, 264,
1363	       1, 62, 73, 46,
1364	       0, 74, 89, 272,
1365	       2, 360, 385, 97,
1366	       0, 386, 417, 288,
1367	       1, 836, 874, 58,
1368	       0, 875, 938, 320,
1369	       1, 1878, 1888, 35,
1370	       0, 1889, 2015, 384,
1371	       1, 4032, 4052, 123,
1372	       1, 8106, 8137, 0,
1373	       1, 16276, 16379, 144,
1374	       1, 32760, 32767, 248)

1376	   COMPARE ($index, 256, literal, distance, distance)

1378	   :literal

1380	   set (index_lsb, (index + 1))

1382	   OUTPUT (index_lsb, 1)
1383	   COPY-LITERAL (index_lsb, 1, $decompressed_pointer)
1384	   JUMP (character_after_literal)

1386	   :distance

1388	   SUBTRACT ($index, 253)
1389	   INPUT-HUFFMAN (distance_value, !, 9,
1390	       9, 0, 7, 9,
1391	       0, 8, 63, 129,
1392	       1, 128, 135, 1,
1393	       0, 136, 247, 17,
1394	       0, 248, 319, 185,
1395	       1, 640, 1407, 257,
1396	       2, 5632, 6655, 1025,
1397	       1, 13312, 15359, 2049,
1398	       2, 61440, 65535, 4097)

1400	   LOAD (old_pointer, $decompressed_pointer)
1401	   COPY-OFFSET ($distance_value, $index, $decompressed_pointer)
1402	   OUTPUT ($old_pointer, $index)
1403	   JUMP (character_after_match)

1405	   :end_of_message

1407	   END-MESSAGE (requested_feedback_location,
1408	   returned_parameters_location, state_length, 64,
1409	   decompress_sigcomp_message, 6, 0)

1411	   readonly (0)
1412	   :circular_buffer

1414	   An example message compressed using the modified DEFLATE algorithm is
1415	   given below:

1417	   0xd956 b132 cd68 5424 c5a9 6215 8a70 a64d af0a 5499 3621 509b 3e4c
1418	   0x28b4 a145 b362 653a d0a6 498b 5a6d 2970 ac4c 930a a4ca 74a4 c268
1419	   0x0c

1421	   The uncompressed message is "Arthur leapt to his feet like an author
1422	   hearing the phone ring".

1424	5.  Additional SigComp mechanisms

1426	   The following chapter covers the additional mechanisms that can be
1427	   employed by SigComp to improve the overall compression ratio;
1428	   including the use of acknowledgements, dictionaries and sharing state
1429	   between two directions of a compressed message flow.

1431	   Example assembly code is provided for these mechanisms.  Depending on
1432	   the mechanism and basic algorithm in use, the assembly code for
1433	   either the mechanism or the basic algorithm may require modification
1434	   (e.g. if the algorithm uses 'no more input' to jump to
1435	   end_of_message, following end_of_message with an input instruction
1436	   for CRC will not work).  In any case, these are examples and there
1437	   may be alternative ways to make use of the mechanisms.

1439	   When each of the compression algorithms described in Section 4 has
1440	   successfully decompressed the current SigComp message, the contents
1441	   of the UDVM memory are saved as a SigComp state item.  Subsequent
1442	   messages can access this state item by uploading the correct state
1443	   identifier to the receiving endpoint, which avoids the need to upload
1444	   the bytecode for the compression algorithm on a per-message basis.

1446	   However, before a state item can be accessed the compressor must
1447	   first ensure that it is available at the receiving endpoint.

1449	   For each SigComp compartment, the receiving endpoint maintains a list
1450	   of currently available states (where the total amount of state saved
1451	   does not exceed the state_memory_size for the compartment).  The
1452	   SigComp compressor should maintain a similar list containing the
1453	   states that it has instructed the receiving endpoint to save.

1455	   As well as tracking the list of state items that it has saved at the
1456	   remote endpoint, the compressor also maintains a flag for each state
1457	   item indicating whether the state can safely be accessed or not.
1458	   State items should not be accessed until they have been acknowledged
1459	   (e.g. by using the SigComp feedback mechanism as per Section 5.1).

1461	   State items are deleted from the list when the total
1462	   state_memory_size for the compartment is used up by states of a
1463	   higher priority.  The SigComp compressor should not attempt to access
1464	   any state items that have been deleted in this manner, as they may no
1465	   longer be available at the receiving endpoint.

1467	5.1  Acknowledging a state item

1469	   SigComp [4] defines a feedback mechanism to allow the compressor to
1470	   request feedback from the decompressor, to give the compressor
1471	   indication that a message has been received, correctly decompressed
1472	   and state storage attempted.  (Note, this mechanism cannot convey the
1473	   success or failure of individual state creation requests.)  In order
1474	   to invoke the feedback mechanism the following fields must be
1475	   reserved in the UDVM memory:

1477	     0   1   2   3   4   5   6   7
1478	   +---+---+---+---+---+---+---+---+
1479	   |     reserved      | Q | S | I |  requested_feedback_location
1480	   +---+---+---+---+---+---+---+---+
1481	   | 1 | requested_feedback_length |  if Q = 1
1482	   +---+---+---+---+---+---+---+---+
1483	   |                               |
1484	   :   requested_feedback_field    :  if Q = 1
1485	   |                               |
1486	   +---+---+---+---+---+---+---+---+

1488	   These fields can be reserved in any of the algorithms of Chapter 4 by
1489	   replacing the line "set (requested_feedback_location, 0)" with the
1490	   following assembly:

1492	   :requested_feedback_location    pad (1)
1493	   :requested_feedback_length      pad (1)
1494	   :requested_feedback_field       pad (12)
1495	   :hash_start                     pad (8)

1497	   When a SigComp message is successfully decompressed and saved as
1498	   state, the following bytecode instructs the receiving endpoint to
1499	   return the first 6 bytes of the corresponding state identifier.  The
1500	   bytecode can be added to any of the compression algorithms of Chapter
1501	   4 immediately following the ":end_of_message" label:

1503	   :end_of_message

1505	   set (hash_length, (state_length + 8))

1507	   LOAD (requested_feedback_location, 1158)
1508	   MULTILOAD (hash_start, 4, state_length, 64,
1509	   decompress_sigcomp_message, 6)
1510	   SHA-1 (hash_start, hash_length, requested_feedback_field)

1512	   The receiving endpoint then returns the state identifier in the
1513	   "returned feedback field" of the next SigComp message to be
1514	   transmitted in the reverse direction.

1516	   When the state identifier is returned, the compressor can set the
1517	   availability flag for the corresponding state to 1.

1519	5.2  Static dictionary

1521	   Certain protocols that can be compressed using SigComp offer a fixed,
1522	   mandatory state item known as a static dictionary.  This dictionary
1523	   contains a number of text strings that commonly occur in messages
1524	   generated by the protocol in question.  The overall compression ratio
1525	   can often be improved by accessing the text phrases from this static
1526	   dictionary rather than by uploading them as part of the compressed
1527	   message.

1529	   As an example, a static dictionary is provided for the protocols SIP
1530	   and SDP RFC-3485 [6].  This dictionary is designed for use by a wide
1531	   range of compression algorithms including all of the ones covered in
1532	   Chapter 4.

1534	   In any of the compression algorithms of Chapter 4, the static
1535	   dictionary can be accessed by inserting the following instruction
1536	   immediately after the ":initialize_memory" label:

1538	   STATE-ACCESS (dictionary_id, 6, 0, 0, 1024, 0)

1540	   The parameters of STATE-ACCESS instruction will depend on the
1541	   compression algorithm in use.

1543	   The following lines should also be inserted immediately after the
1544	   END-MESSAGE instruction:

1546	   :dictionary_id

1548	   byte (0xfb, 0xe5, 0x07, 0xdf, 0xe5, 0xe6)

1550	   The text strings contained in the static dictionary can then be
1551	   accessed in exactly the same manner as the text strings from
1552	   previously decompressed messages (see Section 5.1 for further
1553	   details).

1555	   Note that in some cases it is sufficient to only load part of the
1556	   static dictionary into the UDVM memory.  Further information on the
1557	   contents of the SIP and SDP static dictionary can be found in the
1558	   relevant document RFC-3485 [6].

1560	5.3  CRC checksum

1562	   The acknowledgement scheme of Section 5.1 is designed to indicate the
1563	   successful decompression of a message.  However, it does not
1564	   guarantee that the decompressed message is identical to the original
1565	   message, since decompression of a corrupted message could succeed but
1566	   with some characters being incorrect.  This could lead to an
1567	   incorrect message being passed to the application or unexpected
1568	   contents of state to be stored.  In order to prevent this happening a
1569	   CRC check could be used.

1571	   If an additional CRC check is required then the following bytecode
1572	   can be inserted after the ":end_of_message" label:

1574	   INPUT-BYTES (2, index, !)
1575	   CRC ($index, 64, state_length, !)

1577	   The bytecode extracts a 2-byte CRC from the end of the SigComp
1578	   message and compares it with a CRC calculated over the UDVM memory.
1579	   Decompression failure occurs if the two CRC values do not match.

1581	   A definition of the CRC polynomial used by the CRC instruction can be
1582	   found in SigComp [4].

1584	5.4  Announcing additional resources

1586	   If a particular endpoint is able to offer more processing or memory
1587	   resources than the mandatory minimum, the SigComp feedback mechanism
1588	   can be used to announce that these resources are available to the
1589	   remote endpoint.  This may help to improve the overall compression
1590	   ratio between the two endpoints.

1592	   Additionally, if an endpoint has any pieces of state that may be
1593	   useful for the remote endpoint to reference, it can advertise the
1594	   identifiers for the states.  The remote endpoint can then make use of
1595	   any that it also knows about (i.e. knows the contents of) e.g. a
1596	   dictionary or shared mode state (see Section 5.5).

1598	   The values of the following SigComp parameters can be announced using
1599	   the SigComp advertisement mechanism:

1601	      cycles_per_bit
1602	      decompression_memory_size
1603	      state_memory_size
1604	      SigComp_version
1605	      state identifiers

1607	   As explained in SigComp, in order to announce the values of these
1608	   parameters the following fields must be reserved in the UDVM memory:

1610	     0   1   2   3   4   5   6   7
1611	   +---+---+---+---+---+---+---+---+
1612	   |  cpb  |    dms    |    sms    |  returned_parameters_location
1613	   +---+---+---+---+---+---+---+---+
1614	   |        SigComp_version        |
1615	   +---+---+---+---+---+---+---+---+
1616	   | length_of_partial_state_ID_1  |
1617	   +---+---+---+---+---+---+---+---+
1618	   |                               |
1619	   :  partial_state_identifier_1   :
1620	   |                               |
1621	   +---+---+---+---+---+---+---+---+
1622	           :               :
1623	   +---+---+---+---+---+---+---+---+
1624	   | length_of_partial_state_ID_n  |
1625	   +---+---+---+---+---+---+---+---+
1626	   |                               |
1627	   :  partial_state_identifier_n   :
1628	   |                               |
1629	   +---+---+---+---+---+---+---+---+

1631	   These fields can be reserved in any of the algorithms of Chapter 4 by
1632	   replacing the line "set (returned_parameters_location, 0)" with the
1633	   following piece of assembly:

1635	   :adverts_len                    pad (1)
1636	   :adverts_len_lsb                pad (1)
1637	   :returned_parameters_location   pad (1)
1638	   :returned_sigcomp_version       pad (1)
1639	   :state_ids                      pad (x)

1641	   where x is enough space for the number state identifiers that the
1642	   endpoint wishes to advertise.

1644	   When a SigComp message is successfully decompressed and saved as
1645	   state, the following bytecode announces to the receiving endpoint
1646	   that additional resources and pieces of state are available at the
1647	   sending endpoint:

1649	   :end_of_message

1651	   LOAD (returned_parameters_location, N)
1652	   INPUT-BYTES (1, adverts_len_lsb, done)
1653	   INPUT-BYTES ($adverts_len, state_ids, done)
1654	   :done

1656	   Note that the integer value "N" should be set equal to the amount of
1657	   resources available at the sending endpoint.  N should be expressed
1658	   as a 2-byte integer with the most significant bits corresponding to
1659	   the cycles_per_bit parameter and the least significant bits
1660	   corresponding to the SigComp_version parameter.

1662	   The length of the state identifiers, followed by the state
1663	   identifiers in the format shown are appended to the end of the
1664	   compressed message.

1666	5.5  Shared compression

1668	   This section provides bytecode for implementing the SigComp shared
1669	   compression mechanism RFC-3321 [5].  If two endpoints A and B are
1670	   communicating via SigComp, shared compression allows the messages
1671	   sent from Endpoint A to Endpoint B to be compressed relative to the
1672	   messages sent from Endpoint B to Endpoint A (and vice versa).  This
1673	   may improve the overall compression ratio by reducing the need to
1674	   transmit the same information in both directions.

1676	   As described in RFC-3321 [5], two steps must be taken to implement
1677	   shared compression at an endpoint.

1679	   Firstly, it is necessary to announce to the remote endpoint that
1680	   shared compression is available.  This is done by announcing the
1681	   state identifier as an available piece of state.  This can be done
1682	   using the returned_parameters_location announcement as in
1683	   Section 5.4.

1685	   Secondly, assuming that such an announcement is received from the
1686	   remote endpoint, then the state created by shared compression needs
1687	   to be accessed by the message sent in the opposite direction.  This
1688	   can be done in a similar way to accessing the static dictionary (see
1689	   Section 5.2), but using the appropriate state identifier, for
1690	   example, by using the INPUT-BYTES instruction as below:

1692	   :shared_state_id        pad (6)

1694	   :access_shared_state

1696	   INPUT-BYTES (6, shared_state_id, !)
1697	   STATE-ACCESS (shared_state_id, 6, 0, 0, $decompressed_start, 0)

1699	6.  Security considerations

1701	   This document describes implementation options for the SigComp
1702	   protocol [4].  Consequently the security considerations for this
1703	   document match those of SigComp.

1705	7.  Acknowledgements

1707	   Thanks to Richard Price, Carsten Bormann, Adam Roach, Lawrence
1708	   Conroy, Christian Schmidt, Max Riegel, Lars-Erik Jonsson, Jonathan
1709	   Rosenberg, Stefan Forsgren, Krister Svanbro, Miguel Garcia,
1710	   Christopher Clanton, Khiem Le, Ka Cheong Leung, Zoltan Barczikay for
1711	   valuable input and review.

1713	   Special thanks to Pekka Pessi and Cristian Constantin who served as
1714	   committed working group document reviewers.

1716	8.  Intellectual Property Right Considerations

1718	   The IETF has been notified of intellectual property rights claimed in
1719	   regard to some or all of the specification contained in this
1720	   document.  For more information consult the online list of claimed
1721	   rights.

1723	9.  References

1725	   [1]   Johnston, A., Donovan, S., Sparks, R., Cunningham, C., and K.
1726	         Summers, "Session Initiation Protocol (SIP) Basic Call Flow
1727	         Examples", RFC 3665, December 2003.

1729	   [2]   Bradner, S., "The Internet Standards Process -- Revision 3",
1730	         RFC 3667, February 2004.

1732	   [3]   Crocker, D. and P. Overell, "Augmented BNF for Syntax
1733	         Specifications: ABNF", RFC 2234, November 1997.

1735	   [4]   Price, R., Borman, C., Christoffersson, J., Hannu, H., Liu, Z.,
1736	         and J. Rosenberg, "Signaling Compression (SigComp)", RFC 3320,
1737	         January 2003.

1739	   [5]   Hannu, H., Christoffersson, J., Forsgren, S., Leung, K., Liu,
1740	         Z., and R. Price, "Signaling Compression (SigComp) - Extended
1741	         Operations", RFC 3321, January 2003.

1743	   [6]   Garcia-Martin, M., Borman, C., Ott, J., Price, R., and A.
1744	         Roach, "The Session Initiation Protocol (SIP) and Session
1745	         Description Protocol (SDP) Static Dictionary for Signaling
1746	         Compression (SigComp)", RFC 3485, February 2003.

1748	   [7]   Ziv, J. and A. Lempel, "A universal algorithm for sequential
1749	         data compression", IEEE 23:337-343, 1977.

1751	   [8]   Storer, J., "Data Compression: Methods and Theory", Computer
1752	         Science Press ISBN 0-88175-161-8, 1998.

1754	   [9]   Nelson, M., "LZW Data Compression", Dr Dobb's Journal,
1755	         October 1989.

1757	   [10]  Deutsch, P., "DEFLATE Compressed Data Format Specification
1758	         version 1.3", RFC 1951, May 1996.

1760	   [11]  "Data Compression Procedures", ITU-T Recommendation V.44,
1761	         November 2000.

1763	Authors' Addresses

1765	   Abigail Surtees
1766	   Siemens/Roke Manor Research
1767	   Roke Manor Research Ltd.
1768	   Romsey, Hants  SO51 0ZN
1769	   UK

1771	   Phone: +44 (0)1794 833131
1772	   Email: abigail.surtees@roke.co.uk
1773	   URI:   http://www.roke.co.uk

1775	   Mark A. West
1776	   Siemens/Roke Manor Research
1777	   Roke Manor Research Ltd.
1778	   Romsey, Hants  SO51 0ZN
1779	   UK

1781	   Phone: +44 (0)1794 833311
1782	   Email: mark.a.west@roke.co.uk
1783	   URI:   http://www.roke.co.uk

1785	Appendix A.  UDVM bytecode for the compression algorithms

1787	   The following sections list the UDVM bytecode generated for each
1788	   compression algorithm of Section 4.

1790	   Note that the different assemblers can output different bytecode for
1791	   the same piece of assembly code, so a valid assembler can produce
1792	   results different from those presented below.  However, the following
1793	   bytecode should always generate the same decompressed messages on any
1794	   UDVM.

1796	A.1  Well-known Algorithms

1798	A.1.1  Simplified LZ77

1800	   0x0f86 0389 8d89 1588 8800 011c 0420 0d13 5051 2222 5051 16f5 2300
1801	   0x00bf c086 a08b 06

1803	A.1.2  LZSS

1805	   0x0f86 04a0 c48d 00a0 c41e 2031 0209 00a0 ff8e 048c bfff 0117 508d
1806	   0x0f23 0622 2101 1321 0123 16e5 1d04 22e8 0611 030e 2463 1450 5123
1807	   0x2252 5116 9fd2 2300 00bf c086 a089 06

1809	A.1.3  LZW

1811	   0x0f86 06a1 ce8d 00b1 8f01 a0ce 13a0 4903 2313 2501 2506 1201 1752
1812	   0x88f4 079f 681d 0a24 2508 1203 0612 b18f 1252 0321 0ea0 4801 0624
1813	   0x5013 a049 0323 1351 5025 2251 5016 9fde 2300 00bf c086 a09f 06

1815	A.1.4  DEFLATE

1817	   0x0f86 7aa2 528d 05a2 5200 0300 0400 0500 0600 0700 0800 0900 0a01
1818	   0x0b01 0d01 0f01 1102 1302 1702 1b02 1f03 2303 2b03 3303 3b04 a043
1819	   0x04a0 5304 a063 04a0 7305 a083 05a0 a305 a0c3 05a0 e300 a102 0001
1820	   0x0002 0003 0004 0105 0107 0209 020d 0311 0319 0421 0431 05a0 4105
1821	   0xa061 06a0 8106 a0c1 07a1 0107 a181 08a2 0108 a301 09a4 0109 a601
1822	   0x0aa8 010a ac01 0bb0 010b b801 0c80 2001 0c80 3001 0d80 4001 0d80
1823	   0x6001 1d03 229f b41e 20a0 6504 0700 1780 4011 0130 a0bf 0000 a0c0
1824	   0xa0c7 8040 2901 a190 a1ff a090 1750 8040 1109 a046 1322 2101 1321
1825	   0x0123 169f d108 1004 1250 0422 1d51 229f d706 1251 1e20 9fcf 0105
1826	   0x001f 2f08 1004 1250 0426 1d53 26f6 0614 530e 2063 1454 5223 2250
1827	   0x5216 9f9e 2300 00bf c086 a1de 06

1829	A.1.5  LZJH

1831	   0x0f86 08a1 5b8d 0700 a15b 0706 b18f 1d01 24a0 c317 5201 1a31 311e
1832	   0x24a0 b802 0101 0102 0100 0100 1752 0107 a04e 1e1d 6524 f822 2501
1833	   0x0ea0 4602 13a0 4703 2713 2501 2416 9fcd 1d66 24e1 1752 03a0 639f
1834	   0xb808 0812 0306 12b1 8312 5203 210e a046 0106 2350 0e28 6713 a047
1835	   0x0327 1351 5024 2251 5016 9fa8 1e24 9fb1 0401 0101 0102 0103 0201
1836	   0x0101 0d03 0007 0517 520d 0d06 061d 0826 f706 1253 1351 5011 1351
1837	   0x5224 2251 5206 1250 1225 0154 169f 6617 5201 9fdb 070f 1c00 009e
1838	   0xce16 9f57 1d01 24fa 1752 0107 0d9e c206 2501 169f 6506 2601 169f
1839	   0x7623 0000 bfc0 86a0 8e06

1841	A.2  Adapted Algorithms

1843	A.2.1  Modified DEFLATE

1845	   0x0f86 04a1 d38d 00a1 d31e 20a1 4010 0500 0b2e 000c 0c88 011a 20a1
1846	   0x0101 a042 a044 2000 a045 a05e a061 00a0 5fa0 66a1 0800 a067 a067
1847	   0xa1ff 02a1 a0a1 aa23 00a1 aba1 d13a 00a1 d2a1 e1a1 1001 a3c4 a3e3
1848	   0xa120 03bf 20bf 34a0 7b00 bf35 bfb3 a180 0180 3f68 803f 8700 0080
1849	   0x3f88 803f c7a1 4001 807f 9080 7fff a090 1750 88a0 79a0 83a0 831e
1850	   0x20a0 c810 0400 00a1 ff01 0209 8801 1416 2000 171e a108 013e a049
1851	   0x2e00 a04a a059 a110 02a1 68a1 81a0 6100 a182 a1a1 a120 01a3 44a3
1852	   0x6a3a 00a3 6ba3 aaa1 4001 a756 a760 2300 a761 a7df a180 01af c0af
1853	   0xd4a0 7b01 bfaa bfc9 0001 803f 9480 3ffb a090 0180 7ff8 807f ffa0
1854	   0xf817 5088 0610 1022 2101 1321 0123 169f 1107 10a0 fd1e 229f d909
1855	   0x0900 0709 0008 3fa0 8101 87a0 8701 00a0 88a0 f711 00a0 f8a1 3fa0
1856	   0xb901 a280 a57f a101 02b6 00b9 ffa4 0101 8034 0080 3bff a801 0290
1857	   0x00ff b001 0e24 6314 5150 2322 5250 169f 3b23 0000 bfc0 86a0 8906

1859	Intellectual Property Statement

1861	   The IETF takes no position regarding the validity or scope of any
1862	   Intellectual Property Rights or other rights that might be claimed to
1863	   pertain to the implementation or use of the technology described in
1864	   this document or the extent to which any license under such rights
1865	   might or might not be available; nor does it represent that it has
1866	   made any independent effort to identify any such rights.  Information
1867	   on the procedures with respect to rights in RFC documents can be
1868	   found in BCP 78 and BCP 79.

1870	   Copies of IPR disclosures made to the IETF Secretariat and any
1871	   assurances of licenses to be made available, or the result of an
1872	   attempt made to obtain a general license or permission for the use of
1873	   such proprietary rights by implementers or users of this
1874	   specification can be obtained from the IETF on-line IPR repository at
1875	   http://www.ietf.org/ipr.

1877	   The IETF invites any interested party to bring to its attention any
1878	   copyrights, patents or patent applications, or other proprietary
1879	   rights that may cover technology that may be required to implement
1880	   this standard.  Please address the information to the IETF at
1881	   ietf-ipr@ietf.org.

1883	Disclaimer of Validity

1885	   This document and the information contained herein are provided on an
1886	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
1887	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
1888	   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
1889	   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
1890	   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
1891	   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

1893	Copyright Statement

1895	   Copyright (C) The Internet Society (2005).  This document is subject
1896	   to the rights, licenses and restrictions contained in BCP 78, and
1897	   except as set forth therein, the authors retain all their rights.

1899	Acknowledgment

1901	   Funding for the RFC Editor function is currently provided by the
1902	   Internet Society.

1904	This Internet-Draft will expire on April 27, 2006.