Network Working Group Richard Price, Siemens/Roke Manor INTERNET-DRAFT Abigail Surtees, Siemens/Roke Manor Expires: September 2003 Mark A West, Siemens/Roke Manor March 3, 2003 Formal Notation for Robust Header Compression (ROHC-FN) Status of this memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or cite them other than as "work in progress". The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html This document is a submission of the IETF ROHC WG. Comments should be directed to its mailing list, rohc@ietf.org. Abstract This document defines ROHC-FN: a formal notation for specifying how to compress and decompress fields from an arbitrary protocol stack. ROHC-FN is intended to simplify the creation of new compression profiles to fit within the ROHC [RFC-3095] framework. Price et al. [Page 1] INTERNET-DRAFT ROHC-FN March 3, 2003 Table of contents 1. Introduction..................................................3 2. Terminology...................................................3 3. Overview of ROHC-FN...........................................4 4. Normative definition of ROHC-FN...............................8 5. Basic libraries...............................................11 6. Extended library of encoding methods..........................16 7. Security considerations.......................................27 8. Acknowledgements..............................................27 9. Authors' addresses............................................28 10. References....................................................28 1. Introduction ROHC-FN is a simple notation designed to help with the creation of new ROHC [RFC-3095] header compression profiles. ROHC-FN offers a library of "encoding methods" that are often used in ROHC profiles, so new profiles can be defined without needing to redefine this library from scratch. Informally, an encoding method is just a function that converts uncompressed data into compressed data. The simplest encoding methods only have one input and output: the input is an uncompressed field and the output is the compressed version of the field. More complex encoding methods can compress multiple fields at the same time, e.g. "list" encoding from [RFC-3095], which is designed to compress an ordered list of fields. ROHC-FN also allows new encoding methods to be created from existing ones, which is useful when defining encoding methods that are not included in the basic ROHC-FN library (e.g. because they only apply to a specific protocol). The features required for ROHC-FN are already offered as standard by a number of programming languages, including Prolog, Haskell and Mercury. An off-the-shelf compiler for any of these languages can compile ROHC-FN encoding methods into executable code. [Editor's Note: The syntax of ROHC-FN is currently very close to that of Haskell, but not completely identical. At some stage we should fix this - either by using the exact Haskell syntax, or by picking a different language and using the syntax from that instead.] Note that since ROHC-FN only needs a small percentage of the features offered by the above languages, this draft contains a standalone definition of the notation (i.e. there's no need to understand Prolog/Haskell/Mercury in order to understand ROHC-FN). Price et al. [Page 2] INTERNET-DRAFT ROHC-FN March 3, 2003 2. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC-2119 [RFC-2119]. Control field Control fields are transmitted from a ROHC compressor to a ROHC decompressor, but are not part of the uncompressed protocol header itself. An example is a checksum field over the header to ensure robustness against bit errors and dropped packets. Encoding method Encoding methods are functions that can be applied to compress fields in a protocol header. ROHC-FN includes a library of commonly used encoding methods. Field ROHC-FN divides the protocol to be compressed into a set of contiguous bit patterns known as fields. Library of encoding methods The library of encoding methods contains a number of commonly used encoding methods for compressing header fields. Profile A ROHC [RFC-3095] profile is a description of how to compress a certain protocol stack over a certain type of link. Each profile includes packet formats to compress the headers and a state machine to control the actions of each endpoint. 3. Overview of ROHC-FN This section gives an overview of ROHC-FN and explains how it can be used to compress header fields as part of the ROHC framework. 3.1. Scope of ROHC-FN The following section describes the scope of ROHC-FN, and explains how it relates to the overall ROHC framework and to specific ROHC profiles. The ROHC framework is common to all profiles, and provides generic features such as padding and segmentation of the compressed packets (useful if the link layer can only handle a limited range of packet sizes). Price et al. [Page 3] INTERNET-DRAFT ROHC-FN March 3, 2003 A ROHC profile is a description of how to compress a certain protocol stack over a certain type of link. For example, ROHC profiles are available for RTP/UDP/IP and many other protocol stacks. Each ROHC profile can be further subdivided into the following two components: a) Packet formats for compressing and decompressing headers b) State machine The job of the packet formats is to define how to compress and decompress headers. The packet formats must define the compressed version of each uncompressed header (and vice versa). The packet formats will typically compress headers relative to a "context" of field values from previous headers in a flow. This improves the overall compression ratio, due to taking into account redundancies between successive headers. The job of the state machine is to ensure that the profile is robust against bit errors and dropped packets. The state machine manages the context, providing feedback and other mechanisms to ensure that the compressor and decompressor contexts are kept in sync. ROHC-FN is designed to help provide the packet formats for use in new ROHC profiles. It offers a library of encoding methods for compressing fields, and a mechanism for combining these encoding methods to create new packet formats tailored to a specific protocol stack. Note however that the state machine for the new profiles is beyond the scope of ROHC-FN, and must be provided separately. 3.2. Worked example using IPv6 Rather than immediately diving in with a formal definition of ROHC- FN, the following section will give an overview of how the notation is used by means of a worked example. The example will develop an encoding method capable of compressing a single, well-known header. A suitable choice for the header is IPv6, which has fields that display a good range of behaviours to demonstrate ROHC-FN in action. The "ipv6_header" encoding method will be created from well-known encoding methods in the ROHC-FN library. The first step is to break the IPv6 header down into its component fields, which is accomplished in ROHC-FN as follows: ipv6_header ::= version traffic_class ect_flag ce_flag Price et al. [Page 4] INTERNET-DRAFT ROHC-FN March 3, 2003 flow_label payload_length next_header hop_limit source_address destination_address For the purpose of the example we'll need some simple encoding methods from the ROHC-FN library. The "static" encoding method can compress any field whose length and value is the same as for the previous header in the flow. No compressed bits need to be sent because the static field can be reconstructed using only the context. We can apply "static" encoding to the following fields in the IPv6 header: traffic_class ::= static ect_flag ::= static flow_label ::= static hop_limit ::= static source_address ::= static destination_address ::= static Secondly, the "value (n, k)" encoding method handles n-bit fields that take a fixed integer value k. The parameters n and k can be filled in with whatever values we require. For example, the IPv6 Version field is a 4-bit field that always takes the value 6: version ::= value (4, 6) Finally, the "self-describing values" encoding method, denoted by the symbol "/", allows a field to be compressed in two or more different ways. For example, if we wished to compress the Version field from an arbitrary IP header (including both IPv4 and IPv6) then we could encode it as "value (4, 4) / value (4, 6)". The cost of using "self-describing values" encoding is that a 1-bit indicator flag needs to be sent in the compressed header, so that the decompressor knows which of the two choices has been used to compress the field. So, when encoding a field there is always a trade-off between handling a larger set of field behaviors and getting a better compression ratio. In the case of IPv6, we will assume that the only field with more than one possible behavior is the CE Flag: ce_flag ::= value (1, 0) / value (1, 1) Price et al. [Page 5] INTERNET-DRAFT ROHC-FN March 3, 2003 The remaining two fields in the IPv6 header are the Payload Length and the Next Header fields. These fields are more difficult to compress because their behavior is related to other fields in the protocol stack: the Next Header field depends on the next header after IPv6, whilst the Payload Length can be inferred from the total packet size. In order to compress these fields we'll need to use ROHC-FN's ability to store temporary variables. ROHC-FN allows a field value (or any other useful information) to be stored in a variable, so that it can be retrieved and used by other encoding methods. See Section 4.2.3 for further details on temporary variables. The "label" encoding method takes an n-bit field from the uncompressed header and puts it into a temporary variable, so that it can be accessed when compressing later fields in the header. We can apply this method to the Next Header field as follows: next_header ::= label (8, Next_Header) The above encoding method extracts 8 bits from the uncompressed IPv6 header and stores them in a temporary variable called "Next_Header". Note that we haven't actually compressed the Next Header field yet, but rather put it into a variable so that it can be handled later on once we know the next header in the chain. See Section 6.2.2 for an example of how to retrieve and compress this field. The only field left in the IPv6 header is the Payload Length. Fortunately this field is easy to compress, because ROHC-FN has a well-known variable called "Remaining_Data" that stores the number of bits left to compress in the uncompressed packet. So the value of the Payload Length field is just ((Remaining_Data - 304) / 8). ROHC- FN allows simple formulae of this nature to be used when defining new encoding methods: payload_length ::= value (16, (Remaining_Data - 304) / 8) We've now finished defining how to compress an entire IPv6 header, using a selection of encoding methods from the ROHC-FN library. The encoding is rather efficient: the only field that requires data to be sent between the compressor and the decompressor is the CE Flag (and even for this field only 1 bit is needed). All other fields can be reconstructed automatically at the decompressor. 3.3. Adding robustness ROHC profiles are designed to be "robust" against packet loss and residual bit errors on the link over which header compression takes place. A well-designed profile can cope with these errors without losing additional packets or introducing additional bit errors in the decompressed headers. Price et al. [Page 6] INTERNET-DRAFT ROHC-FN March 3, 2003 ROHC-FN offers two techniques to help ensure that a ROHC profile is robust. Firstly, the encoding methods in the ROHC-FN library are designed to tolerate a certain number of dropped or misordered packets between the compressor and decompressor. For example, Least- Significant Bit (LSB) encoding can robustly compress fields that change by a small value between successive headers. Secondly, the "checksum" encoding method can provide a CRC over the original uncompressed header to detect faulty decompressed headers and prevent them from mistakenly being used to update the context. This situation is illustrated in Figure 1: CRC failure +--------------+ +--------------+ ================ +--------------+ -| Valid header |-| Valid header |-|Invalid header|-| Valid header |- +--------------+ +--------------+ ================ +--------------+ | | | | | | >------+ +------>------+ +--------------->--------------+ +------> context context Figure 1: Preventing accidental corruption of the context 4. Normative definition of ROHC-FN This section gives the normative definition of ROHC-FN, including its syntax and any data structures that it requires. 4.1. ROHC-FN syntax One of the key features of ROHC-FN is the ability to define new encoding methods in terms of existing encoding methods. Using this technique it is possible to start with a handful of simple encoding methods and create more complex methods offering more advanced compression techniques. Creating a new encoding method in ROHC-FN is extremely simple. All that needs to be provided is the following: a) A name for the new encoding method. b) Names for each of the parameters needed by the encoding method. c) An expression that defines how the new encoding method behaves. As an example, recall that we defined a new encoding method for handling the Payload Length in an IPv6 header as follows: payload_length ::= value (16, (Remaining_Data - 304) / 8) The above definition creates a new encoding method called "payload_length" (with no parameters) that infers the value of the field from the total number of uncompressed bits remaining in the packet. Price et al. [Page 7] INTERNET-DRAFT ROHC-FN March 3, 2003 The disadvantage of the above encoding method is that it's designed specifically to compress the Payload Length field in an IPv6 header. If other fields are encountered that behave in a similar manner then new encoding methods will need to be defined especially for these fields. The alternative is to create a more generic encoding method called "inferred_size", which can handle any Note that the "inferred_size" encoding method takes two parameters, both of which are integers. inferred_size (#n,#p) ::= value (n, (Remaining_Data - n - p) / 8) The names of each parameter needed by an encoding method are enclosed in parentheses and separated by commas. A complete list of the available parameter types can be found in Section 4.1.1. This gives us another way of compressing the Payload Length, namely: payload_length ::= inferred_size (16, 288) Note that the name of a new encoding method or parameter can be any combination of letters, digits or the symbol "_", provided that it doesn't coincide with an existing name. Names are case-sensitive. 4.1.1. Parameter types As explained in Section 4.1, encoding methods defined by ROHC-FN can have parameters that affect how the method behaves. The following parameter types are available: a) Integer (#) b) Variable name ($) c) Encoding method d) List ([]) Integers can be positive or negative, and can have arbitrary precision. Variables are explained in more detail in Section 4.2.3. An encoding method can also be defined that takes one or more existing encoding methods as parameters. This feature is very useful for creating new, more complex encoding methods from simpler ones. Lists can contain entries of any type (so lists of integers, variable names or even encoding methods are all acceptable). However, a single list MUST NOT contain entries of different types. 4.1.2. Local definitions To improve the readability of ROHC-FN, the description of an encoding method can use one or more "local definitions". For example: Price et al. [Page 8] INTERNET-DRAFT ROHC-FN March 3, 2003 uncompressible ($name,#d,#m,#p) ::= irregular (n) where n is (floor (name, d) * m + p) The keyword "where" indicates that a list of local definitions follows the expression for the new encoding method. Each local definition is given a name (using the same namespace as the parameters described in Section 4.1), followed by an arbitrary expression. When the new encoding method is used, any instances of a local definition should be replaced by the corresponding expression. 4.2. Implementation structures The following section gives some information on the data structures required by an implementation of ROHC-FN. Note that the exact way in which the data structures are stored is up to the implementation itself. The following three "global" data structures should be made available to all encoding methods: a) Uncompressed and compressed packets b) The context c) Temporary variables Each of these data structures is explained in more detail below: 4.2.1. Uncompressed and compressed packets An encoding method may need to read bits from the uncompressed packet, and may wish to append bits to the compressed version of the packet. Note that there's no need to divide the uncompressed packet into header and payload a-priori. The ROHC-FN encoding methods always specify the location of the field(s) that they need from the uncompressed packet. 4.2.2. The context The behavior of one header is often related to the behavior of previous headers in a flow. For example, the RTP Sequence Number field increases by 1 for each consecutive header in an RTP stream. ROHC profiles take into account the dependency between successive headers by storing a "context" of field values from a header. This context can then be used when compressing the next header in the flow. Price et al. [Page 9] INTERNET-DRAFT ROHC-FN March 3, 2003 ROHC-FN stores the context as a list of fields, where each field has a length and an integer value. Note that storing the length of a field is important in case the field changes length between successive headers. Encoding methods should be able to read the old field values from the context, and should be able to update the context with the new field values from the current header. 4.2.3. Temporary variables Certain ROHC-FN encoding methods may need some temporary storage space to perform calculations, or to buffer field values that cannot immediately be compressed. For this reason, ROHC-FN provides a set of temporary variables that can be used to store and retrieve data. Each temporary variable holds a single integer value. Any variables that are not in use should be set to the special value "Null", denoting that the variable is undefined. The ROHC-FN library includes various encoding methods that can read from and write to temporary variables. Temporary variable names are just arbitrary text strings. Note that the variable namespace is disjoint from the other namespaces used by ROHC-FN, so for example it is ok to give a variable the same name as an encoding method. However, to prevent confusion between encoding methods and variables it is RECOMMENDED that variable names begin with an uppercase letter. 5. Basic libraries The following section defines some core libraries for ROHC-FN: a maths library, a library for handling lists, and a basic library of encoding methods. 5.1. Maths library This section provides a library of maths functions for operating on integer values. The following functions are currently available: Integers Integers can be expressed as decimal values, binary values prefixed by 0b, or hex values prefixed by 0x. Negative integers are prefixed by a "-" sign. Null The keyword "Null" can be used to specify an undefined integer value (useful for deleting temporary variables that are no longer needed). Price et al. [Page 10] INTERNET-DRAFT ROHC-FN March 3, 2003 Inequalities Inequalities (<, <=, ==, !=, >=, >) return 1 if they are true and 0 if they are false. Arithmetic The functions +, -, *, / and ^ are available. Note that k / v is undefined if it does not evaluate to an integer. Percentages A value can be expressed as a percentage with up to 2 decimal places (i.e. abc.xy%). This is multiplied by 100 to give an integer from 0 to 10000 inclusive. floor (#k, #v) Returns k / v rounded down to the nearest integer (undefined for v == 0). mod (#k, #v) Returns k - v * floor (k, v). log2 (#v) Returns the smallest integer k where v <= 2^k. min(#k[]) Returns j such that k[j] is minimal (taking the smallest value of j in the event of a tie). checksum16 (#n) Returns the integer value of an IP checksum over the last n bits in the uncompressed packet. crc (#n) Returns an n-bit CRC over the entire uncompressed header. [Editor's Note: CRC needs to be defined precisely.] switch(#j,#k[]) Returns k[j]. byte_swap (#n, #k) Returns the byte-swapped value of k when treated as an n-bit field (undefined if n is not a multiple of 8). permutation(#n,#k,#x) Checks if the integer x is a concatenation of the integers 0 to n-1 (each interpreted as a bit pattern of length k) in any order. Returns 1 if this is the case and 0 if not. context_length (#n) Returns the length of the nth field in the context. context_value (#n) Returns the value of the nth field in the context. Variable names Returns the integer value of the specified variable. Price et al. [Page 11] INTERNET-DRAFT ROHC-FN March 3, 2003 Note that if any of the parameters are undefined, the value returned by the function is also undefined. The precedence for each of the infix functions is given below (higher precedence first): ^ *, / +, - <, <=, ==, !=, >=, > 5.2. Library for handling lists As mentioned in Section 4.1.1, one of the parameter types offered by ROHC-FN is the list. When writing lists, the individual entries should be separated by commas. For example: "0, 1, 2, 3, 4, 5" "Tom, Dick, Harry, Sally" The list handling library includes two further techniques for simplifying the creation of lists: 5.2.1. Lists of consecutive integers A list of consecutive integers can be specified using the expression "lower_bound..upper_bound". For example: "0..5" is equivalent to "0, 1, 2, 3, 4" If lower_bound > upper_bound then the resulting list is empty. 5.2.2. List comprehensions List comprehensions are a powerful way of creating new lists from existing ones. The overall expression for a list comprehension is enclosed in square braces, and is separated into two parts using the keyword "for". The left-hand side gives an expression for each element of the list in terms of one or more indices. The right-hand side declares each of these indices together with the list of values that they take, for example: [2 * j for j in 0..5] gives "0, 2, 4, 6, 8" [k - j for j in 0..3, k in 0..3] gives "0, 1, 2, -1, 0, 1, -2, -1, 0" [k for j in 0..4, k in 0..j] gives "0, 0, 1, 0, 1, 2" Price et al. [Page 12] INTERNET-DRAFT ROHC-FN March 3, 2003 Note that if more than one index is provided, the rightmost index is varied the most quickly whilst the leftmost index is varied the least quickly. 5.3. Basic library of encoding methods The following sections describe the core library of encoding methods for ROHC-FN. 5.3.1. Field The "field" encoding method extracts an n-bit field from the uncompressed packet and stores it in a temporary variable. It also updates the context with the new field value. The format of the "field" encoding method is given below: field (n) Note the single parameter n, which specifies the number of bits to be extracted from the uncompressed packet. The position of the field in the packet is read from a temporary variable called "Remaining_Data". This variable specifies the start of the field as a bit offset relative to the end of the packet, as illustrated in Figure 2. <-------------- Remaining_Data ---------------> +---------------------+-------------+-------------------------------+ | | n-bit field | | +---------------------+-------------+-------------------------------+ <----------------------- Header and payload ------------------------> Figure 2: The location of the n-bit field in the uncompressed packet Let F denote the integer value of the n-bit field (taking MSBs to precede LSBs). If n == 0 then take F to be 0. The following steps must then be taken: a) If n == Null or Remaining_Data == Null or Remaining_Data < n then fail. b) If Field_Value == Null then set Field_Value:= F and set Remaining_Data := Remaining_Data - n. c) If No_Update != 1 then set context_length (Index) := n and set context_value (Index) := F. Price et al. [Page 13] INTERNET-DRAFT ROHC-FN March 3, 2003 5.3.2. Distribution The "distribution" encoding method appends a field to the compressed packet. distribution (list) The log2 of the number of items in the list specifies the length of the compressed field. The integer value of the field is taken from the temporary variable "Field_Value". 5.3.3. Map The "map" encoding method updates the value of a temporary variable. The format of this encoding method is given below: map ($name, #old_value, #new_value) The encoding method has a total of three parameters. The first is the name of the variable to be updated. The second parameter is the old value of the variable before the encoding method has been applied, and the third parameter is the new value for the variable. When compressing a header, the "map" encoding method MUST check that the variable is equal to old_value. If this is the case then the variable is set to new_value. If not then the encoding method fails. When decompressing a header, the encoding method MUST check that the variable is equal to new_value. If this is the case then the variable is set to old_value. If not then the encoding method fails. The expressions for old_value and new_value can include any of the temporary variables except for the variable currently being updated. The "map" encoding method does not update the context or the compressed packet. 5.3.4. Concatenation The "concatenation" encoding method applies a list of encoding methods to a packet. The format of this encoding method is given below: concatenation (methods[]) The "concatenation" method requires a single parameter, which is a list of encoding methods to apply to the packet. For example: simple_header ::= concatenation irregular (16), static, lsb (4, 2) Price et al. [Page 14] INTERNET-DRAFT ROHC-FN March 3, 2003 The encoding method also has the following shorthand form: methods[0] methods[1] ... methods[n-1] For example: simple_header ::= irregular (16) static lsb (4, 2) When compressing a header, the encoding methods are applied sequentially in the same order as they appear in the list. If any method in the list fails, then the "concatenation" method also fails. 6. Extended library of encoding methods The ROHC [RFC-3095] standard contains a number of different techniques for compressing header fields (LSB encoding, scaled timestamp encoding, list-based compression etc.). Each of these techniques can be added to the ROHC-FN library so that they can be reused when creating new ROHC profiles. The following encoding methods are all defined using the ROHC-FN language of Section 4 and the basic libraries from Section 5. 6.1. Numeric encoding methods This section defines the simplest set of encoding methods that are typically used to handle numeric fields. 6.1.1. Range The "range" encoding method compresses an n-bit field that takes a value between v and v+k-1 inclusive. The definition of the encoding method is given below: range (#n,#v,#k) ::= field (n) map (Tmp, Null, Field_Value) map (Field_Value, Tmp, mod (Tmp, k)) map (Tmp, mod (mod (Field_Value - v, k) + v, 2^n), Null) distribution ([1 for j in 0..k]) The remaining encoding methods in this section are all defined in terms of the "range" encoding method. 6.1.2. Value The "value" encoding method compresses an n-bit field that takes a known value v. The definition of the encoding method is given below: Price et al. [Page 15] INTERNET-DRAFT ROHC-FN March 3, 2003 value (#n,#v) ::= range (n, v, 1) So the "value" encoding method is just a special case of "range", where the field is known to take exactly one value. In this case the field will be compressed down to 0 bits (i.e. no compressed data is needed to reconstruct the field at the decompressor). 6.1.3. Irregular The "irregular" encoding method compresses an n-bit field that behaves randomly (i.e. each of the 2^n possible field values occurs with equal probability). The definition of the encoding method is given below: irregular (#n) ::= range (n, 0, 2^n) 6.1.4. Irregular_padded The "irregular_padded" encoding method compresses an n-bit field with k least significant bits that behave randomly and with (n - k) most significant bits that are always zero. The definition of the encoding method is given below: irregular_padded (#n,#k) ::= range (n, 0, 2^k) 6.1.5. Static The "static" encoding method compresses a field whose length and value is the same as for the previous header in the flow. The definition of the encoding method is given below: static ::= range (context_length (Index), context_value (Index), 1) Note that "static" is defined using the functions context_length and context_value, which respectively return the length and value of the current field stored in the context. 6.1.6. Lsb The "lsb" encoding method compresses a field whose value differs by a small amount from the value stored in the context. The definition of the encoding method is given below: lsb (#k,#p) ::= range (context_length (Index), context_value (Index) - p, 2^k) The "lsb" encoding method can compress a field whose value lies between (context_value - p) and (context_value - p + 2^k - 1) inclusive. In particular, if p = 0 then the field value can only stay the same or increase relative to the previous header in the Price et al. [Page 16] INTERNET-DRAFT ROHC-FN March 3, 2003 flow. If p = -1 then it can only increase, whereas if p = 2^k then it can only decrease. 6.2. Encoding methods for storing fields This section provides some encoding methods that can be used to store and retrieve fields, so that their values can be used when compressing fields that occur elsewhere in the header. 6.2.1. Label The "label" encoding method extracts an n-bit field from the uncompressed header and puts it into a temporary variable: label (#n,$name) ::= field (n) map (name, Null, Field_Value) map (Field_Value, name, Null) The "label" encoding method can be used in conjunction with the "next_field" encoding method specified below. This is particularly important for capturing dependencies between fields that occur in different parts of the header. 6.2.2. Next_field The "next_field" encoding method indicates that the next field to be compressed is not the next field in the header, but instead is a certain previously labeled field (see the "label" encoding method above). The definition of the encoding method is given below: next_field ($name) ::= map (Field_Value, Null, name) map (name, Field_Value, Null) The "next_field" encoding method sets Field_Value to be the value of the named field. As an example, suppose that the IPv6 Next Header field has been stored in a temporary variable called "Next Header", ready to be compressed later on when the header following IPv6 is known. tcp_header ::= ... next_field (Next_Header) value (4, 6) 6.2.3. Local The "local" encoding method can be useful when defining a new encoding method in terms of an existing method, both of which use the same temporary variable. It prevents the case whereby two values are assigned to the same variable at the same time (which results in compression failure). Price et al. [Page 17] INTERNET-DRAFT ROHC-FN March 3, 2003 The definition of the encoding method is given below: local ($name, method) ::= call (name, name, method) The "local" encoding method takes as its parameters the name of a variable and an existing method: the named variable is set to "Null" and then the existing method is invoked. The original value is then restored to the variable. Note that the definition of the "local" encoding method is given in terms of another method "call" which is defined as follows: call (#k, $name, method) ::= map (name, k, Null) method map (name, Null, k) 6.3. Inferred encoding methods The "inferred" encoding methods describe fields whose values can be inferred from other fields in the header. The following inferred encoding methods are available: 6.3.1. Inferred_translate The "inferred_translate" encoding method translates a field value under a certain mapping. The definition of the encoding method is given below: inferred_translate (#n,#a[],#b[]) ::= field (n) map (Tmp, Null, Field_Value) map (Field_Value, Tmp, b[min([Tmp != a[j] for j in 0..end])]) map (Tmp, a[min([Field_Value != b[j] for j in 0..end])], Null) inferred_translate n list_a list_b = field n, assign "Tmp" Nil (val "Current Field"), assign "Current Field" (val "Tmp") (list_b !! min_index [val "Tmp" != (list_a !! j) | j <- [0..length list_a - 1]]), assign "Tmp" (list_a !! min_index [val "Current Field" != (list_b !! j) | j <- [0..length list_b - 1]]) Nil The first parameter specifies the length of the field before the translation. This is followed by two lists of integers, respectively giving the value of the field before and after it is translated. For example: gre_protocol ::= inferred_translate (16, 0x86dd, 0x0800, 41, 4) The GRE Protocol field behaves in the same manner as the Next Header field in other extension headers, except that it indicates that the subsequent header is IPv6 or IPv4 using the values 0x86dd and 0x0800 Price et al. [Page 18] INTERNET-DRAFT ROHC-FN March 3, 2003 instead of 41 and 4. The "inferred_translate" encoding method can translate the values required by the GRE Protocol field into the standard values, whose behavior can then be described by another encoding method such as "list". 6.3.2. Inferred_size The "inferred_size" encoding method infers the value of a field from the total amount of remaining data. The definition of the encoding method is given below: inferred_size (#n,#p) ::= value (n, (Remaining_Data - n - p) / 8) The first parameter specifies the length of the field in bits, and the second parameter specifies an offset that will be subtracted from the total data length when deriving the value of the field. 6.3.3. Inferred_offset The "inferred_offset" encoding method compresses a field that takes a known offset relative to a certain base value. The definition of the encoding method is given below: inferred_offset (#n,$base) ::= field (n) map (Tmp, Null, Field_Value) map (Field_Value, Tmp, mod (Field_Value - base, 2^n)) map (Tmp, mod (Field_Value + base, 2^n), Null) The first parameter defines the length of the field. The second parameter gives the name of a variable where the base value can be found. The value of this variable is defined by another encoding method (e.g. "label"). The "inferred_offset" encoding method subtracts the base value from the current field value. The resulting offset is then placed in the well-known variable "Current Field" so that it can be compressed by the next encoding method. For example, a typical sequence number can be compressed as follows: sequence_number ::= inferred_offset (32, MSN) static / lsb (8, -1) / lsb (16, -1) / irregular (32) In this case the offset value is expected to change rarely and only by small amounts, and hence it is defined using the "static" and "lsb" encoding methods. 6.3.4. Inferred_sequence The "inferred_sequence" encoding method compresses a field that takes a known offset relative to a base value, in the same way as inferred_offset. However, inferred_sequence additionally defines a Price et al. [Page 19] INTERNET-DRAFT ROHC-FN March 3, 2003 new base value equal to the value of the current field. This is useful for describing sequences of incrementing values. The definition of the encoding method is given below: inferred_sequence (#n,$base) ::= field (n) map (Tmp, Null, Field_Value) map (Field_Value, Tmp, mod (Tmp - base, 2^n)) map (base, mod (Tmp - Field_Value, 2^n), Tmp) map (Tmp, base, Null) The first parameter defines the length of the field. The second parameter gives the name of a variable where the base value can be found. The value of this variable is defined by another encoding method (e.g. "label"). The inferred_sequence encoding method subtracts the base value from the field value. The resulting offset is then placed in the well- known variable "Current Field" so that it can be described by the next encoding method. Additionally, the base variable is updated to contain the original value of the field being described. For example, a sequence of values such as 10, 20, 30 can be described as follows: incrementing_sequence ::= label (16, Base_Field) inferred_sequence (16, Base_Field) static / irregular (16) ; describes (2nd - 1st) entry inferred_sequence (16, Base_Field) static / irregular (16) ; describes (3rd - 2nd) entry next_field (Base_Field) static / irregular (16) ; describes 3rd entry 6.3.5. Inferred_ip_checksum The "inferred_ip_checksum" encoding method is used to compress the IP checksum field. The definition of the encoding method is given below: inferred_ip_checksum ::= map (Tmp, Null, Remaining_Data) map (Remaining_Data, Tmp, Tmp + 80) field (16) map (Remaining_Data, Tmp + 96, Tmp) map (Field_Value, checksum16(Tmp), Null) map (Tmp, Remaining_Data, Null) Price et al. [Page 20] INTERNET-DRAFT ROHC-FN March 3, 2003 The encoding method matches whenever the 2-byte field at an offset of 10 bytes from the current position in the packet contains a 16-bit one's complement checksum over the remaining data. 6.4. Control field encoding methods This section provides a selection of encoding method for handling control fields. 6.4.1. Choice The "choice" encoding method stores an implementation-specific value in a temporary variable. The definition of the encoding method is given below: choice ($name) ::= map (name, Null, Choice) map (Choice, name, Null) 6.4.2. Self-describing values The "self-describing values" encoding method offers a choice between two existing encoding methods. The definition of this encoding method is given below: sdv (methods[2]) ::= choice (Field_Value) local (Field_Value, methods [Field_Value]) distribution ([1 for j in 0..2]) The "self-describing values" encoding method also has the following infix form using the symbol "/": method[0] / method[1] The compressor may use either method[0] or method[1] when compressing a header. It then appends a 1-bit indicator flag to the compressed packet, which is set to "0" if method[0] is used and is set to "1" if method[1] is used. This tells the decompressor which method must be applied in order to successfully decompress the header. 6.4.3. Pick The "pick" encoding method allows an implementation to pick one encoding method from a list of possible choices. The definition of the encoding method is given below: pick (methods[], #p[]) ::= choice (Field_Value) local (Field_Value, methods [Field_Value]) distribution ([p[j] for j in 0..end]) The parameters for the "pick" encoding method include a list of existing methods, followed by a list of integers specifying the Price et al. [Page 21] INTERNET-DRAFT ROHC-FN March 3, 2003 relative probability with which each method will be applied in a typical header flow. The "pick" encoding method also has the following infix form using the symbol "|": method[0] p[0] | ... | method[n-1] p[n-1] The precedence for this infix form is higher than that for the "concatenation" encoding method, but lower than the precedence of any maths functions. 6.4.4. Format The "format" encoding method compresses a field using the same encoding method as used to compress the field in the previous header. The definition of the encoding method is given below: format (methods[]) ::= choice (Field_Value) local (Field_Value, methods [Field_Value]) static The "format" encoding method is followed by a list of existing method, each of which offers a different compression technique for the field. The encoding method that is chosen is the same as used for the preceding header. For example: ip_id ::= format (sequential_ip_id, random_ip_id) Two encoding methods are provided for the IP-ID: one for an IP-ID that increases sequentially, and one for a randomly behaving IP-ID. 6.4.5. Nbo The "nbo" encoding method compresses a field that may be in reverse byte order from the rest of the data, as is sometimes seen with the IP-ID. The definition of the encoding method is given below: nbo (#n) ::= label (n, Base_Field) choice (NBO_Flag) map (NBO_Field, Null, switch (NBO_Flag, byte_swap (n, Base_Field), Base_Field)) map (Base_Field, switch (NBO_Flag, byte_swap (n, NBO_Field), NBO_Field), Null) The encoding method examines the n-bit field (where n is typically 16 or 32) and generates a variable "NBO Flag" indicating whether or not Price et al. [Page 22] INTERNET-DRAFT ROHC-FN March 3, 2003 the field is in network byte order. Determining whether a field is in network byte order is a complex task (typically requiring the compressor to examine a history of field values) so the value of "NBO Flag" is left as a choice for the implementation. If the field is in network byte order then the NBO Flag is set to 1 and a variable "NBO Field" is created containing the original field value. If the field is not in network byte order then the NBO Flag is set to 0 and NBO Field is given the byte-swapped value of the original field. For example: ip_id ::= nbo (16) ; Checks byte order next_field (NBO_Flag) ; Compresses NBO Flag static / irregular (1) next_field (NBO_Field) ; Compresses NBO Field inferred_offset (16, MSN) static / lsb (5, -1) / irregular (16) 6.4.6. Scale The "scale" encoding method compresses a field which changes by a regular (usually large) amount in consecutive headers. The definition of the encoding method is given below: scale (#n) ::= label (n, Base_Field) choice (Scale) map (Scaled_Field, Null, floor (Base_Field, Scale)) map (Remainder_Field, Null, mod (Base_Field, Scale)) map (Base_Field, Scale*Scaled_Field+Remainder_Field, Null) The number of bits specified are examined and a suitable scale factor is chosen (typically using a history of field values). The original field value is then mapped to two values: the scaled value obtained by dividing the field by the chosen scale factor, and the remainder value from this division. These two values plus the scale factor itself are sufficient to reconstruct the original field value. 6.4.7. Checksum The "checksum" encoding method provides a CRC calculated across the original uncompressed header. The size of the CRC can be altered depending on the characteristics of the link over which the protocol is to be transmitted. A sufficiently long CRC should be provided to ensure the probability that an unexpected error will not be spotted is negligible. The definition of the encoding method is given below: checksum (#n) ::= map (Field_Value, Null, crc (n)) irregular (n) Price et al. [Page 23] INTERNET-DRAFT ROHC-FN March 3, 2003 Note that it is possible to offer different CRC lengths, so extra CRC bits can be allocated to protect important headers. This is illustrated in the example below: crc_field ::= checksum (3) / (checksum (7) 6.5. List encoding methods Many protocol fields and headers can be compressed by a single encoding method built up from lists of simpler encoding methods. This section defines a set of encoding methods that can be used to compress lists. 6.5.1. List The "list" encoding method compresses a list of items that do not necessarily occur in the same order for every header. Example applications for "list" encoding include TCP options and TCP SACK blocks. The definition of the encoding method is given below: list ($name,#d,#m,#p,methods[]) ::= map (Start, Null, Remaining_Data) map (Order, Null, 0) map (Presence, Null, 0) concatenation ([list_item for j in 0..end]) map (Start, switch (permutation (end, k, Order), -1, Remaining_Data + floor (name, d)*m+p), Null) where k is log2 (end) list_item is choice (Next_Method) choice (Optional) method [Next_Method] map (Tmp, Null, Order) map (Order, Tmp, Tmp*2^k+Next_Method) map (Next_Method, mod (Order, 2^k), Null) map (Tmp, floor (Order, 2^k), Null) map (Tmp, Null, Presence) map (Presence, Tmp, Tmp*2+Optional) map (Optional, mod (Presence, 2), Null) map (Tmp, floor (Presence, 2), Null) The first parameter gives the name of a variable, whilst the next three parameters specify a divisor, multiplier and offset which scale the variable so that it specifies the exact size of the list in bits. These parameters are followed by a set of n existing encoding methods that can be used to compress individual items in the list. Price et al. [Page 24] INTERNET-DRAFT ROHC-FN March 3, 2003 Once the total list size is reached, the "list" encoding method outputs the variables "Order" and "Presence". The order consists of a string of n * k-bit entries (where k is the least number of bits required to index each of the n entries). The order list contains the indices in the order in which they occurred. The presence data is a list a 1-bit flags, one per entry in the list, which are set to "1" to indicate that the list item is present or "0" if not. These flags occur in the order in which the entries appear in the list encoding. It is expected that the "optional" encoding method (see Section 6.6.1) might be used within list encoding. The variable "Optional" is thus set automatically within list encoding, so that it can be used in conjunction with the "optional" encoding method as illustrated below: tcp_options ::= list (TCP_Data_Offset, 1, 32, 0, optional (tcp_sack), optional (tcp_timestamp), optional (tcp_end), optional (tcp_generic)) next_field (Order) static / irregular (8) ; TCP Options order next_field (Presence) static / irregular (4) ; TCP Options presence The "list" encoding method automatically compresses the variable that is specified in the first parameter, so there is no need to explicitly compress this variable later on. 6.5.2. List_n The "list_n" encoding method is similar to "list" encoding, with the addition that each of the encoding methods comprising the list can be applied multiple times. The definition of the encoding method is given below: list_n ($name,#d,#m,#p,methods[],#number[]) ::= list (name,d,m,p,[methods[j] for j in 0..end, k in 0..number[j]]) The "list_n" method is a useful shorthand in the case that a method in the list may occur a large number of times. This encoding method is chosen over a truly open list (i.e. one allowing an unbounded number of repeats) since it is clear that it Price et al. [Page 25] INTERNET-DRAFT ROHC-FN March 3, 2003 should always be possible to set an upper bound, and moreover this information improves the overall compression ratio. For example: tcp_options ::= list_n (TCP_Data_Offset, 1, 32, 0, optional (tcp_sack), optional (tcp_timestamp), optional (tcp_end), optional (tcp_pad), optional (tcp_generic), 1, 1, 1, 32, 3) 6.6. Miscellaneous encoding methods This section introduces some miscellaneous encoding methods that can be used to compress fields in a protocol header. 6.6.1. Optional The "optional" encoding method is used to describe fields that are optionally present in the header. The definition of the encoding method is given below: optional (method) ::= concatenation ([method for j in 0..Optional]) The "optional" encoding method uses the well-known variable "Optional" to specify whether or not the field is present in the header. The variable can be set by another encoding method such as "label" or "list". For example: gre_key_flag ::= label (1, Optional) gre_key ::= optional (key_present) In this case the "key_present" encoding method is called to compress the GRE_Key field, but only if the GRE_Key Flag field is set to 1. 6.6.2. Uncompressible The "uncompressible" encoding method is a special case of "irregular" encoding, where the length of the field can be derived from the value contained in a variable. The definition of the encoding method is given below: uncompressible ($name,#d,#m,#p) ::= irregular (floor (name, d)*m+p) The first parameter gives the name of the variable, whilst the next three parameters specify a divisor, multiplier and offset which scale the variable so that it specifies the size of the uncompressible field in bits. Price et al. [Page 26] INTERNET-DRAFT ROHC-FN March 3, 2003 The "uncompressible" encoding method is typically used in conjunction with the "label" method. 6.6.3. Skip The "skip" encoding method skips past a certain field in the protocol header without compressing it. The definition of the encoding method is given below: skip (#n) ::= map (Tmp, Null, Remaining_Data) map (Remaining_Data, Tmp, Tmp - n) map (Tmp, Remaining_Data + n, Null) 6.6.4. No update The "no update" encoding method behaves just like the encoding method specified as its parameter, with the exception that it does not update the context. This is useful when a field exhibits an unusual behavior for one header and then reverts back to its original behavior in subsequent headers. The definition of the encoding method is given below: no_update (method) ::= map (No_Update, Null, 1) method map (No_Update, 1, Null) An example of the "no_update" encoding method in use is given below: tcp_window ::= static / no_update (lsb (11, 2048)) / irregular (16) In the above example the "no_update" encoding method is applied to the TCP Window field. When the field is compressed using "lsb" encoding, it does not update the context. 7. Security considerations This draft describes a formal notation similar to ABNF [RFC-2234], and hence is not believed to raise any security issues. 8. Acknowledgements A number of important concepts and ideas have been borrowed from ROHC [RFC-3095]. Updates to the LIST encoding methods owe much to discussions with Qian Zhang and Hongbin Liao. Thanks to Paul Ollis for field labeling; and to Rob Hancock and Stephen McCann for putting up with the authors' arguments and making helpful suggestions, frequently against the tide! Price et al. [Page 27] INTERNET-DRAFT ROHC-FN March 3, 2003 The authors would also like to thank Carsten Bormann, Christian Schmidt, Max Riegel and Lars-Erik Jonsson for their comments and encouragement. We haven't always agreed, but the arguments have been fun! 9. Authors' addresses Richard Price Tel: +44 1794 833681 Email: richard.price@roke.co.uk Abigail Surtees Tel: +44 1794 833131 Email: abigail.surtees@roke.co.uk Mark A West Tel: +44 1794 833311 Email: mark.a.west@roke.co.uk Roke Manor Research Ltd Romsey, Hants, SO51 0ZN United Kingdom http://www.roke.co.uk 10. References [HASKELL] "Report on the Programming Language Haskell 98", S. Peyton Jones et al., February 1999 [RFC-2026] "The Internet Standards Process - Revision 3", Scott Bradner, RFC 2026, Internet Engineering Task Force, October 1996 [RFC-2119] "Key words for use in RFCs to Indicate Requirement Levels", Scott Bradner, RFC 2119, Internet Engineering Task Force, March 1997 [RFC-2234] "Augmented BNF for Syntax Specifications: ABNF", D. Crocker and P. Overell, RFC 2234, Internet Engineering Task Force, November 1997 [RFC-3095] "RObust Header Compression (ROHC)", Carsten Bormann et al, RFC3095, Internet Engineering Task Force, July 2001 Price et al. [Page 28]