Robust Header Compression (ROHC) WG Khiem Le INTERNET-DRAFT Christopher Clanton Date: 24 May 2000 Zhigang Liu Expires: 24 November 2000 Haihong Zheng Nokia Research Center Adaptive Header ComprEssion (ACE) for Real-Time Multimedia Status of This Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This document is a submission of the IETF ROHC WG. Comments should be directed to its mailing list rohc@cdt.luth.se Abstract When Real-Time Multimedia over IP is applied to cellular systems, it is critical to minimize the overhead of the IP/UDP/RTP header, as spectral efficiency is a top requirement. Robustness to errors and error bursts is also a must. Existing IP/UDP/RTP header compression schemes such as that presented in IETF RFC 2508 [CRTP], do not provide sufficient performance in such environments. This report describes a new scheme (ACE, or Le, Clanton, Liu, Zheng [Page i] INTERNET-DRAFT Robust Header Compression 24 May 2000 Adaptive header ComprEssion) , which like RFC 2508, is based on the idea that most of the time IP/UDP/RTP fields are either constant or can be extrapolated in a linear fashion. However, ACE incorporates several additional concepts which enable it to provide excellent compression efficiency (exceeds the performance of [CRTP]) along with a high degree of error-resiliency. Some of the concepts employed, such as Variable Length Encoding (VLE), enable ACE to adapt to changing behavior in the IP/UDP/RTP header fields, such that good efficiency and robustness characteristics are maintained over a wide range of operating conditions. ACE is a general framework that can be parameterized to account for the existence/non-existence and performance characteristics of the feedback channel. Thus, ACE is applicable over both bi-directional and unidirectional links. ACE is also able to perform a seamless handoff, i.e. the scheme can resume efficient compression operation immediately after handoff. Le, Clanton, Liu, Zheng [Page ii] INTERNET-DRAFT Robust Header Compression 24 May 2000 Table of Contents Status of This Memo . . . . . . . . . . . . . . . . . . . . . . . i Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . i 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 1 2. Basic Framework of ACE . . . . . . . . . . . . . . . . . . . . 1 2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 1 2.2. ACE Assumptions . . . . . . . . . . . . . . . . . . . . . 2 2.3. ACE Modes of Operation . . . . . . . . . . . . . . . . . 3 2.4. Compression States . . . . . . . . . . . . . . . . . . . 3 2.4.1. Initialization/Refresh (IR) State . . . . . . . . . 4 2.4.2. First Order (FO) State . . . . . . . . . . . . . . . 4 2.4.3. Second Order (SO) State . . . . . . . . . . . . . . 4 2.5. Profiles and Context . . . . . . . . . . . . . . . . . . 5 2.6. ACE Feedback . . . . . . . . . . . . . . . . . . . . . . 6 2.6.1. ACE Acknowledgements . . . . . . . . . . . . . . . . 7 2.6.1.1. Flexibility of ACE ACKnowledgements . . . . . . 8 2.6.2. Refresh Requests . . . . . . . . . . . . . . . . . . 8 2.7. Bandwidth and Efficiency Constraints . . . . . . . . . . 8 2.7.1. Sparse IR and Sparse FO . . . . . . . . . . . . . . 8 2.7.2. Sparse ACK . . . . . . . . . . . . . . . . . . . . . 9 2.8. Checksum . . . . . . . . . . . . . . . . . . . . . . . . 9 2.9. Decompression Strategy . . . . . . . . . . . . . . . . . 10 3. Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.1. VLE - Variable Length Encoding . . . . . . . . . . . . . 10 3.1.1. VLE Basics . . . . . . . . . . . . . . . . . . . . . . 11 3.1.2. One-Sided Variable Length Coding (OVLE) . . . . . . 13 3.2. Timer-Based Compression of RTP Timestamp . . . . . . . . 14 4. Protocol definition . . . . . . . . . . . . . . . . . . . . . 18 4.1. Profiles . . . . . . . . . . . . . . . . . . . . . . . . 18 4.2. Compressor/decompressor Logic . . . . . . . . . . . . . . 19 4.2.1. Starting point . . . . . . . . . . . . . . . . . . . 19 4.2.2. NRP Mode . . . . . . . . . . . . . . . . . . . . . . 20 4.2.2.1. Compressor logic . . . . . . . . . . . . . . . . 20 4.2.2.2. Decompressor logic . . . . . . . . . . . . . . . 21 4.2.3. RPP Mode . . . . . . . . . . . . . . . . . . . . . . 22 4.2.3.1. Compressor logic . . . . . . . . . . . . . . . . 22 4.2.3.2. Decompressor logic . . . . . . . . . . . . . . . 24 4.3. Compressed Packet Formats . . . . . . . . . . . . . . . . 25 5. Robustness/Efficiency Issues and Tradeoffs . . . . . . . . . . 31 5.1. RPP Mode - Rationale for ACK-based Robustness . . . . . . 31 Le, Clanton, Liu, Zheng [Page iii] INTERNET-DRAFT Robust Header Compression 24 May 2000 5.2. NRP Mode - Rationale for Operation . . . . . . . . . . . 33 5.3. Checksum - Rationale for Use (Instead of CRC) . . . . . . 33 6. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 34 7. Intellectual Property Considerations . . . . . . . . . . . . . 35 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 36 9. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 37 Appendix A - Header Field Classification . . . . . . . . . . . . 38 Appendix B - Implementation Hints . . . . . . . . . . . . . . . . 39 B.1. Considerations for Feeback Channel Realization (RPP mode only) . . . . . . . . . . . . . . . . . . . . 40 B.2. ACK Transmission Frequency (RPP mode only) . . . . . . . 41 B.3. Transmission of Checksum Information (RPP mode only) . . 41 B.4. Suggested Parameter Values . . . . . . . . . . . . . . . 42 Appendix C - Experimental Results . . . . . . . . . . . . . . . . 44 C.1. Test Configuration . . . . . . . . . . . . . . . . . . . 44 C.2. Assumptions in implementing RFC 2508 . . . . . . . . . . 46 C.3. Test Results . . . . . . . . . . . . . . . . . . . . . . 47 Appendix D - Handoff Operation . . . . . . . . . . . . . . . . . 49 Le, Clanton, Liu, Zheng [Page iv] INTERNET-DRAFT Robust Header Compression 24 May 2000 1. Introduction This Internet draft describes a general header compression framework, and specifies in detail a protocol to compress IP/UDP/RTP headers, based on the framework. For conciseness, the general background information on header compression, and requirements specific to error-prone and low bandwidth environments is not elaborated on in this document. Much of the information can be found in RFC2508 and [REQ]. The remainder of this document is structured as follows. Section 2 is a conceptual introduction, which describes the basic ACE framework, as well as its application to IP/UDP/RTP header compression. Section 3 describes the different encoding techniques. Section 4 is the protocol definition. It provides a detailed description of the profile attributes and the resulting profiles, the compressor/decompressor logic, packet header formats, and bit field definitions. Section 5 specifies rationale for algorithm choices. Some informative appendices are also included at the end. 2. Basic Framework of ACE 2.1. Terminology - Static field: A header field that does not change for the duration of the session. An example is the UDP port number. - Random field: A field that is inherently unpredictible, e.g. UDP checksum - Compression context: The set of data used by the compressor to compress the current header. The compression context is a dynamic quantity that may change on a header-by-header basis - Decompression context: The set of data used by the decompressor to decompress the current header. The decompression context is a dynamic quantity that may change on a header-by-header basis; to enable correct decompression, the decompression context must be in synchronization with the compression context - Numerical field: A header field that can be interpreted as numerical information, e.g. RTP Sequence Number. An example of non- numerical field is the RTP CSRC list. Le, Clanton, Liu, Zheng [Page 1] INTERNET-DRAFT Robust Header Compression 24 May 2000 - String: a sequence of headers whose 1) non-random numerical fields change by a delta value which is constant within the string (a particular case is delta equal to zero, if the field does not change); and 2) other fields do not change - String pattern: the set of information needed to decompress a header belonging to a string For non-random numerical fields, it is the delta value. For the other fields, it is the field itself. - Error propagation: happens when the current compressed header, even though it is not corrupted by transmission error, has to be discarded by the decompressor, because the context synchronization has been lost between the compressor and decompressor. Error propagation lasts until the contexts have been resynchronized. - Error cumulation: happens when the context synchronization is lost due to undetected transmission errors, and subsequent headers are likely to be incorrectly decompressed - Window-based field: A header field that is encoded according to VLE or Timer techniques 2.2. ACE Assumptions ACE only requires a minimal set of assumptions, which are listed here. - The communication channel between the compressor and the decompressor, referred to as 'CD-CC', could be a link or a concatenation of links; in particular, it could include multiple IP networks and/or multiple low-bandwidth links. - Packet Ordering: Packets transferred between compressor and decompressor may be lost or corrupted, but their order should be maintained by the CD-CC (i.e., FIFO pipe). - Error Detection: The scheme should include a mechanism to detect errors in the compressed header; this mechanism may be provided by the CD-CC. If CD-CC does not provide adequate error detection capability, the scheme may be extended in a straightforward fashion by adding an error detection code at the compressor- decompressor level. - Packet Length: If this information is provided by the link layer, it may be excluded from the compressed header. Otherwise, packet length must be carried in the compressed header. Le, Clanton, Liu, Zheng [Page 2] INTERNET-DRAFT Robust Header Compression 24 May 2000 - Link-Layer Fragmentation: CD-CC may fragment packets in an arbitrary manner, as long as they are reconstructed at the decompressor. 2.3. ACE Modes of Operation ACE is a general framework for robust header compression, that can be parameterized to account for the existence/non-existence and performance characteristics of a return path to carry feedback. Specifically, environments where the compressor/decompressor operate can be classified in one of the following cases: - Return Path Present (RPP): This is the case where there is a return path, which can be used to carry feedback information from the decompressor to the compressor. ACE does not make any assumptions on the performance of the return path. The return path may experience a wide variance in delay, latency and/or error performance. One example of return path with large latency fluctuation is when feedback is piggybacked on the forward data sent in the opposite direction, and the forward data is sent only intermittently (e.g. speech that employs silence supression tactics, where data is only transmitted when there is actual speech detected at the transmitter). - No Return Path (NRP): This mode of operation is used when there is no reverse channel of any kind. In this case, no feedback can be obtained from the decompressor. ACE has 2 modes of operation, RPP and NRP. RPP mode is used when the compressor knows it is in the RPP case. NRP mode is used when the compressor knows it is in the NRP case. NRP mode is also used to start the process if the compressor does not know which case is in effect. If the compressor subsequently receives a feedback from the decompressor, it will switch to the RPP mode and continue in that mode. 2.4. Compression States The compressor starts in the lowest compression state and gradually transitions to higher compression states. The general principle is the compressor will always operate in the highest possible compression state, under the constraint that the compressor has sufficient confidence that the decompressor has the information necessary to decompress a header compressed according to that state. In the RPP case, that confidence comes from receipt of ACKs from the Le, Clanton, Liu, Zheng [Page 3] INTERNET-DRAFT Robust Header Compression 24 May 2000 decompressor. In the NRP case, that confidence comes from sending the information a certain number of times. The compressor may also transition back to a lower compression state when necessary. Thus ACE's primary approach to robustness is proactive, i.e. it aims at avoiding loss of context synchronization between the compressor and decompressor, rather than reacting to it. For IP/UDP/RTP compression, the three compressor states are the Initialization/Refresh, First Order, and Second Order. A brief description of each is given in the subsections below. 2.4.1. Initialization/Refresh (IR) State In this state, the compressor essentially sends IR headers. The information sent in a refresh may be static and non-static fields in uncompressed form (full refresh), or just non-static fields in uncompressed form (non-static refresh). The compressor enters this state at initialization, upon request from decompressor, or upon Refresh Time-out. The compressor leaves the IR state when it is confident that the decompressor has correctly received the refresh information. 2.4.2. First Order (FO) State Subsequently to the IR state, the compressor operates in the FO state when the header stream does not conform to a string, or when the compressor is not confident that the decompressor has acquired the string pattern. In this state, the compressor essentially sends FO headers. In the case of speech with silence suppression turned on, a new talk spurt following a silence interval will result in the RTP TS incrementing by more than TS_stride, the regular TS increment. Consequently, the header stream does not conform to the string pattern, and the compressor is in the FO state. The compressor will leave this state and transition to the SO state when the current header conforms to a string, and the compressor is confident the decompressor has acquired the string pattern. 2.4.3. Second Order (SO) State The compressor enters this state when the header to be compressed belongs to a string, and the compressor is sufficiently confident that the decompressor has also acquired the string pattern. In the SO Le, Clanton, Liu, Zheng [Page 4] INTERNET-DRAFT Robust Header Compression 24 May 2000 state, the compressor sends SO headers, which essentially only consist of a sequence number. While in the the SO state, the decompressor does a simple extrapolation based on information it knows about the pattern of change of the header fields and the sequence number contained in the SO header in order to regenerate the uncompressed header. The compressor leaves this state to go back to FO state when the header no longer conforms to the string. 2.5. Profiles and Context To be able to decompress, the decompressor must share a common knowledge of some information with the compressor. The information can be short term (i.e. may change frequently, e.g. from packet to packet), or long term (remain constant for the duration of the session, or seldom change). A profile is a repository of the long term information. A profile is determined by the flow to be compressed (e.g. IP version, whether it is UDP, TCP, or RTP, if it is RTP, what is the codec behavior), the system operating environment (e.g. RPP or NRP), the CD-CC performance behavior and capabilities (e.g. error distributions, whether the CD- CC provides packet boundary framing), and by the user-specified required level of performance. A profile can be characterized by a set of attributes. A context is a repository of the short term information. For a given profile, the compressor uses a compression context to compress the current packet, while the decompressor uses a corresponding decompression context. An example of information contained in the context is the field values of the last decompressed header. An IP/UDP/RTP compression or decompression profile is determined by the following considerations: - Stream to be compressed: IP version: v4 or v6 UDP behavior: whether UDP checksum is used or not RTP codec behavior: Value of TS_stride, Linearity of RTP TS with respect to wall-clock - CD-CC: Distribution of header losses due to detected transmission errors Capability to sustain transmission of consecutive large size Le, Clanton, Liu, Zheng [Page 5] INTERNET-DRAFT Robust Header Compression 24 May 2000 headers Residual error rate, seen by decompressor - User requirements: Maximum acceptable error propagation (in NRP case only) Each of the above considerations corresponds to some profile attribute. Stream-related profile attributes are derived by the compressor from observing the flow of headers. Some attributes (e.g. IP version) can be instantaneously derived from the current header. Others may require observation over multiple headers (e.g. Value of TS_stride). CD-CC-related profile attributes are determined by the upfront knowledge of the CD-CC. In the absence of such knowledge, default attributes can be used. The decompression profile must share some attribute values (e.g. IP version, TS_stride value) with the compression profile, for the decompression to succeed. Some values are acquired by the decompressor through observing the flow of full headers sent at initialization (e.g. IP version). Others may have to be explicitly sent by the compressor, through in-band signaling, or some other mechanism. In-band signaling information is sent by the compressor in the in- band signaling field of the compressed header. In-band signaling can be used to update the decompressor on some information not found explicitly in the non-static compressed header fields, and that typically seldom changes. An example is the information that the compressor is using Timer encoding technique for the RTP TS of the current header, (and of subsequent headers, until further notice) along with the value of TS_stride. Other uses are possible. See section 3 for details on encoding techniques. Some attributes do not need to be known by the decompressor. See section 4 for a detailed specification of the profiles along with their attributes. 2.6. ACE Feedback This section is relevant only to the RPP mode. ACE has different kinds of feedback: Le, Clanton, Liu, Zheng [Page 6] INTERNET-DRAFT Robust Header Compression 24 May 2000 - Ack - Refresh_Request (full or non-static) 2.6.1. ACE Acknowledgements An ACK packet contains a sequence number that uniquely identifies the compressed header packet that is being ACKed. ACE ACKnowledgements have four main functions: - To inform the compressor that Refresh information has been received. In that case, the compressor knows that the decompressor has acquired the information necessary to decompress FO headers. This means the compressor can reliably transition to the next higher compression state, the FO state. This kind of ACK is referred to as an IR-ACK. - To inform the compressor that FO information has been received; in that case, the compressor knows that the decompressor has acquired the information necessary to decompress SO headers. This means the compressor can reliably transition to the next higher compression state, SO; this kind of ACK is referred to as an FO-ACK. - To inform the compressor that a header with a specific sequence number n has been received; in that case, the compressor knows that the decompressor can determine the sequence number without any ambiguity (caused, e.g., by counter wrap-around) up to header number n + seq_cycle, where seq_cycle is the counter cycle (determined by the number of bits, k, in the sequence number). This kind of ACK is referred to as an SO-ACK. - When information is sent as in-band signaling, to confirm that the in-band signaling information has been received The control of transition from IR to FO to SO states by ACKs ensures that there is no context desynchronization, and therefore no error propagation. That is, a compressed header that is not received in error can always be correctly decompressed, because synchronization is never lost. Reception of ACKs by the compressor can also be used to increase compressor header field encoding efficiency. Compression is more efficient because the compressor just has to send the necessary Le, Clanton, Liu, Zheng [Page 7] INTERNET-DRAFT Robust Header Compression 24 May 2000 information (but no more) to ensure correct decompression of the current header. In general, the minimal information that the compressor needs to send depends on what information the decompressor already knows. The information known at the decompressor is indicated to the compressor in the decompressor's ACK transmission. 2.6.1.1. Flexibility of ACE ACKnowledgements There is a lot of flexibility with respect to when and how often the decompressor sends an ACK. ACE is also extremely resilient to ACKs being lost or delayed. The compressor constantly adapts its compression strategy based on the current information to be compressed and the ACKs received. Loss or delay of an FO-ACK may result in the compressor staying longer in the FO state. Loss or delay of an SO-ACK may result in the compressor sending more bits for the sequence number, to prevent any incorrect decompression at the decompressor caused by counter wrap around. An advantage of this flexibility is that the feedback channel utilized to transmit the ACKs can have very loose requirements. This is because ACKs only have an effect on the compression efficiency, NOT the correctness. Delay or loss of ACKs might cause the size of compressed headers to increase, but even in such cases the increase is minor (logarithmic). 2.6.2. Refresh Requests Refresh Requests are sent when the decompressor determines that it needs refresh information. Requests can be for a Full Refresh or Non- static Refresh. 2.7. Bandwidth and Efficiency Constraints 2.7.1. Sparse IR and Sparse FO Larger size headers such as IR or some FO may cause a problem if too many of them are sent consecutively. This might occur, e.g., if the compressor waits for an ack to transition to a higher compression state, but the round-trip delay between compressor and decompressor is large, and in the meantime the CD-CC cannot accomodate on a sustained basis the surge in bandwidth demand caused by the larger Le, Clanton, Liu, Zheng [Page 8] INTERNET-DRAFT Robust Header Compression 24 May 2000 size headers. There is also a negative impact on overall efficiency. Sparse transmission technique can be used to alleviate the problem. In that technique, the larger size headers are interspersed with smaller size headers. The smaller size headers are the ones that would be sent in the higher compression state, where the compressor would transition to, if the larger size headers were acked. Sparse transmission can be applied to IR headers interspersed with FO headers, or FO headers interspersed with SO headers. 2.7.2. Sparse ACK An enhancement to the acknowledgement procedure can be used to reduce FO ACK traffic on the feedback channel; this traffic can be quite high if there is significant round trip delay between compressor and decompressor. In this case, several FO headers would be sent before the compressor can receive an ACK, and normally, one ACK would be sent by the decompressor for each FO header received. The basic idea is that whenever the decompressor receives a packet and needs to send an ACK to the compressor, it just sends the ACK once (or twice if there is no default 'pattern' agreed on by the compressor and decompressor) and waits for some round trip time (as opposed to sending ACKs in response to each, e.g., FO packet on the feedback channel). After the round trip time, if the decompressor can confirm that the compressor received the ACK (evidenced by receipt of an SO packet at the decompressor), it continues normal decompression. Otherwise, it will send the ACK again and the process repeats. The only potential negative to this approach is if the ACK sent by the decompressor is lost. In that case, progression to the next higher compression state by the compressor is delayed until the next ACK is correctly received (at least one round trip time). 2.8. Checksum In the RPP mode, a checksum is used as a secondary means of robustness, in complement to the ACK, which is the primary means of robustness. The CS needs only be attached to a small percentage of headers. In the NRP mode, a checksum is used in conjunction with refresh for robustness. The CS is attached to all headers. The CS is calculated as a 8-bit one's complement checksum, Le, Clanton, Liu, Zheng [Page 9] INTERNET-DRAFT Robust Header Compression 24 May 2000 calculated over the whole uncompressed header. Refer to section 5 for a more detailed discussion. 2.9. Decompression Strategy The decompressor uses as reference for decompression only those headers which it is sufficiently confident of the correct decompression (secure reference). A secure reference must be chosen from the headers received with an OK CS. Until a new secure reference is chosen, all subsequent headers are decompressed with respect to the current secure reference. A major advantage of this approach is that an undetected error which affect correct decompression of header m will not affect decompression of subsequent headers. For example, if header # 3 is an FO used as a secure reference, and header # 5 is an SO with an undetected error, the decompression of header # 6 will be based solely on header # 3 and not affected by header # 5. In other words, an undetected error will affect only the current header, just like when headers are not compressed. 3. Encoding This section describes techniques for minimizing the overhead associated with transmitting this kind of compressed header information in FO and SO packet headers, as well as ACK packets. 3.1. VLE - Variable Length Encoding An alternative approach to encoding irregular changes in header fields is to send the 'k' least significant bits of the original header field value. Clearly, it is desirable for the compressor to minimize this number of bits. Due to the possible loss of packets on the channel between compressor and decompressor (CD-CC), the compressor does not know which packet will be used as the reference by the decompressor, and hence, does not know how many LSBs need to be sent. Variable Length Encoding (VLE) solves this problem. The basic algorithm employs a 'sliding window', maintained by the compressor, which is advanced when the compressor has sufficient confidence that the compressor has certain information. The confidence may be obtained by various means, e.g., an ACK from the decompressor if operating in RPP. In the case of NRP a sliding window of fixed size, Le, Clanton, Liu, Zheng [Page 10] INTERNET-DRAFT Robust Header Compression 24 May 2000 e.g. M (described later) may be used. In either case, the value of k determined depends on the current values in the sliding window. Details of the operation follow below. 3.1.1. VLE Basics Basic concepts of VLE are: * The decompressor uses one of the decompressed header values as a reference value, v_ref. The reference may be chosen by various means- one approach might be to select only headers whose correct reconstruction is verified by inclusion of a checksum with the compressed header ("secure" reference). * The compressor maintains a sliding window of the values (VSW) which may be chosen as a reference by the decompressor. It also maintains the maximum value (v_max) and the minimum value (v_min) of VSW. * When the compressor has to compress a value v, it calculates the range r = max(|v - v_max|, |v - v_min|). The value of k needed is k = ceiling(log2(2 * r + 1)), i.e., the compressor sends the ceiling(log2(2 * r +1)) LSBs of v as the encoded value. * The compressor adds v into the VSW and updates the v_min and v_max IF the value v could potentially be used as a reference by the decompressor. * The decompressor chooses as the decompressed value the one that is closest to v_ref and whose k LSB equals the compressed value that has been received. It is obvious that we need to move forward (or shrink) the sliding window to prevent k from increasing too much. To do that, the compressor only needs to know which values in VSW have been received by the decompressor. In the case of RPP, that information is carried in the ACKs. In the case of NRP, the VSW is moved without ACK, if there are a maximum number of entries, 'M', already present in VSW. M is defined in the compressor logic section and further elaborated upon in the "Implementation Hints" appendix. The VLE concept can be applied to RTP Timestamp, RTP Sequence Number, IP-ID header fields, etc. The examples below illustrate the operation of VLE under various scenarios. The field values used in the examples could correspond to any fields that we wish to compress. The examples illustrate the scenario where the compressed field has resolution of one bit. Le, Clanton, Liu, Zheng [Page 11] INTERNET-DRAFT Robust Header Compression 24 May 2000 Example 1: Normal operation (no packet loss prior to compressor, no reodering prior to compressor). Suppose packets with header fields 279, 280, 281, 282, and 283 have been sent, and 279 and 283 are fields of potential reference packets. The current VLE window is {279, 283}. and a packet with field value = 284 is received next, VLE computes the following values New Value VMax VMin r # LSBs 284 283 279 max[|284-279|,|284-283|]=5 4 The window is unmodified if we assuming the new packet {284} is not a potential reference. The field is encoded using 4 bits in this case, and the actual encoded value is the 4 least significant bits of 284 (10011100) which = 1100. Example 2: Packet Loss prior to compressor. Suppose packets with header fields 279, 280, 281, 282, and 283 have been sent, and 279 and 283 are fields of potential reference packets such that the VSW is again {279, 283}. If a packet with field value = 290 is received next, VLE computes the following values New Value VMax VMin r # LSBs 290 283 279 max[|290-283|,|290-279|]=11 5 So the field is encoded using 5 bits. Actual encoded value is the 5 LSBs of 290 (100100010) which = 00010. If we assume the new value is a potential reference, the new VSW is {279, 283, 290}. Example 3: Packet Misordering prior to compressor. Suppose packets with header fields 279, 280, 281, 282, and 283 have been sent, and 279 and 283 are fields of potential reference packets such that the VSW is again {279, 283}. If a packet with field value = 278 is received next, VLE computes the following values Le, Clanton, Liu, Zheng [Page 12] INTERNET-DRAFT Robust Header Compression 24 May 2000 New Value VMax VMin r # LSBs 278 283 279 max[|278-283|,|278-279|]=5 4 So the field is encoded using 4 bits. Actual encoded value is the 4 LSBs of 278 (10010110) which = 0110. If we assume the new value is a potential reference, the new VSW is {283, 290, 278}. In any case, the VLE encoded fields must be accompanied by some bits in order to identify the different possible encoded field sizes. Sizes of this bit field can vary depending on the number of different sizes one wishes to allow. The approach in ACE is to allow only a few different sizes for byte-aligned header formats. Huffman coding of the length is used to achieve some additional efficiency, based on the expected frequency of needing the different sizes. 1 or 2 additional bits are actually sent in the ACE compressed header. The decompressor behavior in all the example cases is the same- it uses as a reference a specific decompressed header field value. The header to use might be indicated by the presence of a checksum in the compressed header packet, or by other means. The must by definition be one of the values in the compressor's window. For example let's assume that the last correctly decompressed packet which qualifies as a reference was the packet with header field = 291. Now suppose the encoded field value of 303 (10001111) is received and = 01111. The two values closest values to 291 which have LSBs = 01111 are 271 and 303. 303 is closest, therefore it is correctly selected as the uncompressed field value. 3.1.2. One-Sided Variable Length Coding (OVLE) The VLE encoding scheme is very general and flexible, as it can accommodate arbitrary changes (positive, negative) from one value to the next. When VLE is applied to a field that is monotonic (e.g. RTP SN), there is a loss in efficiency, because k, the number of bits is defined by the condition (2p+1)<2 to the kth(p=|current value-reference value|). On the other hand, if the variation is known to be monotonic, the required k is smaller, as it has to satisfy only p < 2 to the kth. One-Sided Variable Length Encoding (OVLE) is based on the idea to use a k that satisfies the latter condition, when the field to be Le, Clanton, Liu, Zheng [Page 13] INTERNET-DRAFT Robust Header Compression 24 May 2000 compressed is monotonic (increasing or decreasing). When the field is almost always monotonic (quasi-monotonic), OVLE compression can be used when the field is behaving monotonically, and 'regular' VLE used when it is not. The savings over VLE is 1 bit, and since that saving is achieved most of the time, it translates into a 1 bit savings in the average overhead. Alternatively, the number of bits can be kept the same, but the frequency of ACKs can be reduced by a factor of 2. 3.2. Timer-Based Compression of RTP Timestamp A useful observation is that the RTP timestamps when generated at the source closely follow a linear pattern as a function of the time of day clock, particularly for the case when speech is being carried in the RTP payload. For example, if the time interval between consecutive speech samples is 20 msec, then the RTP time stamp of header n (generated at time n*20 msec) = RTP time stamp of header 0 (generated at time 0) + TS_stride * n, where TS_stride is a constant dependent on the voice codec. In what follows, n is referred to as the 'packed' RTP TS. Consequently, the RTP TS in headers coming into the decompressor also follow a linear pattern as a function of time, but less closely, due to the delay jitter between the source and the decompressor. In normal operation (no crashes or failures), the delay jitter is bounded, to meet the requirements of conversational real-time traffic. Thus, it is possible for the decompressor to obtain an approximation of the packed RTP TS of the current header (the one to be decompressed), by adding the time elapsed since the previous header to the packed RTP TS of that previous header. The decompressor then refines this approximation with the additional information received in the compressed header. The compressed header carries the k least significant bits of the packed RTP TS. The required value of k to ensure correct decompression is a function of the jitter between the source and decompressor. The compressor can estimate the jitter and determine k, or alternatively it can have a fixed k, and filter out the packets with excessive jitter. Once the decompressor has the packed RTP TS, it can convert to the original RTP TS. The advantages to this approach are many: * The size of the compressed RTP TS is constant and small. In particular, it does NOT depend on the length of the silence interval. This is in contrast to other RTP TS compression Le, Clanton, Liu, Zheng [Page 14] INTERNET-DRAFT Robust Header Compression 24 May 2000 techniques, which require a variable number of bits dependent on the duration of the preceding silence interval. It is very important to be able to efficiently compress the RTP TS, as it is one of the essential changing fields (see Appendix A). * No synchronization is required between the timer process and the decompressor process. * Robustness to errors: the partial RTP TS information in the compressed header is self contained and only needs to be combined with the decompressor timer to yield the full RTP TS value. Loss or corruption of a header will not invalidate subsequent compressed headers. As an example, consider the scenario in which a long silence interval has just ended, and the header compressor scheme is preparing to send an FO header to decompressor to adjust for the unexpected change in RTP timestamp. The compressor knows that the packet which has just arrived is the first packet of a new talkspurt as opposed to following a lost packet because the RTP SN increments by only one. Note that we need not assume any special behavior of the input to the compressor (i.e. the scheme tolerates reordering, or more generally, non-increasing RTP timestamp behavior observed prior to the compressor). At the end of the silence interval, the compressor sends in the FO compressed header the k least significant bits of p_TS_current = (current RTP time stamp - TS0)/TS stride. p_TS_current is the "packed" representation of the current time; it has granularity of TS stride, which is the RTP timestamp increment observed during e.g. a VoIP session (e.g. 160 for a 20 mS voice codec). TS0 is an arbitrary timestamp offset. The compressor runs the following algorithm to determine k. STEP 1: calculate Network_Jitter (Current_header, j) as | (T_current - T_j) - (p_TS_current - p_TS_j) | for all packets in a sliding window, TSW. TSW contains several pairs (T_j, p_TS_j) of values corresponding to the packets sent that may be used as a reference, including the last packet which was ACKed. In the case of RPP, TWS is moved when an ACK with some indication (e.g., Le, Clanton, Liu, Zheng [Page 15] INTERNET-DRAFT Robust Header Compression 24 May 2000 checksum) is received from the decompressor. In the case of NRP mode, the TSW is moved without ACK if there are a maximum number of entries, 'M', present in TSW. I.e., the sliding window is managed just like for the case of VLE. T_current is the current wall clock time at the compressor, and T_j is the wall clock time at which the packet j in the sliding window was received by the compressor. Both T_current and T_j are in units of TS stride. p_TS_current & p_TS_j are the packed RTP timestamp times of the packets, determined from the actual RTP header. STEP 2: compute Max_Network_Jitter, where Max_Network_Jitter = Max{Network_Jitter(current, j)},for all headers j in TSW Note that Max_Network_Jitter is positive. STEP 3: k is then calculated as k = ceiling(log2(2 * J + 1), where J = Max_Network_Jitter + Max_CD_CC_Jitter + 2. Max_CD_CC_Jitter is the maximum possible CD-CC jitter expected on the CD-CC. Such a maximum depends only on the characteristics of the CD- CC, and is expected to be reasonably small in order to have good quality for real-time services. The factor + 2 is to account for the quantization error caused by the timers at the compressor and decompressor, which can be +/- 1. As a an example of operation, consider the case of a voice codec (20 mS), such that TS_stride = 160 mS. Assume T_current and p_TS_current are 357 and 351, respectively, and that we have sliding window TSW which contains the following values 4 entries: j T_j p_TS_j 1 9 7 2 8 6 3 7 4 4 3 1 j above is the packet number. Le, Clanton, Liu, Zheng [Page 16] INTERNET-DRAFT Robust Header Compression 24 May 2000 In this case we have Network_jitter(1)=|(357-9)-(351-7)|=4 (80 mS Network Jitter) Network_jitter(2)=|(357-8)-(351-6)|=4 (80 mS Network Jitter) Network_jitter(3)=|(357-7)-(351-4)|=3 (60 mS Network Jitter) Network_jitter(4)=|(357-3)-(351-1)|=4 (80 mS Network Jitter) So Max_Network_Jitter = 4. We assume a maximum CD-CC jitter of 2 (40 mS); the total jitter to be handled in this case is then J = 4 + 2 + 2 = 8 packets (160 mS) and k = 5 bits (since 2 * 5 + 1 < 2^5). The compressor sends the 5 LSBs of p_TS_current to the decompressor (351 = 101011111, so the encoded TS value = 11111). When the decompressor receives this value, it first attempts to estimate the timestamp by computing the time difference between the last reference established and the current packet T_current - T_ref, where T_ref is the value of the packed TS corresponding to the reference used by the decompressor. That value is added to p_TS_ref to get the estimate. Assume that at the decompressor packet #3 is used as the reference: - T_current = 359 - T_ref = 7 - p_TS_ref = 4 Note: T_current is picked here as any value; the difference between it and T_ref represents the length of the silence interval as observed at the decompressor. Then: T_current - T_ref = 359 - 7 = 352 p_TS_current(estimate) = 352 + 4 = 356 The decompressor searches for the closest value to 356 which has, in this case, LSBs = 11111. The value in this case is 351, the original p_TS. Le, Clanton, Liu, Zheng [Page 17] INTERNET-DRAFT Robust Header Compression 24 May 2000 If instead the compressor were to send the timestamp jump as simply the difference in consecutive packed RTP Timestamps, that value would be p_TS_current - p_TS_ref = 351-4 = 347 = 101011011 So over twice as many bits would be sent for a silence interval of 347 (20 mS) = 6.94 seconds Due to basic conversational real-time requirements, the cumulative jitter in normal operation is expected to be at most only a few times T stride for voice. For this reason, the FO payload formats in section 3 are optimized (in terms of representing different k- length encoded TS values) for the case of k=4 (handles up to 16 discrepencies in the timestamp). The remaining formats allow a wide range of jitter conditions (outside of just voice) to be handled as well. 4. Protocol definition 4.1. Profiles For IP/UDP/RTP compression, a profile has the following attributes: - Mode: NRP or RPP; derived as described below - IP version: v4 or v6; derived from IR header - UDP checksum: yes or no; derived from IR header - RTP TS encoding technique: VLE or Timer; encoding derived from negotiation This set of profile attributes can be configured to support either audio or video. In addition, the value of TS_stride is a numerical parameter predefined or sent by in-band signaling, or determined by some other means. In-band signaling can be carried using FO_EXT header, with S bit set to 1. In the RPP mode, an ACK of the FO_EXT confirms that the decompressor has received the signal and thus allows the compressor to stop sending in-band signal (means smaller headers afterward). Note that the in-band signaling is optional and can be replaced by other signaling channel if provided by the links. A default profile will be used when there is no prior knowledge on Le, Clanton, Liu, Zheng [Page 18] INTERNET-DRAFT Robust Header Compression 24 May 2000 some attributes: Mode, or UDP Checksum or RTP TS behavior. In that case, - Mode is set to NRP - UDP Checksum is yes - RTP TS encoding is VLE This profile will work with both audio and video. In addition, depending on the specific CD-CC, the following may have to be added: - CRC at the compressor/decompressor level to detect transmission errors; this is needed to reduce the probability of undetected transmission error from the CD-CC to an acceptable level if it is not sufficiently low - Packet length information: this is needed if the CD-CC does not provide the information - CID: this is needed if the CD-CC does not provide a means to discriminate between flows When CRC, Packet length and/or CID are needed, the exact formats and lengths depend on the CD-CC technology, and therefore would have to be defined on a case-by-case basis, as an implementation issue. For simplicity, we assume no compressor level CRC is needed, packet length information is provided, as well as a means to discriminate between flows. 4.2. Compressor/decompressor Logic 4.2.1. Starting point The compressor will start in the Initialization state and assume NRP mode. If no ACK is ever received, it stays in NRP mode. Otherwise, after receiving the first ACK, the compressor will switch to and stay in RPP mode. However, if by some other means (e.g. profile, or interface to link layer, etc) the compressor knows there is a feedback channel, the compressor can start in Initialization state of RPP mode directly. The following sections will describe the logic for each operation mode. Note that the logic is designed such that the switch from NRP to RPP mode is seamless. Le, Clanton, Liu, Zheng [Page 19] INTERNET-DRAFT Robust Header Compression 24 May 2000 4.2.2. NRP Mode 4.2.2.1. Compressor logic Below is the state machine for the compressor in NRP mode. Details of each state and the transitions between states will follow the diagram. Lr refresh sent end of string +----->------->----------+ +------<-------<--------+ | | | | | v v | +----------+ +----------+ +----------+ | IR state | | FO state | | SO state | +----------+ +----------+ +----------+ ^ ^ | | ^ | | | refresh time-out | | Lf FO sent | | | +-----<----------<-------+ +------->------->----+ | | | +----------------<---------------<---------------------+ refresh time-out * IR state The compressor starts in IR state and sends full refresh headers using FH header format (see header format section). The compressor leaves the IR state and transitions to FO state when it has sent Lr Refresh headers, where Lr is a parameter. Note that the compressor needs not to send Lr refresh headers consecutively. Sparse IR scheme can be applied to alleviate the surge of bandwidth problem. The compressor can send IR_IR1 refresh packets, followed by IR_FO1 FO packets, then IR_IR2 refresh packets, then IR_FO2 FO packets, and so on ... Note that each FO packet send in IR state MUST carry a checksum. Dynamic refresh is carried using FO_EXT header format with ST = 11 and all the Mi bits set to 1. However, the compressor may want to send a full refresh sometime during the session to fully update the context of the decompressor. One simple implementation could be that a full refresh will be sent after Cfr packets has elapsed since the last full refresh. Le, Clanton, Liu, Zheng [Page 20] INTERNET-DRAFT Robust Header Compression 24 May 2000 * FO state FO headers are sent in this state. Every FO header MUST carry a checksum. The compressor leaves this state and transitions to the SO state when the current header conforms to a string, and the compressor has sent Lf FO headers since the last string pattern change. Lf is a parameter. Similarly to the IR state, a sparse FO scheme can be used here. Basically, the compressor can send FO_FO1 FO packets, then FO_SO1 SO packets, then FO_FO2 FO packets, then FO_SO2 SO packets, etc. Note that each SO packet MUST carry a checksum. The compressor will also go back to IR state after a Refresh time-out. A Refresh Time-out occurs when the maximum time or number of packets allowed to elapse since the last refresh. In a simple implementation, a parameter Cr (number of packets) can be used for the time-out purpose. * SO state SO headers are sent in this state. Every SO header MUST carry a checksum. Fields in SO headers are encoded using as reference only those headers in the sliding window (see below). The compressor will transit back to FO state if the current string terminates, or back to IR state after refresh time-out (see Cr above). * Sliding window management Each window-based field value of a transmitted IR header or FO header MUST be added to the sliding window. If window reaches the size of M, it will slide (i.e. the oldest header will be removed to make room for the new one). M is an implementation parameter. 4.2.2.2. Deompressor logic * Only a full refresh (FH) packet can create a new context. Any headers (other than FH) belonging to an invalid (i.e. non- Le, Clanton, Liu, Zheng [Page 21] INTERNET-DRAFT Robust Header Compression 24 May 2000 existing) context will be discarded until the corresponding context is created. * Both full and dynamic refresh will update the context without decompression. .IP "*" 4 If checksum is present in a compressed header, the decompressor MUST use it to verify a correct decompression. If the checksum test fails, the decompressor MUST discard the packet. * The decompressor MUST use only the following headers as reference for decompression of subsequent headers: 1) A full or dynamic refresh header with a correct checksum 2) A FO header with a correct checksum 4.2.3. RPP Mode 4.2.3.1. Compressor logic Below is the state machine for the compressor in RPP mode. Details of each state and the transitions between states will follow the diagram. receive 1 ACK end of string +----->------->----------+ +------<-------<--------+ | | | | | v v | +----------+ +----------+ +----------+ | IR state | | FO state | | SO state | +----------+ +----------+ +----------+ ^ ^ | | ^ | | | receive REFRESH_REQ | | receive 1 or 2 ACKs | | | +-----<----------<-------+ +------->------->-----+ | | | +----------------<---------------<----------------------+ receive REFRESH_REQ * IR state At the beginning of a session, the compressor starts in IR state and sends full refresh headers using FH header format (see header Le, Clanton, Liu, Zheng [Page 22] INTERNET-DRAFT Robust Header Compression 24 May 2000 format section). During the session, the compressor MUST go to IR state upon receiving an REFRESH_REQ packet. It MUST send either full refresh or dynamic refresh, depending on the F-bit in the REFRESH_REQ packet. Each window-based field value of a transmitted IR header, MUST be added to the sliding window. The compressor MUST leave the IR state and transition to FO state when it receives an ACK confirming that the decompressor has correctly received the refresh information. The ACK also triggers the deletion of older headers in the sliding window. Note that sparse IR as described in section 4.2.2.1 can be also applied here. * FO state FO headers are sent in this state. The shortest FO header format that can carry enough information MUST be used. The compressor MAY choose to send checksum in a FO header. In the case of audio, the compressor SHOULD attach the checksum in every FO header. Each window-based field value of an FO header with checksum, MUST be added to the sliding window. The compressor leaves this state and transition to the SO state when the current header conforms to a string, and 2 ACKs (1 ACK if default FO information has been established between the compressor and decompressor) older than the current string has been received. Upon receiving an ACK, the compressor SHOULD shrink the sliding window by deleting any value that is older than the one in the ACKed header. FO_EXT headers SHOULD be sent in FO state when FO header format is incapable to handle the change of current header (e.g., non- essential fields changed, see FO_EXT format). Note that sparse FO as described in section 4.2.2.1 can be also applied here. Le, Clanton, Liu, Zheng [Page 23] INTERNET-DRAFT Robust Header Compression 24 May 2000 * SO state SO headers are sent in this state. Checksum needs not to be transmitted in an SO header unless the compressor wants the decompressor to ACK that header. Each window-based field value of an SO header with checksum, MUST be added to the sliding window. The compressor may also need to send SO_EXT header if, 1) the compressed RTP SN needs more bits than allowed in SO header, or 2) there is a misordering of packets before the compressor which prevents the use of OVLE (as defined in SO packet). Note that not every misordering of packets will trigger an SO_EXT packet. An SO_EXT header is needed only in a severe misordering event, in which the RTP SN in a misordered packet is smaller than the all the RTP SN values in the sliding window. The compressor will transit back to FO state if the current string terminates. In addition, it MUST go to IR state upon receiving a REFRESH_REQ packet from the decompressor. * Sliding window management The compressor controls the size of the sliding window by sending more or less headers with checksum. The compressor will shrink the sliding window after receiving an ACK. In implementation, it may also set an maximum window size M and slides the sliding window even if no ACK is received. M can depend on the round trip time between the compressor and decompressor and/or the memory available to the compressor. However, the value of M SHOULD be large enough to avoid triggering the decompressor to send a REFRESH_REQ (see decompressor logic). 4.2.3.2. Deompressor logic * Decompressor MUST use only a secure reference to decompress. A secure reference is a compressed header with a checksum (calculated over the uncompressed header) AND the checksum test succeeds. * The decompression of an FO header is achieved using the last secure reference and applying different decoding rules (e.g. VLE Le, Clanton, Liu, Zheng [Page 24] INTERNET-DRAFT Robust Header Compression 24 May 2000 or OVLE for RTP SN and Timer-based scheme for RTP TS) to different fields. * The decompression of an SO header is achieved in two steps: 1) Decompress the RTP SN in current header by using the decompressed RTP SN in the last secure reference and applying OVLE (if SO header) or VLE (if SO_EXT header) decoding algorithm. 2) Decompress RTP TS and IP ID using linear extrapolation. * If a header with checksum is received, the decompressor MUST verify the checksum after decompression. If the checksum test succeeds, the decompressor MUST send an ACK for that header. If it fails, the decompressor MUST discard the packet. * The decompressor MUST send REFRESH_REQ after receiving Lcs consecutive headers with incorrect checksum, where Lcs is an implementation parameter. 4.3. Compressed Packet Formats Here we describe the different header formats which are used by ACE. The compressed RTP SN is used as the sequence number of the packet. A compressed packet may include a 8-bit checksum, defined as the one's complement of the one's complement sum of the uncompressed (original) IP/UDP/RTP header. Note that for simplicity, RTP payload will not be shown in the following descriptions, since we understand it always follows the compressed headers. Besides, an extra CID field will be added if the link layer does not provide mux/demux of different flows. The following header formats are for IPv4. IPv6 formats will be provided later. SO: 1 byte without checksum or 2 bytes with checksum. Consists only of Packet Type (PT), used to distinguish the different types of header formats at the decompressor, C-bit, used to indicate the presence of checksum, Compressed RTP sequence number, C_RTP_SN, and the payload. Checksum is present if C = 1. SO packets are the most optimal packets to send, due to Le, Clanton, Liu, Zheng [Page 25] INTERNET-DRAFT Robust Header Compression 24 May 2000 their small size. +----+---+----------+::::::::::::+ | PT | C | C_RTP_SN | checksum | +----+---+----------+::::::::::::+ PT: 1 bit (value = "0") C: 1 bit (value = "1" indicates the presence of checksum) C_RTP_SN: 6 bits (LSBs of RTP SN, OVLE encoded) checksum: 8 bits (present if C = 1; not present if C = 0) FO: 2 to 6 bytes. FO packets are sent as as a result of observing unrepresentable irregularities in the expected behavior pattern of the header fields. One purpose of this structure is to optimize coding (usually requires only 2 bytes) for the case of irregular change in RTP TS, which is usually the most frequent case of irregularity. Note that the C_RTP_TS carries LSBs of the packed RTP-TS value. The coding rules can be either VLE or timer-based. Le, Clanton, Liu, Zheng [Page 26] INTERNET-DRAFT Robust Header Compression 24 May 2000 +----+---+---+-----+-----+----------+::::::::::+:::::::::+ | PT | C | M | TI | FMT | C_RTP_SN | C_RTP_TS | C_IP_ID | +----+---+---+-----+-----+----------+::::::::::+:::::::::+ ::::::::::+ checksum | ::::::::::+ PT: 2 bits (value = "10") C: 1 bit (indicates if checksum is present in this header) M: 1 bit (the Marker bit in the original RTP header) TI: = 0 (1 bit), if only C_RTP_TS is present = 10 (2 bits), if only C_IP_ID is present = 11 (2 bits), if both fields are present FMT: 1 or 2 bits, combined with TI-bit, indicating the exact structure of FO header (see table below) C_RTP_SN: 6 or 8 bits (LSBs of RTP SN, VLE encoded) C_RTP_TS: variable length (LSBs of the packed RTP TS, VLE or timer-based encoding ) C_IP_ID: variable length (LSBs of IP ID, VLE encoded) checksum: 8 bits (present if C = 1; not present if C = 0) +------------------------------------------------------------+ | TI | FMT | C_RTP_SN | C_RTP_TS | C_IP_ID | Length* of | | Value | Value | Length | Length | Length | FO Header | | | | (bits) | (bits) | (bits) | (bytes) | +-------+-------+----------+----------+---------+------------+ | 0 | 0 | 6 | 4 | N.P. | 2 | | | 10 | 6 | 11 | N.P. | 3 | | | 11 | 8 | 9 | N.P. | 3 | +-------+-------+----------+----------+---------+------------+ | 10 | 0 | 6 | N.P. | 11 | 3 | | | 1 | 8 | N.P. | 16 | 4 | +-------+-------+----------+----------+---------+------------+ | 11 | 00 | 6 | 4 | 6 | 3 | | | 01 | 7 | 8 | 9 | 4 | | | 10 | 8 | 12 | 12 | 5 | | | 11 | 8 | 8 | 16 | 5 | +-------+-------+----------+----------+---------+------------+ Note: N.P. -- Not Present * length not including checksum ACK: 2 bytes. Le, Clanton, Liu, Zheng [Page 27] INTERNET-DRAFT Robust Header Compression 24 May 2000 +----+----------+ | PT | C_RTP_SN | +----+----------+ PT: 3 bits (value = "110") C_RTP_SN: 13 bits (LSBs of the acked RTP SN) SO_EXT: 2 bytes without checksum or 3 bytes with checksum. +----+---+----------+::::::::::+ | PT | C | C_RTP_SN | checksum | +----+---+----------+::::::::::+ PT: 4 bits (value = "1110") C: 1 bit (indicates the presence of checksum) C_RTP_SN: 11 bits (LSBs of RTP SN, VLE encoded) checksum: 8 bits (present if C = 1; not present if C = 0) SO_EXT is sent if the compressor is in SO state, but the 6-bit C_RTP_SN (OVLE encoded) is not long enough. (for example, due to a large amount of packet loss before the compressor, or no ACK has been received after sending 64 SO packets). Another reason for the compressor to send SO_EXT packet is to handle the misordering case, since C_RTP_SN is VLE encoded in this header. FO_EXT: variable length. general structure +----+----+---+---+-----------------------------+ | PT | ST | C | M | depending on ST value | +----+----+---+---+-----------------------------+ Le, Clanton, Liu, Zheng [Page 28] INTERNET-DRAFT Robust Header Compression 24 May 2000 PT: 5 bits (value = "11110") ST: Sub-type, 1 or bits (see notes below) C: 1 bit (indicating the presence of the checksum M: 1 bit (Marker bit in RTP header) Two reasons that could trigger the transmition of FO_EXT header: 1) Changes in the RTP-SN, RTP-TS, or IP-ID cannot be carried using FO format, i.e. the number of bits are not enough. 2) Some non-essential header fields changed. Therefore, we have three cases: Case 1 (ST = 0): reason 1 only. +----+----+---+---+--------+--------+-------+::::::::::+ | PT | ST | C | M | RTP_SN | RTP_TS | IP_ID | checksum | +----+----+---+---+--------+--------+-------+::::::::::+ - RTP_SN, RTP_TS, and IP_ID are uncompressed. - checksum is 8 bits and present if C = 1. - Total length: 9 bytes without checksum or 10 bytes with checksum case 2 (ST = 10): reason 2 only. +----+----+---+---+---+----+-----+----------+::::::::::+ | PT | ST | S | C | M | TI | FMT | C_RTP_SN | C_RTP_TS | +----+----+---+---+---+----+-----+----------+::::::::::+ :::::::::+::::::::::+::::::::::::::+::::::::::+::::::::::+ C_IP_ID | Bit Mask | Field Values | signal | checksum | :::::::::+::::::::::+::::::::::::::+::::::::::+::::::::::+ - S: in-band signal flag (1 means signal field is present) - Everything from C bit up to and including C_IP_ID is the same as in FO header - Bit Mask (1 byte): indicates which fields are present M1 - Type of Service (in IPv4 header) M2 - Don't Fragment Flag (in IPv4 header) M3 - Time to Live (in IPv4 header); M4 - Padding-bit (in RTP header) M5 - Extension-bit (in RTP header, note that RTP header extension will be carried as RTP payload) M6 - Payload Type (in RTP header) M7 - CSRC Count (in RTP header) M8 - CSRC List (in RTP header) Le, Clanton, Liu, Zheng [Page 29] INTERNET-DRAFT Robust Header Compression 24 May 2000 - Field Values: For simplicity, uncompressed values. Further optimization are possible. - Signal: Used for in-band signaling - checksum: at least 8 bits, but may be longer to fit the byte boundary. case 3 (ST = 11): reason 1 and 2. +----+----+---+---+---+--------+--------+-------+ | PT | ST | S | C | M | RTP_SN | RTP_TS | IP_ID | +----+----+---+---+---+--------+--------+-------+ ::::::::::+::::::::::::::+::::::::::+::::::::::+ Bit Mask | Field Values | signal | checksum | ::::::::::+::::::::::::::+::::::::::+::::::::::+ - S: in-band signal flag (1 means in-band signal is present) - RTP_SN, RTP_TS, and IP_ID are uncompressed - Bit Mask, Field Values, Signal and checksum are same as in case 2 FH: Consists of the PT ("111110", 6 bits), followed by full headers from RTP, UDP, and IP, checksum, plus the payload. Some of the fields in the UDP and IP header could be inferred from the link layer (e.g. packet length), and may be replaced with compressor / decompressor level information (e.g., CID). This type of packet header is normally sent only at session initiation, or in response to an FH_REQ Packet, described below. The extra bits can be padding or used for other purpose. +----+------------+-------------+-------------+---------+ | PT | IP header* | UDP header* | RTP header* |checksum | +----+------------+-------------+-------------+---------+ * some field(s) may be modified (see above) REFRESH_REQ: These packets are sent only in extremely rare incidences, e.g., memory loss or CPU crash. It may also be sent by the decompressor when undected transmission error is caught by the header compression level checksum. Le, Clanton, Liu, Zheng [Page 30] INTERNET-DRAFT Robust Header Compression 24 May 2000 +----+---+ | PT | F | +----+---+ PT: 7 bits (value = 1111110). F: = 1, requesting full refresh = 0, requesting only dynamic (non-static) refresh 5. Robustness/Efficiency Issues and Tradeoffs This section provides the rationale behind the choices in ACE to achieve robustness and efficiency. 5.1. RPP Mode - Rationale for ACK-based Robustness The primary means for providing robustness in the RPP mode is to send ACKs. When a return path is available, transitioning from a lower compression state to a higher compression state in a manner controlled by ACKs provides a proactive way to prevent loss of context synchronization, and consequently error propagation. Context loss is prevented because the compressor always uses as compression reference some header that is known to be correctly decompressed through an ack. The alternative is to have a reactive approach, i.e. detect loss of context synchronization and react to it by requesting the compressor to send information to resynchronize the contexts. An issue that remains to be addressed is the residual transmission errors undetected by the CD-CC. When such an undetected error occurs, without any additional mechanism, the decompressor will decompress incorrectly without knowing it. Depending on the technology of the CD-CC, the residual error may or may not be large enough to be a problem. If the residual error is excessive, there must be a sufficiently strong CRC added at the compressor/decompressor level to minimize the probability of undetected errors. However, even with a strong CRC, there is a non-negligible chance that during the lifetime of an RTP session, at least one header may have an undetected error. When that happens, it is likely that decompression of the affected header will be incorrect. If the incorrectly decompressed header happens to be used as reference for decompressing subsequent headers, it is likely that the subsequent headers will be incorrectly decompressed, until another correctly decompressed header is used as reference. This problem is referred to as error accumulation. To address this problem, a checksum (CS) is attached to every header Le, Clanton, Liu, Zheng [Page 31] INTERNET-DRAFT Robust Header Compression 24 May 2000 candidate to be a reference, to allow the decompressor to verify the correctness. The CS is calculated as a 8-bit one's complement checksum, calculated over the whole uncompressed header. If the decompressor detects an incorrect CS, it will simply discard the header. Therefore, a secondary means of providing robustness is to send compressed headers with a checksum. Reasons why the checksum is used only as a secondary means of providing robustness are: - The extremely low probability of error accumulation when the link error detection is reasonably good. - The checksum is not meant to be a substitute for link error detection CRC, and can dramatically reduce compression efficiency if used in this way. Indeed, an incorrect checksum will trigger a resynchronization process that is bandwidth costly because the information sent consists of uncompressed "full" or large-sized headers. The sudden surge in bandwidth caused by the large-sized headers, may not be handled well by some radio technologies. The resynchronization mechanism is sensitive to errors and delays. Due to the round trip delay, there can be a significant time period during which the decompressor has to discard all incoming headers, while waiting for the requested resynchronization information from the compressor. This means that always or even frequently sending the checksum when error detection is sufficient and ACKs are already being used is NOT efficient use of transmission bandwidth. In ACE, a checksum need only be sent in headers which may be used as decompression reference. In the ACE decompression logic, these headers are usually a very small percentage of the headers (FH, or FO). Since the checksum need not be sent in every header, especially in 1 byte SO headers, the same bits that would be used for the checksum can be used to send a longer compressed sequence number and tolerate more packet loss. The loss range tolerated grows exponentially with the number of bits allocated to the sequence number. For example, when 2 more bits are used (6 bits instead of 4), the range is multiplied by 4 (can tolerate 64 packet loss rather than 16). Another use of checksum is during sparse FO or sparse IR techniques. If during the Sparse FO procedure, the decompressor receives an SO without receiving any of the preceding FO headers, it will decompress incorrectly without even knowing it. To alleviate this problem, a checksum (CS) is appended to all the SO headers during the Sparse FO procedure. Similarly, all FO during the Sparse IR procedure have a CS. Le, Clanton, Liu, Zheng [Page 32] INTERNET-DRAFT Robust Header Compression 24 May 2000 5.2. NRP Mode - Rationale for Operation The primary means for providing robustness in NRP mode is to send a checksum in EVERY compressed header packet, along with occasional refresh information. ACKs cannot be used, by definition of NRP mode. The main reason for inclusion of the checksum in every compressed header packet is that it provides a means of verifying that the decompressor is functioning properly, i.e., a means to check synchronization of contexts at compressor and decompressor. The link layer error protection, alone or in combination with additional protection at the compressor level, should result in very few undetected errors reaching the decompressor. If they do, the checksum can likely detect them, so they can be discarded before causing any harm to the decompressor. The same behavior is observed at the decompressor if the checksum fails due to other conditions, e.g., excessive loss on the CD-CC. However, unlike with RPP, there is by definition no way to let the compressor know that such events have occurred. The compressor must therefore periodically send some refresh information to the decompressor 'just in case' something has gone wrong. 5.3. Checksum - Rationale for Use (Instead of CRC) It is well-known that a CRC typically provides better error detection capability than a checksum of the same length. The advantages of a CRC are evident when isolated bit errors occur, or when the errors are in a small burst. Such error patterns are often observed on many types of transmission channels, including cellular channels. However, the bit errors caused by loss of context synchronization, unlike transmission errors, will tend to be widespread. This means that a simple error detecting mechanism like a checksum would likely suffice. Some other considerations in favor of using a checksum over a CRC include: - Complexity: The improved performance of CRC over checksum does not come without cost- CRC implementation is more complex than implementation of a checksum. In practice, the checksum calculation can be provided in a very straightforward way, while CRC computation likely requires additional CPU cycles due to the bitwise operations that are often involved. The additional CPU cycles could be significant if the header compressor needs to process several flows in parallel, as would be the case for the header compression entity that would reside in the cellular Le, Clanton, Liu, Zheng [Page 33] INTERNET-DRAFT Robust Header Compression 24 May 2000 infrastructure equipment. Processing a small number of flows might be an issue for today's miniature wireless devices, which already aim to stretch battery performance to new limits. - Flexibility: It is likely much easier to implement checksums of various lengths- one only needs to truncate the result of the one's complement sum. This flexibility might be desirable if, e.g., one wants to modify compressed header formats in the future in such a way that more/fewer bits are allocated for the checksum. Alternatively, changing the length of a CRC will, at minimum, require definition of a new CRC polynomial. Further, if the CRC is implemented in hardware, this kind of modification can be quite significant. 6. Conclusions Efficient IP/UDP/RTP Header Compression is a must for transmission of real-time multimedia over bandwidth-limited channels. Cellular systems in particular present significant challenges, not only in terms of bandwidth limitations, but also issues such as bursty error characteristics, and long round trip delays. A new header compression scheme which is both efficient and robust to a broad range of error and delay conditions has been presented. The scheme overcomes the faults associated with the baseline IP/UDP/RTP header compression technique [CRTP], using several new techniques. These include the use of a controlled transition from lower compression states to higher compression states, based on the compressor confidence that the decompressor has acquired the needed information to decompress compressed headers sent in the higher compression state. When a feedback channel is available, confidence is achieved by proactive feedback in the form of ACKs from the decompressor. In addition to that state transition strategy, various encoding schemes such as VLE, and timer-based RTP timestamp compression are designed to minimize header overhead. ACE is also able to continue compression/decompression process in a seamless fashion across handoffs, including those which entail physical relocation of the network-based compression/decompression function. These features make this scheme ideal for VoIP transmission in cellular environments, or in any system which is subject to non- trivial error rates and delays. Le, Clanton, Liu, Zheng [Page 34] INTERNET-DRAFT Robust Header Compression 24 May 2000 Basic ideas and principles of ACE are scalable to be able to handle a large number of packet loss between the compressor and decompressor, and a large degree of packet loss and misordering before the compressor. Preliminary results indicate that the new scheme's performance thoroughly exceeds that of [CRTP], particularly at high random packet error rates. 7. Intellectual Property Considerations Nokia has filed patent applications that might possibly have technical relation to this contribution. Le, Clanton, Liu, Zheng [Page 35] INTERNET-DRAFT Robust Header Compression 24 May 2000 8. References [REQ] M. Degermark, "Requirements for IP/UDP/RTP robust header compression", IETF Draft, May 2000. [CRTP] S. Casner, V. Jacobson. "Compressing IP/UDP/RTP Headers for Low-Speed Serial Links", Internet Engineering Task Force (IETF) RFC2508, February 1999. [GSMSIG] ETSI Digital cellular telecommunications system (Phase 2+); "Mobile radio interface layer 3 specification", (GSM 04.08 version 7.0.1 Release 1998). Le, Clanton, Liu, Zheng [Page 36] INTERNET-DRAFT Robust Header Compression 24 May 2000 9. Authors' Addresses Khiem Le Nokia Research Center 6000 Connection Drive Irving, TX 75039 USA Phone: +1 972 894-4882 Fax: +1 972 894-4589 E-mail: khiem.le@nokia.com Christopher Clanton Nokia Research Center 6000 Connection Drive Irving, TX 75039 USA Phone: +1 972 894-4886 Fax: +1 972 894-4589 E-mail: chris.clanton@nokia.com Zhigang Liu Nokia Research Center 6000 Connection Drive Irving, TX 75039 USA Phone: +1 972 894-5935 Fax: +1 972 894-4589 E-mail: zhigang.liu@nokia.com Haihong Zheng Nokia Research Center 6000 Connection Drive Irving, TX 75039 USA Phone: +1 972 894-4232 Fax: +1 972 894-4589 E-mail: haihong.zheng@nokia.com Le, Clanton, Liu, Zheng [Page 37] INTERNET-DRAFT Robust Header Compression 24 May 2000 Appendix A - Header Field Classification In this section we describe a classification of the different IP/UDP/RTP header fields. Three types of headers are identified: * Static: a field which is expected to be constant during the lifetime of the compressed packet flow. Examples of static fields are the source and destination IP addresses, and the source and destination UDP port numbers. * Changing Essential: a field whose value will change with some frequency. Examples of changing essential fields are the RTP timestamp, RTP sequence number, and IP-ID. Each has a tendency to change from one packet to the next. The term "essential" is used here only for convenience, and does not mean to imply that these fields are more essential than others. * Changing Non-Essential: a field whose value can possibly change during a session, but seldom does. Examples are the RTP payload type, and the IP Time-To-Live (TTL) and Type of Service (TOS) fields. This class of header fields also includes "inferred" fields whose value may be provided by the link layer. An example is the IP total length field. The table below summarizes classification of the various IP/UDP/RTP header fields. Le, Clanton, Liu, Zheng [Page 38] INTERNET-DRAFT Robust Header Compression 24 May 2000 +-----------+----------------+-------------------------------------+ | | Static | Non-static | | | +-----------------+-------------------+ | | | Essential | Non-essential | +-----------+----------------+-----------------+-------------------+ | IPv4 | version | IP-ID | type of service | | Header | header length | | total length | | Fields | protocol | | don't fragment | | | source IP addr | | more fragment | | | dest IP addr | | fragment offset | | | | | time to live | | | | | header checksum | +-----------+----------------+-----------------+-------------------+ | IPv6 | version | | traffic class | | Header | flow label | | next header | | Fields | source IP addr | | hop limit | | | dest IP addr | | | +-----------+----------------+-----------------+-------------------+ | UDP | source port | | length | | Header | dest port | | checksum | | Fields | | | | +-----------+----------------+-----------------+-------------------+ | RTP | version | marker-bit | padding-bit | | Header | SSRC | sequence number | extension-bit | | Fields | | timestamp | payload type | | | | | CSRC count | | | | | CSRC list | +-----------+----------------+-----------------+-------------------+ Appendix B - Implementation Hints This appendix is informative. It is meant to provide some recommendations on how to best utilize the concepts discussed in this draft. The descriptions below are not meant to restrict implementations in any way; the primary objective is provide suggestions and loose guidelines in the following areas: - realization of channels for transmission of ACK information - transmission frequency of ACK - transmission frequency of compressed header with checksum Le, Clanton, Liu, Zheng [Page 39] INTERNET-DRAFT Robust Header Compression 24 May 2000 - suggestions for setting ACE parameters (Lf, Lr, etc.) B.1. Considerations for Feeback Channel Realization (RPP mode only) ACK transmission flexibility means the implementor can utilize any of several feedback channel options, depending on what facilities are available in the target system. Some options (and associated tradeoffs/considerations) follow below. - Shared ACK channel: In this case, N channels are shared between M decompressors, where typically N << M. Tradeoff here is bandwidth used (low compared to dedicated case) vs. response time (slower than dedicated case). - Dedicated ACK channel: In this case, each decompressor has it's own dedicated feedback channel. Tradeoff here is again bandwidth used (high compared to shared case) vs. response time (faster than dedicated case). - Piggybacked ACK: In this case, ACKs in the reverse direction are `piggybacked' on data packets already being sent in the reverse direction. Bandwidth usage in this case could be quite low, since transmission overhead (e.g., L1/L2 overhead) of ACK can be shared with the packet already being sent. But response time could suffer, since transmission of ACK can occur only when other packets already need to be sent. In the case of speech, the ACK might be piggybacked e.g., on actual speech from the remote endpoint. During silence intervals, some cellular speech codecs send comfort noise frames that might also be used to piggyback the ACK data. - Hybrids: Combinations of the previous and the first two options are also possible, e.g., use shared channel if piggybacking not possible. Clearly, it is desirable to have low delay ACK transmission using as few system resources as is possible. As a general rule, it is recommended that one should make use of ACK piggybacking if possible. But as described below, ACK transmission frequency is quite low, so that some inefficiency in transmitting them has minor effects on the effectiveness of the compression. Le, Clanton, Liu, Zheng [Page 40] INTERNET-DRAFT Robust Header Compression 24 May 2000 B.2. ACK Transmission Frequency (RPP mode only) Although the overhead due to the ACKs is quite low, it is still desirable to minimize ACK transmission costs. This means reducing transmission of ACKs to those times when they are absolutely needed. By carefully managing ACK transmission, is possible to take full advantage of the robustness/compression efficiency improvements they can provide, without burdening the system with large amounts of traffic on the feedback channels. The items below provide some general guidelines for transmitting ACKs: - In SO state, it is desirable that the compressor received an ACK in enough time that increase of the sequence number sent in the SO header is avoided. For example, if the compressor is sending a 6-bit sequence number while in SO state, an ACK is needed at least once every 2^6 packets to avoid the need to send a larger sequence number (e.g., using SO_EXT header). Note: to account for potential loss of ACK and system round-trip delays, one may want to send them at a slightly higher rate. - If an estimate of the round trip time between compressor and decompressor is known by the decompressor, the sparse ACK concept should be used whenever possible to avoid transmission of many consecutive ACKs during compressor state transitions. If these guidelines are followed, one can expect very little overhead contribution due to ACKs (normally << 1 bit per compressed header). B.3. Transmission of Checksum Information (RPP mode only) RPP mode only, since the checksum appears in all packets for NRP mode. As described in section 5, checksums provide only a secondary level of robustness when a feedback channel is present. However, we recommend that: - Checksum is sent in EVERY FO header if the media is audio, particulary when sparse IR/FO mechanisms are used, to avoid the possibility of undetected error corrupting the decompression context during state transitions. - Checksum is sent in the SO header frequently enough that an ACK transmission is triggered at the decompressor in accordance with the guidelines for ACK transmission frequency described in section Le, Clanton, Liu, Zheng [Page 41] INTERNET-DRAFT Robust Header Compression 24 May 2000 B.2. I.e., sequence number wrap-around condition should be avoided. B.4. Suggested Parameter Values This section provides some considerations which should be taken into account when selected actual values for the ACE parameters described throughout the document. In practice, it is good fine tune these parameters in the actual operating enviroment to ensure that the best performance is obtained. IR_IRx/IR_FOx (x = 1, 2, 3, etc): These parameters define the behavior of the compressor during sparse IR operation. An arbitrary spacing between between IR and FO headers can be achieved by setting the values appropriately; alternatively, sparse IR can be 'disabled' by chosing IR_FOx for all x = 0. Selection of the parameters depends on round trip time between compressor and decompressor, robustness of the CD-CC, and bandwidth available for the header. For the cellular case, we can expect non-negligible round-trip times, limited transmission robustness, and very limited bandwidth. In such a scenario, it probably makes sense to send an IR header followed by several FO headers (e.g., IR_IR1=1, IR_FO1=N, IR_IR2=1, IR_FO2=N, etc, where N corresponds to the number of FO packets that can be sent in one round trip time observed on the CD-CC). Also, note that sending IR headers could mean degradation or muting of speech, due to their high bandwidth requirement. Also, very low loss on CD-CC might mean very small values of x. FO_FOx/FO_SOx (x = 1, 2, 3, etc): The same recommendations as were given in the case of IR_IRx/IR_FOx parameters still apply. Since FO header is typically smaller than a refresh header (and closer to the size of an SO header), estimates of round trip time would need to be adjusted (compared to IR_IRx/IR_FOx case). Lr: This parameter indicates the number of refresh headers sent before exiting the IR state, when the compressor operates in NRP mode. The value of Lr should be selected such that there is high probability of receiving at least one of the refresh headers at the decompressor without errors. Factors which effect selection of this parameter value are: - loss characteristics of the CD-CC; e.g., if CD-CC is error-prone, Lr may be high - values of IR_IRx and IR_FOx (x = 1, 2, 3, etc) if sparse IR technique is used Le, Clanton, Liu, Zheng [Page 42] INTERNET-DRAFT Robust Header Compression 24 May 2000 Lf: This parameter indicates the number of FO headers that should be sent (since the last string pattern change) before exiting the FO state, when the compressor operates in NRP mode. The value of Lf is chosen using similar criteria to that used to pick Lr. Lcs: This parameter indicates the maximum number of consecutive packets which can have an incorrect checksum value before some action is taken by the decompressor in RPP mode. With ACE, an incorrect checksum computation occurs when - Link layer error detection mechanism has failed to detect an error - FO received before IR in IR state (when sparse IR is used) - SO received before FO in FO state (when sparse FO is used) - Sliding window is managed in such a way that the decompression is incorrect (i.e., value of M, described below, is too small) Factors to consider when selecting Lcs are: - Value of 'M' for sliding window - Amount of degradation that is allowable in the application; in case of RPP, refresh may be requested quicker if the application can not tolerate much packet loss. This implies a smaller Lcs may be desired. - Effect of refresh packets on compressor/decompressor performance; consideration should be given to the transmission medium's ability to tolerate the bandwidth surge caused by the refresh; refresh also negatively impacts compression efficiency. This may imply need for a larger Lcs value. Refresh Time Out: this parameter defines the amount of time between transmission of refresh packets in NRP mode. There are several factors to consider: - Frequent refresh is inefficient in terms of compression performance. - Less frequent refresh can result in long periods where no data is provided to the application; an application left 'starving' for data for a considerable time may have problems or even terminate. Le, Clanton, Liu, Zheng [Page 43] INTERNET-DRAFT Robust Header Compression 24 May 2000 - Frequent refresh causes more frequent bandwidth surges; some channels may be able to tolerate the larger refresh packets better than others; a related consideration is that sending refresh info may require `stealing' from the application data flow. - Probability that refresh is lost due to transmission conditions; since refresh packets are larger than other packets, probability of corruption is higher. It is recommended that one start with a long refresh timeout and then lower the value to take into account the needs of the application, the bandwidth available, and the robustness of the CD-CC. M: M is a parameter that defines the max size of the the sliding window (VLE and Timer-Based encoding schemes). Considerations when selecting M include: - Memory (cost) limitations; larger M means more memory/cost - Round-Trip Time between compressor and decompressor - Too small M could result in transmission of FH_REQ, sliding window is moved without ACK (see Compressor Logic). Appendix C - Experimental Results The new header compression scheme was implemented (along with RFC 2508) on a testbed in order to validate its operation and performance. This section describes the testbed and some of the results obtained. C.1. Test Configuration Le, Clanton, Liu, Zheng [Page 44] INTERNET-DRAFT Robust Header Compression 24 May 2000 +------------+ +-----+ +---------+ "Mobile Terminal" | Endpoint 1 | -- | Hub | -- | Linux 1 |---+ | (PC) | | 1 | | (UPA) | | +------------+ +-----+ +---------+ | | +-------+ "Air Interface" | Hub 2 | +-------+ | "network" | +------------+ ......... +---------+ | | Endpoint 2 | --- : LAN : ---- | Linux 2 |---------+ | (PC) | : : | (UPA) | +------------+ :.......: +---------+ The testbed employs a User Plane Adaptation (UPA) function on both the "network" side (Linux 2) and the "mobile" side (Linux 1). The adaptation function performs several functions: * Header Compression: either IETF 2508, or ACE. * Channel Delay Simulation: a fixed delay (set at run time) is applied to each packet at the transmitter side. The same delay model was applied on both forward and reverse channels. Delay is simulated at the transmitting side. * Channel Error Model: packet loss is simulated at decompressor. The testbed can simulate channel conditions as either packet loss or bit errors, according to random or template-based error distributions. The same error model was applied on both forward and feedback channels. * H.225/H.245 Message Parser: the adaptation function on the mobile side (in Linux 1) includes a control message parser which can derive RTP port number information at call setup. * This information is signalled to the network adaptation function entity as well. The port number data allows each adaptation function to easily determine which information flows should be routed to the compressor. * Handoff Manager: the adaptation function on the network side (Linux 2) includes software for simulating handoff effects due to the header compression (e.g., transfer of state information). The Linux 2 can host two complete UPA entities simultaneously for this purpose. Le, Clanton, Liu, Zheng [Page 45] INTERNET-DRAFT Robust Header Compression 24 May 2000 * Network Jitter Simuation: a model of the network jitter can be included; this provides a controllable means to test the behavior of the timer-based RTP timestamp compression scheme. * Packet Reordering Simulation: various amounts of packet reordering prior to the compressor can be simulated. The endpoints are standard Windows PCs running a Microsoft Netmeeting, a commonly used H.323 based audio/video conferencing application. Netmeeting provides the IP/UDP/RTP data flows. C.2. Assumptions in implementing RFC 2508 In the results which follow, it is important to note that the following assumptions were made in implementing RFC 2508: 1. Assumes that the UDP Checksum can be omitted. This reduces the minimal compressed header size to 2 bytes, compared to 4 with the checksum. The same assumption is made regarding ACE. 2. Assumes that the compressor sends a large header upon receipt of an explicit context state request from the decompressor, compared to the compressor periodically sending a full header every 'r' packets, where r is the 'refresh rate'. 3. The decompressor sends another context state request if the next packet received after the previous request is NOT a full header. 4. The decompressor drops any packets received until the full header is received. 5. Assumes compressor responds to all context state requests, i.e., full header may be sent unnecessarily due to timing of request and delay of full header delivery to decompressor. 6. Assumes that the delta encoding rule given in the specification is used. 7. The TWICE mechanism is not employed. 8. To be fair for RFC 2508, the size of "full header" is counted as 17 bytes (COMPRESSED_NON_TCP), instead of 40 bytes (FULL_HEADER). 9. CID is not used, e.g., the link layer provides enough Le, Clanton, Liu, Zheng [Page 46] INTERNET-DRAFT Robust Header Compression 24 May 2000 information to discriminate multiple flows (assumed for both ACE and RFC 2508, for fairness). 10. Length information need not be sent in the compressed, e.g., the link layer provides the necessary data. C.3. Test Results This section summarizes performance when different channel characteristics are simulated. The testbed is capable of simulating random packet loss, random bit errors, and bit errors according to a provided template. A short description of the meaning of these error models follows. * Random Packet Loss: Entire VoIP packets (each packet contains one speech frame, representing a 20 or 30 mS speech sample) are dropped at the decompressor according to a provided numerical input. The error simulation module at the decompressor runs a clock, and update an error flag for each timeslot (20ms or 30ms, depending on codec). If a compressed packet arrives at the time the flag is set to true, the entire packet is dropped. Therefore, a value of 2% means that on average, 2 out of every 100 opportunities to transmit a packet would result in the packet being lost (i.e., the error simulation is NOT driven by reception of RTP packets). * Random Bit Errors: In this case, location of the bit error determines whether or not the packet is dropped. If one or more errors occur in the header of the packet (link layer header and/or compressed header), then the packet is dropped. Errors in the payload do not result in loss of the entire packet. A value of 2% will result in on average 2 out of every 100 bits being corrupted. As with the above, the simulation is not driven by reception of bits. Note again, that the bits may carry nothing if there is no compressed packets being transmitted (e.g. during silence period). * Errors According to Template: In this case, the error behavior is determined by an error trace or template which may have been obtained from a cellular channel simulator, or alternatively, an actual cellular system with the capability to produce such traces. Same rules for processing the packet apply as with the Random Bit Le, Clanton, Liu, Zheng [Page 47] INTERNET-DRAFT Robust Header Compression 24 May 2000 Error model. Test Configuration Details and Assumptions: * Microsoft Netmeeting G.723.1 6.4 kbps speech codec used; one frame per VoIP packet (every 30 mS in this case) * 60 mS one way delay * Packet Loss rates (PER) of 1%, 2%, 5%, 10% and 20% * Packet Loss rate of both payload channel and feedback channel is the same. Results for the case of random packet loss with VLE plus Timer-Based coding of the RTP timestamp (no dynamic switching was used in this case) is employed are shown in the tables below. Efficiency is measured in terms of average packet header size (in bytes) with overhead associated with ACK included. Link layer overhead for the ACKs is NOT included, as it depends on the nature of the feedback channel. Conversational Speech Sample +----------+-----------------------------------+ | PER (%) | 1.0 2.0 5.0 10.0 20.0 | + ---------+-----------------------------------+ | RFC 2508 | 1.86 2.64 3.93 5.10 5.83 | +----------+-----------------------------------+ | ACE | 1.42 1.42 1.42 1.48 1.52 | +----------+-----------------------------------+ To guarantee fairness, the same voice samples were used as the audio source for each test. Silence suppression is used by Netmeeting to induce RTP timestamp irregularities corresponding to 'real' talkspurts and silence intervals. The adjustable threshold within that application is set to be the same during the runs for each HC scheme. As mentioned above, it is assumed that CID related information can be derived from the link layer in the case of both ACE and [CRTP]. Therefore, it is excluded from compressed headers for both schemes. Half time of the voice sample is in silence, which is typical for real-life conversation. The silence periods range from 0.3 to 8.5 seconds. The talk spurts range from 1 to 10 seconds. Besides silence periods, irregular changes of IP-ID trigger half of the FO packets. Le, Clanton, Liu, Zheng [Page 48] INTERNET-DRAFT Robust Header Compression 24 May 2000 For [CRTP], the actual packet loss measured is at least double the PER over the simulated channel. This is expected since at least one extra packet is invalidated by the decompressor and thus dropped, if the preceding packet is detected as out of synchronization (error propagation, see Appendix B for description). In general, with [CRTP], the packet loss rate increases with the round trip delay. But for ACE, error propagation has been completely eliminated. Also of note is the near constant performance of ACE, even at high packet loss rates. A simple subjective evaluation proves consistent with the obtained objective measurements. The voice quality in the case of ACE is clearly better than that using plain RFC2508. Audio clips are recorded into files for evaluation by interested parties. Audio clips will be provided on the ACE homepage. Appendix D - Handoff Operation Handoff presents a problem for the header compression process because it normally results in the loss of multiple packets between compressor and decompressor. Most of the loss is the result of sending/receiving necessary signalling and synchronization information, when the mobile station would otherwise be sending/receiving user traffic. For example, an upper bound in GSM systems is 320 mS [GSMSIG], but more typically, when handoff occurs, communications is disrupted for 100 mS, which translates into multiple 20 msec speech packets. Schemes such as [CRTP] will necessarily have to reinitialize the compression/decompression process by sending large amounts of synchronization information when handoff occurs. But ideally, the process would not have to do any reinitialization after completion of the handoff, if it can handle multiple packet loss. Efficiency of this operation is important, since in some types of wireless systems (e.g. PCS), cells can be small (several hundred to a few thousand feet), such that handoff can occur quite frequently at mobile speeds. Another issue with handoff is the network compressor/decompressor relocation. In cellular systems, it is expected that the network part of header compression/decompression is done by some network entity that we refer to as the Access Network Infrastructure Adapter (ANI_AD). As the MS moves farther and farther away from the location where the call initially started, routing efficiency considerations may require that the header compression/decompression function be relocated from the initial ANI_AD to another ANI_AD. Such relocation of functions already exists in third generation UMTS systems, for the purpose of routing optimization when soft handover is done. The Le, Clanton, Liu, Zheng [Page 49] INTERNET-DRAFT Robust Header Compression 24 May 2000 impact of ANI_AD relocation is that to avoid reinitialization between the MS and the new ANI_AD, some context information must be transferred from the original ANI_AD to the new ANI_AD. The header compression scheme must be such that it can handle that context information transfer without being disrupted. It is a must for the header compression scheme to handle cellular handoff operations in an efficient manner. When header compression is applied to a cellular environment, there is an adaptation function in the MS and another one in the network which is responsible for the header compression. The MS adapter (MS_AD) acts as compressor for the uplink and a decompressor for the downlink. Conversely, the access network infrastructure adapter (ANI_AD) acts as decompressor for the uplink and a compressor for the downlink. With ACE, it is possible to do radio handoff as well as ANI_AD relocation in a seamless manner, i.e. without requiring the compression/decompression process to go through reinitialization or resynchronization. Key to the seamlessness are: - ACE ability to withstand a very large number of packet losses between the compressor and decompressor, caused by the radio handoff - ACE ability to tolerate context information transfer from the old ANI_AD to the new ANI_AD, without disrupting the continuing compression and decompression of user packets. Le, Clanton, Liu, Zheng [Page 50]