idnits 2.17.1 draft-ietf-rohc-rtp-ace-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 53 longer pages, the longest (page 0) being 60 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 54 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 5 instances of too long lines in the document, the longest one being 6 characters in excess of 72. ** The abstract seems to contain references ([CRTP]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 1039: '...send in IR state MUST carry a checksum...' RFC 2119 keyword, line 1050: '...is state. Every FO header MUST carry a...' RFC 2119 keyword, line 1061: '...t each SO packet MUST carry a checksum...' RFC 2119 keyword, line 1071: '...is state. Every SO header MUST carry a...' RFC 2119 keyword, line 1084: '... header MUST be added to the sli...' (21 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 24 has weird spacing: '... It is inapp...' == Line 193 has weird spacing: '...pted by trans...' == Line 194 has weird spacing: '... by the decom...' == Line 195 has weird spacing: '...between the c...' == Line 261 has weird spacing: '...cess if the c...' == (20 more instances...) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. 'REQ' -- Possible downref: Non-RFC (?) normative reference: ref. 'GSMSIG' Summary: 7 errors (**), 0 flaws (~~), 9 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Robust Header Compression (ROHC) WG Khiem Le 3 INTERNET-DRAFT Christopher Clanton 4 Date: 24 May 2000 Zhigang Liu 5 Expires: 24 November 2000 Haihong Zheng 7 Nokia Research Center 9 Adaptive Header ComprEssion (ACE) for Real-Time Multimedia 10 12 Status of This Memo 14 This document is an Internet-Draft and is in full conformance with 15 all provisions of Section 10 of RFC2026. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as Internet- 20 Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet-Drafts as reference 25 material or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 This document is a submission of the IETF ROHC WG. Comments should be 34 directed to its mailing list rohc@cdt.luth.se 36 Abstract 38 When Real-Time Multimedia over IP is applied to cellular systems, it 39 is critical to minimize the overhead of the IP/UDP/RTP header, as 40 spectral efficiency is a top requirement. Robustness to errors and 41 error bursts is also a must. 43 Existing IP/UDP/RTP header compression schemes such as that presented 44 in IETF RFC 2508 [CRTP], do not provide sufficient performance in 45 such environments. This report describes a new scheme (ACE, or 46 Adaptive header ComprEssion) , which like RFC 2508, is based on the 47 idea that most of the time IP/UDP/RTP fields are either constant or 48 can be extrapolated in a linear fashion. However, ACE incorporates 49 several additional concepts which enable it to provide excellent 50 compression efficiency (exceeds the performance of [CRTP]) along with 51 a high degree of error-resiliency. Some of the concepts employed, 52 such as Variable Length Encoding (VLE), enable ACE to adapt to 53 changing behavior in the IP/UDP/RTP header fields, such that good 54 efficiency and robustness characteristics are maintained over a wide 55 range of operating conditions. 57 ACE is a general framework that can be parameterized to account for 58 the existence/non-existence and performance characteristics of the 59 feedback channel. Thus, ACE is applicable over both bi-directional 60 and unidirectional links. 62 ACE is also able to perform a seamless handoff, i.e. the scheme can 63 resume efficient compression operation immediately after handoff. 65 Table of Contents 67 Status of This Memo . . . . . . . . . . . . . . . . . . . . . . . i 69 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . i 71 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 1 73 2. Basic Framework of ACE . . . . . . . . . . . . . . . . . . . . 1 74 2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 1 75 2.2. ACE Assumptions . . . . . . . . . . . . . . . . . . . . . 2 76 2.3. ACE Modes of Operation . . . . . . . . . . . . . . . . . 3 77 2.4. Compression States . . . . . . . . . . . . . . . . . . . 3 78 2.4.1. Initialization/Refresh (IR) State . . . . . . . . . 4 79 2.4.2. First Order (FO) State . . . . . . . . . . . . . . . 4 80 2.4.3. Second Order (SO) State . . . . . . . . . . . . . . 4 81 2.5. Profiles and Context . . . . . . . . . . . . . . . . . . 5 82 2.6. ACE Feedback . . . . . . . . . . . . . . . . . . . . . . 6 83 2.6.1. ACE Acknowledgements . . . . . . . . . . . . . . . . 7 84 2.6.1.1. Flexibility of ACE ACKnowledgements . . . . . . 8 85 2.6.2. Refresh Requests . . . . . . . . . . . . . . . . . . 8 86 2.7. Bandwidth and Efficiency Constraints . . . . . . . . . . 8 87 2.7.1. Sparse IR and Sparse FO . . . . . . . . . . . . . . 8 88 2.7.2. Sparse ACK . . . . . . . . . . . . . . . . . . . . . 9 89 2.8. Checksum . . . . . . . . . . . . . . . . . . . . . . . . 9 90 2.9. Decompression Strategy . . . . . . . . . . . . . . . . . 10 92 3. Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 93 3.1. VLE - Variable Length Encoding . . . . . . . . . . . . . 10 94 3.1.1. VLE Basics . . . . . . . . . . . . . . . . . . . . . . 11 95 3.1.2. One-Sided Variable Length Coding (OVLE) . . . . . . 13 96 3.2. Timer-Based Compression of RTP Timestamp . . . . . . . . 14 98 4. Protocol definition . . . . . . . . . . . . . . . . . . . . . 18 99 4.1. Profiles . . . . . . . . . . . . . . . . . . . . . . . . 18 100 4.2. Compressor/decompressor Logic . . . . . . . . . . . . . . 19 101 4.2.1. Starting point . . . . . . . . . . . . . . . . . . . 19 102 4.2.2. NRP Mode . . . . . . . . . . . . . . . . . . . . . . 20 103 4.2.2.1. Compressor logic . . . . . . . . . . . . . . . . 20 104 4.2.2.2. Decompressor logic . . . . . . . . . . . . . . . 21 105 4.2.3. RPP Mode . . . . . . . . . . . . . . . . . . . . . . 22 106 4.2.3.1. Compressor logic . . . . . . . . . . . . . . . . 22 107 4.2.3.2. Decompressor logic . . . . . . . . . . . . . . . 24 108 4.3. Compressed Packet Formats . . . . . . . . . . . . . . . . 25 110 5. Robustness/Efficiency Issues and Tradeoffs . . . . . . . . . . 31 111 5.1. RPP Mode - Rationale for ACK-based Robustness . . . . . . 31 112 5.2. NRP Mode - Rationale for Operation . . . . . . . . . . . 33 113 5.3. Checksum - Rationale for Use (Instead of CRC) . . . . . . 33 115 6. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 34 117 7. Intellectual Property Considerations . . . . . . . . . . . . . 35 119 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 36 121 9. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 37 123 Appendix A - Header Field Classification . . . . . . . . . . . . 38 125 Appendix B - Implementation Hints . . . . . . . . . . . . . . . . 39 126 B.1. Considerations for Feeback Channel Realization 127 (RPP mode only) . . . . . . . . . . . . . . . . . . . . 40 128 B.2. ACK Transmission Frequency (RPP mode only) . . . . . . . 41 129 B.3. Transmission of Checksum Information (RPP mode only) . . 41 130 B.4. Suggested Parameter Values . . . . . . . . . . . . . . . 42 132 Appendix C - Experimental Results . . . . . . . . . . . . . . . . 44 133 C.1. Test Configuration . . . . . . . . . . . . . . . . . . . 44 134 C.2. Assumptions in implementing RFC 2508 . . . . . . . . . . 46 135 C.3. Test Results . . . . . . . . . . . . . . . . . . . . . . 47 137 Appendix D - Handoff Operation . . . . . . . . . . . . . . . . . 49 139 1. Introduction 141 This Internet draft describes a general header compression framework, 142 and specifies in detail a protocol to compress IP/UDP/RTP headers, 143 based on the framework. For conciseness, the general background 144 information on header compression, and requirements specific to 145 error-prone and low bandwidth environments is not elaborated on in 146 this document. Much of the information can be found in RFC2508 and 147 [REQ]. 149 The remainder of this document is structured as follows. Section 2 150 is a conceptual introduction, which describes the basic ACE 151 framework, as well as its application to IP/UDP/RTP header 152 compression. Section 3 describes the different encoding techniques. 153 Section 4 is the protocol definition. It provides a detailed 154 description of the profile attributes and the resulting profiles, the 155 compressor/decompressor logic, packet header formats, and bit field 156 definitions. Section 5 specifies rationale for algorithm choices. 157 Some informative appendices are also included at the end. 159 2. Basic Framework of ACE 161 2.1. Terminology 163 - Static field: A header field that does not change for the duration 164 of the session. An example is the UDP port number. 166 - Random field: A field that is inherently unpredictible, e.g. UDP 167 checksum 169 - Compression context: The set of data used by the compressor to 170 compress the current header. The compression context is a dynamic 171 quantity that may change on a header-by-header basis 173 - Decompression context: The set of data used by the decompressor to 174 decompress the current header. The decompression context is a 175 dynamic quantity that may change on a header-by-header basis; to 176 enable correct decompression, the decompression context must be in 177 synchronization with the compression context 179 - Numerical field: A header field that can be interpreted as 180 numerical information, e.g. RTP Sequence Number. An example of non- 181 numerical field is the RTP CSRC list. 183 - String: a sequence of headers whose 1) non-random numerical fields 184 change by a delta value which is constant within the string (a 185 particular case is delta equal to zero, if the field does not 186 change); and 2) other fields do not change 188 - String pattern: the set of information needed to decompress a 189 header belonging to a string For non-random numerical fields, it is 190 the delta value. For the other fields, it is the field itself. 192 - Error propagation: happens when the current compressed header, even 193 though it is not corrupted by transmission error, has to be 194 discarded by the decompressor, because the context synchronization 195 has been lost between the compressor and decompressor. Error 196 propagation lasts until the contexts have been resynchronized. 198 - Error cumulation: happens when the context synchronization is lost 199 due to undetected transmission errors, and subsequent headers are 200 likely to be incorrectly decompressed 202 - Window-based field: A header field that is encoded according to VLE 203 or Timer techniques 205 2.2. ACE Assumptions 207 ACE only requires a minimal set of assumptions, which are listed 208 here. 210 - The communication channel between the compressor and the 211 decompressor, referred to as 'CD-CC', could be a link or a 212 concatenation of links; in particular, it could include multiple IP 213 networks and/or multiple low-bandwidth links. 215 - Packet Ordering: Packets transferred between compressor and 216 decompressor may be lost or corrupted, but their order should be 217 maintained by the CD-CC (i.e., FIFO pipe). 219 - Error Detection: The scheme should include a mechanism to detect 220 errors in the compressed header; this mechanism may be provided by 221 the CD-CC. If CD-CC does not provide adequate error detection 222 capability, the scheme may be extended in a straightforward fashion 223 by adding an error detection code at the compressor- decompressor 224 level. 226 - Packet Length: If this information is provided by the link layer, 227 it may be excluded from the compressed header. Otherwise, packet 228 length must be carried in the compressed header. 230 - Link-Layer Fragmentation: CD-CC may fragment packets in an 231 arbitrary manner, as long as they are reconstructed at the 232 decompressor. 234 2.3. ACE Modes of Operation 236 ACE is a general framework for robust header compression, that can be 237 parameterized to account for the existence/non-existence and 238 performance characteristics of a return path to carry feedback. 239 Specifically, environments where the compressor/decompressor operate 240 can be classified in one of the following cases: 242 - Return Path Present (RPP): This is the case where there is a 243 return path, which can be used to carry feedback information from 244 the decompressor to the compressor. ACE does not make any 245 assumptions on the performance of the return path. The return path 246 may experience a wide variance in delay, latency and/or error 247 performance. One example of return path with large latency 248 fluctuation is when feedback is piggybacked on the forward data 249 sent in the opposite direction, and the forward data is sent only 250 intermittently (e.g. speech that employs silence supression 251 tactics, where data is only transmitted when there is actual speech 252 detected at the transmitter). 254 - No Return Path (NRP): This mode of operation is used when there is 255 no reverse channel of any kind. In this case, no feedback can be 256 obtained from the decompressor. 258 ACE has 2 modes of operation, RPP and NRP. RPP mode is used when the 259 compressor knows it is in the RPP case. NRP mode is used when the 260 compressor knows it is in the NRP case. NRP mode is also used to 261 start the process if the compressor does not know which case is in 262 effect. If the compressor subsequently receives a feedback from the 263 decompressor, it will switch to the RPP mode and continue in that 264 mode. 266 2.4. Compression States 268 The compressor starts in the lowest compression state and gradually 269 transitions to higher compression states. The general principle is 270 the compressor will always operate in the highest possible 271 compression state, under the constraint that the compressor has 272 sufficient confidence that the decompressor has the information 273 necessary to decompress a header compressed according to that state. 274 In the RPP case, that confidence comes from receipt of ACKs from the 275 decompressor. In the NRP case, that confidence comes from sending 276 the information a certain number of times. 278 The compressor may also transition back to a lower compression state 279 when necessary. 281 Thus ACE's primary approach to robustness is proactive, i.e. it aims 282 at avoiding loss of context synchronization between the compressor 283 and decompressor, rather than reacting to it. 285 For IP/UDP/RTP compression, the three compressor states are the 286 Initialization/Refresh, First Order, and Second Order. A brief 287 description of each is given in the subsections below. 289 2.4.1. Initialization/Refresh (IR) State 291 In this state, the compressor essentially sends IR headers. The 292 information sent in a refresh may be static and non-static fields in 293 uncompressed form (full refresh), or just non-static fields in 294 uncompressed form (non-static refresh). The compressor enters this 295 state at initialization, upon request from decompressor, or upon 296 Refresh Time-out. The compressor leaves the IR state when it is 297 confident that the decompressor has correctly received the refresh 298 information. 300 2.4.2. First Order (FO) State 302 Subsequently to the IR state, the compressor operates in the FO state 303 when the header stream does not conform to a string, or when the 304 compressor is not confident that the decompressor has acquired the 305 string pattern. In this state, the compressor essentially sends FO 306 headers. In the case of speech with silence suppression turned on, 307 a new talk spurt following a silence interval will result in the RTP 308 TS incrementing by more than TS_stride, the regular TS increment. 309 Consequently, the header stream does not conform to the string 310 pattern, and the compressor is in the FO state. The compressor will 311 leave this state and transition to the SO state when the current 312 header conforms to a string, and the compressor is confident the 313 decompressor has acquired the string pattern. 315 2.4.3. Second Order (SO) State 317 The compressor enters this state when the header to be compressed 318 belongs to a string, and the compressor is sufficiently confident 319 that the decompressor has also acquired the string pattern. In the SO 320 state, the compressor sends SO headers, which essentially only 321 consist of a sequence number. While in the the SO state, the 322 decompressor does a simple extrapolation based on information it 323 knows about the pattern of change of the header fields and the 324 sequence number contained in the SO header in order to regenerate the 325 uncompressed header. The compressor leaves this state to go back to 326 FO state when the header no longer conforms to the string. 328 2.5. Profiles and Context 330 To be able to decompress, the decompressor must share a common 331 knowledge of some information with the compressor. The information 332 can be short term (i.e. may change frequently, e.g. from packet to 333 packet), or long term (remain constant for the duration of the 334 session, or seldom change). 336 A profile is a repository of the long term information. A profile is 337 determined by the flow to be compressed (e.g. IP version, whether it 338 is UDP, TCP, or RTP, if it is RTP, what is the codec behavior), the 339 system operating environment (e.g. RPP or NRP), the CD-CC performance 340 behavior and capabilities (e.g. error distributions, whether the CD- 341 CC provides packet boundary framing), and by the user-specified 342 required level of performance. A profile can be characterized by a 343 set of attributes. 345 A context is a repository of the short term information. For a given 346 profile, the compressor uses a compression context to compress the 347 current packet, while the decompressor uses a corresponding 348 decompression context. An example of information contained in the 349 context is the field values of the last decompressed header. 351 An IP/UDP/RTP compression or decompression profile is determined by 352 the following considerations: 354 - Stream to be compressed: 356 IP version: v4 or v6 357 UDP behavior: whether UDP checksum is used or not 358 RTP codec behavior: Value of TS_stride, Linearity of RTP TS with 359 respect to wall-clock 361 - CD-CC: 363 Distribution of header losses due to detected transmission errors 364 Capability to sustain transmission of consecutive large size 365 headers Residual error rate, seen by decompressor 367 - User requirements: 369 Maximum acceptable error propagation (in NRP case only) 371 Each of the above considerations corresponds to some profile 372 attribute. 374 Stream-related profile attributes are derived by the compressor from 375 observing the flow of headers. Some attributes (e.g. IP version) can 376 be instantaneously derived from the current header. Others may 377 require observation over multiple headers (e.g. Value of TS_stride). 378 CD-CC-related profile attributes are determined by the upfront 379 knowledge of the CD-CC. In the absence of such knowledge, default 380 attributes can be used. 382 The decompression profile must share some attribute values (e.g. IP 383 version, TS_stride value) with the compression profile, for the 384 decompression to succeed. Some values are acquired by the 385 decompressor through observing the flow of full headers sent at 386 initialization (e.g. IP version). Others may have to be explicitly 387 sent by the compressor, through in-band signaling, or some other 388 mechanism. 390 In-band signaling information is sent by the compressor in the in- 391 band signaling field of the compressed header. In-band signaling can 392 be used to update the decompressor on some information not found 393 explicitly in the non-static compressed header fields, and that 394 typically seldom changes. An example is the information that the 395 compressor is using Timer encoding technique for the RTP TS of the 396 current header, (and of subsequent headers, until further notice) 397 along with the value of TS_stride. Other uses are possible. See 398 section 3 for details on encoding techniques. 400 Some attributes do not need to be known by the decompressor. 402 See section 4 for a detailed specification of the profiles along with 403 their attributes. 405 2.6. ACE Feedback 407 This section is relevant only to the RPP mode. ACE has different 408 kinds of feedback: 410 - Ack 412 - Refresh_Request (full or non-static) 414 2.6.1. ACE Acknowledgements 416 An ACK packet contains a sequence number that uniquely identifies the 417 compressed header packet that is being ACKed. ACE ACKnowledgements 418 have four main functions: 420 - To inform the compressor that Refresh information has been 421 received. In that case, the compressor knows that the decompressor 422 has acquired the information necessary to decompress FO headers. 423 This means the compressor can reliably transition to the next 424 higher compression state, the FO state. This kind of ACK is 425 referred to as an IR-ACK. 427 - To inform the compressor that FO information has been received; in 428 that case, the compressor knows that the decompressor has acquired 429 the information necessary to decompress SO headers. This means the 430 compressor can reliably transition to the next higher compression 431 state, SO; this kind of ACK is referred to as an FO-ACK. 433 - To inform the compressor that a header with a specific sequence 434 number n has been received; in that case, the compressor knows that 435 the decompressor can determine the sequence number without any 436 ambiguity (caused, e.g., by counter wrap-around) up to header 437 number n + seq_cycle, where seq_cycle is the counter cycle 438 (determined by the number of bits, k, in the sequence number). 439 This kind of ACK is referred to as an SO-ACK. 441 - When information is sent as in-band signaling, to confirm that the 442 in-band signaling information has been received 444 The control of transition from IR to FO to SO states by ACKs ensures 445 that there is no context desynchronization, and therefore no error 446 propagation. That is, a compressed header that is not received in 447 error can always be correctly decompressed, because synchronization 448 is never lost. 450 Reception of ACKs by the compressor can also be used to increase 451 compressor header field encoding efficiency. Compression is more 452 efficient because the compressor just has to send the necessary 453 information (but no more) to ensure correct decompression of the 454 current header. In general, the minimal information that the 455 compressor needs to send depends on what information the decompressor 456 already knows. The information known at the decompressor is 457 indicated to the compressor in the decompressor's ACK transmission. 459 2.6.1.1. Flexibility of ACE ACKnowledgements 461 There is a lot of flexibility with respect to when and how often the 462 decompressor sends an ACK. ACE is also extremely resilient to ACKs 463 being lost or delayed. The compressor constantly adapts its 464 compression strategy based on the current information to be 465 compressed and the ACKs received. 467 Loss or delay of an FO-ACK may result in the compressor staying 468 longer in the FO state. Loss or delay of an SO-ACK may result in 469 the compressor sending more bits for the sequence number, to prevent 470 any incorrect decompression at the decompressor caused by counter 471 wrap around. 473 An advantage of this flexibility is that the feedback channel 474 utilized to transmit the ACKs can have very loose requirements. This 475 is because ACKs only have an effect on the compression efficiency, 476 NOT the correctness. Delay or loss of ACKs might cause the size of 477 compressed headers to increase, but even in such cases the increase 478 is minor (logarithmic). 480 2.6.2. Refresh Requests 482 Refresh Requests are sent when the decompressor determines that it 483 needs refresh information. Requests can be for a Full Refresh or Non- 484 static Refresh. 486 2.7. Bandwidth and Efficiency Constraints 488 2.7.1. Sparse IR and Sparse FO 490 Larger size headers such as IR or some FO may cause a problem if too 491 many of them are sent consecutively. This might occur, e.g., if the 492 compressor waits for an ack to transition to a higher compression 493 state, but the round-trip delay between compressor and decompressor 494 is large, and in the meantime the CD-CC cannot accomodate on a 495 sustained basis the surge in bandwidth demand caused by the larger 496 size headers. There is also a negative impact on overall efficiency. 498 Sparse transmission technique can be used to alleviate the problem. 499 In that technique, the larger size headers are interspersed with 500 smaller size headers. The smaller size headers are the ones that 501 would be sent in the higher compression state, where the compressor 502 would transition to, if the larger size headers were acked. Sparse 503 transmission can be applied to IR headers interspersed with FO 504 headers, or FO headers interspersed with SO headers. 506 2.7.2. Sparse ACK 508 An enhancement to the acknowledgement procedure can be used to reduce 509 FO ACK traffic on the feedback channel; this traffic can be quite 510 high if there is significant round trip delay between compressor and 511 decompressor. In this case, several FO headers would be sent before 512 the compressor can receive an ACK, and normally, one ACK would be 513 sent by the decompressor for each FO header received. 515 The basic idea is that whenever the decompressor receives a packet 516 and needs to send an ACK to the compressor, it just sends the ACK 517 once (or twice if there is no default 'pattern' agreed on by the 518 compressor and decompressor) and waits for some round trip time (as 519 opposed to sending ACKs in response to each, e.g., FO packet on the 520 feedback channel). After the round trip time, if the decompressor 521 can confirm that the compressor received the ACK (evidenced by 522 receipt of an SO packet at the decompressor), it continues normal 523 decompression. Otherwise, it will send the ACK again and the process 524 repeats. 526 The only potential negative to this approach is if the ACK sent by 527 the decompressor is lost. In that case, progression to the next 528 higher compression state by the compressor is delayed until the next 529 ACK is correctly received (at least one round trip time). 531 2.8. Checksum 533 In the RPP mode, a checksum is used as a secondary means of 534 robustness, in complement to the ACK, which is the primary means of 535 robustness. The CS needs only be attached to a small percentage of 536 headers. 538 In the NRP mode, a checksum is used in conjunction with refresh for 539 robustness. The CS is attached to all headers. 541 The CS is calculated as a 8-bit one's complement checksum, 542 calculated over the whole uncompressed header. Refer to section 5 for 543 a more detailed discussion. 545 2.9. Decompression Strategy 547 The decompressor uses as reference for decompression only those 548 headers which it is sufficiently confident of the correct 549 decompression (secure reference). A secure reference must be chosen 550 from the headers received with an OK CS. Until a new secure reference 551 is chosen, all subsequent headers are decompressed with respect to 552 the current secure reference. A major advantage of this approach is 553 that an undetected error which affect correct decompression of 554 header m will not affect decompression of subsequent headers. For 555 example, if header # 3 is an FO used as a secure reference, and 556 header # 5 is an SO with an undetected error, the decompression of 557 header # 6 will be based solely on header # 3 and not affected by 558 header # 5. In other words, an undetected error will affect only the 559 current header, just like when headers are not compressed. 561 3. Encoding 563 This section describes techniques for minimizing the overhead 564 associated with transmitting this kind of compressed header 565 information in FO and SO packet headers, as well as ACK packets. 567 3.1. VLE - Variable Length Encoding 569 An alternative approach to encoding irregular changes in header 570 fields is to send the 'k' least significant bits of the original 571 header field value. 573 Clearly, it is desirable for the compressor to minimize this number 574 of bits. Due to the possible loss of packets on the channel between 575 compressor and decompressor (CD-CC), the compressor does not know 576 which packet will be used as the reference by the decompressor, and 577 hence, does not know how many LSBs need to be sent. 579 Variable Length Encoding (VLE) solves this problem. The basic 580 algorithm employs a 'sliding window', maintained by the compressor, 581 which is advanced when the compressor has sufficient confidence that 582 the compressor has certain information. The confidence may be 583 obtained by various means, e.g., an ACK from the decompressor if 584 operating in RPP. In the case of NRP a sliding window of fixed size, 585 e.g. M (described later) may be used. In either case, the value of 586 k determined depends on the current values in the sliding window. 587 Details of the operation follow below. 589 3.1.1. VLE Basics 591 Basic concepts of VLE are: 593 * The decompressor uses one of the decompressed header values as a 594 reference value, v_ref. The reference may be chosen by various 595 means- one approach might be to select only headers whose correct 596 reconstruction is verified by inclusion of a checksum with the 597 compressed header ("secure" reference). 599 * The compressor maintains a sliding window of the values (VSW) which 600 may be chosen as a reference by the decompressor. It also 601 maintains the maximum value (v_max) and the minimum value (v_min) 602 of VSW. 604 * When the compressor has to compress a value v, it calculates the 605 range r = max(|v - v_max|, |v - v_min|). The value of k needed is 606 k = ceiling(log2(2 * r + 1)), i.e., the compressor sends the 607 ceiling(log2(2 * r +1)) LSBs of v as the encoded value. 609 * The compressor adds v into the VSW and updates the v_min and v_max 610 IF the value v could potentially be used as a reference by the 611 decompressor. 613 * The decompressor chooses as the decompressed value the one that is 614 closest to v_ref and whose k LSB equals the compressed value that 615 has been received. 617 It is obvious that we need to move forward (or shrink) the sliding 618 window to prevent k from increasing too much. To do that, the 619 compressor only needs to know which values in VSW have been received 620 by the decompressor. In the case of RPP, that information is carried 621 in the ACKs. In the case of NRP, the VSW is moved without ACK, if 622 there are a maximum number of entries, 'M', already present in VSW. 623 M is defined in the compressor logic section and further elaborated 624 upon in the "Implementation Hints" appendix. 626 The VLE concept can be applied to RTP Timestamp, RTP Sequence Number, 627 IP-ID header fields, etc. 629 The examples below illustrate the operation of VLE under various 630 scenarios. The field values used in the examples could correspond to 631 any fields that we wish to compress. The examples illustrate the 632 scenario where the compressed field has resolution of one bit. 634 Example 1: Normal operation (no packet loss prior to compressor, no 635 reodering prior to compressor). 637 Suppose packets with header fields 279, 280, 281, 282, and 283 have 638 been sent, and 279 and 283 are fields of potential reference packets. 640 The current VLE window is {279, 283}. 642 and a packet with field value = 284 is received next, VLE computes 643 the following values 645 New Value VMax VMin r # LSBs 646 284 283 279 max[|284-279|,|284-283|]=5 4 648 The window is unmodified if we assuming the new packet {284} is not a 649 potential reference. The field is encoded using 4 bits in this case, 650 and the actual encoded value is the 4 least significant bits of 284 651 (10011100) which = 1100. 653 Example 2: Packet Loss prior to compressor. 655 Suppose packets with header fields 279, 280, 281, 282, and 283 have 656 been sent, and 279 and 283 are fields of potential reference packets 657 such that the VSW is again {279, 283}. 659 If a packet with field value = 290 is received next, VLE computes the 660 following values 662 New Value VMax VMin r # LSBs 663 290 283 279 max[|290-283|,|290-279|]=11 5 665 So the field is encoded using 5 bits. Actual encoded value is the 5 666 LSBs of 290 (100100010) which = 00010. 668 If we assume the new value is a potential reference, the new VSW is 669 {279, 283, 290}. 671 Example 3: Packet Misordering prior to compressor. 673 Suppose packets with header fields 279, 280, 281, 282, and 283 have 674 been sent, and 279 and 283 are fields of potential reference packets 675 such that the VSW is again {279, 283}. 677 If a packet with field value = 278 is received next, VLE computes the 678 following values 679 New Value VMax VMin r # LSBs 680 278 283 279 max[|278-283|,|278-279|]=5 4 682 So the field is encoded using 4 bits. Actual encoded value is the 4 683 LSBs of 278 (10010110) which = 0110. 685 If we assume the new value is a potential reference, the new VSW is 686 {283, 290, 278}. 688 In any case, the VLE encoded fields must be accompanied by some bits 689 in order to identify the different possible encoded field sizes. 690 Sizes of this bit field can vary depending on the number of different 691 sizes one wishes to allow. The approach in ACE is to allow only a 692 few different sizes for byte-aligned header formats. Huffman coding 693 of the length is used to achieve some additional efficiency, based on 694 the expected frequency of needing the different sizes. 1 or 2 695 additional bits are actually sent in the ACE compressed header. 697 The decompressor behavior in all the example cases is the same- it 698 uses as a reference a specific decompressed header field value. The 699 header to use might be indicated by the presence of a checksum in the 700 compressed header packet, or by other means. The must by definition 701 be one of the values in the compressor's window. 703 For example let's assume that the last correctly decompressed packet 704 which qualifies as a reference was the packet with header field = 705 291. Now suppose the encoded field value of 303 (10001111) is 706 received and = 01111. The two values closest values to 291 which 707 have LSBs = 01111 are 271 and 303. 303 is closest, therefore it is 708 correctly selected as the uncompressed field value. 710 3.1.2. One-Sided Variable Length Coding (OVLE) 712 The VLE encoding scheme is very general and flexible, as it can 713 accommodate arbitrary changes (positive, negative) from one value to 714 the next. When VLE is applied to a field that is monotonic (e.g. 715 RTP SN), there is a loss in efficiency, because k, the number of bits 716 is defined by the condition 718 (2p+1)<2 to the kth(p=|current value-reference value|). 720 On the other hand, if the variation is known to be monotonic, the 721 required k is smaller, as it has to satisfy only 723 p < 2 to the kth. 725 One-Sided Variable Length Encoding (OVLE) is based on the idea to use 726 a k that satisfies the latter condition, when the field to be 727 compressed is monotonic (increasing or decreasing). When the field 728 is almost always monotonic (quasi-monotonic), OVLE compression can be 729 used when the field is behaving monotonically, and 'regular' VLE used 730 when it is not. 732 The savings over VLE is 1 bit, and since that saving is achieved most 733 of the time, it translates into a 1 bit savings in the average 734 overhead. Alternatively, the number of bits can be kept the same, 735 but the frequency of ACKs can be reduced by a factor of 2. 737 3.2. Timer-Based Compression of RTP Timestamp 739 A useful observation is that the RTP timestamps when generated at the 740 source closely follow a linear pattern as a function of the time of 741 day clock, particularly for the case when speech is being carried in 742 the RTP payload. 744 For example, if the time interval between consecutive speech samples 745 is 20 msec, then the RTP time stamp of header n (generated at time 746 n*20 msec) = RTP time stamp of header 0 (generated at time 0) + 747 TS_stride * n, where TS_stride is a constant dependent on the voice 748 codec. In what follows, n is referred to as the 'packed' RTP TS. 750 Consequently, the RTP TS in headers coming into the decompressor also 751 follow a linear pattern as a function of time, but less closely, due 752 to the delay jitter between the source and the decompressor. In 753 normal operation (no crashes or failures), the delay jitter is 754 bounded, to meet the requirements of conversational real-time 755 traffic. Thus, it is possible for the decompressor to obtain an 756 approximation of the packed RTP TS of the current header (the one to 757 be decompressed), by adding the time elapsed since the previous 758 header to the packed RTP TS of that previous header. The decompressor 759 then refines this approximation with the additional information 760 received in the compressed header. The compressed header carries the 761 k least significant bits of the packed RTP TS. The required value of 762 k to ensure correct decompression is a function of the jitter between 763 the source and decompressor. The compressor can estimate the jitter 764 and determine k, or alternatively it can have a fixed k, and filter 765 out the packets with excessive jitter. Once the decompressor has the 766 packed RTP TS, it can convert to the original RTP TS. 768 The advantages to this approach are many: 770 * The size of the compressed RTP TS is constant and small. In 771 particular, it does NOT depend on the length of the silence 772 interval. This is in contrast to other RTP TS compression 773 techniques, which require a variable number of bits dependent on 774 the duration of the preceding silence interval. It is very 775 important to be able to efficiently compress the RTP TS, as it is 776 one of the essential changing fields (see Appendix A). 778 * No synchronization is required between the timer process and the 779 decompressor process. 781 * Robustness to errors: the partial RTP TS information in the 782 compressed header is self contained and only needs to be combined 783 with the decompressor timer to yield the full RTP TS value. Loss 784 or corruption of a header will not invalidate subsequent compressed 785 headers. 787 As an example, consider the scenario in which a long silence interval 788 has just ended, and the header compressor scheme is preparing to send 789 an FO header to decompressor to adjust for the unexpected change in 790 RTP timestamp. The compressor knows that the packet which has just 791 arrived is the first packet of a new talkspurt as opposed to 792 following a lost packet because the RTP SN increments by only one. 793 Note that we need not assume any special behavior of the input to the 794 compressor (i.e. the scheme tolerates reordering, or more generally, 795 non-increasing RTP timestamp behavior observed prior to the 796 compressor). 798 At the end of the silence interval, the compressor sends in the FO 799 compressed header the k least significant bits of 801 p_TS_current = (current RTP time stamp - TS0)/TS stride. 803 p_TS_current is the "packed" representation of the current time; it 804 has granularity of TS stride, which is the RTP timestamp increment 805 observed during e.g. a VoIP session (e.g. 160 for a 20 mS voice 806 codec). 808 TS0 is an arbitrary timestamp offset. 810 The compressor runs the following algorithm to determine k. 812 STEP 1: calculate Network_Jitter (Current_header, j) as 814 | (T_current - T_j) - (p_TS_current - p_TS_j) | 816 for all packets in a sliding window, TSW. TSW contains several pairs 817 (T_j, p_TS_j) of values corresponding to the packets sent that may be 818 used as a reference, including the last packet which was ACKed. In 819 the case of RPP, TWS is moved when an ACK with some indication (e.g., 820 checksum) is received from the decompressor. In the case of NRP mode, 821 the TSW is moved without ACK if there are a maximum number of 822 entries, 'M', present in TSW. I.e., the sliding window is managed 823 just like for the case of VLE. 825 T_current is the current wall clock time at the compressor, and T_j 826 is the wall clock time at which the packet j in the sliding window 827 was received by the compressor. Both T_current and T_j are in units 828 of TS stride. 830 p_TS_current & p_TS_j are the packed RTP timestamp times of the 831 packets, determined from the actual RTP header. 833 STEP 2: compute Max_Network_Jitter, where 835 Max_Network_Jitter = Max{Network_Jitter(current, j)},for all headers j 836 in TSW 838 Note that Max_Network_Jitter is positive. 840 STEP 3: k is then calculated as 842 k = ceiling(log2(2 * J + 1), where 844 J = Max_Network_Jitter + Max_CD_CC_Jitter + 2. 846 Max_CD_CC_Jitter is the maximum possible CD-CC jitter expected on the 847 CD-CC. Such a maximum depends only on the characteristics of the CD- 848 CC, and is expected to be reasonably small in order to have good 849 quality for real-time services. 851 The factor + 2 is to account for the quantization error caused by the 852 timers at the compressor and decompressor, which can be +/- 1. 854 As a an example of operation, consider the case of a voice codec (20 855 mS), such that TS_stride = 160 mS. Assume T_current and p_TS_current 856 are 357 and 351, respectively, and that we have sliding window TSW 857 which contains the following values 4 entries: 859 j T_j p_TS_j 861 1 9 7 862 2 8 6 863 3 7 4 864 4 3 1 866 j above is the packet number. 868 In this case we have 870 Network_jitter(1)=|(357-9)-(351-7)|=4 (80 mS Network Jitter) 871 Network_jitter(2)=|(357-8)-(351-6)|=4 (80 mS Network Jitter) 872 Network_jitter(3)=|(357-7)-(351-4)|=3 (60 mS Network Jitter) 873 Network_jitter(4)=|(357-3)-(351-1)|=4 (80 mS Network Jitter) 875 So Max_Network_Jitter = 4. 877 We assume a maximum CD-CC jitter of 2 (40 mS); the total jitter to be 878 handled in this case is then 880 J = 4 + 2 + 2 = 8 packets (160 mS) 882 and k = 5 bits (since 2 * 5 + 1 < 2^5). The compressor sends the 5 883 LSBs of p_TS_current to the decompressor (351 = 101011111, so the 884 encoded TS value = 11111). 886 When the decompressor receives this value, it first attempts to 887 estimate the timestamp by computing the time difference between the 888 last reference established and the current packet 890 T_current - T_ref, where T_ref is the value of the packed 891 TS corresponding to the reference used by the decompressor. 893 That value is added to p_TS_ref to get the estimate. 895 Assume that at the decompressor packet #3 is used as the reference: 897 - T_current = 359 898 - T_ref = 7 899 - p_TS_ref = 4 901 Note: 903 T_current is picked here as any value; the difference between it and 904 T_ref represents the length of the silence interval as observed at 905 the decompressor. Then: 907 T_current - T_ref = 359 - 7 = 352 908 p_TS_current(estimate) = 352 + 4 = 356 910 The decompressor searches for the closest value to 356 which has, in 911 this case, LSBs = 11111. The value in this case is 351, the original 912 p_TS. 914 If instead the compressor were to send the timestamp jump as simply 915 the difference in consecutive packed RTP Timestamps, that value would 916 be 918 p_TS_current - p_TS_ref = 351-4 = 347 = 101011011 920 So over twice as many bits would be sent for a silence interval of 922 347 (20 mS) = 6.94 seconds 924 Due to basic conversational real-time requirements, the cumulative 925 jitter in normal operation is expected to be at most only a few times 926 T stride for voice. For this reason, the FO payload formats in 927 section 3 are optimized (in terms of representing different k- length 928 encoded TS values) for the case of k=4 (handles up to 16 929 discrepencies in the timestamp). The remaining formats allow a wide 930 range of jitter conditions (outside of just voice) to be handled as 931 well. 933 4. Protocol definition 935 4.1. Profiles 937 For IP/UDP/RTP compression, a profile has the following attributes: 939 - Mode: NRP or RPP; derived as described below 940 - IP version: v4 or v6; derived from IR header 941 - UDP checksum: yes or no; derived from IR header 942 - RTP TS encoding technique: VLE or Timer; encoding derived from 943 negotiation 945 This set of profile attributes can be configured to support either 946 audio or video. In addition, the value of TS_stride is a numerical 947 parameter predefined or sent by in-band signaling, or determined by 948 some other means. 950 In-band signaling can be carried using FO_EXT header, with S bit set 951 to 1. In the RPP mode, an ACK of the FO_EXT confirms that the 952 decompressor has received the signal and thus allows the compressor 953 to stop sending in-band signal (means smaller headers afterward). 954 Note that the in-band signaling is optional and can be replaced by 955 other signaling channel if provided by the links. 957 A default profile will be used when there is no prior knowledge on 958 some attributes: Mode, or UDP Checksum or RTP TS behavior. In that 959 case, 961 - Mode is set to NRP 962 - UDP Checksum is yes 963 - RTP TS encoding is VLE 965 This profile will work with both audio and video. 967 In addition, depending on the specific CD-CC, the following may have 968 to be added: 970 - CRC at the compressor/decompressor level to detect transmission errors; 971 this is needed to reduce the probability of undetected transmission 972 error from the CD-CC to an acceptable level if it is not sufficiently 973 low 974 - Packet length information: this is needed if the CD-CC does not provide 975 the information 976 - CID: this is needed if the CD-CC does not provide a means to 977 discriminate between flows 979 When CRC, Packet length and/or CID are needed, the exact formats and 980 lengths depend on the CD-CC technology, and therefore would have to 981 be defined on a case-by-case basis, as an implementation issue. 983 For simplicity, we assume no compressor level CRC is needed, packet 984 length information is provided, as well as a means to discriminate 985 between flows. 987 4.2. Compressor/decompressor Logic 989 4.2.1. Starting point 991 The compressor will start in the Initialization state and assume NRP 992 mode. If no ACK is ever received, it stays in NRP mode. Otherwise, 993 after receiving the first ACK, the compressor will switch to and stay 994 in RPP mode. 996 However, if by some other means (e.g. profile, or interface to link 997 layer, etc) the compressor knows there is a feedback channel, the 998 compressor can start in Initialization state of RPP mode directly. 1000 The following sections will describe the logic for each operation 1001 mode. Note that the logic is designed such that the switch from NRP 1002 to RPP mode is seamless. 1004 4.2.2. NRP Mode 1006 4.2.2.1. Compressor logic 1008 Below is the state machine for the compressor in NRP mode. Details of 1009 each state and the transitions between states will follow the 1010 diagram. 1012 Lr refresh sent end of string 1013 +----->------->----------+ +------<-------<--------+ 1014 | | | | 1015 | v v | 1016 +----------+ +----------+ +----------+ 1017 | IR state | | FO state | | SO state | 1018 +----------+ +----------+ +----------+ 1019 ^ ^ | | ^ | 1020 | | refresh time-out | | Lf FO sent | | 1021 | +-----<----------<-------+ +------->------->----+ | 1022 | | 1023 +----------------<---------------<---------------------+ 1024 refresh time-out 1026 * IR state 1028 The compressor starts in IR state and sends full refresh headers 1029 using FH header format (see header format section). 1031 The compressor leaves the IR state and transitions to FO state 1032 when it has sent Lr Refresh headers, where Lr is a parameter. 1034 Note that the compressor needs not to send Lr refresh headers 1035 consecutively. Sparse IR scheme can be applied to alleviate the 1036 surge of bandwidth problem. The compressor can send IR_IR1 1037 refresh packets, followed by IR_FO1 FO packets, then IR_IR2 1038 refresh packets, then IR_FO2 FO packets, and so on ... Note that 1039 each FO packet send in IR state MUST carry a checksum. 1041 Dynamic refresh is carried using FO_EXT header format with ST = 1042 11 and all the Mi bits set to 1. However, the compressor may 1043 want to send a full refresh sometime during the session to fully 1044 update the context of the decompressor. One simple 1045 implementation could be that a full refresh will be sent after 1046 Cfr packets has elapsed since the last full refresh. 1048 * FO state 1050 FO headers are sent in this state. Every FO header MUST carry a 1051 checksum. 1053 The compressor leaves this state and transitions to the SO state 1054 when the current header conforms to a string, and the compressor 1055 has sent Lf FO headers since the last string pattern change. Lf 1056 is a parameter. 1058 Similarly to the IR state, a sparse FO scheme can be used here. 1059 Basically, the compressor can send FO_FO1 FO packets, then FO_SO1 1060 SO packets, then FO_FO2 FO packets, then FO_SO2 SO packets, etc. 1061 Note that each SO packet MUST carry a checksum. 1063 The compressor will also go back to IR state after a Refresh 1064 time-out. A Refresh Time-out occurs when the maximum time or 1065 number of packets allowed to elapse since the last refresh. In a 1066 simple implementation, a parameter Cr (number of packets) can be 1067 used for the time-out purpose. 1069 * SO state 1071 SO headers are sent in this state. Every SO header MUST carry a 1072 checksum. 1074 Fields in SO headers are encoded using as reference only those 1075 headers in the sliding window (see below). 1077 The compressor will transit back to FO state if the current 1078 string terminates, or back to IR state after refresh time-out 1079 (see Cr above). 1081 * Sliding window management 1083 Each window-based field value of a transmitted IR header or FO 1084 header MUST be added to the sliding window. If window reaches the 1085 size of M, it will slide (i.e. the oldest header will be removed 1086 to make room for the new one). M is an implementation parameter. 1088 4.2.2.2. Deompressor logic 1090 * Only a full refresh (FH) packet can create a new context. Any 1091 headers (other than FH) belonging to an invalid (i.e. non- 1092 existing) context will be discarded until the corresponding 1093 context is created. 1095 * Both full and dynamic refresh will update the context without 1096 decompression. 1097 .IP "*" 4 If checksum is present in a compressed header, the 1098 decompressor MUST use it to verify a correct decompression. If 1099 the checksum test fails, the decompressor MUST discard the 1100 packet. 1102 * The decompressor MUST use only the following headers as reference 1103 for decompression of subsequent headers: 1105 1) A full or dynamic refresh header with a correct checksum 1106 2) A FO header with a correct checksum 1108 4.2.3. RPP Mode 1110 4.2.3.1. Compressor logic 1112 Below is the state machine for the compressor in RPP mode. Details of 1113 each state and the transitions between states will follow the 1114 diagram. 1116 receive 1 ACK end of string 1117 +----->------->----------+ +------<-------<--------+ 1118 | | | | 1119 | v v | 1120 +----------+ +----------+ +----------+ 1121 | IR state | | FO state | | SO state | 1122 +----------+ +----------+ +----------+ 1123 ^ ^ | | ^ | 1124 | | receive REFRESH_REQ | | receive 1 or 2 ACKs | | 1125 | +-----<----------<-------+ +------->------->-----+ | 1126 | | 1127 +----------------<---------------<----------------------+ 1128 receive REFRESH_REQ 1130 * IR state 1132 At the beginning of a session, the compressor starts in IR state 1133 and sends full refresh headers using FH header format (see header 1134 format section). 1136 During the session, the compressor MUST go to IR state upon 1137 receiving an REFRESH_REQ packet. It MUST send either full refresh 1138 or dynamic refresh, depending on the F-bit in the REFRESH_REQ 1139 packet. 1141 Each window-based field value of a transmitted IR header, MUST be 1142 added to the sliding window. 1144 The compressor MUST leave the IR state and transition to FO state 1145 when it receives an ACK confirming that the decompressor has 1146 correctly received the refresh information. The ACK also triggers 1147 the deletion of older headers in the sliding window. 1149 Note that sparse IR as described in section 4.2.2.1 can be also 1150 applied here. 1152 * FO state 1154 FO headers are sent in this state. The shortest FO header format 1155 that can carry enough information MUST be used. 1157 The compressor MAY choose to send checksum in a FO header. In the 1158 case of audio, the compressor SHOULD attach the checksum in every 1159 FO header. 1161 Each window-based field value of an FO header with checksum, MUST 1162 be added to the sliding window. 1164 The compressor leaves this state and transition to the SO state 1165 when the current header conforms to a string, and 2 ACKs (1 ACK 1166 if default FO information has been established between the 1167 compressor and decompressor) older than the current string has 1168 been received. 1170 Upon receiving an ACK, the compressor SHOULD shrink the sliding 1171 window by deleting any value that is older than the one in the 1172 ACKed header. 1174 FO_EXT headers SHOULD be sent in FO state when FO header format 1175 is incapable to handle the change of current header (e.g., non- 1176 essential fields changed, see FO_EXT format). 1178 Note that sparse FO as described in section 4.2.2.1 can be also 1179 applied here. 1181 * SO state 1183 SO headers are sent in this state. Checksum needs not to be 1184 transmitted in an SO header unless the compressor wants the 1185 decompressor to ACK that header. 1187 Each window-based field value of an SO header with checksum, MUST 1188 be added to the sliding window. 1190 The compressor may also need to send SO_EXT header if, 1) the 1191 compressed RTP SN needs more bits than allowed in SO header, or 1192 2) there is a misordering of packets before the compressor which 1193 prevents the use of OVLE (as defined in SO packet). Note that not 1194 every misordering of packets will trigger an SO_EXT packet. An 1195 SO_EXT header is needed only in a severe misordering event, in 1196 which the RTP SN in a misordered packet is smaller than the all 1197 the RTP SN values in the sliding window. 1199 The compressor will transit back to FO state if the current 1200 string terminates. In addition, it MUST go to IR state upon 1201 receiving a REFRESH_REQ packet from the decompressor. 1203 * Sliding window management 1205 The compressor controls the size of the sliding window by sending 1206 more or less headers with checksum. The compressor will shrink 1207 the sliding window after receiving an ACK. 1209 In implementation, it may also set an maximum window size M and 1210 slides the sliding window even if no ACK is received. M can 1211 depend on the round trip time between the compressor and 1212 decompressor and/or the memory available to the compressor. 1213 However, the value of M SHOULD be large enough to avoid 1214 triggering the decompressor to send a REFRESH_REQ (see 1215 decompressor logic). 1217 4.2.3.2. Deompressor logic 1219 * Decompressor MUST use only a secure reference to decompress. A 1220 secure reference is a compressed header with a checksum 1221 (calculated over the uncompressed header) AND the checksum test 1222 succeeds. 1224 * The decompression of an FO header is achieved using the last 1225 secure reference and applying different decoding rules (e.g. VLE 1226 or OVLE for RTP SN and Timer-based scheme for RTP TS) to 1227 different fields. 1229 * The decompression of an SO header is achieved in two steps: 1231 1) Decompress the RTP SN in current header by using the 1232 decompressed RTP SN in the last secure reference and applying 1233 OVLE (if SO header) or VLE (if SO_EXT header) decoding 1234 algorithm. 1236 2) Decompress RTP TS and IP ID using linear extrapolation. 1238 * If a header with checksum is received, the decompressor MUST 1239 verify the checksum after decompression. If the checksum test 1240 succeeds, the decompressor MUST send an ACK for that header. If 1241 it fails, the decompressor MUST discard the packet. 1243 * The decompressor MUST send REFRESH_REQ after receiving Lcs 1244 consecutive headers with incorrect checksum, where Lcs is an 1245 implementation parameter. 1247 4.3. Compressed Packet Formats 1249 Here we describe the different header formats which are used by ACE. 1250 The compressed RTP SN is used as the sequence number of the packet. 1252 A compressed packet may include a 8-bit checksum, defined as the 1253 one's complement of the one's complement sum of the uncompressed 1254 (original) IP/UDP/RTP header. 1256 Note that for simplicity, RTP payload will not be shown in the 1257 following descriptions, since we understand it always follows the 1258 compressed headers. Besides, an extra CID field will be added if the 1259 link layer does not provide mux/demux of different flows. 1261 The following header formats are for IPv4. IPv6 formats will be 1262 provided later. 1264 SO: 1 byte without checksum or 2 bytes with checksum. 1266 Consists only of Packet Type (PT), used to distinguish the 1267 different types of header formats at the decompressor, C-bit, 1268 used to indicate the presence of checksum, Compressed RTP 1269 sequence number, C_RTP_SN, and the payload. Checksum is present 1270 if C = 1. SO packets are the most optimal packets to send, due to 1271 their small size. 1273 +----+---+----------+::::::::::::+ 1274 | PT | C | C_RTP_SN | checksum | 1275 +----+---+----------+::::::::::::+ 1277 PT: 1 bit (value = "0") 1278 C: 1 bit (value = "1" indicates the presence of checksum) 1279 C_RTP_SN: 6 bits (LSBs of RTP SN, OVLE encoded) 1280 checksum: 8 bits (present if C = 1; not present if C = 0) 1282 FO: 2 to 6 bytes. 1284 FO packets are sent as as a result of observing unrepresentable 1285 irregularities in the expected behavior pattern of the header 1286 fields. One purpose of this structure is to optimize coding 1287 (usually requires only 2 bytes) for the case of irregular change 1288 in RTP TS, which is usually the most frequent case of 1289 irregularity. 1291 Note that the C_RTP_TS carries LSBs of the packed RTP-TS value. 1292 The coding rules can be either VLE or timer-based. 1294 +----+---+---+-----+-----+----------+::::::::::+:::::::::+ 1295 | PT | C | M | TI | FMT | C_RTP_SN | C_RTP_TS | C_IP_ID | 1296 +----+---+---+-----+-----+----------+::::::::::+:::::::::+ 1297 ::::::::::+ 1298 checksum | 1299 ::::::::::+ 1301 PT: 2 bits (value = "10") 1302 C: 1 bit (indicates if checksum is present in this header) 1303 M: 1 bit (the Marker bit in the original RTP header) 1304 TI: = 0 (1 bit), if only C_RTP_TS is present 1305 = 10 (2 bits), if only C_IP_ID is present 1306 = 11 (2 bits), if both fields are present 1307 FMT: 1 or 2 bits, combined with TI-bit, indicating the exact 1308 structure of FO header (see table below) 1309 C_RTP_SN: 6 or 8 bits (LSBs of RTP SN, VLE encoded) 1310 C_RTP_TS: variable length (LSBs of the packed RTP TS, 1311 VLE or timer-based encoding ) 1312 C_IP_ID: variable length (LSBs of IP ID, VLE encoded) 1313 checksum: 8 bits (present if C = 1; not present if C = 0) 1315 +------------------------------------------------------------+ 1316 | TI | FMT | C_RTP_SN | C_RTP_TS | C_IP_ID | Length* of | 1317 | Value | Value | Length | Length | Length | FO Header | 1318 | | | (bits) | (bits) | (bits) | (bytes) | 1319 +-------+-------+----------+----------+---------+------------+ 1320 | 0 | 0 | 6 | 4 | N.P. | 2 | 1321 | | 10 | 6 | 11 | N.P. | 3 | 1322 | | 11 | 8 | 9 | N.P. | 3 | 1323 +-------+-------+----------+----------+---------+------------+ 1324 | 10 | 0 | 6 | N.P. | 11 | 3 | 1325 | | 1 | 8 | N.P. | 16 | 4 | 1326 +-------+-------+----------+----------+---------+------------+ 1327 | 11 | 00 | 6 | 4 | 6 | 3 | 1328 | | 01 | 7 | 8 | 9 | 4 | 1329 | | 10 | 8 | 12 | 12 | 5 | 1330 | | 11 | 8 | 8 | 16 | 5 | 1331 +-------+-------+----------+----------+---------+------------+ 1332 Note: N.P. -- Not Present 1333 * length not including checksum 1335 ACK: 2 bytes. 1337 +----+----------+ 1338 | PT | C_RTP_SN | 1339 +----+----------+ 1341 PT: 3 bits (value = "110") 1342 C_RTP_SN: 13 bits (LSBs of the acked RTP SN) 1344 SO_EXT: 2 bytes without checksum or 3 bytes with checksum. 1346 +----+---+----------+::::::::::+ 1347 | PT | C | C_RTP_SN | checksum | 1348 +----+---+----------+::::::::::+ 1350 PT: 4 bits (value = "1110") 1351 C: 1 bit (indicates the presence of checksum) 1352 C_RTP_SN: 11 bits (LSBs of RTP SN, VLE encoded) 1353 checksum: 8 bits (present if C = 1; not present if C = 0) 1355 SO_EXT is sent if the compressor is in SO state, but the 1356 6-bit C_RTP_SN (OVLE encoded) is not long enough. (for 1357 example, due to a large amount of packet loss before the 1358 compressor, or no ACK has been received after sending 64 SO 1359 packets). Another reason for the compressor to send SO_EXT 1360 packet is to handle the misordering case, since C_RTP_SN is 1361 VLE encoded in this header. 1363 FO_EXT: variable length. 1365 general structure 1367 +----+----+---+---+-----------------------------+ 1368 | PT | ST | C | M | depending on ST value | 1369 +----+----+---+---+-----------------------------+ 1370 PT: 5 bits (value = "11110") 1371 ST: Sub-type, 1 or bits (see notes below) 1372 C: 1 bit (indicating the presence of the checksum 1373 M: 1 bit (Marker bit in RTP header) 1375 Two reasons that could trigger the transmition of FO_EXT header: 1377 1) Changes in the RTP-SN, RTP-TS, or IP-ID cannot be carried 1378 using FO format, i.e. the number of bits are not enough. 1379 2) Some non-essential header fields changed. 1381 Therefore, we have three cases: 1383 Case 1 (ST = 0): reason 1 only. 1385 +----+----+---+---+--------+--------+-------+::::::::::+ 1386 | PT | ST | C | M | RTP_SN | RTP_TS | IP_ID | checksum | 1387 +----+----+---+---+--------+--------+-------+::::::::::+ 1389 - RTP_SN, RTP_TS, and IP_ID are uncompressed. 1390 - checksum is 8 bits and present if C = 1. 1391 - Total length: 9 bytes without checksum or 10 bytes 1392 with checksum 1394 case 2 (ST = 10): reason 2 only. 1396 +----+----+---+---+---+----+-----+----------+::::::::::+ 1397 | PT | ST | S | C | M | TI | FMT | C_RTP_SN | C_RTP_TS | 1398 +----+----+---+---+---+----+-----+----------+::::::::::+ 1400 :::::::::+::::::::::+::::::::::::::+::::::::::+::::::::::+ 1401 C_IP_ID | Bit Mask | Field Values | signal | checksum | 1402 :::::::::+::::::::::+::::::::::::::+::::::::::+::::::::::+ 1404 - S: in-band signal flag (1 means signal field is present) 1405 - Everything from C bit up to and including C_IP_ID is the same 1406 as in FO header 1407 - Bit Mask (1 byte): indicates which fields are present 1409 M1 - Type of Service (in IPv4 header) 1410 M2 - Don't Fragment Flag (in IPv4 header) 1411 M3 - Time to Live (in IPv4 header); 1412 M4 - Padding-bit (in RTP header) 1413 M5 - Extension-bit (in RTP header, note that RTP header 1414 extension will be carried as RTP payload) 1415 M6 - Payload Type (in RTP header) 1416 M7 - CSRC Count (in RTP header) 1417 M8 - CSRC List (in RTP header) 1419 - Field Values: For simplicity, uncompressed values. Further 1420 optimization are possible. 1421 - Signal: Used for in-band signaling 1422 - checksum: at least 8 bits, but may be longer to fit the 1423 byte boundary. 1425 case 3 (ST = 11): reason 1 and 2. 1427 +----+----+---+---+---+--------+--------+-------+ 1428 | PT | ST | S | C | M | RTP_SN | RTP_TS | IP_ID | 1429 +----+----+---+---+---+--------+--------+-------+ 1431 ::::::::::+::::::::::::::+::::::::::+::::::::::+ 1432 Bit Mask | Field Values | signal | checksum | 1433 ::::::::::+::::::::::::::+::::::::::+::::::::::+ 1435 - S: in-band signal flag (1 means in-band signal is present) 1436 - RTP_SN, RTP_TS, and IP_ID are uncompressed 1437 - Bit Mask, Field Values, Signal and checksum are same as in case 2 1439 FH: Consists of the PT ("111110", 6 bits), followed by full headers 1440 from RTP, UDP, and IP, checksum, plus the payload. Some of the 1441 fields in the UDP and IP header could be inferred from the link 1442 layer (e.g. packet length), and may be replaced with compressor / 1443 decompressor level information (e.g., CID). This type of packet 1444 header is normally sent only at session initiation, or in 1445 response to an FH_REQ Packet, described below. The extra bits can 1446 be padding or used for other purpose. 1448 +----+------------+-------------+-------------+---------+ 1449 | PT | IP header* | UDP header* | RTP header* |checksum | 1450 +----+------------+-------------+-------------+---------+ 1452 * some field(s) may be modified (see above) 1454 REFRESH_REQ: 1456 These packets are sent only in extremely rare incidences, e.g., 1457 memory loss or CPU crash. It may also be sent by the decompressor 1458 when undected transmission error is caught by the header 1459 compression level checksum. 1461 +----+---+ 1462 | PT | F | 1463 +----+---+ 1465 PT: 7 bits (value = 1111110). 1466 F: = 1, requesting full refresh 1467 = 0, requesting only dynamic (non-static) refresh 1469 5. Robustness/Efficiency Issues and Tradeoffs 1471 This section provides the rationale behind the choices in ACE to 1472 achieve robustness and efficiency. 1474 5.1. RPP Mode - Rationale for ACK-based Robustness 1476 The primary means for providing robustness in the RPP mode is to send 1477 ACKs. When a return path is available, transitioning from a lower 1478 compression state to a higher compression state in a manner 1479 controlled by ACKs provides a proactive way to prevent loss of 1480 context synchronization, and consequently error propagation. Context 1481 loss is prevented because the compressor always uses as compression 1482 reference some header that is known to be correctly decompressed 1483 through an ack. The alternative is to have a reactive approach, i.e. 1484 detect loss of context synchronization and react to it by requesting 1485 the compressor to send information to resynchronize the contexts. 1487 An issue that remains to be addressed is the residual transmission 1488 errors undetected by the CD-CC. When such an undetected error occurs, 1489 without any additional mechanism, the decompressor will decompress 1490 incorrectly without knowing it. Depending on the technology of the 1491 CD-CC, the residual error may or may not be large enough to be a 1492 problem. If the residual error is excessive, there must be a 1493 sufficiently strong CRC added at the compressor/decompressor level to 1494 minimize the probability of undetected errors. However, even with a 1495 strong CRC, there is a non-negligible chance that during the lifetime 1496 of an RTP session, at least one header may have an undetected error. 1497 When that happens, it is likely that decompression of the affected 1498 header will be incorrect. If the incorrectly decompressed header 1499 happens to be used as reference for decompressing subsequent 1500 headers, it is likely that the subsequent headers will be incorrectly 1501 decompressed, until another correctly decompressed header is used as 1502 reference. This problem is referred to as error accumulation. To 1503 address this problem, a checksum (CS) is attached to every header 1504 candidate to be a reference, to allow the decompressor to verify the 1505 correctness. The CS is calculated as a 8-bit one's complement 1506 checksum, calculated over the whole uncompressed header. If the 1507 decompressor detects an incorrect CS, it will simply discard the 1508 header. Therefore, a secondary means of providing robustness is to 1509 send compressed headers with a checksum. 1511 Reasons why the checksum is used only as a secondary means of 1512 providing robustness are: 1514 - The extremely low probability of error accumulation when the link 1515 error detection is reasonably good. 1517 - The checksum is not meant to be a substitute for link error 1518 detection CRC, and can dramatically reduce compression efficiency 1519 if used in this way. Indeed, an incorrect checksum will trigger a 1520 resynchronization process that is bandwidth costly because the 1521 information sent consists of uncompressed "full" or large-sized 1522 headers. The sudden surge in bandwidth caused by the large-sized 1523 headers, may not be handled well by some radio technologies. The 1524 resynchronization mechanism is sensitive to errors and delays. Due 1525 to the round trip delay, there can be a significant time period 1526 during which the decompressor has to discard all incoming headers, 1527 while waiting for the requested resynchronization information from 1528 the compressor. 1530 This means that always or even frequently sending the checksum when 1531 error detection is sufficient and ACKs are already being used is NOT 1532 efficient use of transmission bandwidth. In ACE, a checksum need only 1533 be sent in headers which may be used as decompression reference. In 1534 the ACE decompression logic, these headers are usually a very small 1535 percentage of the headers (FH, or FO). Since the checksum need not be 1536 sent in every header, especially in 1 byte SO headers, the same bits 1537 that would be used for the checksum can be used to send a longer 1538 compressed sequence number and tolerate more packet loss. The loss 1539 range tolerated grows exponentially with the number of bits 1540 allocated to the sequence number. For example, when 2 more bits are 1541 used (6 bits instead of 4), the range is multiplied by 4 (can 1542 tolerate 64 packet loss rather than 16). 1544 Another use of checksum is during sparse FO or sparse IR techniques. 1545 If during the Sparse FO procedure, the decompressor receives an SO 1546 without receiving any of the preceding FO headers, it will decompress 1547 incorrectly without even knowing it. To alleviate this problem, a 1548 checksum (CS) is appended to all the SO headers during the Sparse FO 1549 procedure. Similarly, all FO during the Sparse IR procedure have a 1550 CS. 1552 5.2. NRP Mode - Rationale for Operation 1554 The primary means for providing robustness in NRP mode is to send a 1555 checksum in EVERY compressed header packet, along with occasional 1556 refresh information. ACKs cannot be used, by definition of NRP mode. 1558 The main reason for inclusion of the checksum in every compressed 1559 header packet is that it provides a means of verifying that the 1560 decompressor is functioning properly, i.e., a means to check 1561 synchronization of contexts at compressor and decompressor. 1563 The link layer error protection, alone or in combination with 1564 additional protection at the compressor level, should result in very 1565 few undetected errors reaching the decompressor. If they do, the 1566 checksum can likely detect them, so they can be discarded before 1567 causing any harm to the decompressor. The same behavior is observed 1568 at the decompressor if the checksum fails due to other conditions, 1569 e.g., excessive loss on the CD-CC. 1571 However, unlike with RPP, there is by definition no way to let the 1572 compressor know that such events have occurred. The compressor must 1573 therefore periodically send some refresh information to the 1574 decompressor 'just in case' something has gone wrong. 1576 5.3. Checksum - Rationale for Use (Instead of CRC) 1578 It is well-known that a CRC typically provides better error detection 1579 capability than a checksum of the same length. The advantages of a 1580 CRC are evident when isolated bit errors occur, or when the errors 1581 are in a small burst. Such error patterns are often observed on many 1582 types of transmission channels, including cellular channels. 1584 However, the bit errors caused by loss of context synchronization, 1585 unlike transmission errors, will tend to be widespread. This means 1586 that a simple error detecting mechanism like a checksum would likely 1587 suffice. Some other considerations in favor of using a checksum over 1588 a CRC include: 1590 - Complexity: The improved performance of CRC over checksum does not 1591 come without cost- CRC implementation is more complex than 1592 implementation of a checksum. In practice, the checksum 1593 calculation can be provided in a very straightforward way, while 1594 CRC computation likely requires additional CPU cycles due to the 1595 bitwise operations that are often involved. The additional CPU 1596 cycles could be significant if the header compressor needs to 1597 process several flows in parallel, as would be the case for the 1598 header compression entity that would reside in the cellular 1599 infrastructure equipment. Processing a small number of flows might 1600 be an issue for today's miniature wireless devices, which already 1601 aim to stretch battery performance to new limits. 1603 - Flexibility: It is likely much easier to implement checksums of 1604 various lengths- one only needs to truncate the result of the one's 1605 complement sum. This flexibility might be desirable if, e.g., one 1606 wants to modify compressed header formats in the future in such a 1607 way that more/fewer bits are allocated for the checksum. 1608 Alternatively, changing the length of a CRC will, at minimum, 1609 require definition of a new CRC polynomial. Further, if the CRC is 1610 implemented in hardware, this kind of modification can be quite 1611 significant. 1613 6. Conclusions 1615 Efficient IP/UDP/RTP Header Compression is a must for transmission of 1616 real-time multimedia over bandwidth-limited channels. Cellular 1617 systems in particular present significant challenges, not only in 1618 terms of bandwidth limitations, but also issues such as bursty error 1619 characteristics, and long round trip delays. 1621 A new header compression scheme which is both efficient and robust to 1622 a broad range of error and delay conditions has been presented. The 1623 scheme overcomes the faults associated with the baseline IP/UDP/RTP 1624 header compression technique [CRTP], using several new techniques. 1625 These include the use of a controlled transition from lower 1626 compression states to higher compression states, based on the 1627 compressor confidence that the decompressor has acquired the needed 1628 information to decompress compressed headers sent in the higher 1629 compression state. When a feedback channel is available, confidence 1630 is achieved by proactive feedback in the form of ACKs from the 1631 decompressor. In addition to that state transition strategy, various 1632 encoding schemes such as VLE, and timer-based RTP timestamp 1633 compression are designed to minimize header overhead. ACE is also 1634 able to continue compression/decompression process in a seamless 1635 fashion across handoffs, including those which entail physical 1636 relocation of the network-based compression/decompression function. 1637 These features make this scheme ideal for VoIP transmission in 1638 cellular environments, or in any system which is subject to non- 1639 trivial error rates and delays. 1641 Basic ideas and principles of ACE are scalable to be able to handle a 1642 large number of packet loss between the compressor and decompressor, 1643 and a large degree of packet loss and misordering before the 1644 compressor. 1646 Preliminary results indicate that the new scheme's performance 1647 thoroughly exceeds that of [CRTP], particularly at high random packet 1648 error rates. 1650 7. Intellectual Property Considerations 1652 Nokia has filed patent applications that might possibly have 1653 technical relation to this contribution. 1655 8. References 1657 [REQ] M. Degermark, "Requirements for IP/UDP/RTP robust header 1658 compression", IETF Draft, May 2000. 1660 [CRTP] S. Casner, V. Jacobson. "Compressing IP/UDP/RTP Headers for 1661 Low-Speed Serial Links", Internet Engineering Task Force 1662 (IETF) RFC2508, February 1999. 1664 [GSMSIG] ETSI Digital cellular telecommunications system (Phase 2+); 1665 "Mobile radio interface layer 3 specification", (GSM 04.08 1666 version 7.0.1 Release 1998). 1668 9. Authors' Addresses 1670 Khiem Le 1671 Nokia Research Center 1672 6000 Connection Drive 1673 Irving, TX 75039 1674 USA 1676 Phone: +1 972 894-4882 1677 Fax: +1 972 894-4589 1678 E-mail: khiem.le@nokia.com 1680 Christopher Clanton 1681 Nokia Research Center 1682 6000 Connection Drive 1683 Irving, TX 75039 1684 USA 1686 Phone: +1 972 894-4886 1687 Fax: +1 972 894-4589 1688 E-mail: chris.clanton@nokia.com 1690 Zhigang Liu 1691 Nokia Research Center 1692 6000 Connection Drive 1693 Irving, TX 75039 1694 USA 1696 Phone: +1 972 894-5935 1697 Fax: +1 972 894-4589 1698 E-mail: zhigang.liu@nokia.com 1700 Haihong Zheng 1701 Nokia Research Center 1702 6000 Connection Drive 1703 Irving, TX 75039 1704 USA 1706 Phone: +1 972 894-4232 1707 Fax: +1 972 894-4589 1708 E-mail: haihong.zheng@nokia.com 1710 Appendix A - Header Field Classification 1712 In this section we describe a classification of the different 1713 IP/UDP/RTP header fields. Three types of headers are identified: 1715 * Static: a field which is expected to be constant during the 1716 lifetime of the compressed packet flow. Examples of static fields 1717 are the source and destination IP addresses, and the source and 1718 destination UDP port numbers. 1720 * Changing Essential: a field whose value will change with some 1721 frequency. Examples of changing essential fields are the RTP 1722 timestamp, RTP sequence number, and IP-ID. Each has a tendency to 1723 change from one packet to the next. The term "essential" is used 1724 here only for convenience, and does not mean to imply that these 1725 fields are more essential than others. 1727 * Changing Non-Essential: a field whose value can possibly change 1728 during a session, but seldom does. Examples are the RTP payload 1729 type, and the IP Time-To-Live (TTL) and Type of Service (TOS) 1730 fields. This class of header fields also includes "inferred" 1731 fields whose value may be provided by the link layer. An example 1732 is the IP total length field. 1734 The table below summarizes classification of the various IP/UDP/RTP 1735 header fields. 1737 +-----------+----------------+-------------------------------------+ 1738 | | Static | Non-static | 1739 | | +-----------------+-------------------+ 1740 | | | Essential | Non-essential | 1741 +-----------+----------------+-----------------+-------------------+ 1742 | IPv4 | version | IP-ID | type of service | 1743 | Header | header length | | total length | 1744 | Fields | protocol | | don't fragment | 1745 | | source IP addr | | more fragment | 1746 | | dest IP addr | | fragment offset | 1747 | | | | time to live | 1748 | | | | header checksum | 1749 +-----------+----------------+-----------------+-------------------+ 1750 | IPv6 | version | | traffic class | 1751 | Header | flow label | | next header | 1752 | Fields | source IP addr | | hop limit | 1753 | | dest IP addr | | | 1754 +-----------+----------------+-----------------+-------------------+ 1755 | UDP | source port | | length | 1756 | Header | dest port | | checksum | 1757 | Fields | | | | 1758 +-----------+----------------+-----------------+-------------------+ 1759 | RTP | version | marker-bit | padding-bit | 1760 | Header | SSRC | sequence number | extension-bit | 1761 | Fields | | timestamp | payload type | 1762 | | | | CSRC count | 1763 | | | | CSRC list | 1764 +-----------+----------------+-----------------+-------------------+ 1766 Appendix B - Implementation Hints 1768 This appendix is informative. It is meant to provide some 1769 recommendations on how to best utilize the concepts discussed in this 1770 draft. The descriptions below are not meant to restrict 1771 implementations in any way; the primary objective is provide 1772 suggestions and loose guidelines in the following areas: 1774 - realization of channels for transmission of ACK information 1776 - transmission frequency of ACK 1778 - transmission frequency of compressed header with checksum 1779 - suggestions for setting ACE parameters (Lf, Lr, etc.) 1781 B.1. Considerations for Feeback Channel Realization (RPP mode only) 1783 ACK transmission flexibility means the implementor can utilize any of 1784 several feedback channel options, depending on what facilities are 1785 available in the target system. Some options (and associated 1786 tradeoffs/considerations) follow below. 1788 - Shared ACK channel: In this case, N channels are shared between M 1789 decompressors, where typically N << M. Tradeoff here is bandwidth 1790 used (low compared to dedicated case) vs. response time (slower 1791 than dedicated case). 1793 - Dedicated ACK channel: In this case, each decompressor has it's 1794 own dedicated feedback channel. Tradeoff here is again bandwidth 1795 used (high compared to shared case) vs. response time (faster than 1796 dedicated case). 1798 - Piggybacked ACK: In this case, ACKs in the reverse direction are 1799 `piggybacked' on data packets already being sent in the reverse 1800 direction. Bandwidth usage in this case could be quite low, since 1801 transmission overhead (e.g., L1/L2 overhead) of ACK can be shared 1802 with the packet already being sent. But response time could 1803 suffer, since transmission of ACK can occur only when other packets 1804 already need to be sent. In the case of speech, the ACK might be 1805 piggybacked e.g., on actual speech from the remote endpoint. 1806 During silence intervals, some cellular speech codecs send comfort 1807 noise frames that might also be used to piggyback the ACK data. 1809 - Hybrids: Combinations of the previous and the first two options 1810 are also possible, e.g., use shared channel if piggybacking not 1811 possible. 1813 Clearly, it is desirable to have low delay ACK transmission using as 1814 few system resources as is possible. As a general rule, it is 1815 recommended that one should make use of ACK piggybacking if possible. 1816 But as described below, ACK transmission frequency is quite low, so 1817 that some inefficiency in transmitting them has minor effects on the 1818 effectiveness of the compression. 1820 B.2. ACK Transmission Frequency (RPP mode only) 1822 Although the overhead due to the ACKs is quite low, it is still 1823 desirable to minimize ACK transmission costs. This means reducing 1824 transmission of ACKs to those times when they are absolutely needed. 1825 By carefully managing ACK transmission, is possible to take full 1826 advantage of the robustness/compression efficiency improvements they 1827 can provide, without burdening the system with large amounts of 1828 traffic on the feedback channels. The items below provide some 1829 general guidelines for transmitting ACKs: 1831 - In SO state, it is desirable that the compressor received an ACK in 1832 enough time that increase of the sequence number sent in the SO 1833 header is avoided. For example, if the compressor is sending a 1834 6-bit sequence number while in SO state, an ACK is needed at least 1835 once every 2^6 packets to avoid the need to send a larger sequence 1836 number (e.g., using SO_EXT header). Note: to account for potential 1837 loss of ACK and system round-trip delays, one may want to send them 1838 at a slightly higher rate. 1840 - If an estimate of the round trip time between compressor and 1841 decompressor is known by the decompressor, the sparse ACK concept 1842 should be used whenever possible to avoid transmission of many 1843 consecutive ACKs during compressor state transitions. 1845 If these guidelines are followed, one can expect very little overhead 1846 contribution due to ACKs (normally << 1 bit per compressed header). 1848 B.3. Transmission of Checksum Information (RPP mode only) 1850 RPP mode only, since the checksum appears in all packets for NRP 1851 mode. 1853 As described in section 5, checksums provide only a secondary level 1854 of robustness when a feedback channel is present. However, we 1855 recommend that: 1857 - Checksum is sent in EVERY FO header if the media is audio, 1858 particulary when sparse IR/FO mechanisms are used, to avoid the 1859 possibility of undetected error corrupting the decompression 1860 context during state transitions. 1862 - Checksum is sent in the SO header frequently enough that an ACK 1863 transmission is triggered at the decompressor in accordance with 1864 the guidelines for ACK transmission frequency described in section 1865 B.2. I.e., sequence number wrap-around condition should be 1866 avoided. 1868 B.4. Suggested Parameter Values 1870 This section provides some considerations which should be taken into 1871 account when selected actual values for the ACE parameters described 1872 throughout the document. In practice, it is good fine tune these 1873 parameters in the actual operating enviroment to ensure that the best 1874 performance is obtained. 1876 IR_IRx/IR_FOx (x = 1, 2, 3, etc): These parameters define the 1877 behavior of the compressor during sparse IR operation. An arbitrary 1878 spacing between between IR and FO headers can be achieved by setting 1879 the values appropriately; alternatively, sparse IR can be 'disabled' 1880 by chosing IR_FOx for all x = 0. Selection of the parameters depends 1881 on round trip time between compressor and decompressor, robustness of 1882 the CD-CC, and bandwidth available for the header. For the cellular 1883 case, we can expect non-negligible round-trip times, limited 1884 transmission robustness, and very limited bandwidth. In such a 1885 scenario, it probably makes sense to send an IR header followed by 1886 several FO headers (e.g., IR_IR1=1, IR_FO1=N, IR_IR2=1, IR_FO2=N, 1887 etc, where N corresponds to the number of FO packets that can be sent 1888 in one round trip time observed on the CD-CC). 1890 Also, note that sending IR headers could mean degradation or muting 1891 of speech, due to their high bandwidth requirement. Also, very low 1892 loss on CD-CC might mean very small values of x. 1894 FO_FOx/FO_SOx (x = 1, 2, 3, etc): The same recommendations as were 1895 given in the case of IR_IRx/IR_FOx parameters still apply. Since FO 1896 header is typically smaller than a refresh header (and closer to the 1897 size of an SO header), estimates of round trip time would need to be 1898 adjusted (compared to IR_IRx/IR_FOx case). 1900 Lr: This parameter indicates the number of refresh headers sent 1901 before exiting the IR state, when the compressor operates in NRP 1902 mode. The value of Lr should be selected such that there is high 1903 probability of receiving at least one of the refresh headers at the 1904 decompressor without errors. Factors which effect selection of this 1905 parameter value are: 1907 - loss characteristics of the CD-CC; e.g., if CD-CC is error-prone, 1908 Lr may be high 1910 - values of IR_IRx and IR_FOx (x = 1, 2, 3, etc) if sparse IR 1911 technique is used 1913 Lf: This parameter indicates the number of FO headers that should be 1914 sent (since the last string pattern change) before exiting the FO 1915 state, when the compressor operates in NRP mode. The value of Lf is 1916 chosen using similar criteria to that used to pick Lr. 1918 Lcs: This parameter indicates the maximum number of consecutive 1919 packets which can have an incorrect checksum value before some action 1920 is taken by the decompressor in RPP mode. 1922 With ACE, an incorrect checksum computation occurs when 1924 - Link layer error detection mechanism has failed to detect an error 1926 - FO received before IR in IR state (when sparse IR is used) 1928 - SO received before FO in FO state (when sparse FO is used) 1930 - Sliding window is managed in such a way that the decompression is 1931 incorrect (i.e., value of M, described below, is too small) 1933 Factors to consider when selecting Lcs are: 1935 - Value of 'M' for sliding window 1937 - Amount of degradation that is allowable in the application; in case 1938 of RPP, refresh may be requested quicker if the application can not 1939 tolerate much packet loss. This implies a smaller Lcs may be 1940 desired. 1942 - Effect of refresh packets on compressor/decompressor performance; 1943 consideration should be given to the transmission medium's ability 1944 to tolerate the bandwidth surge caused by the refresh; refresh also 1945 negatively impacts compression efficiency. This may imply need for 1946 a larger Lcs value. 1948 Refresh Time Out: this parameter defines the amount of time between 1949 transmission of refresh packets in NRP mode. There are several 1950 factors to consider: 1952 - Frequent refresh is inefficient in terms of compression 1953 performance. 1955 - Less frequent refresh can result in long periods where no data is 1956 provided to the application; an application left 'starving' for 1957 data for a considerable time may have problems or even terminate. 1959 - Frequent refresh causes more frequent bandwidth surges; some 1960 channels may be able to tolerate the larger refresh packets better 1961 than others; a related consideration is that sending refresh info 1962 may require `stealing' from the application data flow. 1964 - Probability that refresh is lost due to transmission conditions; 1965 since refresh packets are larger than other packets, probability of 1966 corruption is higher. 1968 It is recommended that one start with a long refresh timeout and then 1969 lower the value to take into account the needs of the application, 1970 the bandwidth available, and the robustness of the CD-CC. 1972 M: M is a parameter that defines the max size of the the sliding 1973 window (VLE and Timer-Based encoding schemes). Considerations when 1974 selecting M include: 1976 - Memory (cost) limitations; larger M means more memory/cost 1978 - Round-Trip Time between compressor and decompressor 1980 - Too small M could result in transmission of FH_REQ, sliding window 1981 is moved without ACK (see Compressor Logic). 1983 Appendix C - Experimental Results 1985 The new header compression scheme was implemented (along with RFC 1986 2508) on a testbed in order to validate its operation and 1987 performance. This section describes the testbed and some of the 1988 results obtained. 1990 C.1. Test Configuration 1991 +------------+ +-----+ +---------+ 1992 "Mobile Terminal" | Endpoint 1 | -- | Hub | -- | Linux 1 |---+ 1993 | (PC) | | 1 | | (UPA) | | 1994 +------------+ +-----+ +---------+ | 1995 | 1996 +-------+ 1997 "Air Interface" | Hub 2 | 1998 +-------+ 1999 | 2000 "network" | 2001 +------------+ ......... +---------+ | 2002 | Endpoint 2 | --- : LAN : ---- | Linux 2 |---------+ 2003 | (PC) | : : | (UPA) | 2004 +------------+ :.......: +---------+ 2006 The testbed employs a User Plane Adaptation (UPA) function on both 2007 the "network" side (Linux 2) and the "mobile" side (Linux 1). The 2008 adaptation function performs several functions: 2010 * Header Compression: either IETF 2508, or ACE. 2012 * Channel Delay Simulation: a fixed delay (set at run time) is 2013 applied to each packet at the transmitter side. The same delay 2014 model was applied on both forward and reverse channels. Delay is 2015 simulated at the transmitting side. 2017 * Channel Error Model: packet loss is simulated at decompressor. 2018 The testbed can simulate channel conditions as either packet loss 2019 or bit errors, according to random or template-based error 2020 distributions. The same error model was applied on both forward 2021 and feedback channels. 2023 * H.225/H.245 Message Parser: the adaptation function on the mobile 2024 side (in Linux 1) includes a control message parser which can 2025 derive RTP port number information at call setup. 2027 * This information is signalled to the network adaptation function 2028 entity as well. The port number data allows each adaptation 2029 function to easily determine which information flows should be 2030 routed to the compressor. 2032 * Handoff Manager: the adaptation function on the network side 2033 (Linux 2) includes software for simulating handoff effects due to 2034 the header compression (e.g., transfer of state information). The 2035 Linux 2 can host two complete UPA entities simultaneously for this 2036 purpose. 2038 * Network Jitter Simuation: a model of the network jitter can be 2039 included; this provides a controllable means to test the behavior 2040 of the timer-based RTP timestamp compression scheme. 2042 * Packet Reordering Simulation: various amounts of packet reordering 2043 prior to the compressor can be simulated. 2045 The endpoints are standard Windows PCs running a Microsoft 2046 Netmeeting, a commonly used H.323 based audio/video conferencing 2047 application. Netmeeting provides the IP/UDP/RTP data flows. 2049 C.2. Assumptions in implementing RFC 2508 2051 In the results which follow, it is important to note that the 2052 following assumptions were made in implementing RFC 2508: 2054 1. Assumes that the UDP Checksum can be omitted. This reduces the 2055 minimal compressed header size to 2 bytes, compared to 4 with the 2056 checksum. The same assumption is made regarding ACE. 2058 2. Assumes that the compressor sends a large header upon receipt 2059 of an explicit context state request from the decompressor, 2060 compared to the compressor periodically sending a full header 2061 every 'r' packets, where r is the 'refresh rate'. 2063 3. The decompressor sends another context state request if the 2064 next packet received after the previous request is NOT a full 2065 header. 2067 4. The decompressor drops any packets received until the full 2068 header is received. 2070 5. Assumes compressor responds to all context state requests, 2071 i.e., full header may be sent unnecessarily due to timing of 2072 request and delay of full header delivery to decompressor. 2074 6. Assumes that the delta encoding rule given in the specification 2075 is used. 2077 7. The TWICE mechanism is not employed. 2079 8. To be fair for RFC 2508, the size of "full header" is counted 2080 as 17 bytes (COMPRESSED_NON_TCP), instead of 40 bytes 2081 (FULL_HEADER). 2083 9. CID is not used, e.g., the link layer provides enough 2084 information to discriminate multiple flows (assumed for both ACE 2085 and RFC 2508, for fairness). 2087 10. Length information need not be sent in the compressed, e.g., 2088 the link layer provides the necessary data. 2090 C.3. Test Results 2092 This section summarizes performance when different channel 2093 characteristics are simulated. The testbed is capable of simulating 2094 random packet loss, random bit errors, and bit errors according to a 2095 provided template. A short description of the meaning of these error 2096 models follows. 2098 * Random Packet Loss: Entire VoIP packets (each packet contains 2099 one speech frame, representing a 20 or 30 mS speech sample) are 2100 dropped at the decompressor according to a provided numerical 2101 input. 2103 The error simulation module at the decompressor runs a clock, and 2104 update an error flag for each timeslot (20ms or 30ms, depending on 2105 codec). If a compressed packet arrives at the time the flag is set 2106 to true, the entire packet is dropped. Therefore, a value of 2% 2107 means that on average, 2 out of every 100 opportunities to 2108 transmit a packet would result in the packet being lost (i.e., the 2109 error simulation is NOT driven by reception of RTP packets). 2111 * Random Bit Errors: In this case, location of the bit error 2112 determines whether or not the packet is dropped. If one or more 2113 errors occur in the header of the packet (link layer header and/or 2114 compressed header), then the packet is dropped. Errors in the 2115 payload do not result in loss of the entire packet. A value of 2% 2116 will result in on average 2 out of every 100 bits being corrupted. 2117 As with the above, the simulation is not driven by reception of 2118 bits. 2120 Note again, that the bits may carry nothing if there is no 2121 compressed packets being transmitted (e.g. during silence period). 2123 * Errors According to Template: In this case, the error behavior 2124 is determined by an error trace or template which may have been 2125 obtained from a cellular channel simulator, or alternatively, an 2126 actual cellular system with the capability to produce such traces. 2127 Same rules for processing the packet apply as with the Random Bit 2128 Error model. 2130 Test Configuration Details and Assumptions: 2132 * Microsoft Netmeeting G.723.1 6.4 kbps speech codec used; one frame 2133 per VoIP packet (every 30 mS in this case) 2135 * 60 mS one way delay 2137 * Packet Loss rates (PER) of 1%, 2%, 5%, 10% and 20% 2139 * Packet Loss rate of both payload channel and feedback channel is 2140 the same. 2142 Results for the case of random packet loss with VLE plus Timer-Based 2143 coding of the RTP timestamp (no dynamic switching was used in this 2144 case) is employed are shown in the tables below. Efficiency is 2145 measured in terms of average packet header size (in bytes) with 2146 overhead associated with ACK included. Link layer overhead for the 2147 ACKs is NOT included, as it depends on the nature of the feedback 2148 channel. 2150 Conversational Speech Sample 2151 +----------+-----------------------------------+ 2152 | PER (%) | 1.0 2.0 5.0 10.0 20.0 | 2153 + ---------+-----------------------------------+ 2154 | RFC 2508 | 1.86 2.64 3.93 5.10 5.83 | 2155 +----------+-----------------------------------+ 2156 | ACE | 1.42 1.42 1.42 1.48 1.52 | 2157 +----------+-----------------------------------+ 2159 To guarantee fairness, the same voice samples were used as the audio 2160 source for each test. Silence suppression is used by Netmeeting to 2161 induce RTP timestamp irregularities corresponding to 'real' 2162 talkspurts and silence intervals. The adjustable threshold within 2163 that application is set to be the same during the runs for each HC 2164 scheme. As mentioned above, it is assumed that CID related 2165 information can be derived from the link layer in the case of both 2166 ACE and [CRTP]. Therefore, it is excluded from compressed headers 2167 for both schemes. 2169 Half time of the voice sample is in silence, which is typical for 2170 real-life conversation. The silence periods range from 0.3 to 8.5 2171 seconds. The talk spurts range from 1 to 10 seconds. Besides silence 2172 periods, irregular changes of IP-ID trigger half of the FO packets. 2174 For [CRTP], the actual packet loss measured is at least double the 2175 PER over the simulated channel. This is expected since at least one 2176 extra packet is invalidated by the decompressor and thus dropped, if 2177 the preceding packet is detected as out of synchronization (error 2178 propagation, see Appendix B for description). In general, with 2179 [CRTP], the packet loss rate increases with the round trip delay. 2180 But for ACE, error propagation has been completely eliminated. Also 2181 of note is the near constant performance of ACE, even at high packet 2182 loss rates. 2184 A simple subjective evaluation proves consistent with the obtained 2185 objective measurements. The voice quality in the case of ACE is 2186 clearly better than that using plain RFC2508. Audio clips are 2187 recorded into files for evaluation by interested parties. Audio 2188 clips will be provided on the ACE homepage. 2190 Appendix D - Handoff Operation 2192 Handoff presents a problem for the header compression process because 2193 it normally results in the loss of multiple packets between 2194 compressor and decompressor. Most of the loss is the result of 2195 sending/receiving necessary signalling and synchronization 2196 information, when the mobile station would otherwise be 2197 sending/receiving user traffic. For example, an upper bound in GSM 2198 systems is 320 mS [GSMSIG], but more typically, when handoff occurs, 2199 communications is disrupted for 100 mS, which translates into 2200 multiple 20 msec speech packets. 2202 Schemes such as [CRTP] will necessarily have to reinitialize the 2203 compression/decompression process by sending large amounts of 2204 synchronization information when handoff occurs. But ideally, the 2205 process would not have to do any reinitialization after completion of 2206 the handoff, if it can handle multiple packet loss. Efficiency of 2207 this operation is important, since in some types of wireless systems 2208 (e.g. PCS), cells can be small (several hundred to a few thousand 2209 feet), such that handoff can occur quite frequently at mobile speeds. 2211 Another issue with handoff is the network compressor/decompressor 2212 relocation. In cellular systems, it is expected that the network part 2213 of header compression/decompression is done by some network entity 2214 that we refer to as the Access Network Infrastructure Adapter 2215 (ANI_AD). As the MS moves farther and farther away from the location 2216 where the call initially started, routing efficiency considerations 2217 may require that the header compression/decompression function be 2218 relocated from the initial ANI_AD to another ANI_AD. Such relocation 2219 of functions already exists in third generation UMTS systems, for the 2220 purpose of routing optimization when soft handover is done. The 2221 impact of ANI_AD relocation is that to avoid reinitialization between 2222 the MS and the new ANI_AD, some context information must be 2223 transferred from the original ANI_AD to the new ANI_AD. The header 2224 compression scheme must be such that it can handle that context 2225 information transfer without being disrupted. 2227 It is a must for the header compression scheme to handle cellular 2228 handoff operations in an efficient manner. 2230 When header compression is applied to a cellular environment, there 2231 is an adaptation function in the MS and another one in the network 2232 which is responsible for the header compression. The MS adapter 2233 (MS_AD) acts as compressor for the uplink and a decompressor for the 2234 downlink. Conversely, the access network infrastructure adapter 2235 (ANI_AD) acts as decompressor for the uplink and a compressor for the 2236 downlink. 2238 With ACE, it is possible to do radio handoff as well as ANI_AD 2239 relocation in a seamless manner, i.e. without requiring the 2240 compression/decompression process to go through reinitialization or 2241 resynchronization. 2243 Key to the seamlessness are: 2245 - ACE ability to withstand a very large number of packet losses 2246 between the compressor and decompressor, caused by the radio 2247 handoff 2249 - ACE ability to tolerate context information transfer from the old 2250 ANI_AD to the new ANI_AD, without disrupting the continuing 2251 compression and decompression of user packets.