idnits 2.17.1 draft-ietf-avt-crtp-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-25) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (November 21, 1997) is 9652 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? '1' on line 946 looks like a reference -- Missing reference section? '2' on line 949 looks like a reference -- Missing reference section? '3' on line 973 looks like a reference -- Missing reference section? '4' on line 955 looks like a reference Summary: 8 errors (**), 0 flaws (~~), 1 warning (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force Audio/Video Transport Working Group 3 INTERNET-DRAFT S. Casner / Precept Software 4 draft-ietf-avt-crtp-04.txt V. Jacobson / LBNL 5 November 21, 1997 6 Expires: May 1998 8 Compressing IP/UDP/RTP Headers for Low-Speed Serial Links 10 Status of this Memo 12 This document is an Internet-Draft. Internet-Drafts are working docu- 13 ments of the Internet Engineering Task Force (IETF), its areas, and its 14 working groups. Note that other groups may also distribute working 15 documents as Internet-Drafts. 17 Internet-Drafts are draft documents valid for a maximum of six months 18 and may be updated, replaced, or obsoleted by other documents at any 19 time. It is inappropriate to use Internet- Drafts as reference material 20 or to cite them other than as "work in progress." 22 To learn the current status of any Internet-Draft, please check the 23 "1id-abstracts.txt" listing contained in the Internet- Drafts Shadow 24 Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), 25 munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or 26 ftp.isi.edu (US West Coast). 28 Distribution of this document is unlimited. 30 Abstract 32 This document describes a method for compressing the headers of 33 IP/UDP/RTP datagrams to reduce overhead on low-speed serial links. 34 In many cases, all three headers can be compressed to 2-4 bytes. 36 Comments are solicited and should be addressed to the working group 37 mailing list rem-conf@es.net and/or the author(s). 39 1. Introduction 41 Since the Real-time Transport Protocol was published as an RFC [1], 42 there has been growing interest in using RTP as one step to achieve 43 interoperability among different implementations of network audio/video 44 applications. However, there is also concern that the 12-byte RTP 45 header is too large an overhead for 20-byte payloads when operating over 46 low speed lines such as dial-up modems at 14.4 or 28.8 kb/s. (Some 47 existing applications operating in this environment use an application- 48 specific protocol with a header of a few bytes that has reduced func- 49 tionality relative to RTP.) 51 Header size may be reduced through compression techniques as has been 52 done with great success for TCP [2]. In this case, compression might be 53 applied to the RTP header alone, on an end-to-end basis, or to the com- 54 bination of IP, UDP and RTP headers on a link-by-link basis. Compress- 55 ing the 40 bytes of combined headers together provides substantially 56 more gain than compressing 12 bytes of RTP header alone because the 57 resulting size is approximately the same (2-4 bytes) in either case. 58 Compressing on a link-by-link basis also provides better performance 59 because the delay and loss rate are lower. Therefore, the method 60 defined here is for combined compression of IP, UDP and RTP headers on a 61 link-by-link basis. 63 This document defines a compression scheme that may be used with IPv4, 64 IPv6 or packets encapsulated with more than one IP header, though the 65 initial focus is on IPv4. The IP/UDP/RTP compression defined here is 66 intended to fit within the more general compression framework specified 67 in [3] for use with both IPv6 and IPv4. That framework defines TCP and 68 non-TCP as two classes of transport above IP. This specification 69 creates IP/UDP/RTP as a third class extracted from the non-TCP class. 71 2. Assumptions and Tradeoffs 73 The goal of this compression scheme is to reduce the IP/UDP/RTP headers 74 to two bytes for most packets in the case where no UDP checksums are 75 being sent, or four bytes with checksums. It is motivated primarily by 76 the specific problem of sending audio and video over 14.4 and 28.8 77 dialup modems. These links tend to provide full-duplex communication, 78 so the protocol takes advantage of that fact, though the protocol may 79 also be used with reduced performance on simplex links. 81 This specification does not address segmentation and preemption of large 82 packets to reduce the delay across the slow link experienced by small 83 real-time packets, except to identify in Section 4 some interactions 84 between segmentation and compression that may occur. Segmentation 85 schemes may be defined separately and used in conjunction with the 86 compression defined here. 88 It should be noted that implementation simplicity is an important factor 89 to consider in evaluating a compression scheme. Communications servers 90 may need to support compression over perhaps as many as 100 dial-up 91 modem lines using a single processor. Therefore, it may be appropriate 92 to make some simplifications in the design at the expense of generality, 93 or to produce a flexible design that is general but can be subsetted for 94 simplicity. The next sections discuss some of the tradeoffs listed 95 here. 97 2.1. Simplex vs. Full Duplex 99 In the absence of other constraints, a compression scheme that worked 100 over simplex links would be preferred over one that did not. However, 101 operation over a simplex link requires periodic refreshes with an 102 uncompressed packet header to restore compression state in case of 103 error. If an explicit error signal can be returned instead, the delay 104 to recovery may be shortened substantially. The overhead in the no- 105 error case is also reduced. To gain these performance improvements, 106 this specification includes an explicit error indication sent on the 107 reverse path. 109 On a simplex link, it would be possible to use a periodic refresh 110 instead. Whenever the decompressor detected an error in a particular 111 packet stream, it would simply discard all packets in that stream until 112 an uncompressed header was received for that stream, and then resume 113 decompression. The penalty would be the potentially large number of 114 packets discarded. The periodic refresh method described in Section 3.3 115 of [3] applies to IP/UDP/RTP compression on simplex links as well as to 116 other non-TCP packet streams. 118 2.2. Segmentation and Layering 120 Delay induced by the time required to send a large packet over the slow 121 link is not a problem for one-way audio, for example, because the 122 receiver can adapt to the variance in delay. However, for interactive 123 conversations, minimizing the end-to-end delay is critical. Segmenta- 124 tion of large, non-real-time packets to allow small real-time packets to 125 be transmitted between segments can reduce the delay. 127 This specification deals only with compression and assumes segmentation, 128 if included, will be handled as a separate layer. It would be inap- 129 propriate to integrate segmentation and compression in such a way that 130 the compression could not be used by itself in situations where segmen- 131 tation was deemed unnecessary or impractical. Similarly, one would like 132 to avoid any requirements for a reservation protocol. The compression 133 scheme can be applied locally on the two ends of a link independent of 134 any other mechanisms except for the requirements that the link layer 135 provide some packet type codes, a packet length indication, and good 136 error detection. 138 Conversely, separately compressing the IP/UDP and RTP layers loses too 139 much of the compression gain that is possible by treating them together. 140 Crossing these protocol layer boundaries is appropriate because the same 141 function is being applied across all layers. 143 3. The Compression Algorithm 145 The compression algorithm defined in this document draws heavily upon 146 the design of TCP/IP header compression as described in RFC 1144 [2]. 147 Readers are referred to that RFC for more information on the underlying 148 motivations and general principles of header compression. 150 3.1. The basic idea 152 In TCP header compression, the first factor-of-two reduction in data 153 rate comes from the observation that half of the bytes in the IP and TCP 154 headers remain constant over the life of the connection. After sending 155 the uncompressed header once, these fields may be elided from the 156 compressed headers that follow. The remaining compression comes from 157 differential coding on the changing fields to reduce their size, and 158 from eliminating the changing fields entirely for common cases by calcu- 159 lating the changes from the length of the packet. This length is indi- 160 cated by the link-level protocol. 162 For RTP header compression, some of the same techniques may be applied. 163 However, the big gain comes from the observation that although several 164 fields change in every packet, the difference from packet to packet is 165 often constant and therefore the second-order difference is zero. By 166 maintaining both the uncompressed header and the first-order differences 167 in the session state shared between the compressor and decompressor, all 168 that must be communicated is an indication that the second-order differ- 169 ence was zero. In that case, the decompressor can reconstruct the ori- 170 ginal header without any loss of information simply by adding the 171 first-order differences to the saved uncompressed header as each 172 compressed packet is received. 174 Just as TCP/IP header compression maintains shared state for multiple 175 simultaneous TCP connections, this IP/UDP/RTP compression must maintain 176 state for multiple session contexts. A session context is defined by 177 the combination of the IP source and destination addresses, the UDP 178 source and destination ports, and the RTP SSRC field. A compressor 179 implementation might use a hash function on these fields to index a 180 table of stored session contexts. The compressed packet carries a small 181 integer, called the session context identifier or CID, to indicate in 182 which session context that packet should be interpreted. The decompres- 183 sor can use the CID to index its table of stored session contexts 184 directly. 186 Because the RTP compression is lossless, it may be applied to any UDP 187 traffic that benefits from it. Most likely, the only packets that will 188 benefit are RTP packets, but it is acceptable to use heuristics to 189 determine whether or not the packet is an RTP packet because no harm is 190 done if the heuristic gives the wrong answer. This does require execut- 191 ing the compression algorithm for all UDP packets, or at least those 192 with even port numbers (see section 3.4). 194 Most compressor implementations will need to maintain a "negative cache" 195 of packet streams that have failed to compress as RTP packets for some 196 number of attempts in order to avoid further attempts. Failing to 197 compress means that some fields in the potential RTP header that are 198 expected to remain constant most of the time, such as the payload type 199 field, keep changing. Even if the other such fields remain constant, a 200 packet stream with a constantly changing SSRC field must be entered in 201 the negative cache to avoid consuming all of the available session con- 202 texts. The negative cache is indexed by the source and destination IP 203 address and UDP port pairs but not the RTP SSRC field since the latter 204 may be changing. When RTP compression fails, the IP and UDP headers may 205 still be compressed. 207 Fragmented IP Packets that are not initial fragments and packets that 208 are not long enough to contain a complete UDP header must not be sent as 209 FULL_HEADER packets. Furthermore, packets that do not additionally con- 210 tain at least 12 bytes of UDP data cannot be used to establish RTP con- 211 text. If such a packet is sent as a FULL_HEADER packet, it may be fol- 212 lowed by COMPRESSED_UDP packets but not by COMPRESSED_RTP packets. 214 3.2. Header Compression for RTP Data Packets 216 In the IPv4 header, only the total length, packet ID, and header check- 217 sum fields will normally change. The total length is redundant with the 218 length provided by the link layer, and since this compression scheme 219 must depend upon the link layer to provide good error detection (e.g., 220 PPP's CRC), the header checksum may also be elided. This leaves only 221 the packet ID, which, assuming no IP fragmentation, would not need to be 222 communicated. However, in order to maintain lossless compression, 223 changes in the packet ID will be transmitted. The packet ID usually 224 increments by one or a small number for each packet. In the IPv6 base 225 header, there is no packet ID nor header checksum and only the payload 226 length field changes. 228 In the UDP header, the length field is redundant with the IP total 229 length field and the length indicated by the link layer. The UDP check- 230 sum field will be a constant zero if the source elects not to generate 231 UDP checksums. Otherwise, the checksum must be communicated intact in 232 order to preserve the lossless compression. Maintaining end-to-end 233 error detection for applications that require it is an important princi- 234 ple. 236 In the RTP header, the SSRC identifier is constant in a given context 237 since that is part of what identifies the particular context. For most 238 packets, only the sequence number and the timestamp will change from 239 packet to packet. If packets are not lost or misordered, the sequence 240 number will increment by one for each packet. For audio packets of con- 241 stant duration, the timestamp will increment by the number of sample 242 periods conveyed in each packet. For video, the timestamp will change 243 on the first packet of each frame, but then stay constant for any addi- 244 tional packets in the frame. If each video frame occupies only one 245 packet, but the video frames are generated at a constant rate, then 246 again the change in the timestamp from frame to frame is constant. Note 247 that in each of these cases the second-order difference of the sequence 248 number and timestamp fields is zero, so the next packet header can be 249 constructed from the previous packet header by adding the first-order 250 differences for these fields that are stored in the session context 251 along with the previous uncompressed header. When the second-order 252 difference is not zero, the magnitude of the change is usually much 253 smaller than the full number of bits in the field, so the size can be 254 reduced by encoding the new first-order difference and transmitting it 255 rather than the absolute value. 257 The M bit will be set on the first packet of a talkspurt and the last 258 packet of a video frame. If it were treated as a constant field such 259 that each change required sending the full RTP header, this would reduce 260 the compression significantly. Therefore, one bit in the compressed 261 header will carry the M bit explicitly. 263 If the packets are flowing through an RTP mixer, most commonly for 264 audio, then the CSRC list and CC count will also change. However, the 265 CSRC list will typically remain constant during a talkspurt or longer, 266 so it need be sent only when it changes. 268 3.3. The protocol 270 The compression protocol must maintain a collection of shared informa- 271 tion in a consistent state between the compressor and decompressor. 272 There is a separate session context for each IP/UDP/RTP packet stream, 273 as defined by a particular combination of the IP source and destination 274 addresses, UDP source and destination ports, and the RTP SSRC field. 275 The number of session contexts to be maintained may be negotiated 276 between the compressor and decompressor. Each context is identified by 277 an 8- or 16-bit identifier, depending upon the number of contexts nego- 278 tiated, so the maximum number is 65536. Both uncompressed and 279 compressed packets must carry the context ID and a 4-bit sequence number 280 used to detect packet loss between the compressor and decompressor. 281 Each context has its own separate sequence number space so that a single 282 packet loss need only invalidate one context. 284 The shared information in each context consists of the following items: 286 o The full IP, UDP and RTP headers, possibly including a CSRC list, 287 for the last packet sent by the compressor or reconstructed by the 288 decompressor. 289 o The first-order difference for the IPv4 ID field, initialized to 1 290 whenever an uncompressed IP header for this context is received and 291 updated each time a delta IPv4 ID field is received in a compressed 292 packet. 293 o The first-order difference for the RTP timestamp field, initialized 294 to 0 whenever an uncompressed packet for this context is received 295 and updated each time a delta RTP timestamp field is received in a 296 compressed packet. 297 o The last value of the 4-bit sequence number, which is used to detect 298 packet loss between the compressor and decompressor. 299 o The current generation number for non-differential coding of UDP 300 packets with IPv6(see [3]). For IPv4, the generation number may be 301 set to zero if the COMPRESSED_NON_TCP packet type, defined below, 302 is never used. 303 o A context-specific delta encoding table (see section 3.3.4) may 304 optionally be negotiated for each context. 306 In order to communicate packets in the various uncompressed and 307 compressed forms, this protocol depends upon the link layer being able 308 to provide an indication of four new packet formats in addition to the 309 normal IPv4 and IPv6 packet formats: 311 FULL_HEADER - communicates the uncompressed IP header plus any fol- 312 lowing headers and data to establish the uncompressed header state 313 in the decompressor for a particular context. The FULL-HEADER 314 packet also carries the 8- or 16-bit session context identifier and 315 the 4-bit sequence number to establish synchronization between the 316 compressor and decompressor. The format is shown in section 3.3.1. 318 COMPRESSED_UDP - communicates the IP and UDP headers compressed to 319 6 or fewer bytes (often 2 if UDP checksums are disabled), followed 320 by any subsequent headers (possibly RTP) in uncompressed form, plus 321 data. This packet type is used when there are differences in the 322 usually constant fields of the (potential) RTP header. The RTP 323 header includes a potentially changed value of the SSRC field, so 324 this packet may redefine the session context. The format is shown 325 in section 3.3.3. 327 COMPRESSED_RTP - indicates that the RTP header is compressed along 328 with the IP and UDP headers. The size of this header may still be 329 just two bytes, or more if differences must be communicated. This 330 packet type is used when the second-order difference (at least in 331 the usually constant fields) is zero. It includes delta encodings 332 for those fields that have changed by other than the expected 333 amount to establish the first-order differences after an 334 uncompressed RTP header is sent and whenever they change. The for- 335 mat is shown in section 3.3.2. 337 CONTEXT_STATE - indicates a special packet sent from the decompres- 338 sor to the compressor to communicate a list of context IDs for 339 which synchronization has or may have been lost. This packet is 340 only sent across the point-to-point link so it requires no IP 341 header. The format is shown in section 3.3.5. 343 When this compression scheme is used with IPv6 as part of the general 344 header compression framework specified in [3], another packet type may 345 be used: 347 COMPRESSED_NON_TCP - communicates the compressed IP and UDP headers 348 as defined in [3] without differential encoding. If it were used 349 for IPv4, it would require one or two bytes more than the 350 COMPRESSED_UDP form listed above in order to carry the IPv4 ID 351 field. For IPv6, there is no ID field and this non-differential 352 compression is more resilient to packet loss. 354 Assignments of numeric codes for these packet formats in the Point-to- 355 Point Protocol [4] are to be made by the Internet Assigned Numbers 356 Authority. 358 3.3.1. FULL_HEADER (uncompressed) packet format 360 The definition of the FULL_HEADER packet given here is intended to be 361 the consistent with the definition given in [3]. Full details on design 362 choices are given there. 364 The format of the FULL_HEADER packet is the same as that of the original 365 packet. In the IPv4 case, this is usually an IP header, followed by a 366 UDP header and UDP payload that may be an RTP header and its payload. 367 However, the FULL_HEADER packet may also carry IP encapsulated packets, 368 in which case there would be two IP headers followed by UDP and possibly 369 RTP. Or in the case of IPv6, the packet may be built of some combina- 370 tion of IPv6 and IPv4 headers. Each successive header is indicated by 371 the type field of the previous header, as usual. 373 The FULL_HEADER packet differs from the corresponding normal IPv4 or 374 IPv6 packet in that it must also carry the compression context ID and 375 the 4-bit sequence number. In order to avoid expanding the size of the 376 header, these values are inserted into length fields in the IP and UDP 377 headers since the actual length may be inferred from the length provided 378 by the link layer. Two 16-bit length fields are needed; these are taken 379 from the first two available headers in the packet. That is, for an 380 IPv4/UDP packet, the first length field is the total length field of the 381 IPv4 header, and the second is the length field of the UDP header. For 382 an IPv4 encapsulated packet, the first length field would come from the 383 total length field of the first IP header, and the second length field 384 would come from the total length field of the second IP header. 386 As specified in Sections 5.3.2 of [3], the position of the context ID 387 (CID) and 4-bit sequence number varies depending upon whether 8- or 16- 388 bit context IDs have been selected, as shown in the following diagram 389 (16 bits wide, with the most-significant bit is to the left): 391 For 8-bit context ID: 393 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 394 |0|1| Generation| CID | First length field 395 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 397 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 398 | 0 | seq | Second length field 399 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 401 For 16-bit context ID: 403 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 404 |1|1| Generation| 0 | seq | First length field 405 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 407 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 408 | CID | Second length field 409 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 411 The first bit in the first length field indicates the length of the CID. 412 The length of the CID must either be constant for all contexts or two 413 additional distinct packet types must be provided to separately indicate 414 COMPRESSED_UDP and COMPRESSED_RTP packet formats with 8- and 16-bit 415 CIDs. The second bit in the first length field is 1 to indicate that 416 the 4-bit sequence number is present, as is always the case for this 417 IP/UDP/RTP compression scheme. 419 The generation field is used with IPv6 for COMPRESSED_NON_TCP packets as 420 described in [3]. For IPv4-only implementations that do not use 421 COMPRESSED_NON_TCP packets, the compressor may set the generation value 422 to zero. For consistent operation between IPv4 and IPv6, the generation 423 value is stored in the context when it is received by the decompressor, 424 and the most recent value is returned in the CONTEXT_STATE packet. 426 When a FULL_HEADER packet is received, the complete set of headers is 427 stored into the context selected by the context ID. The 4-bit sequence 428 number is also stored in the context, thereby resynchronizing the 429 decompressor to the compressor. 431 When COMPRESSED_NON_TCP packets are used, the 4-bit sequence number is 432 inserted into the "Data Field" of that packet and the D bit is set as 433 described in Section 6 of [3]. When a COMPRESSED_NON_TCP packet is 434 received, the generation number must be compared to the value stored in 435 the context. If they are not the same, the context is not up to date 436 and must be refreshed by a FULL_HEADER packet. If the generation does 437 match, then the compressed IP and UDP header information, the 4-bit 438 sequence number, and the (potential) RTP header are all stored into the 439 saved context. 441 The amount of memory required to store the context will vary depending 442 upon how many encapsulating headers are included in the FULL_HEADER 443 packet. The compressor and decompressor may negotiate a maximum header 444 size. 446 3.3.2. COMPRESSED_RTP packet format 448 When the second-order difference of the RTP header from packet to packet 449 is zero, the decompressor can reconstruct a packet simply by adding the 450 stored first-order differences to the stored uncompressed header 451 representing the previous packet. All that need be communicated is the 452 session context identifier and a small sequence number to maintain syn- 453 chronization and detect packet loss between the compressor and 454 decompressor. 456 If the second-order difference of the RTP header is not zero for some 457 fields, the new first-order difference for just those fields is communi- 458 cated using a compact encoding. The new first-order difference values 459 are added to the corresponding fields in the uncompressed header in the 460 decompressor's session context, and are also stored explicitly in the 461 context to be added to the corresponding fields again on each subsequent 462 packet in which the second-order difference is zero. Each time the 463 first-order difference changes, it is transmitted and stored in the con- 464 text. 466 In practice, the only fields for which it is useful to store the first- 467 order difference are the IPv4 ID field and the RTP timestamp. For the 468 RTP sequence number field, the usual increment is 1. If the sequence 469 number changes by other than 1, the difference must be communicated but 470 does not set the expected difference for the next packet. Instead, the 471 expected first-order difference remains fixed at 1 so that the 472 difference need not be explicitly communicated on the next packet assum- 473 ing it is in order. 475 For the RTP timestamp, when a FULL_HEADER, COMPRESSED_NON_TCP or 476 COMPRESSED_UDP packet is sent to refresh the RTP state, the stored 477 first-order difference is initialized to zero. If the timestamp is the 478 same on the next packet (e.g., same video frame), then the second-order 479 difference is zero. Otherwise, the difference between the timestamps of 480 the two packets is transmitted as the new first-order difference to be 481 added to the timestamp in the uncompressed header stored in the 482 decompressor's context and also stored as the first-order difference in 483 that context. Each time the first-order difference changes on subse- 484 quent packets, that difference is again transmitted and used to update 485 the context. 487 Similarly, since the IPv4 ID field frequently increments by one, the 488 first-order difference for that field is initialized to one when the 489 state is refreshed by a FULL_HEADER packet, or when a COMPRESSED_NON_TCP 490 packet is sent since it carries the ID field in uncompressed form. 491 Thereafter, whenever the first-order difference changes, it is transmit- 492 ted and stored in the context. 494 A bit mask will be used to indicate which fields have changed by other 495 than the expected difference. In addition to the small link sequence 496 number, the list of items to be conditionally communicated in the 497 compressed IP/UDP/RTP header is as follows: 499 I = IPv4 packet ID (always 0 if no IPv4 header) 500 U = UDP checksum 501 M = RTP marker bit 502 S = RTP sequence number 503 T = RTP timestamp 504 L = RTP CSRC count and list 506 If 4 bits are needed for the link sequence number to get a reasonable 507 probability of loss detection, there are too few bits remaining to 508 assign one bit to each of these items and still fit them all into a sin- 509 gle byte to go along with the context ID. 511 It is not necessary to explicitly indicate the presence of the UDP 512 checksum because a source will typically include checksums on all pack- 513 ets of a session or none of them. When the session state is initialized 514 with an uncompressed header, if there is a nonzero checksum present, an 515 unencoded 16-bit checksum will be inserted into the compressed header in 516 all subsequent packets until this setting is changed by sending another 517 uncompressed packet. 519 Of the remaining items, the CSRC list may be the one least frequently 520 used. Rather than dedicating a bit to indicate CSRC change, an unusual 521 combination of the other bits may be used instead. This bit combination 522 is denoted MSTI. If all four of the bits for the IP packet ID, RTP 523 marker bit, RTP sequence number and RTP timestamp are set, this as a 524 special case indicating an extended form of the compressed RTP header 525 will follow. That header will include an additional byte containing the 526 real values of the four bits plus the CC count. The CSRC list, of 527 length indicated by the CC count, will be included just as it appears in 528 the uncompressed RTP header. 530 The other fields of the RTP header (version, P bit, X bit, payload type 531 and SSRC identifier) are assumed to remain relatively constant. In par- 532 ticular, the SSRC identifier is defined to be constant for a given con- 533 text because it is one of the factors selecting the context. If any of 534 the other fields change, the uncompressed RTP header must sent as 535 described in Section 3.3.3. 537 The following diagram shows the compressed IP/UDP/RTP header with dotted 538 lines indicating fields that are conditionally present. The most signi- 539 ficant bit is numbered 0. Variable-length fields are sent in network 540 byte order (most significant byte first). 542 0 1 2 3 4 5 6 7 543 +...............................+ 544 : msb of session context ID : (if 16-bit CID) 545 +-------------------------------+ 546 | lsb of session context ID | 547 +---+---+---+---+---+---+---+---+ 548 | M | S | T | I | sequence | 549 +---+---+---+---+---+---+---+---+ 550 : : 551 + UDP checksum + (if nonzero in context) 552 : : 553 +...............................+ 554 : : 555 + "RANDOM" fields + (if encapsulated) 556 : : 557 +...............................+ 558 : M'| S'| T'| I'| CC : (if MSTI = 1111) 559 +...............................+ 560 : delta IPv4 ID : (if I or I' = 1) 561 +...............................+ 562 : delta RTP sequence : (if S or S' = 1) 563 +...............................+ 564 : delta RTP timestamp : (if T or T' = 1) 565 +...............................+ 566 : : 567 : CSRC list : (if MSTI = 1111) 568 : : 569 : : 570 +...............................+ 571 : : 572 : RTP header extension : (if X set in context) 573 : : 574 : : 575 +-------------------------------+ 576 | | 577 | RTP data | 578 / / 579 / / 580 | | 581 +-------------------------------+ 582 : padding : (if P set in context) 583 +...............................+ 585 When more than one IPv4 header is present in the context as 586 initialized by the FULL_HEADER packet, then the IP ID fields of 587 encapsulating headers must be sent as absolute values as described in 588 [3]. These fields are identified as "RANDOM" fields. They are 589 inserted into the COMPRESSED_RTP packet in the same order as they 590 appear in the original headers, immediately following the UDP checksum 591 if present or the MSTI byte if not, as shown in the diagram. Only if 592 an IPv4 packet immediately precedes the UDP header will the IP ID of 593 that header be sent differentially, i.e., potentially with no bits if 594 the second difference is zero, or as a delta IPv4 ID field if not. If 595 there is not an IPv4 header immediately preceding the UDP header, then 596 the I bit will be 0 and no delta IPv4 ID field will be present. 598 3.3.3. COMPRESSED_UDP packet format 600 If there is a change in any of the fields of the RTP header that are 601 normally constant (such as the payload type field), then an uncompressed 602 RTP header must be sent. If the IP and UDP headers do not also require 603 updating, this RTP header may be carried in a COMPRESSED_UDP packet 604 rather than a FULL_HEADER packet. The COMPRESSED_UDP packet has the 605 same format as the COMPRESSED_RTP packet except that the M, S and T bits 606 are always 0 and the corresponding delta fields are never included: 608 0 1 2 3 4 5 6 7 609 +...............................+ 610 : msb of session context ID : (if 16-bit CID) 611 +-------------------------------+ 612 | lsb of session context ID | 613 +---+---+---+---+---+---+---+---+ 614 | 0 | 0 | 0 | I | sequence | 615 +---+---+---+---+---+---+---+---+ 616 : : 617 + UDP checksum + (if nonzero in context) 618 : : 619 +...............................+ 620 : : 621 + "RANDOM" fields + (if encapsulated) 622 : : 623 +...............................+ 624 : delta IPv4 ID : (if I = 1) 625 +-------------------------------+ 626 | UDP data | 627 : (uncompressed RTP header) : 629 Note that this constitutes a form of IP/UDP header compression different 630 from COMPRESSED_NON_TCP packet type defined in [3]. The motivation is 631 to allow reaching the target of two bytes when UDP checksums are dis- 632 abled, as IPv4 allows. The protocol in [3] does not use differential 633 coding for UDP packets, so in the IPv4 case, two bytes of IP ID, and two 634 bytes of UDP checksum if nonzero, would always be transmitted in 635 addition to two bytes of compression prefix. For IPv6, the 636 COMPRESSED_NON_TCP packet type may be used instead. 638 3.3.4. Encoding of differences 640 The delta fields in the COMPRESSED_RTP and COMPRESSED_UDP packets are 641 encoded with a variable-length mapping for compactness of the more 642 commonly-used values. A default encoding is specified below, but it is 643 recommended that implementations use a table-driven delta encoder and 644 decoder to allow negotiation of a table specific for each session if 645 appropriate, possibly even an optimal Huffman encoding. Encodings based 646 on sequential interpretation of the bit stream, of which this default 647 table and Huffman encoding are examples, allow a reasonable table size 648 and may result in an execution speed faster than a non-table-driven 649 implementation with explicit tests for ranges of values. 651 The default delta encoding is specified in the following table. This 652 encoding was designed to efficiently encode the small changes that may 653 occur in the IP ID and in RTP sequence number when packets are lost 654 upstream from the compressor, yet still handling most audio and video 655 deltas in two bytes. The column on the left is the decimal value to be 656 encoded, and the column on the right is the resulting sequence of bytes 657 shown in hexadecimal and in the order in which they are transmitted 658 (network byte order). The first and last values in each contiguous 659 range are shown, with ellipses in between: 661 Decimal Hex 663 -16384 C0 00 00 664 : : 665 -129 C0 3F 7F 666 -128 80 00 667 : : 668 -1 80 7F 669 0 00 670 : : 671 127 7F 672 128 80 80 673 : : 674 16383 BF FF 675 16384 C0 40 00 676 : : 677 4194303 FF FF FF 679 For positive values, a change of zero through 127 is represented 680 directly in one byte. If the most significant two bits of the byte are 681 10 or 11, this signals an extension to a two- or three-byte value, 682 respectively. The least significant six bits of the first byte are 683 combined, in decreasing order of significance, with the next one or two 684 bytes to form a 14- or 22- bit value. 686 Negative deltas may occur when packets are misordered or in the inten- 687 tionally out-of-order RTP timestamps on MPEG video. These events are 688 less likely, so a smaller range of negative values is encoded using oth- 689 erwise redundant portions of the positive part of the table. 691 A change in the RTP timestamp value less than -16384 or greater than 692 4194303 forces the RTP header to be sent uncompressed using a 693 FULL_HEADER, COMPRESSED_NON_TCP or COMPRESSED_UDP packet type. The IP 694 ID and RTP sequence number fields are only 16 bits, so negative deltas 695 for those fields should be masked to 16 bits and then encoded (as large 696 positive 16-bit numbers). 698 3.3.5. Error Recovery 700 Whenever the 4-bit sequence number for a particular context increments 701 by other than 1, except when set by a FULL_HEADER or COMPRESSED_NON_TCP 702 packet, the decompressor must invalidate that context and send a 703 CONTEXT_STATE packet back to the compressor indicating that the context 704 has been invalidated. All packets for the invalid context must be dis- 705 carded until a FULL_HEADER or COMPRESSED_NON_TCP packet is received for 706 that context to re-establish consistent state. Since multiple 707 compressed packets may arrive in the interim, the decompressor should 708 not retransmit the CONTEXT_STATE packet for every compressed packet 709 received, but instead should limit the rate of retransmission to avoid 710 flooding the reverse channel. 712 When an error occurs on the link, the link layer will usually discard 713 the packet that was damaged (if any), but may provide an indication of 714 the error. Some time may elapse before another packet is delivered for 715 the same context, and then that packet would have to be discarded by the 716 decompressor when it is observed to be out of sequence, resulting in at 717 least two packets lost. To allow faster recovery if the link does pro- 718 vide an explicit error indication, the decompressor may optionally send 719 an advisory CONTEXT_STATE packet listing the last valid sequence number 720 and generation number for one or more recently active contexts. For a 721 given context, if the compressor has sent no compressed packet with a 722 higher sequence number, and if the generation number matches the current 723 generation, no corrective action is required. Otherwise, the compressor 724 may choose to mark the context invalid so that the next packet is sent 725 in FULL_HEADER or COMPRESSED_NON_TCP mode (FULL_HEADER is required if 726 the generation doesn't match). However, note that if the link round- 727 trip-time is large compared to the inter-packet spacing, there may be 728 several packets in flight across the link, increasing the probability 729 that the sequence number will already have advanced when the 730 CONTEXT_STATE packet is received by the compressor. The result could be 731 that some contexts are invalidated unnecessarily, causing extra 732 bandwidth to be consumed. 734 The format of the CONTEXT_STATE packet is shown in the following 735 diagrams. The first byte is a type code to allow the CONTEXT_STATE 736 packet type to be shared by multiple compression schemes within the gen- 737 eral compression framework specified in [3]. The contents of the 738 remainder of the packet depends upon the compression scheme. For the 739 IP/UDP/RTP compression scheme specified here, the remainder of the 740 CONTEXT_STATE packet is structured as a list of blocks to allow the 741 state for multiple contexts to be indicated, preceded by a one-byte 742 count of the number of blocks. 744 Two type code values are used for the IP/UDP/RTP compression scheme. The 745 value 1 indicates that 8-bit session context IDs are being used: 747 0 1 2 3 4 5 6 7 748 +---+---+---+---+---+---+---+---+ 749 | 1 = IP/UDP/RTP with 8-bit CID | 750 +---+---+---+---+---+---+---+---+ 751 | context count | 752 +---+---+---+---+---+---+---+---+ 753 +---+---+---+---+---+---+---+---+ 754 | session context ID | 755 +---+---+---+---+---+---+---+---+ 756 | I | 0 | 0 | 0 | sequence | 757 +---+---+---+---+---+---+---+---+ 758 | 0 | 0 | generation | 759 +---+---+---+---+---+---+---+---+ 760 ... 761 +---+---+---+---+---+---+---+---+ 762 | session context ID | 763 +---+---+---+---+---+---+---+---+ 764 | I | 0 | 0 | 0 | sequence | 765 +---+---+---+---+---+---+---+---+ 766 | 0 | 0 | generation | 767 +---+---+---+---+---+---+---+---+ 769 The value 2 indicates that 16-bit session context IDs are being used. 770 The session context ID is sent in network byte order (most significant 771 byte first): 773 0 1 2 3 4 5 6 7 774 +---+---+---+---+---+---+---+---+ 775 | 2 = IP/UDP/RTP with 16-bit CID| 776 +---+---+---+---+---+---+---+---+ 777 | context count | 778 +---+---+---+---+---+---+---+---+ 779 +---+---+---+---+---+---+---+---+ 780 | | 781 + session context ID + 782 | | 783 +---+---+---+---+---+---+---+---+ 784 | I | 0 | 0 | 0 | sequence | 785 +---+---+---+---+---+---+---+---+ 786 | 0 | 0 | generation | 787 +---+---+---+---+---+---+---+---+ 788 ... 789 +---+---+---+---+---+---+---+---+ 790 | | 791 + session context ID + 792 | | 793 +---+---+---+---+---+---+---+---+ 794 | I | 0 | 0 | 0 | sequence | 795 +---+---+---+---+---+---+---+---+ 796 | 0 | 0 | generation | 797 +---+---+---+---+---+---+---+---+ 799 The bit labeled "I" is set to one for contexts that have been marked 800 invalid and require a FULL_HEADER of COMPRESSED_NON_TCP packet to be 801 transmitted. If the I bit is zero, the context state is advisory. 803 Since the CONTEXT_STATE packet itself may be lost, retransmission of one 804 or more blocks is allowed. It is expected that retransmission will be 805 triggered only by receipt of another packet, but if the line is near 806 idle, retransmission might be triggered by a relatively long timer (on 807 the order of 1 second). 809 If a CONTEXT_STATE block for a given context is retransmitted, it may 810 cross paths with the FULL_HEADER or COMPRESSED_NON_TCP packet intended 811 to refresh that context. In that case, the compressor may choose to 812 ignore the error indication. 814 In the case where UDP checksums are being transmitted, the decompressor 815 could attempt to use the "twice" algorithm described in section 10.1 of 816 [3]. In this algorithm, the delta is applied more than once on the 817 assumption that the delta may have been the same on the missing 818 packet(s) and the one subsequently received. For the scheme defined 819 here, the difference in the 4-bit sequence number tells number of times 820 the delta must be applied. Note, however, that there is a nontrivial 821 risk of an incorrect positive indication. It may be advisable to 822 request a FULL_HEADER or COMPRESSED_NON_TCP packet even if the "twice" 823 algorithm succeeds. 825 Some errors may not be detected, for example if 16 packets are lost in a 826 row and the link level does not provide an error indication. In that 827 case, the decompressor will generate packets that are not valid. If UDP 828 checksums are being transmitted, the receiver will probably detect the 829 invalid packets and discard them, but the receiver does not have any 830 means to signal the decompressor. Therefore, it is recommended that the 831 decompressor verify the UDP checksum periodically, perhaps one out of 16 832 packets. If an error is detected, the decompressor would invalidate the 833 context and signal the compressor with a CONTEXT_STATE packet. 835 3.4. Compression of RTCP Control Packets 837 By relying on the RTP convention that data is carried on an even port 838 number and the corresponding RTCP packets are carried on the next higher 839 (odd) port number, one could tailor separate compression schemes to be 840 applied to RTP and RTCP packets. For RTCP, the compression could apply 841 not only to the header but also the "data", that is, the contents of the 842 different packet types. The numbers in Sender Report (SR) and Receiver 843 Report (RR) RTCP packets would not compress well, but the text informa- 844 tion in the Source Description (SDES) packets could be compressed down 845 to a bit mask indicating each item that was present but compressed out 846 (for timing purposes on the SDES NOTE item and to allow the end system 847 to measure the average RTCP packet size for the interval calculation). 849 However, in the compression scheme defined here, no compression will be 850 done on the RTCP headers and "data" for several reasons (though compres- 851 sion should still be applied to the IP and UDP headers). Since the RTP 852 protocol specification suggests that the RTCP packet interval be scaled 853 so that the aggregate RTCP bandwidth used by all participants in a ses- 854 sion will be no more than 5% of the session bandwidth, there is not much 855 to be gained from RTCP compression. Compressing out the SDES items 856 would require a significant increase in the shared state that must be 857 stored for each context ID. And, in order to allow compression when 858 SDES information for several sources was sent through an RTP "mixer", it 859 would be necessary to maintain a separate RTCP session context for each 860 SSRC identifier. In a session with more than 255 participants, this 861 would cause perfect thrashing of the context cache even when only one 862 participant was sending data. 864 Even though RTCP is not compressed, the fraction of the total bandwidth 865 occupied by RTCP packets on the compressed link remains no more than 5% 866 in most cases, assuming that the RTCP packets are sent as COMPRESSED_UDP 867 packets. Given that the uncompressed RTCP traffic consumes no more than 868 5% of the total session bandwidth, then for a typical RTCP packet length 869 of 90 bytes, the portion of the compressed bandwidth used by RTCP will 870 be no more than 5% if the size of the payload in RTP data packets is at 871 least 108 bytes. If the size of the RTP data payload is smaller, the 872 fraction will increase, but is still less than 7% for a payload size of 873 37 bytes. For large data payloads, the compressed RTCP fraction is less 874 than the uncompressed RTCP fraction (for example, 4% at 1000 bytes). 876 3.5. Compression of non-RTP UDP Packets 878 As described earlier, the COMPRESSED_UDP packet may be used to compress 879 UDP packets that don't carry RTP. Whatever data follows the UDP header 880 is unlikely to have some constant values in the bits that correspond to 881 usually constant fields in the RTP header. In particular, the SSRC 882 field would likely change. Therefore, it is necessary to keep track of 883 the non-RTP UDP packet streams to avoid using up all the context slots 884 as the "SSRC field" changes (since that field is part of what identifies 885 a particular RTP context). Those streams may each be given a context, 886 but the encoder would set a flag in the context to indicate that the 887 changing SSRC field should be ignored and COMPRESSED_UDP packets should 888 always be sent instead of COMPRESSED_RTP packets. 890 4. Interaction With Segmentation 892 A segmentation scheme may be used in conjunction with RTP header 893 compression to allow small, real-time packets to interrupt large, 894 presumably non-real-time packets in order to reduce delay. It is 895 assumed that the large packets bypass the compressor and decompressor 896 since the interleaving would modify the sequencing of packets at the 897 decompressor and cause the appearance of errors. Header compression 898 should be less important for large packets since the overhead ratio is 899 smaller. 901 If some packets from an RTP session context are selected for segmenta- 902 tion (perhaps based on size) and some are not, there is a possibility of 903 re-ordering. This would reduce the compression efficiency because the 904 large packets would appear as lost packets in the sequence space. How- 905 ever, this should not cause more serious problems because the RTP 906 sequence numbers should be reconstructed correctly and will allow the 907 application to correct the ordering. 909 Link errors detected by the segmentation scheme using its own sequencing 910 information may be indicated to the compressor with an advisory 911 CONTEXT_STATE message just as for link errors detected by the link layer 912 itself. 914 The context ID byte is placed first in the COMPRESSED_RTP header so that 915 this byte may be shared with the segmentation layer if such sharing is 916 feasible and has been negotiated. Since the context ID may have any 917 value, it can be set to match context information from the segmentation 918 layer. 920 5. Negotiating Compression 922 The use of IP/UDP/RTP compression over a particular link is a function 923 of the link-layer protocol. It is expected that such negotiation will 924 be defined separately for PPP [4], for example. The following items may 925 be negotiated: 927 o The size of the context ID. 928 o The maximum size of the stack of headers in the context. 929 o A context-specific table for decoding of delta values. 931 6. Acknowledgments 933 Several people have contributed to the design of this compression scheme 934 and related problems. Scott Petrack initiated discussion of RTP header 935 compression in the AVT working group at Los Angeles in March, 1996. 936 Carsten Bormann has developed an overall architecture for compression in 937 combination with traffic control across a low-speed link, and made 938 several specific contributions to the scheme described here. David Oran 939 independently developed a note based on similar ideas, and suggested the 940 use of PPP Multilink protocol for segmentation. Mikael Degermark has 941 contributed advice on integration of this compression scheme with the 942 IPv6 compression framework. 944 7. References: 946 [1] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, "RTP: 947 A Transport Protocol for real-time applications," RFC 1889. 949 [2] V. Jacobson, "TCP/IP Compression for Low-Speed Serial Links," 950 RFC 1144. 952 [3] M. Degermark, B. Nordgren, and S. Pink, "Header Compression for 953 IPv6," work in progress. 955 [4] W. Simpson, "The Point-to-Point Protocol (PPP)", RFC 1548. 957 8. Security Considerations 959 Because encryption eliminates the redundancy that this compression 960 scheme tries to exploit, there is some inducement to forego encryption 961 in order to achieve operation over a low-bandwidth link. However, for 962 those cases where encryption of data and not headers is satisfactory, 963 RTP does specify an alternative encryption method in which only the RTP 964 payload is encrypted and the headers are left in the clear. That would 965 allow compression to still be applied. 967 A malfunctioning or malicious compressor could cause the decompressor to 968 reconstitute packets that do not match the original packets but still 969 have valid IP, UDP and RTP headers and possibly even valid UDP check- 970 sums. Such corruption may be detected with end-to-end authentication 971 and integrity mechanisms which will not be affected by the compression. 972 Constant portions of authentication headers will be compressed as 973 described in [3]. 975 No authentication is performed on the CONTEXT_STATE control packet sent 976 by this protocol. An attacker with access to the link between the 977 decompressor and compressor could inject false CONTEXT_STATE packets and 978 cause compression efficiency to be reduced, probably resulting in 979 congestion on the link. However, an attacker with access to the link 980 could also disrupt the traffic in many other ways. 982 A potential denial-of-service threat exists when using compression tech- 983 niques that have non-uniform receiver-end computational load. The 984 attacker can inject pathological datagrams into the stream which are 985 complex to decompress and cause the receiver to be overloaded and 986 degrading processing of other streams. However, this compression does 987 not exhibit any significant non-uniformity. 989 A security review of this protocol found no additional security con- 990 siderations. 992 9. Authors' Addresses 994 Stephen L. Casner 995 Precept Software, Inc. 996 1072 Arastradero Road 997 Palo Alto, CA 94304 998 United States 999 EMail: casner@precept.com 1001 Van Jacobson 1002 MS 46a-1121 1003 Lawrence Berkeley National Laboratory 1004 Berkeley, CA 94720 1005 United States 1006 EMail: van@ee.lbl.gov