idnits 2.17.1 draft-ietf-avt-crtp-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-19) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts. ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. ** The document is more than 15 pages and seems to lack a Table of Contents. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (March 26, 1997) is 9886 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? '1' on line 834 looks like a reference -- Missing reference section? '2' on line 837 looks like a reference -- Missing reference section? '3' on line 840 looks like a reference -- Missing reference section? '4' on line 843 looks like a reference Summary: 8 errors (**), 0 flaws (~~), 1 warning (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force Audio/Video Transport Working Group 3 INTERNET-DRAFT S. Casner / Precept Software 4 draft-ietf-avt-crtp-02.txt V. Jacobson / LBNL 5 March 26, 1997 6 Expires: 9/97 8 Compressing IP/UDP/RTP Headers for Low-Speed Serial Links 10 Status of this Memo 12 This document is an Internet-Draft. Internet-Drafts are working docu- 13 ments of the Internet Engineering Task Force (IETF), its areas, and its 14 working groups. Note that other groups may also distribute working 15 documents as Internet-Drafts. 17 Internet-Drafts are draft documents valid for a maximum of six months 18 and may be updated, replaced, or obsoleted by other documents at any 19 time. It is inappropriate to use Internet- Drafts as reference material 20 or to cite them other than as "work in progress." 22 To learn the current status of any Internet-Draft, please check the 23 "1id-abstracts.txt" listing contained in the Internet- Drafts Shadow 24 Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), 25 munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or 26 ftp.isi.edu (US West Coast). 28 Distribution of this document is unlimited. 30 Abstract 32 This document describes a method for compressing the headers of 33 IP/UDP/RTP datagrams to reduce overhead on low-speed serial links. 34 In many cases, all three headers can be compressed to 2-4 bytes. 36 Comments are solicited and should be addressed to the working group 37 mailing list rem-conf@es.net and/or the author(s). 39 1. Introduction 41 Since the Real-time Transport Protocol was published as an RFC [1], 42 there has been growing interest in using RTP as one step to achieve 43 interoperability among different implementations of network audio/video 44 applications. However, there is also concern that the 12-byte RTP 45 header is too large an overhead for 20-byte payloads when operating over 46 low speed lines such as dial-up modems at 14.4 or 28.8 kb/s. (Existing 47 applications operating in this environment may use an application- 48 specific protocol with a header of a few bytes that has reduced func- 49 tionality relative to RTP.) 51 Header size may be reduced through compression techniques as has been 52 done with great success for TCP [2]. In this case, compression might be 53 applied to the RTP header alone, on an end-to-end basis, or to the com- 54 bination of IP, UDP and RTP headers on a link-by-link basis. Compress- 55 ing the 40 bytes of combined headers together provides substantially 56 more gain than compressing 12 bytes of RTP header alone because the 57 resulting size is approximately the same (2-4 bytes) in either case. 58 Compressing on a link-by-link basis also provides better performance 59 because the delay and loss rate are lower. Therefore, the method 60 defined here is for combined compression of IP, UDP and RTP headers on a 61 link-by-link basis. 63 This document defines a compression scheme that may be used with IPv4, 64 IPv6 or packets encapsulated with more than one IP header, though the 65 initial focus is on IPv4. The IP/UDP/RTP compression defined here is 66 intended to fit within the more general compression framework [3] speci- 67 fied by Mikael Degermark, et. al., for both IPv6 and IPv4. That frame- 68 work defines TCP and non-TCP as two classes of transport above IP. This 69 specification creates IP/UDP/RTP as a third class extracted from the 70 non-TCP class. 72 2. Assumptions and Tradeoffs 74 The goal of this compression scheme is to reduce the IP/UDP/RTP headers 75 to two bytes for most packets in the case where no UDP checksums are 76 being sent, or four bytes with checksums. It is motivated primarily by 77 the specific problem of sending audio and video over 14.4 and 28.8 78 dialup modems. These links tend to provide full-duplex communication, 79 so the protocol takes advantage of that fact, though this constraint 80 could be removed. 82 This specification does not address segmentation and preemption of large 83 packets to reduce the delay across the slow link experienced by small 84 real-time packets, except to identify in Section 4 some interactions 85 between segmentation and compression that may occur. Segmentation 86 schemes may be defined separately and used in conjunction with the 87 compression defined here. 89 It should be noted that implementation simplicity is an important factor 90 to consider in evaluating the a compression scheme. Communications 91 servers may need to support compression over perhaps as many as 100 92 dial-up modem lines using a single processor. Therefore, it may be 93 appropriate to make some simplifications in the design at the expense of 94 generality, or to produce a flexible design that is general but can be 95 subsetted for simplicity. The next sections discuss some of the trade- 96 offs listed here. 98 2.1. Simplex vs. Full Duplex 100 In the absence of other constraints, a compression scheme that worked 101 over simplex links would be preferred over one that did not. However, 102 operation over a simplex link requires periodic refreshes with an 103 uncompressed packet header to restore compression state in case of 104 error. If an explicit error signal can be returned instead, the delay 105 to recovery may be shortened substantially. The overhead in the no- 106 error case is also reduced. Some UDP applications may require only sim- 107 plex communication, but RTP applications will frequently require full 108 duplex communication. The application may be 2-way, as in a telephone 109 conversation, but even if data flows in only one direction there is a 110 need for a return path to carry reception feedback in RTCP packets. 112 This specification includes an error indication on the reverse path, 113 however it would be possible to use a periodic refresh instead. When- 114 ever the decompressor detected an error in a particular packet stream, 115 it would simply discard all packets in that stream until an uncompressed 116 header for was received for that stream, and then resume decompression. 117 The penalty would be the potentially large number of packets discarded. 119 2.2. Segmentation and Layering 121 Delay induced by the time required to send a large packet over the slow 122 link is not a problem for one-way audio, for example, because the 123 receiver can adapt to the variance in delay. However, for interactive 124 conversations, minimizing the end-to-end delay is critical. Segmenta- 125 tion of large, none-real-time packets to allow small real-time packets 126 to be transmitted between segments can reduce the delay. 128 This specification deals only with compression and assumes segmentation, 129 if included, will be handled as a separate layer. It would be inap- 130 propriate to integrate segmentation and compression in such a way that 131 the compression could not be used by itself in situations where segmen- 132 tation was deemed unnecessary or impractical. Similarly, one would like 133 to avoid any requirements for a reservation protocol. The compression 134 scheme can be applied locally on the two ends of a link independent of 135 any other mechanisms except for the requirements that the link layer 136 provide some packet type codes, a packet length indication, and good 137 error detection. 139 Conversely, separately compressing the IP/UDP and RTP layers loses too 140 much of the compression gain that is possible by treating them together. 141 Crossing these protocol layer boundaries is appropriate because the same 142 function is being applied across all layers. 144 3. The Compression Algorithm 146 The compression algorithm defined in this document draws heavily upon 147 the design of TCP/IP header compression as described in RFC 1144 [2]. 148 Readers are referred to that RFC for more information on the underlying 149 motivations and general principles of header compression. 151 3.1. The basic idea 153 In TCP header compression, the first factor of two comes from the obser- 154 vation that half of the bytes in the IP and TCP headers remain constant 155 over the life of the connection. After sending the uncompressed header 156 once, these fields may be elided from the compressed headers that fol- 157 low. The remaining compression comes from differential coding on the 158 changing fields to reduce their size, and from eliminating the changing 159 fields entirely for common cases by calculating the changes from the 160 length of the packet. This length is indicated by the link-level proto- 161 col. 163 For RTP header compression, some of the same techniques may be applied. 164 However, the big gain comes from the observation that although several 165 fields change in every packet, the difference from packet to packet is 166 often constant and therefore the second-order difference is zero. By 167 maintaining both the uncompressed header and the first-order differences 168 in the session state shared between the compressor and decompressor, all 169 that must be communicated is an indication that the second-order differ- 170 ence was zero. In that case, the decompressor can reconstruct the ori- 171 ginal header without any loss of information simply by adding the first 172 order differences to the saved uncompressed header. 174 Just as TCP/IP header compression maintains shared state for multiple 175 simultaneous TCP connections, this IP/UDP/RTP compression must maintain 176 state for multiple session contexts. A session context is defined by 177 the combination of the IP source and destination addresses, the UDP 178 source and destination ports, and the RTP SSRC field. The compressed 179 packet carries a small integer to indicate in which session context that 180 packet should be interpreted. 182 Because the RTP compression is lossless, it may be applied to any UDP 183 traffic that benefits from it. Most likely, the only packets that will 184 benefit are RTP packets, but it is acceptable to use heuristics to 185 determine whether or not the packet is an RTP packet because no harm is 186 done if the heurisic gives the wrong answer. This does require execut- 187 ing the compression algorithm for all UDP packets. Most implementations 188 will need to maintain a negative cache of packet streams (identified by 189 IP address and UDP port pairs but not the RTP SSRC field) that have 190 failed to compress as RTP packets for some number of attempts. Failing 191 to compress means that the fields in the potential RTP header that are 192 expected to remain constant most of the time, such as the payload type 193 field, keep changing. Even if the other fields remain constant, a 194 packet stream with a constantly changing SSRC field must be entered in 195 the negative cache to avoid consuming all of the available session con- 196 texts. When RTP compression fails, the IP and UDP headers may still be 197 compressed. 199 3.2. Header Compression for RTP Data Packets 201 In the IPv4 header, only the total length, packet ID, and header check- 202 sum fields will normally change. The total length is redundant with the 203 length provided by the link layer, and since this compression scheme 204 must depend upon the link layer to provide good error detection (e.g., 205 PPP's CRC), the header checksum may also be elided. This leaves only 206 the packet ID, which, assuming no IP fragmentation, would not need to be 207 communicated. However, in order to maintain lossless compression, 208 changes in the packet ID will be transmitted. The packet ID usually 209 increments by one or a small number for each packet. In the IPv6 base 210 header, there is no packet ID nor header checksum and only the payload 211 length field changes. 213 In the UDP header, the length field is redundant with the IP total 214 length field and the length indicated by the link layer. The UDP check- 215 sum field will be a constant zero if the source elects not to generate 216 UDP checksums. Otherwise, the checksum must be communicated intact in 217 order to preserve the lossless compression. Maintaining end-to-end 218 error detection for applications that require it is an important princi- 219 ple. 221 In the RTP header, the SSRC identifier is constant in a given context 222 since that is part of what identifies the particular context. For most 223 packets, only the sequence number and the timestamp will change from 224 packet to packet. If packets are not lost or misordered, the sequence 225 number will increment by one for each packet. For audio packets of con- 226 stant duration, the timestamp will increment by the number of sample 227 periods conveyed in each packet. For video, the timestamp will change 228 on the first packet of each frame, but then stay constant for any addi- 229 tional packets in the frame. If each video frame occupies only one 230 packet, but the video frames are generated at a constant rate, then 231 again the change in the timestamp from frame to frame is constant. Note 232 that in each of these cases the second-order difference of the sequence 233 number and timestamp fields is zero, so the next packet header can be 234 constructed from the previous packet header by adding the first-order 235 differences for these fields that are stored in the session context 236 along with the previous uncompressed header. When the second-order 237 difference is not zero, the magnitude of the change is usually much 238 smaller than the full number of bits in the field, so the size can be 239 reduced by encoding the new first-order difference and transmitting it 240 rather than the absolute value. 242 The M bit will be set on the first packet of a talkspurt and the last 243 packet of a video frame. If it were treated as a constant field such 244 that each change required sending the full RTP header, this would reduce 245 the compression significantly. Therefore, one bit in the compressed 246 header will carry the M bit explicitly. 248 If the packets are flowing through an RTP mixer, most commonly for 249 audio, then the CSRC list and CC count will also change. However, the 250 CSRC list will typically remain constant during a talkspurt or longer, 251 so it need be sent only when it changes. 253 3.3. The protocol 255 The compression protocol must maintain a collection of shared informa- 256 tion in a consistent state between the compressor and decompressor. 257 There is a separate session context for each IP/UDP/RTP packet stream, 258 as defined by a particular combination of the IP source and destination 259 addresses, UDP source and destination ports, and the RTP SSRC field. 260 The number of session contexts to be maintained may be negotiated 261 between the compressor and decompressor. Each context is identified by 262 an 8- or 16-bit identifier, depending upon the number of contexts nego- 263 tiated, so the maximum number is 65536. Both uncompressed and 264 compressed packets must carry the context ID and a 4-bit sequence number 265 used to detect packet loss between the compressor and decompressor. 266 Each context has its own separate sequence number space so that a single 267 packet loss need only invalidate one context. 269 The shared information in each context consists of the following items: 271 o The full IP, UDP and RTP headers for the last packet sent by the 272 compressor or reconstructed by the decompressor. 273 o The first difference for the IPv4 ID field, initialized to 1. 274 o The first difference for the RTP timestamp field, initialized to 0. 275 o The last value of the 4-bit sequence number used to detect packet 276 loss between the compressor and decompressor. 277 o The current generation number for non-differential coding of UDP 278 packets with IPv6(see [3]). For IPv4, the generation number may be 280 set to zero. 282 In order to communicate packets in the various uncompressed and 283 compressed forms, this protocol depends upon the link layer being able 284 to provide an indication of four packet types in addition to the packet 285 types that indicate normal IPv4 and IPv6 packets: 287 FULL_HEADER - communicates the uncompressed IP header plus any fol- 288 lowing headers and data to establish the uncompressed header state 289 in the decompressor for a particular context. The FULL-HEADER 290 packet also carries the 8- or 16-bit session context identifier and 291 the 4-bit sequence number to establish synchronization between the 292 compressor and decompressor. The format is shown in section 3.3.1. 294 COMPRESSED_UDP - communicates the IP and UDP headers compressed to 295 6 or fewer bytes (often 2 if UDP checksums are disabled), followed 296 by any subsequent headers (possibly RTP) in uncompressed form, plus 297 data. This packet type is used when there are differences in the 298 usually constant fields of the (potential) RTP header. The RTP 299 header includes a potentially changed value of the SSRC field, so 300 this packet may redefine the session context. The format is shown 301 in section 3.3.3. 303 COMPRESSED_RTP - indicates that the RTP header is compressed along 304 with the IP and UDP headers. The size of this packet may still be 305 just two bytes, or more if differences must be communicated. This 306 packet type is used when the second-order difference (at least in 307 the usually constant fields) is zero. It includes delta encodings 308 for those fields that have changed, and also establishes the 309 first-order differences after an uncompressed RTP header is sent. 310 The format is shown in section 3.3.2. 312 CONTEXT_STATE - indicates a special packet sent from the decompres- 313 sor to the compressor to communicate a list of context IDs for 314 which synchronization has or may have been lost. This packet is 315 only sent across the point-to-point link so it requires no IP 316 header. The format is shown in section 3.3.5. 318 When this compression scheme is used with IPv6 as part of the general 319 header compression framework specified in [3], another packet type may 320 be used: 322 COMPRESSED_NON_TCP - communicates the compressed IP and UDP headers 323 as defined in [3] without differential encoding. If it were used 324 for IPv4, it would require one or two bytes more than the 325 COMPRESSED_UDP form listed above in order to carry the IPv4 ID 326 field. For IPv6, there is no ID field and this non-differential 327 compression is more resilient to packet loss. 329 Assignment of numeric codes for these packet types in the Point-to-Point 330 Protocol [4] will be made by the Internet Assigned Numbers Authority. 332 3.3.1. FULL_HEADER (uncompressed) packet format 334 The definition of the FULL_HEADER packet given here is intended to be 335 the consistent with the definition given in [3]. Full details on design 336 choices are given there. 338 The format of the FULL_HEADER packet is the same as that of the original 339 packet. In the IPv4 case, this is usually an IP header, followed by a 340 UDP header and UDP payload that may be an RTP header and its payload. 341 However, the FULL_HEADER packet may also carry IP encapsulated packets, 342 in which case there would be two IP headers followed by UDP and possibly 343 RTP. Or in the case of IPv6, the packet may be built of some combina- 344 tion of IPv6 and IPv4 headers. Each successive header is indicated by 345 the type field of the previous header, as usual. 347 The FULL_HEADER packet differs from the corresponding normal IPv4 or 348 IPv6 packet in that it must also carry the compression context ID and 349 the 4-bit sequence number. In order to avoid expanding the size of the 350 header, these values are inserted into length fields in the IP and UDP 351 headers since the actual length may be inferred from the length provided 352 by the link layer. Two 16-bit length fields are needed; these are taken 353 from the first two available headers in the packet. That is, for an 354 IPv4/UDP packet, the first length field is the total length field of the 355 IPv4 header, and the second is the length field of the UDP header. For 356 an IPv4 encapsulated packet, the first length field would come from the 357 total length field of the first IP header, and the second length field 358 would come from the total length field of the second IP header. 360 As specified in Sections 5.3.2 of [3], the position of the context ID 361 (CID) and 4-bit sequence number varies depending upon whether 8- or 16- 362 bit context IDs have been selected, as shown in the following diagram 363 (16 bits wide, with the most-significant bit is to the left): 365 For 8-bit context ID: 367 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 368 |0|1| Generation| CID | First length field 369 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 371 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 372 | 0 | seq | Second length field 373 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 375 For 16-bit context ID: 377 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 378 |1|1| Generation| 0 | seq | First length field 379 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 381 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 382 | CID | Second length field 383 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 385 The first bit int he first length field indicates the length of the CID. 386 The second bit in the first length field is 1 to indicate that the 4-bit 387 sequence number is present, as is always the case for this IP/UDP/RTP 388 compression scheme. The generation field is used with IPv6 for 389 COMPRESSED_NON_TCP packets as described in [3]. For IPv4-only implemen- 390 tations, the compressor may set the generation value to zero. For con- 391 sistent operation between IPv4 and IPv6, the generation value is stored 392 in the context when it is received by the decompressor, and the most 393 recent value is returned in the CONTEXT_STATE packet. 395 When a FULL_HEADER packet is received, the complete set of headers is 396 stored into the context selected by the context ID. The 4-bit sequence 397 number is also stored in the context, thereby resynchronizing the 398 decompressor to the compressor. 400 When COMPRESSED_NON_TCP packets are used, the 4-bit sequence number is 401 inserted into the "Data Field" of that packet and the D bit is set as 402 described in Section 6 of [3]. When a COMPRESSED_NON_TCP packet is 403 received, the generation number must be compared to the value stored in 404 the context. If they are not the same, the context is not up to date 405 and must be refreshed by a FULL_HEADER packet. If the generation does 406 match, then the compressed IP and UDP header information, the 4-bit 407 sequence number, and the (potential) RTP header are all stored into the 408 saved context. 410 The amount of memory required to store the context will vary depending 411 upon how many encapsulating headers are included in the FULL_HEADER 412 packet. The compressor and decompressor may negotiate a maximum header 413 size. 415 3.3.2. COMPRESSED_RTP packet format 417 When the second-order difference of the RTP header from packet to packet 418 is zero, the decompressor can reconstruct a packet simply by adding the 419 stored first-order differences to the stored uncompressed header 420 representing the previous packet. All that need be communicated is a 421 small sequence number to maintain synchronization and detect packet loss 422 between the compressor and decompressor. 424 When the second-order difference of the RTP header is not zero for some 425 fields, the new first-order difference for just those fields is communi- 426 cated using a compact encoding. The new first-order difference values 427 are used to update the uncompressed header in the decompressor's session 428 context, and are also stored explicitly in the context to be used for 429 updating the fields again on subsequent packets in which the second- 430 order difference is zero. 432 In practice, the only fields for which it is useful to store the first- 433 order difference are the IPv4 ID field and the RTP timestamp. For the 434 RTP sequence number field, the usual increment is 1. If the sequence 435 number changes by other than 1, the difference must be communicated but 436 does not set the expected difference for the next packet. Instead, the 437 expected first-order difference remains fixed at 1 so that the differ- 438 ence need not be explictly communicated on the next packet assuming it 439 is in order. 441 For the RTP timestamp, when a FULL_HEADER, COMPRESSED_NON_TCP or 442 COMPRESSED_UDP packet is sent to refresh the RTP state, the stored 443 first-order difference is initialized to zero. If the timestamp is the 444 same on the next packet (e.g., same video frame), then the second-order 445 difference is zero. Otherwise, the difference between the timestamps of 446 the two packets is transmitted as the new first-order difference. 448 Similarly, since the IPv4 ID field frequently increments by one, the 449 first-order difference for that field is initialized to one when the 450 state is refreshed by a FULL_HEADER packet, or when a COMPRESSED_NON_TCP 451 packet is sent since it carries the ID field in uncompressed form. 452 Thereafter, whenever the first-order difference changes, it is transmit- 453 ted and stored in the context. 455 A bit mask will be used to indicate which fields have changed by other 456 than the expected difference. In addition to the small link sequence 457 number, the list of items to be conditionally communicated in the 458 compressed IP/UDP/RTP header is as follows: 460 I = IPv4 packet ID (always 0 if no IPv4 header) 461 U = UDP checksum 462 M = RTP marker bit 463 S = RTP sequence number 464 T = RTP timestamp 465 L = RTP CSRC count and list 467 If 4 bits are needed for the link sequence number to get a reasonable 468 probability of loss detection, there are too few bits remaining to 469 assign one bit to each of these items and still fit them all into a sin- 470 gle byte to go along with the context ID. 472 It is not necessary to explicitly indicate the presence of the UDP 473 checksum because a source will typically include checksums on all pack- 474 ets of a session or none of them. When the session state is initialized 475 with an uncompressed header, if there is a nonzero checksum present, an 476 unencoded 16-bit checksum will be appended to the compressed header in 477 all subsequent packets until this setting is changed by sending another 478 uncompressed packet. 480 Of the remaining items, the CSRC list may be the one least frequently 481 used. Rather than dedicating a bit to indicate CSRC change, an unusual 482 combination of the other bits may be used instead. This bit combination 483 is denoted MSTI. If all four of the bits for the IP packet ID, RTP 484 marker bit, RTP sequence number and RTP timestamp are set, this as a 485 special case indicating an extended form of the compressed RTP header 486 will follow. That header will include an additional byte containing the 487 real values of the four bits plus the CC count. The CSRC list, of 488 length indicated by the CC count, will be included just as it appears in 489 the uncompressed RTP header. 491 The following diagram shows the compressed IP/UDP/RTP header with dotted 492 lines indicating fields that are conditionally present. The most signi- 493 ficant bit is numbered 0. 495 0 1 2 3 4 5 6 7 496 +-------------------------------+ 497 | session context | 498 +---+---+---+---+---+---+---+---+ 499 | M | S | T | I | sequence | 500 +---+---+---+---+---+---+---+---+ 501 : : 502 + "RANDOM" fields + (only if encapsulated) 503 : : 504 +...............................+ 505 : : 506 + UDP checksum + (implicit) 507 : : 508 +...............................+ 509 : M'| S'| T'| I'| CC : (if MSTI) 510 +...............................+ 511 : delta IPv4 ID : (if I or I') 512 +...............................+ 513 : delta RTP sequence : (if S or S') 514 +...............................+ 515 : delta RTP timestamp : (if T or T') 516 +...............................+ 517 : : 518 : CSRC list : (if MSTI) 519 : : 520 : : 521 +...............................+ 522 : : 523 : RTP header extension : (if X set in context) 524 : : 525 : : 526 +---+---+---+---+---+---+---+---+ 527 | RTP data | 528 : : 530 When more than one IP header is present in the context as initialized 531 by the FULL_HEADER packet, then the IP ID fields of encapsulating 532 headers must be sent as absolute values as described in [3]. These 533 fields are identified as "RANDOM" fields. They are inserted into the 534 COMPRESSED_RTP packet in the same order as they appear in the original 535 headers, immediately following the MSTI byte as shown. Only if an 536 IPv4 packet immediately preceeds the UDP header will the IP ID of that 537 header be sent differentially, i.e., potentially with no bits if the 538 second difference is zero, or as a delta IPv4 ID field if not. If 539 there is not an IPv4 header immediately preceeding the UDP header, 540 then the I bit will be 0 and no delta IPv4 ID field will be present. 542 3.3.3. COMPRESSED_UDP packet format 544 If there is a change in any of the fields of the RTP header that are 545 normally constant (such as the payload type field), then an uncompressed 546 RTP header must be sent. If the IP and UDP headers do not also require 547 updating, this RTP header may be carried in a COMPRESSED_UDP packet 548 rather than a FULL_HEADER packet. The COMPRESSED_UDP packet has the 549 same format as the COMPRESSED_RTP packet except that the M, S and T bits 550 are always 0 and the corresponding delta fields are never included: 552 0 1 2 3 4 5 6 7 553 +-------------------------------+ 554 | session context | 555 +---+---+---+---+---+---+---+---+ 556 | 0 | 0 | 0 | I | sequence | 557 +---+---+---+---+---+---+---+---+ 558 : : 559 + "RANDOM" fields + (only if encapsulated) 560 : : 561 +...............................+ 562 : : 563 + UDP checksum + (implicit) 564 : : 565 +...............................+ 566 : delta IPv4 ID : (if I) 567 +---+---+---+---+---+---+---+---+ 568 | UDP data | 569 : (uncompressed RTP header) : 571 Note that this constitutes a form of IP/UDP header compression different 572 from COMPRESSED_NON_TCP packet type defined in [3]. The motivation is 573 to allow reaching the target of two bytes when UDP checksums are dis- 574 abled, as IPv4 allows. The protocol in [3] does not use differential 575 coding for UDP packets, so in the IPv4 case, two bytes of IP ID, and two 576 bytes of UDP checksum if nonzero, would always be transmitted in addi- 577 tion to two bytes of compression prefix. For IPv6, the 578 COMPRESSED_NON_TCP packet type may be used instead. 580 3.3.4. Encoding of differences 582 The delta fields in the COMPRESSED_RTP and COMPRESSED_UDP packets are 583 encoded with a variable-length mapping for compactness of the more 584 commonly-used values. A default encoding is specified below, but it is 585 recommended that implementations use a table-driven delta encoder and 586 decoder to allow negotiation of a table specific for each session if 587 appropriate, possibly even an optimal Huffman encoding. Encodings based 588 on sequential interpretation of the bit stream, of which the default 589 table and Huffman encoding are examples, allow a reasonable table size 590 and may result in an execution speed faster than a non-table-driven 591 implementation with explicit tests for ranges of values. 593 The default delta encoding is specified in the following table. This 594 encoding was designed to efficiently encode the small changes that may 595 occur in the IP ID and in RTP sequence number when packets are lost 596 upstream from the compressor, yet still handling most audio and video 597 deltas in two bytes. The column on the left is the decimal value to be 598 encoded, and the column on the right is the resulting sequence of bytes 599 shown in hexadecimal and in network byte order. The first and last 600 values in each contiguous range are shown, with elipses in between: 602 Decimal Hex 604 -16384 C0 00 00 605 : : 606 -129 C0 3F 7F 607 -128 80 00 608 : : 609 -1 80 7F 610 0 00 611 : : 612 127 7F 613 128 80 80 614 : : 615 16383 BF FF 616 16384 C0 40 00 617 : : 618 4194303 FF FF FF 620 For positive values, a change of zero through 127 is represented 621 directly in one byte. If the most significant two bits of the byte are 622 10 or 11, this signals an extension to a two- or three-byte value, 623 respectively. The least significant six bits of the first byte are com- 624 bined, in decreasing order of significance, with the next one or two 625 bytes to form a 14- or 22- bit value. 627 Negative deltas may occur when packets are misordered or in the inten- 628 tionally out-of-order RTP timestamps on MPEG video. These events are 629 less likely, so a smaller range of negative values is encoded using oth- 630 erwise redundant portions of the positive part of the table. 632 A change in the RTP timestamp value less than -16384 or greater than 633 4194303 forces the RTP header to be sent uncompressed using a 634 FULL_HEADER, COMPRESSED_NON_TCP or COMPRESSED_UDP packet type. The IP 635 ID and RTP sequence number fields are only 16 bits, so negative deltas 636 for those fields should be masked to 16 bits and then encoded (as large 637 positive 16-bit numbers). 639 3.3.5. Error Recovery 641 Whenever the 4-bit sequence number for a particular context increments 642 by other than 1, except when set by a FULL_HEADER or COMPRESSED_NON_TCP 643 packet, the decompressor must invalidate that context and send a 644 CONTEXT_STATE packet back to the compressor indicating that the context 645 has been invalidated. All packets for the invalid context must be dis- 646 carded until a FULL_HEADER or COMPRESSED_NON_TCP packet is received for 647 that context to re-establish consistent state. Since multiple 648 compressed packets may arrive in the interim, the decompressor should 649 not retransmit the CONTEXT_STATE packet for every compressed packet 650 received, but instead should limit the rate of retransmission to avoid 651 flooding the reverse channel. 653 When an error occurs on the link, the link layer will usually discard 654 the packet that was damaged (if any), but may provide an indication of 655 the error. Some time may elapse before another packet is delivered for 656 the same context, and then that packet would have to be discarded by the 657 decompressor when it is observed to be out of sequence, resulting in at 658 least two packets lost. To allow faster recovery if the link does pro- 659 vide an explicit error indication, the decompressor may optionally send 660 a CONTEXT_STATE packet listing the last valid sequence number and gen- 661 eration number for one or more recently active contexts. For a given 662 context, if the compressor has sent no compressed packet with a higher 663 sequence number, no corrective action is required. Otherwise, the 664 compressor may mark the context invalid so that the next packet is sent 665 in FULL_HEADER or COMPRESSED_NON_TCP mode. If the generation number 666 does not match the current generation of the COMPRESSED_NON_TCP packet, 667 then the FULL_HEADER must be sent. 669 The format of the CONTEXT_STATE packet is shown in the following 670 diagram. The first byte is a type code to allow the CONTEXT_STATE 671 packet type to be shared for compression schemes for other protocols 672 that may be defined in parallel with this one. For this IP/UDP/RTP 673 compression scheme that type code has the value 1, and the remainder of 674 the CONTEXT_STATE packet is structured as a list of blocks to allow the 675 state for multiple contexts to be indicated, preceded by a one-byte 676 count of the number of blocks. 678 0 1 2 3 4 5 6 7 679 +---+---+---+---+---+---+---+---+ 680 | IP/UDP/RTP compression = 1 | 681 +---+---+---+---+---+---+---+---+ 682 | context count | 683 +---+---+---+---+---+---+---+---+ 684 +---+---+---+---+---+---+---+---+ 685 | session context | 686 +---+---+---+---+---+---+---+---+ 687 | I | 0 | 0 | 0 | sequence | 688 +---+---+---+---+---+---+---+---+ 689 | 0 | 0 | generation | 690 +---+---+---+---+---+---+---+---+ 691 ... 692 +---+---+---+---+---+---+---+---+ 693 | session context | 694 +---+---+---+---+---+---+---+---+ 695 | I | 0 | 0 | 0 | sequence | 696 +---+---+---+---+---+---+---+---+ 697 | 0 | 0 | generation | 698 +---+---+---+---+---+---+---+---+ 700 The bit labeled "I" is set to one for contexts that have been marked 701 invalid and require a FULL_HEADER of COMPRESSED_NON_TCP packet to be 702 transmitted. If the I bit is zero, the context state is advisory. 704 Since the CONTEXT_STATE packet itself may be lost, retransmission of one 705 or more blocks is allowed. It is expected that retransmission will be 706 triggered only by receipt of another packet, but if the line is near 707 idle, retransmission might be triggered by a relatively long timer (on 708 the order of 1 second). 710 If a CONTEXT_STATE block for a given context is retransmitted, it may 711 cross paths with the FULL_HEADER or COMPRESSED_NON_TCP packet intended 712 to refresh that context. In that case, the compressor may choose to 713 ignore the error indication. 715 In the case where UDP checksums are being transmitted, the decompressor 716 could attempt to use the "twice" algorithm described in section 10.1 of 717 [3]. In this algorithm, the delta is applied more than once on the 718 assumption that the delta may have been the same on the missing 719 packet(s) and the one subsequently received. For the scheme defined 720 here, the difference in the 4-bit sequence number tells number of times 721 the delta must be applied. Note, however, that there is a nontrivial 722 risk of an incorrect positive indication. It may be advisable to 723 request a FULL_HEADER or COMPRESSED_NON_TCP packet even if the "twice" 724 algorithm succeeds. 726 Some errors may not be detected, for example if 16 packets are lost in a 727 row and the link level does not provide an error indication. In that 728 case, the decompressor will generate packets that are not valid. If UDP 729 checksums are being transmitted, the receiver will probably detect the 730 invalid packets and discard them, but the receiver does not have any 731 means to signal the decompressor. Therefore, it is recommended that the 732 decompressor verify the UDP checksum periodically, perhaps one out of 16 733 packets. If an error is detected, the decompressor would invalidate the 734 context and signal the compressor with a CONTEXT_STATE packet. 736 3.4. Compression of RTCP Control Packets 738 By relying on the RTP convention that data is carried on an even port 739 number and the corresponding RTCP packets are carried on the next higher 740 (odd) port number, one could tailor separate compression schemes to be 741 applied to RTP and RTCP packets. For RTCP, the compression could apply 742 not only to the header but also the "data", that is, the contents of the 743 different packet types. The numbers in Sender Report (SR) and Receiver 744 Report (RR) RTCP packets would not compress well, but the text informa- 745 tion in the Source Description (SDES) packets could be compressed down 746 to a bit mask indicating each item that was present but compressed out 747 (for timing purposes on the SDES NOTE item and to allow the end system 748 to measure the average RTCP packet size for the interval calculation). 750 However, in the compression scheme defined here, no compression will be 751 done on RTCP packets for several reasons. Since the RTP protocol 752 specification suggests that the RTCP packet interval be scaled so that 753 the aggregate RTCP bandwidth used by all participants in a session will 754 be no more than 5% of the session bandwidth, there is not much to be 755 gained from RTCP compression. Compressing out the SDES items would 756 require a significant increase in the shared state that must be stored 757 for each context ID. And, in order to allow compression when SDES 758 information for several sources was sent through an RTP "mixer", it 759 would be necessary to maintain a separate RTCP session context for each 760 SSRC identifier. In a session with more than 255 participants, this 761 would cause perfect thrashing of the context cache even when only one 762 participant was sending data. 764 3.5. Compression of non-RTP UDP Packets 766 As described earlier, the COMPRESSED_UDP packet may be used to compress 767 UDP packets that don't carry RTP. Whatever data follows the UDP header 768 is unlikely to have some constant values in the bits that correspond to 769 usually constant fields in the RTP header. In particular, the SSRC 770 field would likely change. Therefore, it is necessary to keep track of 771 the non-RTP UDP packet streams to avoid using up all the context slots 772 as the "SSRC field" changes (since that field is part of what identifies 773 a particular RTP context). Those streams may each be given a context, 774 but the encoder would set a flag in the context to indicate that the 775 changing SSRC field should be ignored and COMPRESSED_UDP packets should 776 always be sent instead of COMPRESSED_RTP packets. 778 4. Interaction With Segmentation 780 A segmentation scheme may be used in conjunction with RTP header 781 compression to allow small, real-time packets to interrupt large, 782 presumably non-real-time packets in order to reduce delay. It is 783 assumed that the large packets bypass the compressor and decompressor 784 since the interleaving would modify the sequencing of packets at the 785 decompressor and cause the appearance of errors. Header compression 786 should be less important for large packets since the overhead ratio is 787 smaller. 789 If some packets from an RTP session context are selected for segmenta- 790 tion (perhaps based on size) and some are not, there is a possibility of 791 re-ordering. This would reduce the compression efficiency because the 792 large packets would appear as lost packets in the sequence space. How- 793 ever, this should not cause more serious problems because the RTP 794 sequence numbers should be reconstructed correctly and will allow the 795 application to correct the ordering. 797 Link errors detected by the segmentation scheme using its own sequencing 798 information may be indicated to the compressor with an advisory 799 CONTEXT_STATE message just as for link errors detected by the link layer 800 itself. 802 The context ID byte is placed first in the COMPRESSED_RTP header so that 803 this byte may be shared with the segmentation layer if such sharing is 804 feasible and has been negotiated. Since the context ID may have any 805 value, it can be set to match context information from the segmentation 806 layer. 808 5. Negotiating Compression 810 The use of IP/UDP/RTP compression over a particular link is a function 811 of the link-layer protocol. It is expected that such negotiation will 812 be defined separately for PPP [4], for example. The following items may 813 be negotiated: 815 o The size of the context ID. 816 o The maximum size of the stack of headers in the context. 817 o A context-specific table for decoding of delta values. 819 6. Acknowledgments 821 Several people have contributed to the design of this compression scheme 822 and related problems. Scott Petrack initiated discussion of RTP header 823 compression in the AVT working group at Los Angeles in March, 1996. 824 Carsten Bormann has developed an overall achitecture for compression in 825 combination with traffic control across a low-speed link, and made 826 several specific contributions to the scheme described here. David Oran 827 independently developed a note based on similar ideas, and suggested the 828 use of PPP Multilink protocol for segmentation. Mikael Degermark has 829 contributed advice on integration of this compression scheme with the 830 IPv6 compression framework. 832 7. References: 834 [1] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, "RTP: 835 A Transport Protocol for real-time applications," RFC 1889. 837 [2] V. Jacobson, "TCP/IP Compression for Low-Speed Serial Links," 838 RFC 1144. 840 [3] M. Degermark, B. Nordgren, and S. Pink, "Header Compression for 841 IPv6," work in progress. 843 [4] W. Simpson, "The Point-to-Point Protocol (PPP)", RFC 1548. 845 8. Security Considerations 847 Because encryption eliminates the redundancy that this compression 848 scheme tries to exploit, there is some inducement to forgo encryption in 849 order to achieve operation over a low-bandwidth link. However, for 850 those cases where encryption of data and not headers is satisfactory, 851 RTP does specify an alternative encryption method in which only the RTP 852 payload is encrypted and the headers are left in the clear. That would 853 allow compression to still be applied. 855 9. Authors' Addresses 857 Stephen L. Casner 858 Precept Software, Inc. 859 1072 Arastradero Road 860 Palo Alto, CA 94304 861 United States 862 EMail: casner@precept.com 864 Van Jacobson 865 MS 46a-1121 866 Lawrence Berkeley National Laboratory 867 Berkeley, CA 94720 868 United States 869 EMail: van@ee.lbl.gov