idnits 2.17.1 draft-ietf-avt-profile-new-12.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Authors' Addresses Section. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Line 381 has weird spacing: '...hannels descr...' == Line 389 has weird spacing: '... lc c r...' == Line 491 has weird spacing: '...ncoding sampl...' == Line 515 has weird spacing: '...A: not appli...' == Line 616 has weird spacing: '... bits conte...' == (3 more instances...) -- The exact meaning of the all-uppercase expression 'NOT REQUIRED' is not defined in RFC 2119. If it is intended as a requirements expression, it should be rewritten using one of the combinations defined in RFC 2119; otherwise it should not be all-uppercase. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (November 20, 2001) is 8190 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: '0' is mentioned on line 1086, but not defined == Missing Reference: '25' is mentioned on line 1077, but not defined == Missing Reference: '26' is mentioned on line 1082, but not defined == Missing Reference: '27' is mentioned on line 1083, but not defined == Missing Reference: '28' is mentioned on line 1084, but not defined == Missing Reference: '29' is mentioned on line 1085, but not defined == Missing Reference: '30' is mentioned on line 1086, but not defined == Missing Reference: '31' is mentioned on line 1087, but not defined == Missing Reference: '32' is mentioned on line 1088, but not defined == Missing Reference: '33' is mentioned on line 1089, but not defined == Missing Reference: '34' is mentioned on line 1090, but not defined == Missing Reference: '35' is mentioned on line 1091, but not defined == Missing Reference: '36' is mentioned on line 1092, but not defined == Missing Reference: '37' is mentioned on line 1093, but not defined == Missing Reference: '38' is mentioned on line 1094, but not defined == Missing Reference: '39' is mentioned on line 1099, but not defined == Missing Reference: '40' is mentioned on line 1100, but not defined == Missing Reference: '41' is mentioned on line 1101, but not defined == Missing Reference: '42' is mentioned on line 1102, but not defined == Missing Reference: '43' is mentioned on line 1103, but not defined == Missing Reference: '44' is mentioned on line 1104, but not defined == Missing Reference: '45' is mentioned on line 1105, but not defined == Missing Reference: '46' is mentioned on line 1106, but not defined == Missing Reference: '47' is mentioned on line 1107, but not defined == Missing Reference: '48' is mentioned on line 1108, but not defined == Missing Reference: '49' is mentioned on line 1109, but not defined == Missing Reference: '50' is mentioned on line 1110, but not defined == Missing Reference: '51' is mentioned on line 1111, but not defined ** Obsolete normative reference: RFC 1889 (ref. '2') (Obsoleted by RFC 3550) -- Possible downref: Non-RFC (?) normative reference: ref. '3' -- Obsolete informational reference (is this intentional?): RFC 2327 (ref. '6') (Obsoleted by RFC 4566) -- Obsolete informational reference (is this intentional?): RFC 2048 (ref. '8') (Obsoleted by RFC 4288, RFC 4289) -- Obsolete informational reference (is this intentional?): RFC 2032 (ref. '20') (Obsoleted by RFC 4587) -- Obsolete informational reference (is this intentional?): RFC 2429 (ref. '22') (Obsoleted by RFC 4629) -- Obsolete informational reference (is this intentional?): RFC 2326 (ref. '23') (Obsoleted by RFC 7826) Summary: 4 errors (**), 0 flaws (~~), 36 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force AVT WG 3 Internet Draft Schulzrinne/Casner 4 draft-ietf-avt-profile-new-12.txt Columbia U./Packet Design 5 November 20, 2001 6 Expires: May 2002 8 RTP Profile for Audio and Video Conferences with Minimal Control 10 STATUS OF THIS MEMO 12 This document is an Internet-Draft and is in full conformance with 13 all provisions of Section 10 of RFC2026. 15 Internet-Drafts are working documents of the Internet Engineering 16 Task Force (IETF), its areas, and its working groups. Note that 17 other groups may also distribute working documents as Internet- 18 Drafts. 20 Internet-Drafts are draft documents valid for a maximum of six months 21 and may be updated, replaced, or obsoleted by other documents at any 22 time. It is inappropriate to use Internet-Drafts as reference 23 material or to cite them other than as "work in progress". 25 The list of current Internet-Drafts can be accessed at 26 http://www.ietf.org/ietf/1id-abstracts.txt 28 To view the list Internet-Draft Shadow Directories, see 29 http://www.ietf.org/shadow.html. 31 Abstract 33 This memorandum is a revision of RFC 1890 in preparation for 34 advancement from Proposed Standard to Draft Standard status. 36 This document describes a profile called "RTP/AVP" for the use of the 37 real-time transport protocol (RTP), version 2, and the associated 38 control protocol, RTCP, within audio and video multiparticipant 39 conferences with minimal control. It provides interpretations of 40 generic fields within the RTP specification suitable for audio and 41 video conferences. In particular, this document defines a set of 42 default mappings from payload type numbers to encodings. 44 This document also describes how audio and video data may be carried 45 within RTP. It defines a set of standard encodings and their names 46 when used within RTP. The descriptions provide pointers to reference 47 implementations and the detailed standards. This document is meant as 48 an aid for implementors of audio, video and other real-time 49 multimedia applications. 51 Contents 53 1 Introduction ........................................ 3 54 1.1 Terminology ......................................... 3 55 2 RTP and RTCP Packet Forms and Protocol Behavior ..... 4 56 3 IANA Considerations ................................. 6 57 3.1 Registering Additional Encodings .................... 6 58 4 Audio ............................................... 8 59 4.1 Encoding-Independent Rules .......................... 8 60 4.2 Operating Recommendations ........................... 9 61 4.3 Guidelines for Sample-Based Audio Encodings ......... 10 62 4.4 Guidelines for Frame-Based Audio Encodings .......... 10 63 4.5 Audio Encodings ..................................... 11 64 4.5.1 DVI4 ................................................ 12 65 4.5.2 G722 ................................................ 13 66 4.5.3 G723 ................................................ 13 67 4.5.4 G726-40, G726-32, G726-24, and G726-16 .............. 17 68 4.5.5 G728 ................................................ 18 69 4.5.6 G729 ................................................ 19 70 4.5.7 G729D and G729E ..................................... 21 71 4.5.8 GSM ................................................. 24 72 4.5.8.1 General Packaging Issues ............................ 24 73 4.5.8.2 GSM variable names and numbers ...................... 25 74 4.5.9 GSM-EFR ............................................. 25 75 4.5.10 L8 .................................................. 25 76 4.5.11 L16 ................................................. 25 77 4.5.12 LPC ................................................. 27 78 4.5.13 MPA ................................................. 28 79 4.5.14 PCMA and PCMU ....................................... 28 80 4.5.15 QCELP ............................................... 28 81 4.5.16 RED ................................................. 28 82 4.5.17 VDVI ................................................ 29 83 5 Video ............................................... 29 84 5.1 CelB ................................................ 30 85 5.2 JPEG ................................................ 30 86 5.3 H261 ................................................ 30 87 5.4 H263 ................................................ 30 88 5.5 H263-1998 ........................................... 31 89 5.6 MPV ................................................. 31 90 5.7 MP2T ................................................ 31 91 5.8 nv .................................................. 31 92 6 Payload Type Definitions ............................ 31 93 7 RTP over TCP and Similar Byte Stream Protocols ...... 32 94 8 Port Assignment ..................................... 32 95 9 Changes from RFC 1890 ............................... 34 96 10 Security Considerations ............................. 37 97 11 Full Copyright Statement ............................ 37 98 12 Acknowledgments ..................................... 38 99 13 Addresses of Authors ................................ 38 101 1 Introduction 103 [Note to the RFC Editor: This paragraph and the first paragraph of 104 the Abstract are to be deleted when this draft is published as an 105 RFC. All RFC XXXX should be filled in with the number of the RTP 106 specification RFC submitted for Draft Standard status, and all RFC 107 YYYY should be filled in with the number of the draft specifying MIME 108 registration of RTP payload types as it is submitted for Proposed 109 Standard status. These latter references are intended to be non- 110 normative as this Profile may be used independently of the MIME 111 registrations.] 113 This profile defines aspects of RTP left unspecified in the RTP 114 Version 2 protocol definition (RFC XXXX) [2]. This profile is 115 intended for the use within audio and video conferences with minimal 116 session control. In particular, no support for the negotiation of 117 parameters or membership control is provided. The profile is expected 118 to be useful in sessions where no negotiation or membership control 119 are used (e.g., using the static payload types and the membership 120 indications provided by RTCP), but this profile may also be useful in 121 conjunction with a higher-level control protocol. 123 Use of this profile may be implicit in the use of the appropriate 124 applications; there may be no explicit indication by port number, 125 protocol identifier or the like. Applications such as session 126 directories may use the name for this profile specified in Section 3. 128 Other profiles may make different choices for the items specified 129 here. 131 This document also defines a set of encodings and payload formats for 132 audio and video. These payload format descriptions are included here 133 only as a matter of convenience since they are too small to warrant 134 separate documents. Use of these payload formats is NOT REQUIRED to 135 use this profile. Only the binding of some of the payload formats to 136 static payload type numbers in Tables 4 and 5 is normative. 138 1.1 Terminology 139 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 140 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 141 document are to be interpreted as described in RFC 2119 [1] and 142 indicate requirement levels for implementations compliant with this 143 RTP profile. 145 This draft defines the term media type as dividing encodings of audio 146 and video content into three classes: audio, video and audio/video 147 (interleaved). 149 2 RTP and RTCP Packet Forms and Protocol Behavior 151 The section "RTP Profiles and Payload Format Specification" of RFC 152 XXXX enumerates a number of items that can be specified or modified 153 in a profile. This section addresses these items. Generally, this 154 profile follows the default and/or recommended aspects of the RTP 155 specification. 157 RTP data header: The standard format of the fixed RTP data 158 header is used (one marker bit). 160 Payload types: Static payload types are defined in Section 6. 162 RTP data header additions: No additional fixed fields are 163 appended to the RTP data header. 165 RTP data header extensions: No RTP header extensions are 166 defined, but applications operating under this profile MAY 167 use such extensions. Thus, applications SHOULD NOT assume 168 that the RTP header X bit is always zero and SHOULD be 169 prepared to ignore the header extension. If a header 170 extension is defined in the future, that definition MUST 171 specify the contents of the first 16 bits in such a way 172 that multiple different extensions can be identified. 174 RTCP packet types: No additional RTCP packet types are defined 175 by this profile specification. 177 RTCP report interval: The suggested constants are to be used for 178 the RTCP report interval calculation. Sessions operating 179 under this profile MAY specify a separate parameter for the 180 RTCP traffic bandwidth rather than using the default 181 fraction of the session bandwidth. The RTCP traffic 182 bandwidth MAY be divided into two separate session 183 parameters for those participants which are active data 184 senders and those which are not. Following the 185 recommendation in the RTP specification [2] that 1/4 of the 186 RTCP bandwidth be dedicated to data senders, the 187 RECOMMENDED default values for these two parameters would 188 be 1.25% and 3.75%, respectively. For a particular session, 189 the RTCP bandwidth for non-data-senders MAY be set to zero 190 when operating on unidirectional links or for sessions that 191 don't require feedback on the quality of reception. The 192 RTCP bandwidth for data senders SHOULD be kept non-zero so 193 that sender reports can still be sent for inter-media 194 synchronization and to identify the source by CNAME. The 195 means by which the one or two session parameters for RTCP 196 bandwidth are specified is beyond the scope of this memo. 198 SR/RR extension: No extension section is defined for the RTCP SR 199 or RR packet. 201 SDES use: Applications MAY use any of the SDES items described 202 in the RTP specification. While CNAME information MUST be 203 sent every reporting interval, other items SHOULD only be 204 sent every third reporting interval, with NAME sent seven 205 out of eight times within that slot and the remaining SDES 206 items cyclically taking up the eighth slot, as defined in 207 Section 6.2.2 of the RTP specification. In other words, 208 NAME is sent in RTCP packets 1, 4, 7, 10, 13, 16, 19, 209 while, say, EMAIL is used in RTCP packet 22. 211 Security: The RTP default security services are also the default 212 under this profile. 214 String-to-key mapping: No mapping is specified by this profile. 216 Congestion: RTP and this profile may be used in the context of 217 enhanced network service, for example, through Integrated 218 Services (RFC 1633) [4] or Differentiated Services (RFC 219 2475) [5], or they may be used with best effort service. 221 If enhanced service is being used, RTP receivers SHOULD 222 monitor packet loss to ensure that the service that was 223 requested is actually being delivered. If it is not, then 224 they SHOULD assume that they are receiving best-effort 225 service and behave accordingly. 227 If best-effort service is being used, RTP receivers SHOULD 228 monitor packet loss to ensure that the packet loss rate is 229 within acceptable parameters. Packet loss is considered 230 acceptable if a TCP flow across the same network path and 231 experiencing the same network conditions would achieve an 232 average throughput, measured on a reasonable timescale, 233 that is not less than the RTP flow is achieving. This 234 condition can be satisfied by implementing congestion 235 control mechanisms to adapt the transmission rate (or the 236 number of layers subscribed for a layered multicast 237 session), or by arranging for a receiver to leave the 238 session if the loss rate is unacceptably high. 240 The comparison to TCP cannot be specified exactly, but is 241 intended as an "order-of-magnitude" comparison in timescale 242 and throughput. The timescale on which TCP throughput is 243 measured is the round-trip time of the connection. In 244 essence, this requirement states that it is not acceptable 245 to deploy an application (using RTP or any other transport 246 protocol) on the best-effort Internet which consumes 247 bandwidth arbitrarily and does not compete fairly with TCP 248 within an order of magnitude. 250 Underlying protocol: The profile specifies the use of RTP over 251 unicast and multicast UDP as well as TCP. (This does not 252 preclude the use of these definitions when RTP is carried 253 by other lower-layer protocols.) 255 Transport mapping: The standard mapping of RTP and RTCP to 256 transport-level addresses is used. 258 Encapsulation: This profile leaves to applications the 259 specification of RTP encapsulation in protocols other than 260 UDP. 262 3 IANA Considerations 264 The RTP specification establishes a registry of profile names for use 265 by higher-level control protocols, such as the Session Description 266 Protocol (SDP), RFC 2327 [6], to refer to transport methods. This 267 profile registers the name "RTP/AVP". 269 3.1 Registering Additional Encodings 271 This profile lists a set of encodings, each of which is comprised of 272 a particular media data compression or representation plus a payload 273 format for encapsulation within RTP. Some of those payload formats 274 are specified here, while others are specified in separate RFCs. It 275 is expected that additional encodings beyond the set listed here will 276 be created in the future and specified in additional payload format 277 RFCs. 279 This profile also assigns to each encoding a short name which MAY be 280 used by higher-level control protocols, such as the Session 281 Description Protocol (SDP), RFC 2327 [6], to identify encodings 282 selected for a particular RTP session. 284 In some contexts it may be useful to refer to these encodings in the 285 form of a MIME content-type. To facilitate this, RFC YYYY [7] 286 provides registrations for all of the encodings names listed here as 287 MIME subtype names under the "audio" and "video" MIME types through 288 the MIME registration procedure as specified in RFC 2048 [8]. 290 Any additional encodings specified for use under this profile (or 291 others) may also be assigned names registered as MIME subtypes with 292 the Internet Assigned Numbers Authority (IANA). This registry 293 provides a means to insure that the names assigned to the additional 294 encodings are kept unique. RFC YYYY specifies the information that is 295 required for the registration of RTP encodings. 297 In addition to assigning names to encodings, this profile also 298 assigns static RTP payload type numbers to some of them. However, the 299 payload type number space is relatively small and cannot accommodate 300 assignments for all existing and future encodings. During the early 301 stages of RTP development, it was necessary to use statically 302 assigned payload types because no other mechanism had been specified 303 to bind encodings to payload types. It was anticipated that non-RTP 304 means beyond the scope of this memo (such as directory services or 305 invitation protocols) would be specified to establish a dynamic 306 mapping between a payload type and an encoding. Now, mechanisms for 307 defining dynamic payload type bindings have been specified in the 308 Session Description Protocol (SDP) and in other protocols such as 309 ITU-T recommendation H.323/H.245. These mechanisms associate the 310 registered name of the encoding/payload format, along with any 311 additional required parameters such as the RTP timestamp clock rate 312 and number of channels, to a payload type number. This association 313 is effective only for the duration of the RTP session in which the 314 dynamic payload type binding is made. This association applies only 315 to the RTP session for which it is made, thus the numbers can be re- 316 used for different encodings in different sessions so the number 317 space limitation is avoided. 319 This profile reserves payload type numbers in the range 96-127 320 exclusively for dynamic assignment. Applications SHOULD first use 321 values in this range for dynamic payload types. Those applications 322 which need to define more than 32 dynamic payload types MAY bind 323 codes below 96, in which case it is RECOMMENDED that unassigned 324 payload type numbers be used first. However, the statically assigned 325 payload types are default bindings and MAY be dynamically bound to 326 new encodings if needed. Redefining payload types below 96 may cause 327 incorrect operation if an attempt is made to join a session without 328 obtaining session description information that defines the dynamic 329 payload types. 331 Dynamic payload types SHOULD NOT be used without a well-defined 332 mechanism to indicate the mapping. Systems that expect to 333 interoperate with others operating under this profile SHOULD NOT make 334 their own assignments of proprietary encodings to particular, fixed 335 payload types. 337 This specification establishes the policy that no additional static 338 payload types will be assigned beyond the ones defined in this 339 document. Establishing this policy avoids the problem of trying to 340 create a set of criteria for accepting static assignments and 341 encourages the implementation and deployment of the dynamic payload 342 type mechanisms. 344 4 Audio 346 4.1 Encoding-Independent Rules 348 For applications which send either no packets or occasional comfort- 349 noise packets during silence, the first packet of a talkspurt, that 350 is, the first packet after a silence period during which packets have 351 not been transmitted contiguously, SHOULD be distinguished by setting 352 the marker bit in the RTP data header to one. The marker bits in all 353 other packets is zero. The beginning of a talkspurt MAY be used to 354 adjust the playout delay to reflect changing network delays. 355 Applications without silence suppression MUST set the marker bit to 356 zero. 358 The RTP clock rate used for generating the RTP timestamp is 359 independent of the number of channels and the encoding; it usually 360 equals the number of sampling periods per second. For N-channel 361 encodings, each sampling period (say, 1/8000 of a second) generates N 362 samples. (This terminology is standard, but somewhat confusing, as 363 the total number of samples generated per second is then the sampling 364 rate times the channel count.) 366 If multiple audio channels are used, channels are numbered left-to- 367 right, starting at one. In RTP audio packets, information from 368 lower-numbered channels precedes that from higher-numbered channels. 369 For more than two channels, the convention followed by the AIFF-C 370 audio interchange format SHOULD be followed [3], using the following 371 notation, unless some other convention is specified for a particular 372 encoding or payload format: 374 l left 375 r right 376 c center 377 S surround 378 F front 379 R rear 381 channels description channel 382 1 2 3 4 5 6 383 __________________________________________________ 384 2 stereo l r 385 3 l r c 386 4 quadrophonic Fl Fr Rl Rr 387 4 l c r S 388 5 Fl Fr Fc Sl Sr 389 6 l lc c r rc S 391 Samples for all channels belonging to a single sampling instant MUST 392 be within the same packet. The interleaving of samples from different 393 channels depends on the encoding. General guidelines are given in 394 Section 4.3 and 4.4. 396 The sampling frequency SHOULD be drawn from the set: 8000, 11025, 397 16000, 22050, 24000, 32000, 44100 and 48000 Hz. (Older Apple 398 Macintosh computers had a native sample rate of 22254.54 Hz, which 399 can be converted to 22050 with acceptable quality by dropping 4 400 samples in a 20 ms frame.) However, most audio encodings are defined 401 for a more restricted set of sampling frequencies. Receivers SHOULD 402 be prepared to accept multi-channel audio, but MAY choose to only 403 play a single channel. 405 4.2 Operating Recommendations 407 The following recommendations are default operating parameters. 408 Applications SHOULD be prepared to handle other values. The ranges 409 given are meant to give guidance to application writers, allowing a 410 set of applications conforming to these guidelines to interoperate 411 without additional negotiation. These guidelines are not intended to 412 restrict operating parameters for applications that can negotiate a 413 set of interoperable parameters, e.g., through a conference control 414 protocol. 416 For packetized audio, the default packetization interval SHOULD have 417 a duration of 20 ms or one frame, whichever is longer, unless 418 otherwise noted in Table 1 (column "ms/packet"). The packetization 419 interval determines the minimum end-to-end delay; longer packets 420 introduce less header overhead but higher delay and make packet loss 421 more noticeable. For non-interactive applications such as lectures or 422 for links with severe bandwidth constraints, a higher packetization 423 delay MAY be used. A receiver SHOULD accept packets representing 424 between 0 and 200 ms of audio data. (For framed audio encodings, a 425 receiver SHOULD accept packets with a number of frames equal to 200 426 ms divided by the frame duration, rounded up.) This restriction 427 allows reasonable buffer sizing for the receiver. 429 4.3 Guidelines for Sample-Based Audio Encodings 431 In sample-based encodings, each audio sample is represented by a 432 fixed number of bits. Within the compressed audio data, codes for 433 individual samples may span octet boundaries. An RTP audio packet may 434 contain any number of audio samples, subject to the constraint that 435 the number of bits per sample times the number of samples per packet 436 yields an integral octet count. Fractional encodings produce less 437 than one octet per sample. 439 The duration of an audio packet is determined by the number of 440 samples in the packet. 442 For sample-based encodings producing one or more octets per sample, 443 samples from different channels sampled at the same sampling instant 444 SHOULD be packed in consecutive octets. For example, for a two- 445 channel encoding, the octet sequence is (left channel, first sample), 446 (right channel, first sample), (left channel, second sample), (right 447 channel, second sample), .... For multi-octet encodings, octets 448 SHOULD be transmitted in network byte order (i.e., most significant 449 octet first). 451 The packing of sample-based encodings producing less than one octet 452 per sample is encoding-specific. 454 The RTP timestamp reflects the instant at which the first sample in 455 the packet was sampled, that is, the oldest information in the 456 packet. 458 4.4 Guidelines for Frame-Based Audio Encodings 460 Frame-based encodings encode a fixed-length block of audio into 461 another block of compressed data, typically also of fixed length. For 462 frame-based encodings, the sender MAY choose to combine several such 463 frames into a single RTP packet. The receiver can tell the number of 464 frames contained in an RTP packet, if all the frames have the same 465 length, by dividing the RTP payload length by the audio frame size 466 which is defined as part of the encoding. This does not work when 467 carrying frames of different sizes unless the frame sizes are 468 relatively prime. If not, the frames MUST indicate their size. 470 For frame-based codecs, the channel order is defined for the whole 471 block. That is, for two-channel audio, right and left samples SHOULD 472 be coded independently, with the encoded frame for the left channel 473 preceding that for the right channel. 475 All frame-oriented audio codecs SHOULD be able to encode and decode 476 several consecutive frames within a single packet. Since the frame 477 size for the frame-oriented codecs is given, there is no need to use 478 a separate designation for the same encoding, but with different 479 number of frames per packet. 481 RTP packets SHALL contain a whole number of frames, with frames 482 inserted according to age within a packet, so that the oldest frame 483 (to be played first) occurs immediately after the RTP packet header. 484 The RTP timestamp reflects the instant at which the first sample in 485 the first frame was sampled, that is, the oldest information in the 486 packet. 488 4.5 Audio Encodings 490 name of sampling default 491 encoding sample/frame bits/sample rate ms/frame ms/packet 492 __________________________________________________________________ 493 DVI4 sample 4 var. 20 494 G722 sample 8 16,000 20 495 G723 frame N/A 8,000 30 30 496 G726-40 sample 5 8,000 20 497 G726-32 sample 4 8,000 20 498 G726-24 sample 3 8,000 20 499 G726-16 sample 2 8,000 20 500 G728 frame N/A 8,000 2.5 20 501 G729 frame N/A 8,000 10 20 502 G729D frame N/A 8,000 10 20 503 G729E frame N/A 8,000 10 20 504 GSM frame N/A 8,000 20 20 505 GSM-EFR frame N/A 8,000 20 20 506 L8 sample 8 var. 20 507 L16 sample 16 var. 20 508 LPC frame N/A 8,000 20 20 509 MPA frame N/A var. var. 510 PCMA sample 8 var. 20 511 PCMU sample 8 var. 20 512 QCELP frame N/A 8,000 20 20 513 VDVI sample var. var. 20 515 Table 1: Properties of Audio Encodings (N/A: not applicable; var.: 516 variable) 517 The characteristics of the audio encodings described in this document 518 are shown in Table 1; they are listed in order of their payload type 519 in Table 4. While most audio codecs are only specified for a fixed 520 sampling rate, some sample-based algorithms (indicated by an entry of 521 "var." in the sampling rate column of Table 1) may be used with 522 different sampling rates, resulting in different coded bit rates. 523 When used with a sampling rate other than that for which a static 524 payload type is defined, non-RTP means beyond the scope of this memo 525 MUST be used to define a dynamic payload type and MUST indicate the 526 selected RTP timestamp clock rate, which is usually the same as the 527 sampling rate for audio. 529 4.5.1 DVI4 531 DVI4 is specified, with pseudo-code, in [9] as the IMA ADPCM wave 532 type. 534 However, the encoding defined here as DVI4 differs in three respects 535 from this recommendation: 537 o The RTP DVI4 header contains the predicted value rather than 538 the first sample value contained the IMA ADPCM block header. 540 o IMA ADPCM blocks contain an odd number of samples, since the 541 first sample of a block is contained just in the header 542 (uncompressed), followed by an even number of compressed 543 samples. DVI4 has an even number of compressed samples only, 544 using the `predict' word from the header to decode the first 545 sample. 547 o For DVI4, the 4-bit samples are packed with the first sample 548 in the four most significant bits and the second sample in the 549 four least significant bits. In the IMA ADPCM codec, the 550 samples are packed in the opposite order. 552 Each packet contains a single DVI block. This profile only defines 553 the 4-bit-per-sample version, while IMA also specifies a 3-bit-per- 554 sample encoding. 556 The "header" word for each channel has the following structure: 558 int16 predict; /* predicted value of first sample 559 from the previous block (L16 format) */ 560 u_int8 index; /* current index into stepsize table */ 561 u_int8 reserved; /* set to zero by sender, ignored by receiver */ 563 Each octet following the header contains two 4-bit samples, thus the 564 number of samples per packet MUST be even because there is no means 565 to indicate a partially filled last octet. 567 Packing of samples for multiple channels is for further study. 569 The document IMA Recommended Practices for Enhancing Digital Audio 570 Compatibility in Multimedia Systems (version 3.0) contains the 571 algorithm description. It is available from 573 Interactive Multimedia Association 574 48 Maryland Avenue, Suite 202 575 Annapolis, MD 21401-8011 576 USA 577 phone: +1 410 626-1380 579 4.5.2 G722 581 G722 is specified in ITU-T Recommendation G.722, "7 kHz audio-coding 582 within 64 kbit/s". The G.722 encoder produces a stream of octets, 583 each of which SHALL be octet-aligned in an RTP packet. The first bit 584 transmitted in the G.722 octet, which is the most significant bit of 585 the higher sub-band sample, SHALL correspond to the most significant 586 bit of the octet in the RTP packet. 588 Even though the actual sampling rate for G.722 audio is 16000 Hz, the 589 RTP clock rate for the G722 payload format is 8000 Hz because that 590 value was erroneously assigned in RFC 1890 and must remain unchanged 591 for backward compatibility. The octet rate or sample-pair rate is 592 8000 Hz. 594 4.5.3 G723 596 G723 is specified in ITU Recommendation G.723.1, "Dual-rate speech 597 coder for multimedia communications transmitting at 5.3 and 6.3 598 kbit/s". The G.723.1 5.3/6.3 kbit/s codec was defined by the ITU-T as 599 a mandatory codec for ITU-T H.324 GSTN videophone terminal 600 applications. The algorithm has a floating point specification in 601 Annex B to G.723.1, a silence compression algorithm in Annex A to 602 G.723.1 and an encoded signal bit-error sensitivity specification in 603 G.723.1 Annex C. 605 This Recommendation specifies a coded representation that can be used 606 for compressing the speech signal component of multi-media services 607 at a very low bit rate. Audio is encoded in 30 ms frames, with an 608 additional delay of 7.5 ms due to look-ahead. A G.723.1 frame can be 609 one of three sizes: 24 octets (6.3 kb/s frame), 20 octets (5.3 kb/s 610 frame), or 4 octets. These 4-octet frames are called SID frames 611 (Silence Insertion Descriptor) and are used to specify comfort noise 612 parameters. There is no restriction on how 4, 20, and 24 octet frames 613 are intermixed. The least significant two bits of the first octet in 614 the frame determine the frame size and codec type: 616 bits content octets/frame 617 00 high-rate speech (6.3 kb/s) 24 618 01 low-rate speech (5.3 kb/s) 20 619 10 SID frame 4 620 11 reserved 622 It is possible to switch between the two rates at any 30 ms frame 623 boundary. Both (5.3 kb/s and 6.3 kb/s) rates are a mandatory part of 624 the encoder and decoder. The MIME registration for G723 in RFC YYYY 625 [7] specifies parameters that MAY be used with MIME or SDP to 626 restrict to a single data rate or to restrict the use of SID frames. 627 This coder was optimized to represent speech with near-toll quality 628 at the above rates using a limited amount of complexity. 630 The packing of the encoded bit stream into octets and the 631 transmission order of the octets is specified in Rec. G.723.1 and is 632 the same as that produced by the G.723 C code reference 633 implementation. For the 6.3 kb/s data rate, this packing is 634 illustrated as follows, where the header (HDR) bits are always "0 0" 635 as shown in Fig. 1 to indicate operation at 6.3 kb/s, and the Z bit 636 is always set to zero. The diagrams show the bit packing in "network 637 byte order," also known as big-endian order. The bits of each 32-bit 638 word are numbered 0 to 31, with the most significant bit on the left 639 and numbered 0. The octets (bytes) of each word are transmitted most 640 significant octet first. The bits of each data field are numbered in 641 the order of the bit stream representation of the encoding (least 642 significant bit first). The vertical bars indicate the boundaries 643 between field fragments. 645 0 1 2 3 646 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 647 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 648 | LPC |HDR| LPC | LPC | ACL0 |LPC| 649 | | | | | | | 650 |0 0 0 0 0 0|0 0|1 1 1 1 0 0 0 0|2 2 1 1 1 1 1 1|0 0 0 0 0 0|2 2| 651 |5 4 3 2 1 0| |3 2 1 0 9 8 7 6|1 0 9 8 7 6 5 4|5 4 3 2 1 0|3 2| 652 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 653 | ACL2 |ACL|A| GAIN0 |ACL|ACL| GAIN0 | GAIN1 | 654 | | 1 |C| | 3 | 2 | | | 655 |0 0 0 0 0|0 0|0|0 0 0 0|0 0|0 0|1 1 0 0 0 0 0 0|0 0 0 0 0 0 0 0| 656 |4 3 2 1 0|1 0|6|3 2 1 0|1 0|6 5|1 0 9 8 7 6 5 4|7 6 5 4 3 2 1 0| 657 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 658 | GAIN2 | GAIN1 | GAIN2 | GAIN3 | GRID | GAIN3 | 659 | | | | | | | 660 |0 0 0 0|1 1 0 0|1 1 0 0 0 0 0 0|0 0 0 0 0 0 0 0|0 0 0 0|1 1 0 0| 661 |3 2 1 0|1 0 9 8|1 0 9 8 7 6 5 4|7 6 5 4 3 2 1 0|3 2 1 0|1 0 9 8| 662 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 663 | MSBPOS |Z|POS| MSBPOS | POS0 |POS| POS0 | 664 | | | 0 | | | 1 | | 665 |0 0 0 0 0 0 0|0|0 0|1 1 1 0 0 0|0 0 0 0 0 0 0 0|0 0|1 1 1 1 1 1| 666 |6 5 4 3 2 1 0| |1 0|2 1 0 9 8 7|9 8 7 6 5 4 3 2|1 0|5 4 3 2 1 0| 667 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 668 | POS1 | POS2 | POS1 | POS2 | POS3 | POS2 | 669 | | | | | | | 670 |0 0 0 0 0 0 0 0|0 0 0 0|1 1 1 1|1 1 0 0 0 0 0 0|0 0 0 0|1 1 1 1| 671 |9 8 7 6 5 4 3 2|3 2 1 0|3 2 1 0|1 0 9 8 7 6 5 4|3 2 1 0|5 4 3 2| 672 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 673 | POS3 | PSIG0 |POS|PSIG2| PSIG1 | PSIG3 |PSIG2| 674 | | | 3 | | | | | 675 |1 1 0 0 0 0 0 0|0 0 0 0 0 0|1 1|0 0 0|0 0 0 0 0|0 0 0 0 0|0 0 0| 676 |1 0 9 8 7 6 5 4|5 4 3 2 1 0|3 2|2 1 0|4 3 2 1 0|4 3 2 1 0|5 4 3| 677 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 679 Figure 1: G.723 (6.3 kb/s) bit packing 681 For the 5.3 kb/s data rate, the header (HDR) bits are always "0 1", 682 as shown in Fig. 2, to indicate operation at 5.3 kb/s. 684 0 1 2 3 685 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 686 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 687 | LPC |HDR| LPC | LPC | ACL0 |LPC| 688 | | | | | | | 689 |0 0 0 0 0 0|0 1|1 1 1 1 0 0 0 0|2 2 1 1 1 1 1 1|0 0 0 0 0 0|2 2| 690 |5 4 3 2 1 0| |3 2 1 0 9 8 7 6|1 0 9 8 7 6 5 4|5 4 3 2 1 0|3 2| 691 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 692 | ACL2 |ACL|A| GAIN0 |ACL|ACL| GAIN0 | GAIN1 | 693 | | 1 |C| | 3 | 2 | | | 694 |0 0 0 0 0|0 0|0|0 0 0 0|0 0|0 0|1 1 0 0 0 0 0 0|0 0 0 0 0 0 0 0| 695 |4 3 2 1 0|1 0|6|3 2 1 0|1 0|6 5|1 0 9 8 7 6 5 4|7 6 5 4 3 2 1 0| 696 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 697 | GAIN2 | GAIN1 | GAIN2 | GAIN3 | GRID | GAIN3 | 698 | | | | | | | 699 |0 0 0 0|1 1 0 0|1 1 0 0 0 0 0 0|0 0 0 0 0 0 0 0|0 0 0 0|1 1 0 0| 700 |3 2 1 0|1 0 9 8|1 0 9 8 7 6 5 4|7 6 5 4 3 2 1 0|4 3 2 1|1 0 9 8| 701 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 702 | POS0 | POS1 | POS0 | POS1 | POS2 | 703 | | | | | | 704 |0 0 0 0 0 0 0 0|0 0 0 0|1 1 0 0|1 1 0 0 0 0 0 0|0 0 0 0 0 0 0 0| 705 |7 6 5 4 3 2 1 0|3 2 1 0|1 0 9 8|1 0 9 8 7 6 5 4|7 6 5 4 3 2 1 0| 706 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 707 | POS3 | POS2 | POS3 | PSIG1 | PSIG0 | PSIG3 | PSIG2 | 708 | | | | | | | | 709 |0 0 0 0|1 1 0 0|1 1 0 0 0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0|0 0 0 0| 710 |3 2 1 0|1 0 9 8|1 0 9 8 7 6 5 4|3 2 1 0|3 2 1 0|3 2 1 0|3 2 1 0| 711 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 713 Figure 2: G.723 (5.3 kb/s) bit packing 715 The packing of G.723.1 SID (silence) frames, which are indicated by 716 the header (HDR) bits having the pattern "1 0", is depicted in Fig. 717 3. 719 0 1 2 3 720 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 721 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 722 | LPC |HDR| LPC | LPC | GAIN |LPC| 723 | | | | | | | 724 |0 0 0 0 0 0|1 0|1 1 1 1 0 0 0 0|2 2 1 1 1 1 1 1|0 0 0 0 0 0|2 2| 725 |5 4 3 2 1 0| |3 2 1 0 9 8 7 6|1 0 9 8 7 6 5 4|5 4 3 2 1 0|3 2| 726 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 728 Figure 3: G.723 SID mode bit packing 730 4.5.4 G726-40, G726-32, G726-24, and G726-16 732 ITU-T Recommendation G.726 describes, among others, the algorithm 733 recommended for conversion of a single 64 kbit/s A-law or mu-law PCM 734 channel encoded at 8000 samples/sec to and from a 40, 32, 24, or 16 735 kbit/s channel. The conversion is applied to the PCM stream using an 736 Adaptive Differential Pulse Code Modulation (ADPCM) transcoding 737 technique. The ADPCM representation consists of a series of codewords 738 with a one-to-one correspondence to the samples in the PCM stream. 739 The G726 data rates of 40, 32, 24, and 16 kbit/s have codewords of 5, 740 4, 3, and 2 bits respectively. 742 The 16 and 24 kbit/s encodings do not provide toll quality speech. 743 They are designed for used in overloaded Digital Circuit 744 Multiplication Equipment (DCME). ITU-T G.726 recommends that the 16 745 and 24 kbit/s encodings should be alternated with higher data rate 746 encodings to provide an average sample size of between 3.5 and 3.7 747 bits per sample. 749 The encodings of G.726 are here denoted as G726-40, G726-32, G726-24, 750 and G726-16. Prior to 1990, G721 described the 32 kbit/s ADPCM 751 encoding, and G723 described the 40, 32, and 16 kbit/s encodings. 752 Thus, G726-32 designates the same algorithm as G721 in RFC 1890. 754 A stream of G726 codewords contains no information on the encoding 755 being used, therefore transitions between G726 encoding types is not 756 permitted within a sequence of packed codewords. Applications MUST 757 determine the encoding type of packed codewords from the RTP payload 758 identifier. 760 No payload-specific header information SHALL be included as part of 761 the audio data. A stream of G726 codewords MUST be packed into octets 762 as follows: the first codeword is placed into the first octet such 763 that the least significant bit of the codeword aligns with the least 764 significant bit in the octet, the second codeword is then packed so 765 that its least significant bit coincides with the least significant 766 unoccupied bit in the octet. When a complete codeword cannot be 767 placed into an octet, the bits overlapping the octet boundary are 768 placed into the least significant bits of the next octet. Packing 769 MUST end with a completely packed final octet. The number of 770 codewords packed will therefore be a multiple of 8, 2, 8, and 4 for 771 G726-40, G726-32, G726-24, and G726-16 respectively. An example of 772 the packing scheme for G726-32 codewords is as shown: 774 0 1 775 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 776 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- 777 |B B B B|A A A A|D D D D|C C C C| ... 778 |0 1 2 3|0 1 2 3|0 1 2 3|0 1 2 3| 779 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- 781 An example of the packing scheme for G726-24 codewords is: 783 0 1 2 784 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 785 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- 786 |C C|B B B|A A A|F|E E E|D D D|C|H H H|G G G|F F| ... 787 |1 2|0 1 2|0 1 2|2|0 1 2|0 1 2|0|0 1 2|0 1 2|0 1| 788 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+- 790 4.5.5 G728 792 G728 is specified in ITU-T Recommendation G.728, "Coding of speech at 793 16 kbit/s using low-delay code excited linear prediction". 795 A G.278 encoder translates 5 consecutive audio samples into a 10-bit 796 codebook index, resulting in a bit rate of 16 kb/s for audio sampled 797 at 8,000 samples per second. The group of five consecutive samples is 798 called a vector. Four consecutive vectors, labeled V1 to V4 (where V1 799 is to be played first by the receiver), build one G.728 frame. The 800 four vectors of 40 bits are packed into 5 octets, labeled B1 through 801 B5. B1 SHALL be placed first in the RTP packet. 803 Referring to the figure below, the principle for bit order is 804 "maintenance of bit significance". Bits from an older vector are more 805 significant than bits from newer vectors. The MSB of the frame goes 806 to the MSB of B1 and the LSB of the frame goes to LSB of B5. 808 1 2 3 3 809 0 0 0 0 9 810 ++++++++++++++++++++++++++++++++++++++++ 811 <---V1---><---V2---><---V3---><---V4---> vectors 812 <--B1--><--B2--><--B3--><--B4--><--B5--> octets 813 <------------- frame 1 ----------------> 815 In particular, B1 contains the eight most significant bits of V1, 816 with the MSB of V1 being the MSB of B1. B2 contains the two least 817 significant bits of V1, the more significant of the two in its MSB, 818 and the six most significant bits of V2. B1 SHALL be placed first in 819 the RTP packet and B5 last. 821 4.5.6 G729 823 G729 is specified in ITU-T Recommendation G.729, "Coding of speech at 824 8 kbit/s using conjugate structure-algebraic code excited linear 825 prediction (CS-ACELP)". A reduced-complexity version of the G.729 826 algorithm is specified in Annex A to Rec. G.729. The speech coding 827 algorithms in the main body of G.729 and in G.729 Annex A are fully 828 interoperable with each other, so there is no need to further 829 distinguish between them. The G.729 and G.729 Annex A codecs were 830 optimized to represent speech with high quality, where G.729 Annex A 831 trades some speech quality for an approximate 50% complexity 832 reduction [10]. See the next Section (4.5.7) for other data rates 833 added in later G.729 Annexes. For all data rates, the sampling 834 frequency (and RTP timestamp clock rate) is 8000 Hz. 836 A voice activity detector (VAD) and comfort noise generator (CNG) 837 algorithm in Annex B of G.729 is RECOMMENDED for digital simultaneous 838 voice and data applications and can be used in conjunction with G.729 839 or G.729 Annex A. A G.729 or G.729 Annex A frame contains 10 octets, 840 while the G.729 Annex B comfort noise frame occupies 2 octets. The 841 MIME registration for G729 in RFC YYYY [7] specifies a parameter that 842 MAY be used with MIME or SDP to restrict the use of comfort noise 843 frames. 845 A G729 RTP packet may consist of zero or more G.729 or G.729 Annex A 846 frames, followed by zero or one G.729 Annex B frames. The presence of 847 a comfort noise frame can be deduced from the length of the RTP 848 payload. The default packetization interval is 20 ms (two frames), 849 but in some situations it may be desirable to send 10 ms packets. An 850 example would be a transition from speech to comfort noise in the 851 first 10 ms of the packet. For some applications, a longer 852 packetization interval may be required to reduce the packet rate. 854 The transmitted parameters of a G.729/G.729A 10-ms frame, consisting 855 of 80 bits, are defined in Recommendation G.729, Table 8/G.729. The 856 mapping of the these parameters is given below in Fig. 4. The 857 diagrams show the bit packing in "network byte order," also known as 858 big-endian order. The bits of each 32-bit word are numbered 0 to 31, 859 with the most significant bit on the left and numbered 0. The octets 860 (bytes) of each word are transmitted most significant octet first. 861 The bits of each data field are numbered in the order as produced by 862 the G.729 C code reference implementation. 864 0 1 2 3 865 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 866 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 867 |L| L1 | L2 | L3 | P1 |P| C1 | 868 |0| | | | |0| | 869 | |0 1 2 3 4 5 6|0 1 2 3 4|0 1 2 3 4|0 1 2 3 4 5 6 7| |0 1 2 3 4| 870 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 871 | C1 | S1 | GA1 | GB1 | P2 | C2 | 872 | 1 1 1| | | | | | 873 |5 6 7 8 9 0 1 2|0 1 2 3|0 1 2|0 1 2 3|0 1 2 3 4|0 1 2 3 4 5 6 7| 874 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 875 | C2 | S2 | GA2 | GB2 | 876 | 1 1 1| | | | 877 |8 9 0 1 2|0 1 2 3|0 1 2|0 1 2 3| 878 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 880 Figure 4: G.729 and G.729A bit packing 882 The packing of the G.729 Annex B comfort noise frame is shown in Fig. 883 5. 885 0 1 886 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 887 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 888 |L| LSF1 | LSF2 | GAIN |R| 889 |S| | | |E| 890 |F| | | |S| 891 |0|0 1 2 3 4|0 1 2 3|0 1 2 3 4|V| RESV = Reserved (zero) 892 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 894 Figure 5: G.729 Annex B bit packing 896 4.5.7 G729D and G729E 898 Annexes D and E to ITU-T Recommendation G.729 provide additional data 899 rates. Because the data rate is not signaled in the bitstream, the 900 different data rates are given distinct RTP encoding names which are 901 mapped to distinct payload type numbers. G729D indicates a 6.4 kbit/s 902 coding mode (G.729 Annex D, for momentary reduction in channel 903 capacity), while G729E indicates an 11.8 kbit/s mode (G.729 Annex E, 904 for improved performance with a wide range of narrow-band input 905 signals, e.g. music and background noise). Annex E has two operating 906 modes, backward adaptive and forward adaptive, which are signaled by 907 the first two bits in each frame (the most significant two bits of 908 the first octet). 910 The voice activity detector (VAD) and comfort noise generator (CNG) 911 algorithm specified in Annex B of G.729 may be used with Annex D and 912 Annex E frames in addition to G.729 and G.729 Annex A frames. The 913 algorithm details for the operation of Annexes D and E with the Annex 914 B CNG are specified in G.729 Annexes F and G. Note that Annexes F and 915 G do not introduce any new encodings. The MIME registrations for 916 G729D and G729E in RFC YYYY [7] specify a parameter that MAY be used 917 with MIME or SDP to restrict the use of comfort noise frames. 919 For G729D, an RTP packet may consist of zero or more G.729 Annex D 920 frames, followed by zero or one G.729 Annex B frame. Similarly, for 921 G729E, an RTP packet may consist of zero or more G.729 Annex E 922 frames, followed by zero or one G.729 Annex B frame. The presence of 923 a comfort noise frame can be deduced from the length of the RTP 924 payload. 926 A single RTP packet must contain frames of only one data rate, 927 optionally followed by one comfort noise frame. The data rate may be 928 changed from packet to packet by changing the payload type number. 929 G.729 Annexes D, E and H describe what the encoding and decoding 930 algorithms must do to accommodate a change in data rate. 932 For G729D, the bits of a G.729 Annex D frame are formatted as shown 933 below in Fig. 6 (cf. Table D.1/G.729). The frame length is 64 bits. 935 0 1 2 3 936 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 937 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 938 |L| L1 | L2 | L3 | P1 | C1 | 939 |0| | | | | | 940 | |0 1 2 3 4 5 6|0 1 2 3 4|0 1 2 3 4|0 1 2 3 4 5 6 7|0 1 2 3 4 5| 941 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 942 | C1 |S1 | GA1 | GB1 | P2 | C2 |S2 | GA2 | GB2 | 943 | | | | | | | | | | 944 |6 7 8|0 1|0 1 2|0 1 2|0 1 2 3|0 1 2 3 4 5 6 7 8|0 1|0 1 2|0 1 2| 945 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 947 Figure 6: G.729 Annex D bit packing 949 The net bit rate for the G.729 Annex E algorithm is 11.8 kbit/s and a 950 total of 118 bits are used. Two bits are appended as "don't care" 951 bits to complete an integer number of octets for the frame. For 952 G729E, the bits of a data frame are formatted as shown in the next 953 two diagrams (cf. Table E.1/G.729). The fields for the G729E forward 954 adaptive mode are packed as shown in Fig. 7. 956 0 1 2 3 957 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 958 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 959 |0 0|L| L1 | L2 | L3 | P1 |P| C0_1| 960 | |0| | | | |0| | 961 | | |0 1 2 3 4 5 6|0 1 2 3 4|0 1 2 3 4|0 1 2 3 4 5 6 7| |0 1 2| 962 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 963 | | C1_1 | C2_1 | C3_1 | C4_1 | 964 | | | | | | 965 |3 4 5 6|0 1 2 3 4 5 6|0 1 2 3 4 5 6|0 1 2 3 4 5 6|0 1 2 3 4 5 6| 966 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 967 | GA1 | GB1 | P2 | C0_2 | C1_2 | C2_2 | 968 | | | | | | | 969 |0 1 2|0 1 2 3|0 1 2 3 4|0 1 2 3 4 5 6|0 1 2 3 4 5 6|0 1 2 3 4 5| 970 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 971 | | C3_2 | C4_2 | GA2 | GB2 |DC | 972 | | | | | | | 973 |6|0 1 2 3 4 5 6|0 1 2 3 4 5 6|0 1 2|0 1 2 3|0 1| 974 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 976 Figure 7: G.729 Annex E (forward adaptive mode) bit packing 978 The fields for the G729E backward adaptive mode are packed as shown 979 in Fig. 8. 981 0 1 2 3 982 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 983 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 984 |1 1| P1 |P| C0_1 | C1_1 | 985 | | |0| 1 1 1| | 986 | |0 1 2 3 4 5 6 7|0|0 1 2 3 4 5 6 7 8 9 0 1 2|0 1 2 3 4 5 6 7| 987 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 988 | | C2_1 | C3_1 | C4_1 |GA1 | GB1 |P2 | 989 | | | | | | | | 990 |8 9|0 1 2 3 4 5 6|0 1 2 3 4 5 6|0 1 2 3 4 5 6|0 1 2|0 1 2 3|0 1| 991 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 992 | | C0_2 | C1_2 | C2_2 | 993 | | 1 1 1| | | 994 |2 3 4|0 1 2 3 4 5 6 7 8 9 0 1 2|0 1 2 3 4 5 6 7 8 9|0 1 2 3 4 5| 995 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 996 | | C3_2 | C4_2 | GA2 | GB2 |DC | 997 | | | | | | | 998 |6|0 1 2 3 4 5 6|0 1 2 3 4 5 6|0 1 2|0 1 2 3|0 1| 999 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1001 Figure 8: G.729 Annex E (backward adaptive mode) bit packing 1003 4.5.8 GSM 1005 GSM (group speciale mobile) denotes the European GSM 06.10 standard 1006 for full-rate speech transcoding, ETS 300 961, which is based on 1007 RPE/LTP (residual pulse excitation/long term prediction) coding at a 1008 rate of 13 kb/s [11,12,13]. The text of the standard can be obtained 1009 from 1011 ETSI (European Telecommunications Standards Institute) 1012 ETSI Secretariat: B.P.152 1013 F-06561 Valbonne Cedex 1014 France 1015 Phone: +33 92 94 42 00 1016 Fax: +33 93 65 47 16 1018 Blocks of 160 audio samples are compressed into 33 octets, for an 1019 effective data rate of 13,200 b/s. 1021 4.5.8.1 General Packaging Issues 1023 The GSM standard (ETS 300 961) specifies the bit stream produced by 1024 the codec, but does not specify how these bits should be packed for 1025 transmission. The packetization specified here has subsequently been 1026 adopted in ETSI Technical Specification TS 101 318. Some software 1027 implementations of the GSM codec use a different packing than that 1028 specified here. 1030 In the GSM packing used by RTP, the bits SHALL be packed beginning 1031 from the most significant bit. Every 160 sample GSM frame is coded 1032 into one 33 octet (264 bit) buffer. Every such buffer begins with a 4 1033 bit signature (0xD), followed by the MSB encoding of the fields of 1034 the frame. The first octet thus contains 1101 in the 4 most 1035 significant bits (0-3) and the 4 most significant bits of F1 (0-3) in 1036 the 4 least significant bits (4-7). The second octet contains the 2 1037 least significant bits of F1 in bits 0-1, and F2 in bits 2-7, and so 1038 on. The order of the fields in the frame is described in Table 2. 1040 4.5.8.2 GSM variable names and numbers 1042 In the RTP encoding we have the bit pattern described in Table 3, 1043 where F.i signifies the ith bit of the field F, bit 0 is the most 1044 significant bit, and the bits of every octet are numbered from 0 to 7 1045 from most to least significant. 1047 4.5.9 GSM-EFR 1049 GSM-EFR denotes GSM 06.60 enhanced full rate speech transcoding, 1050 specified in ETS 300 969 which is available from ETSI at the address 1051 given in Section 4.5.8. This codec has a frame length of 244 bits. 1052 For transmission in RTP, each codec frame is packed into a 31 octet 1053 (248 bit) buffer beginning with a 4-bit signature 0xC in a manner 1054 similar to that specified here for the original GSM 06.10 codec. The 1055 packing is specified in ETSI Technical Specification TS 101 318. 1057 4.5.10 L8 1059 L8 denotes linear audio data samples, using 8-bits of precision with 1060 an offset of 128, that is, the most negative signal is encoded as 1061 zero. 1063 4.5.11 L16 1065 L16 denotes uncompressed audio data samples, using 16-bit signed 1066 representation with 65535 equally divided steps between minimum and 1067 maximum signal level, ranging from -32768 to 32767. The value is 1068 represented in two's complement notation and transmitted in network 1069 byte order (most significant byte first). 1071 The MIME registration for L16 in RFC YYYY [7] specifies parameters 1072 field field name bits field field name bits 1073 ________________________________________________ 1074 1 LARc[0] 6 39 xmc[22] 3 1075 2 LARc[1] 6 40 xmc[23] 3 1076 3 LARc[2] 5 41 xmc[24] 3 1077 4 LARc[3] 5 42 xmc[25] 3 1078 5 LARc[4] 4 43 Nc[2] 7 1079 6 LARc[5] 4 44 bc[2] 2 1080 7 LARc[6] 3 45 Mc[2] 2 1081 8 LARc[7] 3 46 xmaxc[2] 6 1082 9 Nc[0] 7 47 xmc[26] 3 1083 10 bc[0] 2 48 xmc[27] 3 1084 11 Mc[0] 2 49 xmc[28] 3 1085 12 xmaxc[0] 6 50 xmc[29] 3 1086 13 xmc[0] 3 51 xmc[30] 3 1087 14 xmc[1] 3 52 xmc[31] 3 1088 15 xmc[2] 3 53 xmc[32] 3 1089 16 xmc[3] 3 54 xmc[33] 3 1090 17 xmc[4] 3 55 xmc[34] 3 1091 18 xmc[5] 3 56 xmc[35] 3 1092 19 xmc[6] 3 57 xmc[36] 3 1093 20 xmc[7] 3 58 xmc[37] 3 1094 21 xmc[8] 3 59 xmc[38] 3 1095 22 xmc[9] 3 60 Nc[3] 7 1096 23 xmc[10] 3 61 bc[3] 2 1097 24 xmc[11] 3 62 Mc[3] 2 1098 25 xmc[12] 3 63 xmaxc[3] 6 1099 26 Nc[1] 7 64 xmc[39] 3 1100 27 bc[1] 2 65 xmc[40] 3 1101 28 Mc[1] 2 66 xmc[41] 3 1102 29 xmaxc[1] 6 67 xmc[42] 3 1103 30 xmc[13] 3 68 xmc[43] 3 1104 31 xmc[14] 3 69 xmc[44] 3 1105 32 xmc[15] 3 70 xmc[45] 3 1106 33 xmc[16] 3 71 xmc[46] 3 1107 34 xmc[17] 3 72 xmc[47] 3 1108 35 xmc[18] 3 73 xmc[48] 3 1109 36 xmc[19] 3 74 xmc[49] 3 1110 37 xmc[20] 3 75 xmc[50] 3 1111 38 xmc[21] 3 76 xmc[51] 3 1113 Table 2: Ordering of GSM variables 1115 that MAY be used with MIME or SDP to indicate that analog preemphasis 1116 was applied to the signal before quantization or to indicate that a 1117 multiple-channel audio stream follows a different channel ordering 1118 convention than is specified in Section 4.1. 1120 Octet Bit 0 Bit 1 Bit 2 Bit 3 Bit 4 Bit 5 Bit 6 Bit 7 1121 _____________________________________________________________________ 1122 0 1 1 0 1 LARc0.0 LARc0.1 LARc0.2 LARc0.3 1123 1 LARc0.4 LARc0.5 LARc1.0 LARc1.1 LARc1.2 LARc1.3 LARc1.4 LARc1.5 1124 2 LARc2.0 LARc2.1 LARc2.2 LARc2.3 LARc2.4 LARc3.0 LARc3.1 LARc3.2 1125 3 LARc3.3 LARc3.4 LARc4.0 LARc4.1 LARc4.2 LARc4.3 LARc5.0 LARc5.1 1126 4 LARc5.2 LARc5.3 LARc6.0 LARc6.1 LARc6.2 LARc7.0 LARc7.1 LARc7.2 1127 5 Nc0.0 Nc0.1 Nc0.2 Nc0.3 Nc0.4 Nc0.5 Nc0.6 bc0.0 1128 6 bc0.1 Mc0.0 Mc0.1 xmaxc00 xmaxc01 xmaxc02 xmaxc03 xmaxc04 1129 7 xmaxc05 xmc0.0 xmc0.1 xmc0.2 xmc1.0 xmc1.1 xmc1.2 xmc2.0 1130 8 xmc2.1 xmc2.2 xmc3.0 xmc3.1 xmc3.2 xmc4.0 xmc4.1 xmc4.2 1131 9 xmc5.0 xmc5.1 xmc5.2 xmc6.0 xmc6.1 xmc6.2 xmc7.0 xmc7.1 1132 10 xmc7.2 xmc8.0 xmc8.1 xmc8.2 xmc9.0 xmc9.1 xmc9.2 xmc10.0 1133 11 xmc10.1 xmc10.2 xmc11.0 xmc11.1 xmc11.2 xmc12.0 xmc12.1 xcm12.2 1134 12 Nc1.0 Nc1.1 Nc1.2 Nc1.3 Nc1.4 Nc1.5 Nc1.6 bc1.0 1135 13 bc1.1 Mc1.0 Mc1.1 xmaxc10 xmaxc11 xmaxc12 xmaxc13 xmaxc14 1136 14 xmax15 xmc13.0 xmc13.1 xmc13.2 xmc14.0 xmc14.1 xmc14.2 xmc15.0 1137 15 xmc15.1 xmc15.2 xmc16.0 xmc16.1 xmc16.2 xmc17.0 xmc17.1 xmc17.2 1138 16 xmc18.0 xmc18.1 xmc18.2 xmc19.0 xmc19.1 xmc19.2 xmc20.0 xmc20.1 1139 17 xmc20.2 xmc21.0 xmc21.1 xmc21.2 xmc22.0 xmc22.1 xmc22.2 xmc23.0 1140 18 xmc23.1 xmc23.2 xmc24.0 xmc24.1 xmc24.2 xmc25.0 xmc25.1 xmc25.2 1141 19 Nc2.0 Nc2.1 Nc2.2 Nc2.3 Nc2.4 Nc2.5 Nc2.6 bc2.0 1142 20 bc2.1 Mc2.0 Mc2.1 xmaxc20 xmaxc21 xmaxc22 xmaxc23 xmaxc24 1143 21 xmaxc25 xmc26.0 xmc26.1 xmc26.2 xmc27.0 xmc27.1 xmc27.2 xmc28.0 1144 22 xmc28.1 xmc28.2 xmc29.0 xmc29.1 xmc29.2 xmc30.0 xmc30.1 xmc30.2 1145 23 xmc31.0 xmc31.1 xmc31.2 xmc32.0 xmc32.1 xmc32.2 xmc33.0 xmc33.1 1146 24 xmc33.2 xmc34.0 xmc34.1 xmc34.2 xmc35.0 xmc35.1 xmc35.2 xmc36.0 1147 25 Xmc36.1 xmc36.2 xmc37.0 xmc37.1 xmc37.2 xmc38.0 xmc38.1 xmc38.2 1148 26 Nc3.0 Nc3.1 Nc3.2 Nc3.3 Nc3.4 Nc3.5 Nc3.6 bc3.0 1149 27 bc3.1 Mc3.0 Mc3.1 xmaxc30 xmaxc31 xmaxc32 xmaxc33 xmaxc34 1150 28 xmaxc35 xmc39.0 xmc39.1 xmc39.2 xmc40.0 xmc40.1 xmc40.2 xmc41.0 1151 29 xmc41.1 xmc41.2 xmc42.0 xmc42.1 xmc42.2 xmc43.0 xmc43.1 xmc43.2 1152 30 xmc44.0 xmc44.1 xmc44.2 xmc45.0 xmc45.1 xmc45.2 xmc46.0 xmc46.1 1153 31 xmc46.2 xmc47.0 xmc47.1 xmc47.2 xmc48.0 xmc48.1 xmc48.2 xmc49.0 1154 32 xmc49.1 xmc49.2 xmc50.0 xmc50.1 xmc50.2 xmc51.0 xmc51.1 xmc51.2 1156 Table 3: GSM payload format 1158 4.5.12 LPC 1160 LPC designates an experimental linear predictive encoding contributed 1161 by Ron Frederick, which is based on an implementation written by Ron 1162 Zuckerman posted to the Usenet group comp.dsp on June 26, 1992. The 1163 codec generates 14 octets for every frame. The framesize is set to 20 1164 ms, resulting in a bit rate of 5,600 b/s. 1166 4.5.13 MPA 1168 MPA denotes MPEG-1 or MPEG-2 audio encapsulated as elementary 1169 streams. The encoding is defined in ISO standards ISO/IEC 11172-3 1170 and 13818-3. The encapsulation is specified in RFC 2250 [14]. 1172 The encoding may be at any of three levels of complexity, called 1173 Layer I, II and III. The selected layer as well as the sampling rate 1174 and channel count are indicated in the payload. The RTP timestamp 1175 clock rate is always 90000, independent of the sampling rate. MPEG-1 1176 audio supports sampling rates of 32, 44.1, and 48 kHz (ISO/IEC 1177 11172-3, section 1.1; "Scope"). MPEG-2 supports sampling rates of 16, 1178 22.05 and 24 kHz. The number of samples per frame is fixed, but the 1179 frame size will vary with the sampling rate and bit rate. 1181 The MIME registration for MPA in RFC YYYY [7] specifies parameters 1182 that MAY be used with MIME or SDP to restrict the selection of layer, 1183 channel count, sampling rate, and bit rate. 1185 4.5.14 PCMA and PCMU 1187 PCMA and PCMU are specified in ITU-T Recommendation G.711. Audio data 1188 is encoded as eight bits per sample, after logarithmic scaling. PCMU 1189 denotes mu-law scaling, PCMA A-law scaling. A detailed description is 1190 given by Jayant and Noll [15]. Each G.711 octet SHALL be octet- 1191 aligned in an RTP packet. The sign bit of each G.711 octet SHALL 1192 correspond to the most significant bit of the octet in the RTP packet 1193 (i.e., assuming the G.711 samples are handled as octets on the host 1194 machine, the sign bit SHALL be the most significant bit of the octet 1195 as defined by the host machine format). The 56 kb/s and 48 kb/s modes 1196 of G.711 are not applicable to RTP, since PCMA and PCMU MUST always 1197 be transmitted as 8-bit samples. 1199 4.5.15 QCELP 1201 The Electronic Industries Association (EIA) & Telecommunications 1202 Industry Association (TIA) standard IS-733, "TR45: High Rate Speech 1203 Service Option for Wideband Spread Spectrum Communications Systems," 1204 defines the QCELP audio compression algorithm for use in wireless 1205 CDMA applications. The QCELP CODEC compresses each 20 milliseconds of 1206 8000 Hz, 16- bit sampled input speech into one of four different size 1207 output frames: Rate 1 (266 bits), Rate 1/2 (124 bits), Rate 1/4 (54 1208 bits) or Rate 1/8 (20 bits). For typical speech patterns, this 1209 results in an average output of 6.8 k bits/sec for normal mode and 1210 4.7 k bits/sec for reduced rate mode. The packetization of the QCELP 1211 audio codec is described in [16]. 1213 4.5.16 RED 1214 The redundant audio payload format "RED" is specified by RFC 2198 1215 [17]. It defines a means by which multiple redundant copies of an 1216 audio packet may be transmitted in a single RTP stream. Each packet 1217 in such a stream contains, in addition to the audio data for that 1218 packetization interval, a (more heavily compressed) copy of the data 1219 from a previous packetization interval. This allows an approximation 1220 of the data from lost packets to be recovered upon decoding of a 1221 subsequent packet, giving much improved sound quality when compared 1222 with silence substitution for lost packets. 1224 4.5.17 VDVI 1226 VDVI is a variable-rate version of DVI4, yielding speech bit rates of 1227 between 10 and 25 kb/s. It is specified for single-channel operation 1228 only. Samples are packed into octets starting at the most- 1229 significant bit. The last octet is padded with 1 bits if the last 1230 sample does not fill the last octet. This padding is distinct from 1231 the valid codewords. The receiver needs to detect the padding 1232 because there is no explicit count of samples in the packet. 1234 It uses the following encoding: 1236 DVI4 codeword VDVI bit pattern 1237 _______________________________ 1238 0 00 1239 1 010 1240 2 1100 1241 3 11100 1242 4 111100 1243 5 1111100 1244 6 11111100 1245 7 11111110 1246 8 10 1247 9 011 1248 10 1101 1249 11 11101 1250 12 111101 1251 13 1111101 1252 14 11111101 1253 15 11111111 1255 5 Video 1257 The following sections describe the video encodings that are defined 1258 in this memo and give their abbreviated names used for 1259 identification. These video encodings and their payload types are 1260 listed in Table 5. 1262 All of these video encodings use an RTP timestamp frequency of 90,000 1263 Hz, the same as the MPEG presentation time stamp frequency. This 1264 frequency yields exact integer timestamp increments for the typical 1265 24 (HDTV), 25 (PAL), and 29.97 (NTSC) and 30 Hz (HDTV) frame rates 1266 and 50, 59.94 and 60 Hz field rates. While 90 kHz is the RECOMMENDED 1267 rate for future video encodings used within this profile, other rates 1268 MAY be used. However, it is not sufficient to use the video frame 1269 rate (typically between 15 and 30 Hz) because that does not provide 1270 adequate resolution for typical synchronization requirements when 1271 calculating the RTP timestamp corresponding to the NTP timestamp in 1272 an RTCP SR packet. The timestamp resolution MUST also be sufficient 1273 for the jitter estimate contained in the receiver reports. 1275 For most of these video encodings, the RTP timestamp encodes the 1276 sampling instant of the video image contained in the RTP data packet. 1277 If a video image occupies more than one packet, the timestamp is the 1278 same on all of those packets. Packets from different video images are 1279 distinguished by their different timestamps. 1281 Most of these video encodings also specify that the marker bit of the 1282 RTP header SHOULD be set to one in the last packet of a video frame 1283 and otherwise set to zero. Thus, it is not necessary to wait for a 1284 following packet with a different timestamp to detect that a new 1285 frame should be displayed. 1287 5.1 CelB 1289 The CELL-B encoding is a proprietary encoding proposed by Sun 1290 Microsystems. The byte stream format is described in RFC 2029 [18]. 1292 5.2 JPEG 1294 The encoding is specified in ISO Standards 10918-1 and 10918-2. The 1295 RTP payload format is as specified in RFC 2435 [19]. 1297 5.3 H261 1299 The encoding is specified in ITU-T Recommendation H.261, "Video codec 1300 for audiovisual services at p x 64 kbit/s". The packetization and 1301 RTP-specific properties are described in RFC 2032 [20]. 1303 5.4 H263 1305 The encoding is specified in the 1996 version of ITU-T Recommendation 1306 H.263, "Video coding for low bit rate communication". The 1307 packetization and RTP-specific properties are described in RFC 2190 1308 [21]. The H263-1998 payload format is RECOMMENDED over this one for 1309 use by new implementations. 1311 5.5 H263-1998 1313 The encoding is specified in the 1998 version of ITU-T Recommendation 1314 H.263, "Video coding for low bit rate communication". The 1315 packetization and RTP-specific properties are described in RFC 2429 1316 [22]. Because the 1998 version of H.263 is a superset of the 1996 1317 syntax, this payload format can also be used with the 1996 version of 1318 H.263, and is RECOMMENDED for this use by new implementations. This 1319 payload format does not replace RFC 2190, which continues to be used 1320 by existing implementations, and may be required for backward 1321 compatibility in new implementations. Implementations using the new 1322 features of the 1998 version of H.263 MUST use the payload format 1323 described in RFC 2429. 1325 5.6 MPV 1327 MPV designates the use of MPEG-1 and MPEG-2 video encoding elementary 1328 streams as specified in ISO Standards ISO/IEC 11172 and 13818-2, 1329 respectively. The RTP payload format is as specified in RFC 2250 1330 [14], Section 3. 1332 The MIME registration for MPV in RFC YYYY [7] specifies a parameter 1333 that MAY be used with MIME or SDP to restrict the selection of the 1334 type of MPEG video. 1336 5.7 MP2T 1338 MP2T designates the use of MPEG-2 transport streams, for either audio 1339 or video. The RTP payload format is described in RFC 2250 [14], 1340 Section 2. 1342 5.8 nv 1344 The encoding is implemented in the program `nv', version 4, developed 1345 at Xerox PARC by Ron Frederick. Further information is available from 1346 the author: 1348 Ron Frederick 1349 Cacheflow Inc. 1350 650 Almanor Avenue 1351 Sunnyvale, CA 94085 1352 United States 1353 electronic mail: ronf@cacheflow.com 1355 6 Payload Type Definitions 1357 Tables 4 and 5 define this profile's static payload type values for 1358 the PT field of the RTP data header. In addition, payload type 1359 values in the range 96-127 MAY be defined dynamically through a 1360 conference control protocol, which is beyond the scope of this 1361 document. For example, a session directory could specify that for a 1362 given session, payload type 96 indicates PCMU encoding, 8,000 Hz 1363 sampling rate, 2 channels. Entries in Tables 4 and 5 with payload 1364 type "dyn" have no static payload type assigned and are only used 1365 with a dynamic payload type. Payload type 13 is reserved for a 1366 comfort noise payload format to be specified in a separate RFC. 1367 Payload type 19 is also marked "reserved" because some draft versions 1368 of this specification assigned that number to a comfort noise payload 1369 format. The payload type range 72-76 is marked "reserved" so that 1370 RTCP and RTP packets can be reliably distinguished (see Section 1371 "Summary of Protocol Constants" of the RTP protocol specification). 1373 The payload types currently defined in this profile are assigned to 1374 exactly one of three categories or media types : audio only, video 1375 only and those combining audio and video. The media types are marked 1376 in Tables 4 and 5 as "A", "V" and "AV", respectively. Payload types 1377 of different media types SHALL NOT be interleaved or multiplexed 1378 within a single RTP session, but multiple RTP sessions MAY be used in 1379 parallel to send multiple media types. An RTP source MAY change 1380 payload types within the same media type during a session. See the 1381 section "Multiplexing RTP Sessions" of RFC XXXX for additional 1382 explanation. 1384 Session participants agree through mechanisms beyond the scope of 1385 this specification on the set of payload types allowed in a given 1386 session. This set MAY, for example, be defined by the capabilities 1387 of the applications used, negotiated by a conference control protocol 1388 or established by agreement between the human participants. 1390 Audio applications operating under this profile SHOULD, at a minimum, 1391 be able to send and/or receive payload types 0 (PCMU) and 5 (DVI4). 1392 This allows interoperability without format negotiation and ensures 1393 successful negotiation with a conference control protocol. 1395 7 RTP over TCP and Similar Byte Stream Protocols 1397 Under special circumstances, it may be necessary to carry RTP in 1398 protocols offering a byte stream abstraction, such as TCP, possibly 1399 multiplexed with other data. The application MUST define its own 1400 method of delineating RTP and RTCP packets (RTSP [23] provides an 1401 example of such an encapsulation specification.) 1403 8 Port Assignment 1404 PT encoding media type clock rate channels 1405 name (Hz) 1406 ___________________________________________________ 1407 0 PCMU A 8000 1 1408 1 reserved A 1409 2 G726-32 A 8000 1 1410 3 GSM A 8000 1 1411 4 G723 A 8000 1 1412 5 DVI4 A 8000 1 1413 6 DVI4 A 16000 1 1414 7 LPC A 8000 1 1415 8 PCMA A 8000 1 1416 9 G722 A 8000 1 1417 10 L16 A 44100 2 1418 11 L16 A 44100 1 1419 12 QCELP A 8000 1 1420 13 reserved A 1421 14 MPA A 90000 (see text) 1422 15 G728 A 8000 1 1423 16 DVI4 A 11025 1 1424 17 DVI4 A 22050 1 1425 18 G729 A 8000 1 1426 19 reserved A 1427 20 unassigned A 1428 21 unassigned A 1429 22 unassigned A 1430 23 unassigned A 1431 dyn G726-40 A 8000 1 1432 dyn G726-24 A 8000 1 1433 dyn G726-16 A 8000 1 1434 dyn G729D A 8000 1 1435 dyn G729E A 8000 1 1436 dyn GSM-EFR A 8000 1 1437 dyn L8 A var. var. 1438 dyn RED A (see text) 1439 dyn VDVI A var. 1 1441 Table 4: Payload types (PT) for audio encodings 1443 As specified in the RTP protocol definition, RTP data SHOULD be 1444 carried on an even UDP port number and the corresponding RTCP packets 1445 SHOULD be carried on the next higher (odd) port number. 1447 Applications operating under this profile MAY use any such UDP port 1448 pair. For example, the port pair MAY be allocated randomly by a 1449 session management program. A single fixed port number pair cannot be 1450 required because multiple applications using this profile are likely 1451 PT encoding media type clock rate 1452 name (Hz) 1453 ____________________________________________ 1454 24 unassigned V 1455 25 CelB V 90000 1456 26 JPEG V 90000 1457 27 unassigned V 1458 28 nv V 90000 1459 29 unassigned V 1460 30 unassigned V 1461 31 H261 V 90000 1462 32 MPV V 90000 1463 33 MP2T AV 90000 1464 34 H263 V 90000 1465 35-71 unassigned ? 1466 72-76 reserved N/A N/A 1467 77-95 unassigned ? 1468 96-127 dynamic ? 1469 dyn H263-1998 V 90000 1471 Table 5: Payload types (PT) for video and combined encodings 1473 to run on the same host, and there are some operating systems that do 1474 not allow multiple processes to use the same UDP port with different 1475 multicast addresses. 1477 However, port numbers 5004 and 5005 have been registered for use with 1478 this profile for those applications that choose to use them as the 1479 default pair. Applications that operate under multiple profiles MAY 1480 use this port pair as an indication to select this profile if they 1481 are not subject to the constraint of the previous paragraph. 1482 Applications need not have a default and MAY require that the port 1483 pair be explicitly specified. The particular port numbers were chosen 1484 to lie in the range above 5000 to accommodate port number allocation 1485 practice within some versions of the Unix operating system, where 1486 port numbers below 1024 can only be used by privileged processes and 1487 port numbers between 1024 and 5000 are automatically assigned by the 1488 operating system. 1490 9 Changes from RFC 1890 1492 This RFC revises RFC 1890. It is mostly backwards-compatible with RFC 1493 1890 except for functions removed because two interoperable 1494 implementations were not found. The additions to RFC 1890 codify 1495 existing practice in the use of payload formats under this profile. 1496 Since this profile may be used without using any of the payload 1497 formats listed here, the addition of new payload formats in this 1498 revision does not affect backwards compatibility. The changes are 1499 listed below, categorized into functional and non-functional changes. 1501 Functional changes: 1503 o A new Section "IANA Considerations" was added to specify the 1504 registration of the name for this profile and to establish a 1505 new policy that no additional registration of static payload 1506 types for this profile will be made beyond those added in this 1507 revision and included in Tables 4 and 5. Instead, additional 1508 encoding names may be registered as MIME subtypes for binding 1509 to dynamic payload types. Non-normative references were added 1510 to RFC YYYY [7] where MIME subtypes for all the listed payload 1511 formats are registered, some with optional parameters for use 1512 of the payload formats. 1514 o Static payload types 4, 16, 17 and 34 were added to 1515 incorporate IANA registrations made since the publication of 1516 RFC 1890, along with the corresponding payload format 1517 descriptions for G723 and H263. 1519 o Following working group discussion, static payload types 12 1520 and 18 were added along with the corresponding payload format 1521 descriptions for QCELP and G729. Static payload type 13 was 1522 reserved for a comfort noise payload format to be defined in a 1523 separate RFC. Payload type 19 was marked reserved because it 1524 had been temporarily allocated in some draft revisions of this 1525 document. 1527 o The payload format for G721 was renamed to G726-32 following 1528 the ITU-T renumbering. 1530 o The payload format description for G726 was expanded to 1531 include the -16, -24 and -40 data rates. Payload formats G729D 1532 and G729E were added following the ITU-T addition of Annexes D 1533 and E to Recommendation G.729. Listings were added for payload 1534 formats GSM-EFR, RED, and H263-1998 published in other 1535 documents subsequent to RFC 1890. These additional payload 1536 formats are referenced only by dynamic payload type numbers. 1538 o The descriptions of the payload formats for G722, G728, GSM, 1539 VDVI were expanded. 1541 o The payload format for 1016 audio was removed and its static 1542 payload type assignment 1 was marked "reserved" because two 1543 interoperable implementations were not found. 1545 o Requirements for congestion control were added in Section 2. 1547 o This profile follows the suggestion in the revised RTP spec 1548 that RTCP bandwidth may be specified separately from the 1549 session bandwidth and separately for active senders and 1550 passive receivers. 1552 o The mapping of a user pass-phrase string into an encryption 1553 key was deleted from Section 2 because two interoperable 1554 implementations were not found. 1556 Non-functional changes: 1558 o In Section 4.1, the requirement level for setting of the 1559 marker bit on the first packet after silence for audio was 1560 changed from "is" to "SHOULD be", and clarified that the 1561 marker bit is set only when packets are intentionally not 1562 sent. 1564 o Similarly, text was added to specify that the marker bit 1565 SHOULD be set to one on the last packet of a video frame, and 1566 that video frames are distinguished by their timestamps. 1568 o RFC references are added for payload formats published after 1569 RFC 1890. 1571 o The security considerations and full copyright sections were 1572 added. 1574 o According to Peter Hoddie of Apple, only pre-1994 Macintosh 1575 used the 22254.54 rate and none the 11127.27 rate, so the 1576 latter was dropped from the discussion of suggested sampling 1577 frequencies. 1579 o Table 1 was corrected to move some values from the "ms/packet" 1580 column to the "default ms/packet" column where they belonged. 1582 o A note has been added for G722 to clarify a discrepancy 1583 between the actual sampling rate and the RTP timestamp clock 1584 rate. 1586 o Small clarifications of the text have been made in several 1587 places, some in response to questions from readers. In 1588 particular: 1590 - A definition for "media type" is given in Section 1.1 to 1591 allow the explanation of multiplexing RTP sessions in 1592 Section 6 to be more clear regarding the multiplexing of 1593 multiple media. 1595 - The explanation of how to determine the number of audio 1596 frames in a packet from the length was expanded. 1598 - More description of the allocation of bandwidth to SDES 1599 items is given. 1601 - A note was added that the convention for the order of 1602 channels specified in Section 4.1 may be overridden by a 1603 particular encoding or payload format specification. 1605 - The terms MUST, SHOULD, MAY, etc. are used as defined in RFC 1606 2119. 1608 o A second author for this document was added. 1610 10 Security Considerations 1612 Implementations using the profile defined in this specification are 1613 subject to the security considerations discussed in the RTP 1614 specification [2]. This profile does not specify any different 1615 security services. The primary function of this profile is to list a 1616 set of data compression encodings for audio and video media. 1618 Confidentiality of the media streams is achieved by encryption. 1619 Because the data compression used with the payload formats described 1620 in this profile is applied end-to-end, encryption may be performed 1621 after compression so there is no conflict between the two operations. 1623 A potential denial-of-service threat exists for data encodings using 1624 compression techniques that have non-uniform receiver-end 1625 computational load. The attacker can inject pathological datagrams 1626 into the stream which are complex to decode and cause the receiver to 1627 be overloaded. However, the encodings described in this profile do 1628 not exhibit any significant non-uniformity. 1630 As with any IP-based protocol, in some circumstances a receiver may 1631 be overloaded simply by the receipt of too many packets, either 1632 desired or undesired. Network-layer authentication MAY be used to 1633 discard packets from undesired sources, but the processing cost of 1634 the authentication itself may be too high. In a multicast 1635 environment, pruning of specific sources may be implemented in future 1636 versions of IGMP [24] and in multicast routing protocols to allow a 1637 receiver to select which sources are allowed to reach it. 1639 11 Full Copyright Statement 1641 Copyright (C) The Internet Society (2001). All Rights Reserved. 1643 This document and translations of it may be copied and furnished to 1644 others, and derivative works that comment on or otherwise explain it 1645 or assist in its implementation may be prepared, copied, published 1646 and distributed, in whole or in part, without restriction of any 1647 kind, provided that the above copyright notice and this paragraph are 1648 included on all such copies and derivative works. However, this 1649 document itself may not be modified in any way, such as by removing 1650 the copyright notice or references to the Internet Society or other 1651 Internet organizations, except as needed for the purpose of 1652 developing Internet standards in which case the procedures for 1653 copyrights defined in the Internet Standards process must be 1654 followed, or as required to translate it into languages other than 1655 English. 1657 The limited permissions granted above are perpetual and will not be 1658 revoked by the Internet Society or its successors or assigns. 1660 This document and the information contained herein is provided on an 1661 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 1662 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 1663 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 1664 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 1665 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1667 12 Acknowledgments 1669 The comments and careful review of Simao Campos, Richard Cox and AVT 1670 Working Group participants are gratefully acknowledged. The GSM 1671 description was adopted from the IMTC Voice over IP Forum Service 1672 Interoperability Implementation Agreement (January 1997). Fred Burg 1673 and Terry Lyons helped with the G.729 description. 1675 13 Addresses of Authors 1677 Henning Schulzrinne 1678 Dept. of Computer Science 1679 Columbia University 1680 1214 Amsterdam Avenue 1681 New York, NY 10027 1682 USA 1683 electronic mail: schulzrinne@cs.columbia.edu 1685 Stephen L. Casner 1686 Packet Design 1687 2465 Latham Street 1688 Mountain View, CA 94040 1689 United States 1690 electronic mail: casner@acm.org 1691 References 1693 Normative References 1695 [1] S. Bradner, "Key words for use in RFCs to Indicate Requirement 1696 Levels," RFC 2119, Internet Engineering Task Force, Mar. 1997. 1698 [2] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, "RTP: A 1699 transport protocol for real-time applications," Internet Draft, 1700 Internet Engineering Task Force, Feb. 1999 Work in progress, revision 1701 to RFC 1889. 1703 [3] Apple Computer, "Audio interchange file format AIFF-C," Aug. 1704 1991. (also ftp://ftp.sgi.com/sgi/aiff-c.9.26.91.ps.Z). 1706 Non-Normative References 1708 [4] R. Braden, D. Clark, S. Shenker, "Integrated Services in the 1709 Internet Architecture: an Overview," Request for Comments 1710 (Informational) RFC 1633, Internet Engineering Task Force, June 1994. 1712 [5] S. Blake, D. Black, M. Carlson, E. Davies, Z. Wang, W. Weiss, "An 1713 Architecture for Differentiated Service," Request for Comments 1714 (Proposed Standard) RFC 2475, Internet Engineering Task Force, Dec. 1715 1998. 1717 [6] M. Handley and V. Jacobson, "SDP: Session Description Protocol," 1718 Request for Comments (Proposed Standard) RFC 2327, Internet 1719 Engineering Task Force, Apr. 1998. 1721 [7] S. Casner and P. Hoschka, "MIME Type Registration of RTP Payload 1722 Types," Internet Draft, Internet Engineering Task Force, July 2001. 1723 Work in progress. 1725 [8] N. Freed, J. Klensin, and J. Postel, "Multipurpose Internet Mail 1726 Extensions (MIME) Part Four: Registration Procedures," RFC 2048, 1727 Internet Engineering Task Force, Nov. 1996. 1729 [9] IMA Digital Audio Focus and Technical Working Groups, 1730 "Recommended practices for enhancing digital audio compatibility in 1731 multimedia systems (version 3.00)," tech. rep., Interactive 1732 Multimedia Association, Annapolis, Maryland, Oct. 1992. 1734 [10] D. Deleam and J.-P. Petit, "Real-time implementations of the 1735 recent ITU-T low bit rate speech coders on the TI TMS320C54X DSP: 1736 results, methodology, and applications," in Proc. of International 1737 Conference on Signal Processing, Technology, and Applications 1738 (ICSPAT) , (Boston, Massachusetts), pp. 1656--1660, Oct. 1996. 1740 [11] M. Mouly and M.-B. Pautet, The GSM system for mobile 1741 communications Lassay-les-Chateaux, France: Europe Media Duplication, 1742 1993. 1744 [12] J. Degener, "Digital speech compression," Dr. Dobb's Journal , 1745 Dec. 1994. 1747 [13] S. M. Redl, M. K. Weber, and M. W. Oliphant, An Introduction to 1748 GSM Boston: Artech House, 1995. 1750 [14] D. Hoffman, G. Fernando, V. Goyal, and M. Civanlar, "RTP payload 1751 format for MPEG1/MPEG2 video," Request for Comments (Proposed 1752 Standard) RFC 2250, Internet Engineering Task Force, Jan. 1998. 1754 [15] N. S. Jayant and P. Noll, Digital Coding of Waveforms-- 1755 Principles and Applications to Speech and Video Englewood Cliffs, New 1756 Jersey: Prentice-Hall, 1984. 1758 [16] K. McKay, "RTP Payload Format for PureVoice(tm) Audio", Request 1759 for Comments (Proposed Standard) RFC 2658, Internet Engineering Task 1760 Force, Aug. 1999. 1762 [17] C. Perkins, I. Kouvelas, O. Hodson, V. Hardman, M. Handley, J.C. 1763 Bolot, A. Vega-Garcia, and S. Fosse-Parisis, "RTP Payload for 1764 Redundant Audio Data," Request for Comments (Proposed Standard) RFC 1765 2198, Internet Engineering Task Force, Sep. 1997. 1767 [18] M. Speer and D. Hoffman, "RTP payload format of sun's CellB 1768 video encoding," Request for Comments (Proposed Standard) RFC 2029, 1769 Internet Engineering Task Force, Oct. 1996. 1771 [19] L. Berc, W. Fenner, R. Frederick, and S. McCanne, "RTP payload 1772 format for JPEG-compressed video," Request for Comments (Proposed 1773 Standard) RFC 2435, Internet Engineering Task Force, Oct. 1996. 1775 [20] T. Turletti and C. Huitema, "RTP payload format for H.261 video 1776 streams," Request for Comments (Proposed Standard) RFC 2032, Internet 1777 Engineering Task Force, Oct. 1996. 1779 [21] C. Zhu, "RTP payload format for H.263 video streams," Request 1780 for Comments (Proposed Standard) RFC 2190, Internet Engineering Task 1781 Force, Sep. 1997. 1783 [22] C. Bormann, L. Cline, G. Deisher, T. Gardos, C. Maciocco, D. 1784 Newell, J. Ott, G. Sullivan, S. Wenger, C. Zhu, "RTP Payload Format 1785 for the 1998 Version of ITU-T Rec. H.263 Video (H.263+)," Request for 1786 Comments (Proposed Standard) RFC 2429, Internet Engineering Task 1787 Force, Oct. 1998. 1789 [23] H. Schulzrinne, A. Rao, and R. Lanphier, "Real time streaming 1790 protocol (RTSP)," Request for Comments (Proposed Standard) RFC 2326, 1791 Internet Engineering Task Force, Apr. 1998. 1793 [24] S. Deering, "Host Extensions for IP Multicasting," Request for 1794 Comments RFC 1112, STD 5, Internet Engineering Task Force, Aug. 1989. 1796 Current Locations of Related Resources 1798 Note: Several sections below refer to the ITU-T Software Tool Library 1799 (STL). It is available from the ITU Sales Service, Place des Nations, 1800 CH-1211 Geneve 20, Switzerland (also check http://www.itu.int. The 1801 ITU-T STL is covered by a license defined in ITU-T Recommendation 1802 G.191, "Software tools for speech and audio coding standardization". 1804 UTF-8 1806 Information on the UCS Transformation Format 8 (UTF-8) is available 1807 at 1809 http://www.stonehand.com/unicode/standard/utf8.html 1811 DVI4 1813 An implementation is available from Jack Jansen at 1815 ftp://ftp.cwi.nl/local/pub/audio/adpcm.shar 1817 G722 1819 An implementation of the G.722 algorithm is available as part of the 1820 ITU-T STL, described above. 1822 G723 1824 The reference C code implementation defining the G.723.1 algorithm 1825 and its Annexes A, B, and C are available as an integral part of 1826 Recommendation G.723.1 from the ITU Sales Service, address listed 1827 above. Both the algorithm and C code are covered by a specific 1828 license. The ITU-T Secretariat should be contacted to obtain such 1829 licensing information. 1831 G726 1833 G726 is specified in the ITU-T Recommendation G.726, "40, 32, 24, and 1834 16 kb/s Adaptive Differential Pulse Code Modulation (ADPCM)". An 1835 implementation of the G.726 algorithm is available as part of the 1836 ITU-T STL, described above. 1838 G729 1840 The reference C code implementation defining the G.729 algorithm and 1841 its Annexes A through I are available as an integral part of 1842 Recommendation G.729 from the ITU Sales Service, listed above. Annex 1843 I contains the integrated C source code for all G.729 operating 1844 modes. The G.729 algorithm and associated C code are covered by a 1845 specific license. The contact information for obtaining the license 1846 is available from the ITU-T Secretariat. 1848 GSM 1850 A reference implementation was written by Carsten Borman and Jutta 1851 Degener (TU Berlin, Germany). It is available at 1853 ftp://ftp.cs.tu-berlin.de/pub/local/kbs/tubmik/gsm/ 1855 Although the RPE-LTP algorithm is not an ITU-T standard, there is a C 1856 code implementation of the RPE-LTP algorithm available as part of the 1857 ITU-T STL. The STL implementation is an adaptation of the TU Berlin 1858 version. 1860 LPC 1862 An implementation is available at 1864 ftp://parcftp.xerox.com/pub/net-research/lpc.tar.Z 1866 PCMU, PCMA 1868 An implementation of these algorithm is available as part of the 1869 ITU-T STL, described above. Code to convert between linear and mu-law 1870 companded data is also available in [9].