idnits 2.17.1 draft-sollaud-avt-rtp-g729-scal-wb-ext-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1.a on line 16. -- Found old boilerplate from RFC 3978, Section 5.1 on line 16. -- Found old boilerplate from RFC 3978, Section 5.5 on line 701. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 678. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 685. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 691. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (March 21, 2005) is 6975 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 2327 (ref. '4') (Obsoleted by RFC 4566) Summary: 4 errors (**), 0 flaws (~~), 3 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group A. Sollaud 2 Internet-Draft France Telecom 3 Expires: September 22, 2005 March 21, 2005 5 RTP payload format for the future scalable and wideband extension of 6 G.729 audio codec 7 draft-sollaud-avt-rtp-g729-scal-wb-ext-00 9 Status of this Memo 11 This document is an Internet-Draft and is subject to all provisions 12 of Section 3 of RFC 3978. By submitting this Internet-Draft, each 13 author represents that any applicable patent or other IPR claims of 14 which he or she is aware have been or will be disclosed, and any of 15 which he or she becomes aware will be disclosed, in accordance with 16 Section 6 of BCP 79. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that 20 other groups may also distribute working documents as 21 Internet-Drafts. 23 Internet-Drafts are draft documents valid for a maximum of six months 24 and may be updated, replaced, or obsoleted by other documents at any 25 time. It is inappropriate to use Internet-Drafts as reference 26 material or to cite them other than as "work in progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/1id-abstracts.txt. 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html. 34 This Internet-Draft will expire on September 22, 2005. 36 Copyright Notice 38 Copyright (C) The Internet Society (2005). 40 Abstract 42 This document specifies a real-time transport protocol (RTP) payload 43 format to be used for the future scalable and wideband extension of 44 the International Telecommunication Union (ITU-T) G.729 audio codec. 45 A media type registration is included for this payload format. 47 Table of Contents 49 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 50 2. Background . . . . . . . . . . . . . . . . . . . . . . . . . . 3 51 3. RTP Payload format . . . . . . . . . . . . . . . . . . . . . . 4 52 3.1 RTP header usage . . . . . . . . . . . . . . . . . . . . . 4 53 3.2 Payload format . . . . . . . . . . . . . . . . . . . . . . 5 54 3.2.1 Payload structure . . . . . . . . . . . . . . . . . . 5 55 3.2.2 Payload Header . . . . . . . . . . . . . . . . . . . . 5 56 3.2.3 Table of contents . . . . . . . . . . . . . . . . . . 6 57 3.2.4 Audio data . . . . . . . . . . . . . . . . . . . . . . 8 58 3.2.5 Payload example . . . . . . . . . . . . . . . . . . . 8 59 3.3 MBS operations . . . . . . . . . . . . . . . . . . . . . . 10 60 3.3.1 MBS decreasing . . . . . . . . . . . . . . . . . . . . 10 61 3.3.2 MBS increasing . . . . . . . . . . . . . . . . . . . . 11 62 4. Payload format parameters . . . . . . . . . . . . . . . . . . 11 63 4.1 Media type registration . . . . . . . . . . . . . . . . . 11 64 4.2 Mapping to SDP parameters . . . . . . . . . . . . . . . . 12 65 4.3 Offer-answer model considerations . . . . . . . . . . . . 13 66 5. Security considerations . . . . . . . . . . . . . . . . . . . 14 67 6. IANA considerations . . . . . . . . . . . . . . . . . . . . . 14 68 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 14 69 7.1 Normative references . . . . . . . . . . . . . . . . . . . 14 70 7.2 Informative references . . . . . . . . . . . . . . . . . . 15 71 Author's Address . . . . . . . . . . . . . . . . . . . . . . . 15 72 Intellectual Property and Copyright Statements . . . . . . . . 16 74 1. Introduction 76 International Telecommunication Union (ITU-T) is working on a 77 scalable and wideband extension of its recommendation G.729 [7]. 78 This future audio codec will be called G.729X in the following text. 79 This document specifies the payload format for packetization of 80 G.729X encoded audio signals into the real-time transport protocol 81 (RTP). 83 The payload format itself and the handling of variable bit rate are 84 described in Section 3. The details for the use of G.729X with MIME 85 and SDP are given in Section 4. 87 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 88 "SHOULD", "SHOULD NOT","RECOMMENDED", "MAY", and "OPTIONAL" in this 89 document are to be interpreted as described in RFC 2119 [1]. 91 2. Background 93 G.729X is mainly designed to be used as a speech codec, but it can be 94 used for music at the highest bit rates. The sampling frequency is 95 16000 Hz and the frame size is 20 ms. 97 This G.729-based codec produces an embedded bitstream providing an 98 improved narrow band quality [300, 3400 Hz] at 12 kbps, and an 99 enhanced and gracefully improving wideband quality [50, 7000 Hz] from 100 14 kbps to 32 kbps, by steps of 2 kbps. At 8 kbps it generates a 101 G.729 bitstream (with annex B, that is supporting silence 102 suppression). 104 It has been mainly designed for packetized wideband voice 105 applications (Voice over IP or ATM, Telephony over IP, private 106 networks...) and particularly for those requiring scalable bandwidth, 107 enhanced quality above G.729, and easy integration into existing 108 infrastructures. 110 G.729X is also designed to cope with other services like high quality 111 audio/video conferencing, archival, messaging, etc. 113 For all those applications, the scalability feature allows to tune 114 the bit rate versus quality compromise, possibly in a dynamic way 115 during a session, taking into account service requirements and 116 network transport constraints. 118 G.729X produces frames that are said embedded because they are 119 composed of embedded layers. The first layer is called the core 120 layer and is bitstream compatible with the ITU-T G.729 with annex B 121 coder. Upper layers are added while bit rate increases, to improve 122 quality and enlarge audio bandwidth from narrowband to wideband. As 123 a result, a received frame can be decoded at its original bit rate or 124 at any lower bit rate corresponding to lower layers which are 125 embedded. Only the core layer is mandatory to decode understandable 126 speech, upper layers provide quality enhancement and wideband 127 enlargement. 129 G.729X can be used in cases where bit rate can change during the 130 communication session, due to bandwidth temporarily shared for 131 example. Then the sender will decrease its own sending rate. But if 132 its receiving bandwidth has decreased, it needs to tell the receiver 133 to decrease its sending rate. To do so, the sender will send an 134 in-band request to the receiver: the maximum bit rate supported 135 (MBS). The receiver must acknowledge the receiving of the MBS and 136 modify its sending rate accordingly, if possible. Thanks to the 137 embedded property of the coding scheme, note that it can send at the 138 MBS rate or any lower rate. 140 G.729X supports voice activity detection (VAD) and comfort noise 141 generation (CNG). During silence periods, the coder may 142 significantly decrease the transmitted bit rate by sending only 143 comfort noise parameters in special small frames called silence 144 insertion descriptors (SID). The receiver's decoder will generate 145 comfort noise according to the SID informations. This operation of 146 sending low bit rate comfort noise parameters during silence periods 147 is usually called discontinuous transmission (DTX). 149 3. RTP Payload format 151 3.1 RTP header usage 153 The format of the RTP header is specified in [2]. This payload 154 format uses the fields of the header in a manner consistent with that 155 specification. 157 The RTP timestamp clock frequency is the same as the sampling 158 frequency, that is 16 kHz. So the timestamp unit is in samples. 160 The duration of one frame is 20 ms, corresponding to 320 samples per 161 frame. Thus the timestamp is increased by 320 for each consecutive 162 frame. 164 The M bit should be set as specified in the applicable RTP profile, 165 for example, [3]. 167 The assignment of an RTP payload type for this packet format is 168 outside the scope of the document, and will not be specified here. 169 It is expected that the RTP profile under which this payload format 170 is being used will assign a payload type for this codec or specify 171 that the payload type is to be bound dynamically (see Section 4.2). 173 3.2 Payload format 175 3.2.1 Payload structure 177 The complete payload consists of an OPTIONAL payload header, a 178 payload table of contents and audio data representing one or more 179 frame. The following diagram shows the general format layout: 181 +------------------+-------------------+-----------------+ 182 | (Payload header) | Table of contents | Speech data ... | 183 +------------------+-------------------+-----------------+ 185 3.2.2 Payload Header 187 The payload header is OPTIONAL and is used to indicate to the remote 188 host a MBS value or to acknowledge a MBS request received from the 189 remote host. When no MBS operation is needed, this payload header 190 SHOULD NOT be present. 192 The payload header is one octet, as follows: 194 0 1 2 3 4 5 6 7 195 +-+-+-+-+-+-+-+-+ 196 |1|A|R|R| MBS | 197 +-+-+-+-+-+-+-+-+ 199 The first bit is always set to 1, that is how the presence of the 200 header is detected. 202 A (1 bit): MBS Acknowledge. Set to 1 to acknowledge a MBS request. 203 Otherwise set to 0. It is used to provide reliability to the MBS 204 transmission. 206 R (1 bit): Reserved. MUST be set to zero and SHOULD be ignored by 207 the receiver. 209 MBS (4 bits): maximum bit rate supported. Indicates a bit rate 210 request to the encoder at the site of the receiver of this payload 211 (see end of Section 2 and Section 3.3). The request is a maximum bit 212 rate, as per the following table: 214 +-------+--------------------+ 215 | MBS | max rate supported | 216 +-------+--------------------+ 217 | 0 | 8 kbps | 218 | 1 | 12 kbps | 219 | 2 | 14 kbps | 220 | 3 | 16 kbps | 221 | 4 | 18 kbps | 222 | 5 | 20 kbps | 223 | 6 | 22 kbps | 224 | 7 | 24 kbps | 225 | 8 | 26 kbps | 226 | 9 | 28 kbps | 227 | 10 | 30 kbps | 228 | 11 | 32 kbps | 229 | 12-14 | (reserved) | 230 | 15 | NO_MBS | 231 +-------+--------------------+ 233 The value 15 (NO_MBS) indicates that no rate request is present. It 234 is used either to send a MBS ACK without a MBS, or when the sender 235 can not remove the payload header and has no MBS to send. 237 3.2.3 Table of contents 239 Two types of Table of Contents (ToC) are described below. In a 240 Standard ToC, each ToC entry describes one frame. When all frames in 241 a packet are at the same bit rate, the sender MAY use a Compact ToC, 242 with only one ToC entry to describe the whole packet. Both types of 243 ToC MUST be supported by the receiver. 245 3.2.3.1 Standard ToC 247 The standard table of contents (ToC) consists of a list of ToC 248 entries. Each ToC entry describes one frame. 250 A ToC entry is one octet, as follows: 252 0 1 2 3 4 5 6 7 253 +-+-+-+-+-+-+-+-+ 254 |0|F|R|R| FT | 255 +-+-+-+-+-+-+-+-+ 257 The first bit is always set to 0. 259 F (1 bit): Set to 1 to indicate that this ToC entry is followed by 260 another one. Set to 0 to indicate that this ToC entry is the last 261 one in this payload. 263 R (1 bit): Reserved. MUST be set to zero and SHOULD be ignored by 264 the receiver. 266 FT (4 bits): Frame type, as per the following table: 268 +-------+---------------+------------+ 269 | FT | encoding rate | frame size | 270 +-------+---------------+------------+ 271 | 0 | 8 kbps | 20 octets | 272 | 1 | 12 kbps | 30 octets | 273 | 2 | 14 kbps | 35 octets | 274 | 3 | 16 kbps | 40 octets | 275 | 4 | 18 kbps | 45 octets | 276 | 5 | 20 kbps | 50 octets | 277 | 6 | 22 kbps | 55 octets | 278 | 7 | 24 kbps | 60 octets | 279 | 8 | 26 kbps | 65 octets | 280 | 9 | 28 kbps | 70 octets | 281 | 10 | 30 kbps | 75 octets | 282 | 11 | 32 kbps | 80 octets | 283 | 12-13 | (reserved) | | 284 | 14 | SID | 2 octets | 285 | 15 | NO_DATA | 0 | 286 +-------+---------------+------------+ 288 The FT value 15 (NO_DATA) indicates that a frame is either lost or 289 not transmitted. 291 3.2.3.2 Compact ToC 293 In most cases, the bit rate will not change very often, thus all 294 frames in a payload are likely to be at the same bit rate. When this 295 occurs, the sender MAY put only one ToC entry to indicate the bit 296 rate of all frames in the packet. The receiver will easily detect 297 that there is only one ToC entry (bit F=0) and that the size of the 298 audio data part of the payload is a multiple of the size of one frame 299 at the considered bit rate. So the actual number of frame is easy to 300 infer from the size of the audio data part : 301 nb_frames=(size_of_audio_data)/(size_of_one_frame). 303 This ToC simplification is compatible with DTX, with the restriction 304 that the SID frame MUST be at the end of the payload (it is in 305 consistent with the payload format of G.729 described in section 306 4.5.6 of [3]). Since the SID frame is much smaller than any other 307 frame, it will not hinder the calculation of the number of frames at 308 the receiver side and can be easily detected without the need of 309 adding a Toc entry with FT=14. Actually the presence of a SID frame 310 will be inferred by the result of the above division not being an 311 integer. 313 The receiver MUST support this compact ToC format. 315 Note well that this simplification of the ToC is acceptable only if 316 ALL frames are at the same bit rate. It can not be used if several 317 sequential frames are at the same bit rate R1 and the following group 318 of frames are at another bit rate R2, because the receiver will not 319 be able to determine how many frames of each bit rate there are. If 320 compact Toc is used, there MUST be only ONE ToC entry. 322 Below are short examples illustrating the compact ToC and the 323 calculation of the number of frames. See Section 3.2.5.3 for a 324 complete payload. 326 Example 1 : one ToC entry with FT=4 and 135 octets of audio data 327 following. The receiver knows that for FT=4, a frame is 45 octets 328 long. So there is 135/45 = 3 frames in the payload. 330 Example 2 : one ToC entry with FT=9 and 142 octets of audio data 331 following. The receiver knows that for FT=9, a frame is 70 octets 332 long. So there is 142/70 = 2 frames in the payload + a 2 octets rest 333 which is a SID frame. 335 3.2.4 Audio data 337 Audio data of a payload contains one or more audio frame as described 338 in the table of contents of the payload. The audio frames are packed 339 in the same order as their corresponding ToC entries are arranged in 340 the table of contents. If the Compact ToC is used, the audio frames 341 are packed in order of time, that is the older first. 343 Note that for ToC entries with FT=15, there will be no corresponding 344 audio frame in the payload. 346 3.2.5 Payload example 347 3.2.5.1 Payload carrying a single frame and no MBS 349 The following diagram shows a G.729X payload that contains a single 350 speech frame at 24 kbps (FT=7; size=60 octets) and no MBS. 352 0 1 2 3 353 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 354 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 355 |0|0|R|R| FT=7 | f(1/60) | f(2/60) | f(3/60) | 356 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 357 | f(4/60) | f(5/60) | f(6/60) | f(7/60) | 358 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 359 : ... : 360 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 361 | f(60/60) | 362 +-+-+-+-+-+-+-+-+ 364 3.2.5.2 Payload carrying multiple frames at various bit rates and a MBS 366 The following diagram shows a G.729X payload that contains 3 frames, 367 two at 20 kbps (FT=5; size=50 octets) and one at 32 kbps (FT=11; 368 size=80 octets), and a MBS of 32 kbps (MBS=11). The first octet is 369 the payload header (MBS) and is inserted before the ToC. 371 0 1 2 3 372 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 373 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 374 |1|0|R|R| MBS=11|0|1|R|R| FT=5 |0|1|R|R| FT=5 |0|0|R|R| FT=11 | 375 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 376 | f1(1/50) | f1(2/50) | f1(3/50) | f1(4/50) | 377 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 378 : ... : 379 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 380 | f1(49/50) | f1(50/50) | f2(1/50) | f2(2/50) | 381 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 382 : ... : 383 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 384 | f2(47/50) | f2(48/50) | f2(49/50) | f2(50/50) | 385 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 386 | f3(1/80) | f3(2/80) | f3(3/80) | f3(4/80) | 387 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 388 : ... : 389 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 390 | f3(77/80) | f3(78/80) | f3(79/80) | f3(80/80) | 391 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 393 3.2.5.3 Payload carrying multiple frames at the same bit rate and no 394 MBS 396 The following diagram shows a G.729X payload that contains 2 frames 397 at 14 kbps (FT=2; size=35 octets) and no MBS. There is only one ToC 398 entry, the number of frames is inferred from the payload size. 400 0 1 2 3 401 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 402 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 403 |0|0|R|R| FT=2 | f1(1/35) | f1(2/35) | f1(3/35) | 404 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 405 : ... : 406 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 407 | f1(32/35) | f1(33/35) | f1(34/35) | f1(35/35) | 408 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 409 | f2(1/35) | f2(2/35) | f2(3/35) | f2(4/35) | 410 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 411 : ... : 412 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 413 | f2(33/35) | f2(34/35) | f2(35/35) | 414 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 416 3.3 MBS operations 418 The MBS value located in the optional payload header is used to tell 419 the other party that the maximum bit rate one can receive has 420 changed. Examples of MBS usage are following. The content of the 421 optional payload header will be written {A=x; MSB=y}. 423 3.3.1 MBS decreasing 425 Say Alice and Bob are in a voice call using G.729X operating at the 426 maximum bit rate of 32 kbps. Alice and Bob both have a MBS of 32 427 kbps. 429 For any reason, at a moment, Alice needs to free bandwidth. She will 430 decrease her sending rate to 12 kbps for example and insert a payload 431 header {A=0; MBS=1} to request Bob to decrease his sending rate. In 432 all consecutive payloads she will insert this payload header, waiting 433 for acknowledgement. Bob will receive the MBS information, he will 434 decrease his sending rate accordingly and insert a payload header 435 {A=1;MBS=15} to acknowledge. Bob will insert this header in all 436 consecutive payloads as long as he receives the MBS from Alice. When 437 Alice receives the acknowledgement, she will remove the payload 438 header with the MBS. Then Bob will remove his payload header. 440 This example shows the acknowledgement procedure. Note that Bob can 441 now send at 12 kbps (value of the MBS received) but also any bit rate 442 lower, in this case 8 kbps. Bob does not send a MBS to Alice because 443 Bob's receiving conditions did not change, that is why Bob replies 444 with a simple MBS ACK, without specifying a MBS (MBS=15). 446 3.3.2 MBS increasing 448 We start from the previous situation. Alice and Bod are now 449 operating at 12 kbps, but Alice's MBS is 12 kbps whereas Bob's MBS is 450 still 32 kbps. 452 At a moment, Alice has more bandwidth, she switches back to 32 kbps. 453 She can do this straight because Bob's MBS is 32 kbps. To request 454 Bob to increase his sending rate, she sends a MBS {A=0; MBS=11} to 455 Bob. Bob receives the MBS and acknowledges it by a MBS ACK {A=1; 456 MBS=15}. Now Bob knows he can send up to 32 kbps to Alice. For any 457 reason Bob's sending bandwidth is currently limited and Bob chooses 458 to keep sending at 12 kbps. At this point, Alice is sending 32 kbps 459 to Bob and Bob is sending 12 kbps to Alice. Each endpoint knows the 460 MBS of the other is 32 kbps. So later, Bob may increase his sending 461 bit rate without any previous notice. 463 This case illustrates the fact that the proper receiving and 464 acknowledgement of a MBS does not imply an immediate bit rate 465 increasing. 467 4. Payload format parameters 469 This section defines the parameters that may be used to configure 470 optional features in the G.729X RTP transmission. 472 The parameters are defined here as part of the MIME subtype 473 registration for the G.729X codec. A mapping of the parameters into 474 the Session Description Protocol (SDP) [4] is also provided for those 475 applications that use SDP. In control protocols that do not use MIME 476 or SDP, the media type parameters must be mapped to the appropriate 477 format used with that control protocol. 479 4.1 Media type registration 481 MIME media type name: audio 483 MIME subtype name: G729X [to be replaced by the actual annex letter] 485 Required parameters: none 487 Optional parameters: 489 dtx: indicates that discontinuous transmission (DTX) is used or 490 preferred. DTX means voice activity detection and non 491 transmission of silent frames. Permissible values are 0 and 1. 0 492 means no DTX. 0 is implied if this parameter is omitted. 494 init-MBS: indicates an initial value of MBS. Permissible values 495 are legal values of MBS, that is between 0 and 11 (see table in 496 Section 3.2.2 of RFC XXXX). The maximum MBS, that is 11, is 497 implied if this parameter is omitted. 499 ptime: the recommended length of time in milliseconds represented 500 by the media in a packet. See RFC 2327 [4]. 502 maxptime: the maximum length of time in milliseconds which can be 503 encapsulated in a packet. 505 Encoding considerations: This type is only defined for transfer via 506 RTP [2]. 508 Security considerations: See Section 5 of RFC XXXX 510 Interoperability considerations: none 512 Published specification: RFC XXXX 514 Applications which use this media type: Audio and video conferencing 515 tools. 517 Additional information: none 519 Person & email address to contact for further information: Aurelien 520 Sollaud, aurelien.sollaud@francetelecom.com 522 Intended usage: COMMON 524 Author/Change controller: IETF Audio/Video Transport working group 525 delegated from the IESG 527 4.2 Mapping to SDP parameters 529 The information carried in the MIME media type specification has a 530 specific mapping to fields in the Session Description Protocol (SDP) 531 [4], which is commonly used to describe RTP sessions. When SDP is 532 used to specify sessions employing the G.729X codec, the mapping is 533 as follows : 535 o The media type ("audio") goes in SDP "m=" as the media name. 537 o The media subtype ("G729X") goes in SDP "a=rtpmap" as the encoding 538 name. The RTP clock rate in "a=rtpmap" MUST be 16000 for G.729X. 540 o The parameters "ptime" and "maxptime" go in the SDP "a=ptime" and 541 "a=maxptime" attributes, respectively. 543 o Any remaining parameters go in the SDP "a=fmtp" attribute by 544 copying them directly from the MIME media type string as a 545 semicolon separated list of parameter=value pairs. 547 Some example SDP session descriptions utilizing G.729X encodings 548 follow. 550 Exemple 1: default parameters 552 m=audio 53146 RTP/AVP 98 553 a=rtpmap:98 G729X/16000 555 Exemple 2: recommended packet duration of 40 ms (=2 frames), no DTX 556 and initial MBS to 26 kbps 558 m=audio 51258 RTP/AVP 99 559 a=rtpmap:99 G729X/16000 560 a=ptime:40 561 a=fmtp:99 dtx=1; init-MBS=8 563 4.3 Offer-answer model considerations 565 The following considerations apply when using SDP offer-answer 566 procedures to negotiate the use of G.729X payload in RTP: 568 o Since G.729X is an extension of G.729, the offerer SHOULD announce 569 G.729 support in its "m=audio" line, with G.729X preferred. This 570 will allow interoperability with both G.729X and G.729-only 571 capable parties. 573 Below is an example of such an offer: 575 m=audio 55954 RTP/AVP 98 18 576 a=rtpmap:98 G729X/16000 577 a=rtpmap:18 G729/8000 579 If the answerer supports G.729X, it will keep the payload type 98 580 in its answer and the conversation will be done using G.729X. 581 Else, if the answerer supports only G.729, it will leave only the 582 payload type 18 in its answer and the conversation will be done 583 using G.729. 585 o The "dtx" parameter concerns both sending and receiving, so both 586 sides of a bi-directional session MUST use the same "dtx" value. 587 If one party indicates it does not support DTX, DTX must be 588 deactivated both ways. 590 o The "init-MBS" parameter is not symmetric. Values in the offer 591 and the answer are independant and take into account local 592 bandwidth constraints. Anyway, one party MUST NOT start sending 593 frames at a bit rate higher than the "init-MBS" of the other 594 party. 596 o The parameters "maxptime" and "ptime" will in most cases not 597 affect interoperability. The SDP offer-answer handling of the 598 "ptime" parameter is described in [5]. The "maxptime" parameter 599 MUST be handled in the same way. 601 5. Security considerations 603 RTP packets using the payload format defined in this specification 604 are subject to the general security considerations discussed in the 605 RTP specification [2] and any appropriate profile (for example, [3]). 607 As this format transports encoded speech/audio, the main security 608 issues include confidentiality and authentication of the speech/audio 609 itself. The payload format itself does not have any built-in 610 security mechanisms. Confidentiality of the media streams is 611 achieved by encryption, therefore external mechanisms, such as SRTP 612 [6], MAY be used for that purpose. The data compression used with 613 this payload format is applied end-to-end; hence encryption may be 614 performed after compression with no conflict between the two 615 operations. 617 This payload format and the G.729X encoding do not exhibit any 618 significant non-uniformity in the receiver-end computational load and 619 thus in unlikely to pose a denial-of-service threat due to the 620 receipt of pathological datagrams. 622 6. IANA considerations 624 It is requested that one new MIME subtype (audio/G729X) is registered 625 by IANA, see Section 4.1. 627 7. References 629 7.1 Normative references 631 [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement 632 Levels", BCP 14, RFC 2119, March 1997. 634 [2] Schulzrinne, H., Casner, S., Frederick, R. and V. Jacobson, 635 "RTP: A Transport Protocol for Real-Time Applications", STD 64, 636 RFC 3550, July 2003. 638 [3] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and Video 639 Conferences with Minimal Control", STD 65, RFC 3551, July 2003. 641 [4] Handley, M. and V. Jacobson, "SDP: Session Description 642 Protocol", RFC 2327, April 1998. 644 [5] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with 645 Session Description Protocol (SDP)", RFC 3264, June 2002. 647 [6] Baugher, M., McGrew, D., Naslund, M., Carrara, E. and K. 648 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 649 RFC 3711, March 2004. 651 7.2 Informative references 653 [7] International Telecommunications Union, "Coding of speech at 8 654 kbit/s using conjugate-structure algebraic-code-excited 655 linear-prediction (CS-ACELP)", ITU-T Recommendation G.729, March 656 1996. 658 Author's Address 660 Aurelien Sollaud 661 France Telecom 662 2 avenue Pierre Marzin 663 Lannion Cedex 22307 664 France 666 Phone: +33 2 96 05 15 06 667 Email: aurelien.sollaud@francetelecom.com 669 Intellectual Property Statement 671 The IETF takes no position regarding the validity or scope of any 672 Intellectual Property Rights or other rights that might be claimed to 673 pertain to the implementation or use of the technology described in 674 this document or the extent to which any license under such rights 675 might or might not be available; nor does it represent that it has 676 made any independent effort to identify any such rights. Information 677 on the procedures with respect to rights in RFC documents can be 678 found in BCP 78 and BCP 79. 680 Copies of IPR disclosures made to the IETF Secretariat and any 681 assurances of licenses to be made available, or the result of an 682 attempt made to obtain a general license or permission for the use of 683 such proprietary rights by implementers or users of this 684 specification can be obtained from the IETF on-line IPR repository at 685 http://www.ietf.org/ipr. 687 The IETF invites any interested party to bring to its attention any 688 copyrights, patents or patent applications, or other proprietary 689 rights that may cover technology that may be required to implement 690 this standard. Please address the information to the IETF at 691 ietf-ipr@ietf.org. 693 Disclaimer of Validity 695 This document and the information contained herein are provided on an 696 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 697 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 698 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 699 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 700 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 701 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 703 Copyright Statement 705 Copyright (C) The Internet Society (2005). This document is subject 706 to the rights, licenses and restrictions contained in BCP 78, and 707 except as set forth therein, the authors retain all their rights. 709 Acknowledgment 711 Funding for the RFC Editor function is currently provided by the 712 Internet Society.