idnits 2.17.1 draft-ietf-avt-rtp-g729-scal-wb-ext-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 15. -- Found old boilerplate from RFC 3978, Section 5.5 on line 537. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 514. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 521. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 527. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (January 18, 2006) is 6671 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 2327 (ref. '4') (Obsoleted by RFC 4566) -- Obsolete informational reference (is this intentional?): RFC 4288 (ref. '7') (Obsoleted by RFC 6838) -- Obsolete informational reference (is this intentional?): RFC 3555 (ref. '8') (Obsoleted by RFC 4855, RFC 4856) Summary: 4 errors (**), 0 flaws (~~), 3 warnings (==), 9 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group A. Sollaud 3 Internet-Draft France Telecom 4 Expires: July 22, 2006 January 18, 2006 6 RTP payload format for the future scalable and wideband extension of 7 G.729 audio codec 8 draft-ietf-avt-rtp-g729-scal-wb-ext-01 10 Status of this Memo 12 By submitting this Internet-Draft, each author represents that any 13 applicable patent or other IPR claims of which he or she is aware 14 have been or will be disclosed, and any of which he or she becomes 15 aware will be disclosed, in accordance with Section 6 of BCP 79. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as Internet- 20 Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet-Drafts as reference 25 material or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt. 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 This Internet-Draft will expire on July 22, 2006. 35 Copyright Notice 37 Copyright (C) The Internet Society (2006). 39 Abstract 41 This document specifies a real-time transport protocol (RTP) payload 42 format to be used for the future scalable and wideband extension of 43 the International Telecommunication Union (ITU-T) G.729 audio codec. 44 A media type registration is included for this payload format. 46 Table of Contents 48 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 49 2. Background . . . . . . . . . . . . . . . . . . . . . . . . . . 3 50 3. RTP header usage . . . . . . . . . . . . . . . . . . . . . . . 4 51 4. Payload format . . . . . . . . . . . . . . . . . . . . . . . . 4 52 4.1. Payload structure . . . . . . . . . . . . . . . . . . . . 5 53 4.2. Payload Header: MBS field . . . . . . . . . . . . . . . . 5 54 4.3. Payload Header: FT field . . . . . . . . . . . . . . . . . 6 55 4.4. Audio data . . . . . . . . . . . . . . . . . . . . . . . . 7 56 5. Payload format parameters . . . . . . . . . . . . . . . . . . 7 57 5.1. Media type registration . . . . . . . . . . . . . . . . . 7 58 5.2. Mapping to SDP parameters . . . . . . . . . . . . . . . . 9 59 5.3. Offer-answer model considerations . . . . . . . . . . . . 9 60 6. Security considerations . . . . . . . . . . . . . . . . . . . 10 61 7. IANA considerations . . . . . . . . . . . . . . . . . . . . . 11 62 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 11 63 8.1. Normative references . . . . . . . . . . . . . . . . . . . 11 64 8.2. Informative references . . . . . . . . . . . . . . . . . . 11 65 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 13 66 Intellectual Property and Copyright Statements . . . . . . . . . . 14 68 1. Introduction 70 The International Telecommunication Union (ITU-T) is working on a 71 scalable and wideband extension of its recommendation G.729 [6]. 72 This future audio codec will be called G.729EV in the following text. 73 This document specifies the payload format for packetization of 74 G.729EV encoded audio signals into the real-time transport protocol 75 (RTP). 77 The payload format itself and the handling of variable bit rate are 78 described in Section 4. A media type registration and the details 79 for the use of G.729EV with SDP are given in Section 5. 81 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 82 "SHOULD", "SHOULD NOT","RECOMMENDED", "MAY", and "OPTIONAL" in this 83 document are to be interpreted as described in RFC 2119 [1]. 85 2. Background 87 G.729EV is mainly designed to be used as a speech codec, but it can 88 be used for music at the highest bit rates. The sampling frequency 89 is 16000 Hz and the frame size is 20 ms. 91 This G.729-based codec produces an embedded bitstream providing an 92 improved narrow band quality [300, 3400 Hz] at 12 kbps, and an 93 enhanced and gracefully improving wideband quality [50, 7000 Hz] from 94 14 kbps to 32 kbps, by steps of 2 kbps. At 8 kbps it generates a 95 G.729 bitstream. 97 It has been mainly designed for packetized wideband voice 98 applications (Voice over IP or ATM, Telephony over IP, private 99 networks...) and particularly for those requiring scalable bandwidth, 100 enhanced quality above G.729, and easy integration into existing 101 infrastructures. 103 G.729EV is also designed to cope with other services like high 104 quality audio/video conferencing, archival, messaging, etc. 106 For all those applications, the scalability feature allows to tune 107 the bit rate versus quality trade-off, possibly in a dynamic way 108 during a session, taking into account service requirements and 109 network transport constraints. 111 G.729EV produces frames that are said embedded because they are 112 composed of embedded layers. The first layer is called the core 113 layer and is bitstream compatible with the ITU-T G.729 with annex B 114 coder. Upper layers are added while bit rate increases, to improve 115 quality and enlarge audio bandwidth from narrowband to wideband. As 116 a result, a received frame can be decoded at its original bit rate or 117 at any lower bit rate corresponding to lower layers which are 118 embedded. Only the core layer is mandatory to decode understandable 119 speech, upper layers provide quality enhancement and wideband 120 enlargement. 122 Audio codecs often support voice activity detection (VAD) and comfort 123 noise generation (CNG). During silence periods, the coder may 124 significantly decrease the transmitted bit rate by sending only 125 comfort noise parameters in special small frames called silence 126 insertion descriptors (SID). The receiver's decoder will generate 127 comfort noise according to the SID information. This operation of 128 sending low bit rate comfort noise parameters during silence periods 129 is usually called discontinuous transmission (DTX). 131 G.729EV will be first released without support for DTX. Anyway, this 132 functionality is planned and will be defined in a separate annex 133 later. Thus this specification provides DTX signalling, even if the 134 size of a SID frame is not yet standardized. 136 3. RTP header usage 138 The format of the RTP header is specified in RFC 3550 [2]. This 139 payload format uses the fields of the header in a manner consistent 140 with that specification. 142 The RTP timestamp clock frequency is the same as the sampling 143 frequency, that is 16 kHz. So the timestamp unit is in samples. 145 The duration of one frame is 20 ms, corresponding to 320 samples per 146 frame. Thus the timestamp is increased by 320 for each consecutive 147 frame. 149 The M bit should be set as specified in the applicable RTP profile, 150 for example, RFC 3551 [3]. 152 The assignment of an RTP payload type for this packet format is 153 outside the scope of the document, and will not be specified here. 154 It is expected that the RTP profile under which this payload format 155 is being used will assign a payload type for this codec or specify 156 that the payload type is to be bound dynamically (see Section 5.2). 158 4. Payload format 159 4.1. Payload structure 161 The complete payload consists of a payload header of 1 octet, 162 followed by audio data representing one or more consecutive frames at 163 the same bit rate. 165 0 1 2 3 166 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 167 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 168 | MBS | FT | | 169 +-+-+-+-+-+-+-+-+ + 170 : one ore more frames at the same bit rate : 171 : : 172 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 174 4.2. Payload Header: MBS field 176 MBS (4 bits): maximum bit rate supported. Indicates a maximum bit 177 rate to the encoder at the site of the receiver of this payload. The 178 value of the MBS field is set according to the following table: 180 +-------+--------------+ 181 | MBS | max bit rate | 182 +-------+--------------+ 183 | 0 | 8 kbps | 184 | 1 | 12 kbps | 185 | 2 | 14 kbps | 186 | 3 | 16 kbps | 187 | 4 | 18 kbps | 188 | 5 | 20 kbps | 189 | 6 | 22 kbps | 190 | 7 | 24 kbps | 191 | 8 | 26 kbps | 192 | 9 | 28 kbps | 193 | 10 | 30 kbps | 194 | 11 | 32 kbps | 195 | 12-14 | (reserved) | 196 | 15 | NO_MBS | 197 +-------+--------------+ 199 The MBS is used to tell the other party the maximum bit rate one can 200 receive. The encoder MUST follow the received MBS. It MUST NOT send 201 frames at a bit rate higher than the received MBS. Thanks to the 202 embedded property of the coding scheme, note that it can send frames 203 at the MBS rate or any lower rate. As long as it does not exceed the 204 MBS, it can change its bit rate at any time without previous notice. 206 The MBS received is valid until the next MBS is received, i.e. a 207 newly received MBS value overrides the previous one. 209 If a payload with an invalid MBS value is received, the MBS MUST be 210 ignored. 212 Note that the MBS is a codec bit rate, the actual network bit rate is 213 higher and depends on the overhead of the underlying protocols. 215 The MBS field MUST be set to 15 for packets sent to a multicast 216 group. 218 The MBS field MUST be set to 15 in all packets when the actual MBS 219 value is sent through non-RTP means. This is out of the scope of 220 this specification. 222 4.3. Payload Header: FT field 224 FT (4 bits): Frame type of the frame(s) in this packet, as per the 225 following table: 227 +-------+---------------+------------+ 228 | FT | encoding rate | frame size | 229 +-------+---------------+------------+ 230 | 0 | 8 kbps | 20 octets | 231 | 1 | 12 kbps | 30 octets | 232 | 2 | 14 kbps | 35 octets | 233 | 3 | 16 kbps | 40 octets | 234 | 4 | 18 kbps | 45 octets | 235 | 5 | 20 kbps | 50 octets | 236 | 6 | 22 kbps | 55 octets | 237 | 7 | 24 kbps | 60 octets | 238 | 8 | 26 kbps | 65 octets | 239 | 9 | 28 kbps | 70 octets | 240 | 10 | 30 kbps | 75 octets | 241 | 11 | 32 kbps | 80 octets | 242 | 12-14 | (reserved) | | 243 | 15 | NO_DATA | 0 | 244 +-------+---------------+------------+ 246 The FT value 15 (NO_DATA) indicates that there is no audio data in 247 the payload. This MAY be used to update the MBS value when there is 248 no audio frame to transmit. The payload will then be reduced to the 249 payload header. 251 If a payload with an invalid FT value is received, the whole payload 252 MUST be ignored. 254 4.4. Audio data 256 Audio data of a payload contains one or more consecutive audio frames 257 at the same bit rate. The audio frames are packed in order of time, 258 that is the older first. 260 The actual number of frame is easy to infer from the size of the 261 audio data part: 263 nb_frames = (size_of_audio_data) / (size_of_one_frame). 265 This is compatible with DTX, with the restriction that the SID frame 266 MUST be at the end of the payload (it is consistent with the payload 267 format of G.729 described in section 4.5.6 of RFC 3551 [3]). Since 268 the SID frame is much smaller than any other frame, it will not 269 hinder the calculation of the number of frames at the receiver side 270 and can be easily detected. Actually the presence of a SID frame 271 will be inferred by the result of the above division not being an 272 integer. 274 Note that if FT=15, there will be no audio frame in the payload. 276 5. Payload format parameters 278 This section defines the parameters that may be used to configure 279 optional features in the G.729EV RTP transmission. 281 The parameters are defined here as part of the media subtype 282 registration for the G.729EV codec. A mapping of the parameters into 283 the Session Description Protocol (SDP) [4] is also provided for those 284 applications that use SDP. In control protocols that do not use MIME 285 or SDP, the media type parameters must be mapped to the appropriate 286 format used with that control protocol. 288 5.1. Media type registration 290 This registration is done using the template defined in RFC 4288 [7] 291 and following RFC 3555 [8]. 293 Type name: audio 295 Subtype name: G729EV 297 Required parameters: none 299 Optional parameters: 301 dtx: indicates that discontinuous transmission (DTX) is used or 302 preferred. DTX means voice activity detection and non 303 transmission of silent frames. Permissible values are 0 and 1. 0 304 means no DTX. 0 is implied if this parameter is omitted. The 305 first version of G.729EV will not support DTX. 307 maxbitrate: the absolute maximum codec bit rate for the session. 308 Permissible values are between 0 and 11 (see table in Section 4.2 309 of RFC XXXX). 11 is implied if this parameter is omitted. The 310 maxbitrate restricts the range of bit rates which can be used. 311 Frames bit rate (FT) and MBS MUST NOT exceed this value. 313 mbs: the initial value of MBS, that is the current maximum codec bit 314 rate supported as a receiver. Permissible values are between 0 315 and maxbitrate (see table in Section 4.2 of RFC XXXX). The 316 maximum MBS value is implied if this parameter is omitted. Note 317 that this parameter will be dynamically updated by the MBS field 318 of the RTP packets sent, it is not an absolute value for the 319 session. The goal is to announce this value, prior to the sending 320 of any packet, to avoid the remote sender to exceed the MBS at the 321 beginning of the session. 323 ptime: the recommended length of time in milliseconds represented by 324 the media in a packet. See RFC 2327 [4]. 326 maxptime: the maximum length of time in milliseconds which can be 327 encapsulated in a packet. 329 Encoding considerations: This media type is framed and contains 330 binary data. 332 Security considerations: See Section 6 of RFC XXXX 334 Interoperability considerations: none 336 Published specification: RFC XXXX 338 Applications which use this media type: Audio and video conferencing 339 tools. 341 Additional information: none 343 Person & email address to contact for further information: Aurelien 344 Sollaud, aurelien.sollaud@francetelecom.com 346 Intended usage: COMMON 348 Restrictions on usage: This media type depends on RTP framing, and 349 hence is only defined for transfer via RTP [2]. 351 Author/Change controller: IETF Audio/Video Transport working group 352 delegated from the IESG 354 5.2. Mapping to SDP parameters 356 The information carried in the media type specification has a 357 specific mapping to fields in the Session Description Protocol (SDP) 358 [4], which is commonly used to describe RTP sessions. When SDP is 359 used to specify sessions employing the G.729EV codec, the mapping is 360 as follows: 362 o The media type ("audio") goes in SDP "m=" as the media name. 364 o The media subtype ("G729EV") goes in SDP "a=rtpmap" as the 365 encoding name. The RTP clock rate in "a=rtpmap" MUST be 16000 for 366 G.729EV. 368 o The parameters "ptime" and "maxptime" go in the SDP "a=ptime" and 369 "a=maxptime" attributes, respectively. 371 o Any remaining parameters go in the SDP "a=fmtp" attribute by 372 copying them directly from the media type string as a semicolon 373 separated list of parameter=value pairs. 375 Some example SDP session descriptions utilizing G.729EV encodings 376 follow. 378 Example 1: default parameters 380 m=audio 53146 RTP/AVP 98 381 a=rtpmap:98 G729EV/16000 383 Example 2: recommended packet duration of 40 ms (=2 frames), DTX off, 384 and initial MBS set to 26 kbps 386 m=audio 51258 RTP/AVP 99 387 a=rtpmap:99 G729EV/16000 388 a=fmtp:99 dtx=0; mbs=8 389 a=ptime:40 391 5.3. Offer-answer model considerations 393 The following considerations apply when using SDP offer-answer 394 procedures to negotiate the use of G.729EV payload in RTP: 396 o Since G.729EV is an extension of G.729, the offerer SHOULD 397 announce G.729 support in its "m=audio" line, with G.729EV 398 preferred. This will allow interoperability with both G.729EV and 399 G.729-only capable parties. 401 Below is an example of such an offer: 403 m=audio 55954 RTP/AVP 98 18 404 a=rtpmap:98 G729EV/16000 405 a=rtpmap:18 G729/8000 407 If the answerer supports G.729EV, it will keep the payload type 98 408 in its answer and the conversation will be done using G.729EV. 409 Else, if the answerer supports only G.729, it will leave only the 410 payload type 18 in its answer and the conversation will be done 411 using G.729 (the payload format for G.729 is defined in RFC 3551 412 [3]). 414 o The "dtx" parameter concerns both sending and receiving, so both 415 sides of a bi-directional session MUST use the same "dtx" value. 416 If one party indicates it does not support DTX, DTX must be 417 deactivated both ways. 419 o The "maxbitrate" parameter is bi-directional. If the offerer sets 420 a maxbitrate value, the answerer MUST reply with a smaller or 421 equal value. The actual maximum bit rate for the session will be 422 the minimum. 424 o The "mbs" parameter is not symmetric. Values in the offer and the 425 answer are independent and take into account local constraints. 426 Anyway, one party MUST NOT start sending frames at a bit rate 427 higher than the "mbs" of the other party. 429 o The parameters "ptime" and "maxptime" will in most cases not 430 affect interoperability. The SDP offer-answer handling of the 431 "ptime" parameter is described in RFC 3264 [5]. The "maxptime" 432 parameter MUST be handled in the same way. 434 6. Security considerations 436 RTP packets using the payload format defined in this specification 437 are subject to the general security considerations discussed in the 438 RTP specification [2] and any appropriate profile (for example, RFC 439 3551 [3]). 441 As this format transports encoded speech/audio, the main security 442 issues include confidentiality and authentication of the speech/audio 443 itself. The payload format itself does not have any built-in 444 security mechanisms. Confidentiality of the media streams is 445 achieved by encryption, therefore external mechanisms, such as SRTP 446 [9], MAY be used for that purpose. 448 This payload format and the G.729EV encoding do not exhibit any 449 significant non-uniformity in the receiver-end computational load and 450 thus in unlikely to pose a denial-of-service threat due to the 451 receipt of pathological datagrams. 453 7. IANA considerations 455 It is requested that one new media subtype (audio/G729EV) is 456 registered by IANA, see Section 5.1. 458 8. References 460 8.1. Normative references 462 [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement 463 Levels", BCP 14, RFC 2119, March 1997. 465 [2] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, 466 "RTP: A Transport Protocol for Real-Time Applications", STD 64, 467 RFC 3550, July 2003. 469 [3] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and Video 470 Conferences with Minimal Control", STD 65, RFC 3551, July 2003. 472 [4] Handley, M. and V. Jacobson, "SDP: Session Description 473 Protocol", RFC 2327, April 1998. 475 [5] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with 476 Session Description Protocol (SDP)", RFC 3264, June 2002. 478 8.2. Informative references 480 [6] International Telecommunications Union, "Coding of speech at 8 481 kbit/s using conjugate-structure algebraic-code-excited linear- 482 prediction (CS-ACELP)", ITU-T Recommendation G.729, March 1996. 484 [7] Freed, N. and J. Klensin, "Media Type Specifications and 485 Registration Procedures", BCP 13, RFC 4288, December 2005. 487 [8] Casner, S. and P. Hoschka, "MIME Type Registration of RTP 488 Payload Formats", RFC 3555, July 2003. 490 [9] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 491 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 492 RFC 3711, March 2004. 494 Author's Address 496 Aurelien Sollaud 497 France Telecom 498 2 avenue Pierre Marzin 499 Lannion Cedex 22307 500 France 502 Phone: +33 2 96 05 15 06 503 Email: aurelien.sollaud@francetelecom.com 505 Intellectual Property Statement 507 The IETF takes no position regarding the validity or scope of any 508 Intellectual Property Rights or other rights that might be claimed to 509 pertain to the implementation or use of the technology described in 510 this document or the extent to which any license under such rights 511 might or might not be available; nor does it represent that it has 512 made any independent effort to identify any such rights. Information 513 on the procedures with respect to rights in RFC documents can be 514 found in BCP 78 and BCP 79. 516 Copies of IPR disclosures made to the IETF Secretariat and any 517 assurances of licenses to be made available, or the result of an 518 attempt made to obtain a general license or permission for the use of 519 such proprietary rights by implementers or users of this 520 specification can be obtained from the IETF on-line IPR repository at 521 http://www.ietf.org/ipr. 523 The IETF invites any interested party to bring to its attention any 524 copyrights, patents or patent applications, or other proprietary 525 rights that may cover technology that may be required to implement 526 this standard. Please address the information to the IETF at 527 ietf-ipr@ietf.org. 529 Disclaimer of Validity 531 This document and the information contained herein are provided on an 532 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 533 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 534 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 535 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 536 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 537 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 539 Copyright Statement 541 Copyright (C) The Internet Society (2006). This document is subject 542 to the rights, licenses and restrictions contained in BCP 78, and 543 except as set forth therein, the authors retain all their rights. 545 Acknowledgment 547 Funding for the RFC Editor function is currently provided by the 548 Internet Society.