idnits 2.17.1 draft-sollaud-avt-rtp-g729-scal-wb-ext-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 15. -- Found old boilerplate from RFC 3978, Section 5.5 on line 527. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 504. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 511. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 517. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (November 17, 2005) is 6725 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 2327 (ref. '4') (Obsoleted by RFC 4566) -- Obsolete informational reference (is this intentional?): RFC 3555 (ref. '9') (Obsoleted by RFC 4855, RFC 4856) Summary: 4 errors (**), 0 flaws (~~), 3 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group A. Sollaud 3 Internet-Draft France Telecom 4 Expires: May 21, 2006 November 17, 2005 6 RTP payload format for the future scalable and wideband extension of 7 G.729 audio codec 8 draft-sollaud-avt-rtp-g729-scal-wb-ext-02 10 Status of this Memo 12 By submitting this Internet-Draft, each author represents that any 13 applicable patent or other IPR claims of which he or she is aware 14 have been or will be disclosed, and any of which he or she becomes 15 aware will be disclosed, in accordance with Section 6 of BCP 79. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as Internet- 20 Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six months 23 and may be updated, replaced, or obsoleted by other documents at any 24 time. It is inappropriate to use Internet-Drafts as reference 25 material or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt. 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 This Internet-Draft will expire on May 21, 2006. 35 Copyright Notice 37 Copyright (C) The Internet Society (2005). 39 Abstract 41 This document specifies a real-time transport protocol (RTP) payload 42 format to be used for the future scalable and wideband extension of 43 the International Telecommunication Union (ITU-T) G.729 audio codec. 44 A media type registration is included for this payload format. 46 Table of Contents 48 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 49 2. Background . . . . . . . . . . . . . . . . . . . . . . . . . . 3 50 3. RTP header usage . . . . . . . . . . . . . . . . . . . . . . . 4 51 4. Payload format . . . . . . . . . . . . . . . . . . . . . . . . 4 52 4.1. Payload structure . . . . . . . . . . . . . . . . . . . . 5 53 4.2. Payload Header: MBS field . . . . . . . . . . . . . . . . 5 54 4.3. Payload Header: FT field . . . . . . . . . . . . . . . . . 6 55 4.4. Audio data . . . . . . . . . . . . . . . . . . . . . . . . 7 56 5. Payload format parameters . . . . . . . . . . . . . . . . . . 7 57 5.1. Media type registration . . . . . . . . . . . . . . . . . 7 58 5.2. Mapping to SDP parameters . . . . . . . . . . . . . . . . 9 59 5.3. Offer-answer model considerations . . . . . . . . . . . . 9 60 6. Security considerations . . . . . . . . . . . . . . . . . . . 10 61 7. IANA considerations . . . . . . . . . . . . . . . . . . . . . 11 62 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 11 63 8.1. Normative references . . . . . . . . . . . . . . . . . . . 11 64 8.2. Informative references . . . . . . . . . . . . . . . . . . 11 65 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 12 66 Intellectual Property and Copyright Statements . . . . . . . . . . 13 68 1. Introduction 70 The International Telecommunication Union (ITU-T) is working on a 71 scalable and wideband extension of its recommendation G.729 [7]. 72 This future audio codec will be called G.729EV in the following text. 73 This document specifies the payload format for packetization of 74 G.729EV encoded audio signals into the real-time transport protocol 75 (RTP). 77 The payload format itself and the handling of variable bit rate are 78 described in Section 4. A media type registration and the details 79 for the use of G.729EV with SDP are given in Section 5. 81 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 82 "SHOULD", "SHOULD NOT","RECOMMENDED", "MAY", and "OPTIONAL" in this 83 document are to be interpreted as described in RFC 2119 [1]. 85 2. Background 87 G.729EV is mainly designed to be used as a speech codec, but it can 88 be used for music at the highest bit rates. The sampling frequency 89 is 16000 Hz and the frame size is 20 ms. 91 This G.729-based codec produces an embedded bitstream providing an 92 improved narrow band quality [300, 3400 Hz] at 12 kbps, and an 93 enhanced and gracefully improving wideband quality [50, 7000 Hz] from 94 14 kbps to 32 kbps, by steps of 2 kbps. At 8 kbps it generates a 95 G.729 bitstream. 97 It has been mainly designed for packetized wideband voice 98 applications (Voice over IP or ATM, Telephony over IP, private 99 networks...) and particularly for those requiring scalable bandwidth, 100 enhanced quality above G.729, and easy integration into existing 101 infrastructures. 103 G.729EV is also designed to cope with other services like high 104 quality audio/video conferencing, archival, messaging, etc. 106 For all those applications, the scalability feature allows to tune 107 the bit rate versus quality trade-off, possibly in a dynamic way 108 during a session, taking into account service requirements and 109 network transport constraints. 111 G.729EV produces frames that are said embedded because they are 112 composed of embedded layers. The first layer is called the core 113 layer and is bitstream compatible with the ITU-T G.729 with annex B 114 coder. Upper layers are added while bit rate increases, to improve 115 quality and enlarge audio bandwidth from narrowband to wideband. As 116 a result, a received frame can be decoded at its original bit rate or 117 at any lower bit rate corresponding to lower layers which are 118 embedded. Only the core layer is mandatory to decode understandable 119 speech, upper layers provide quality enhancement and wideband 120 enlargement. 122 Audio codecs often support voice activity detection (VAD) and comfort 123 noise generation (CNG). During silence periods, the coder may 124 significantly decrease the transmitted bit rate by sending only 125 comfort noise parameters in special small frames called silence 126 insertion descriptors (SID). The receiver's decoder will generate 127 comfort noise according to the SID information. This operation of 128 sending low bit rate comfort noise parameters during silence periods 129 is usually called discontinuous transmission (DTX). 131 G.729EV will be first released without support for DTX. Anyway, this 132 functionality is planned and will be defined in a separate annex 133 later. Thus this specification provides DTX signalling, even if the 134 size of a SID frame is not yet standardized. 136 3. RTP header usage 138 The format of the RTP header is specified in RFC 3550 [2]. This 139 payload format uses the fields of the header in a manner consistent 140 with that specification. 142 The RTP timestamp clock frequency is the same as the sampling 143 frequency, that is 16 kHz. So the timestamp unit is in samples. 145 The duration of one frame is 20 ms, corresponding to 320 samples per 146 frame. Thus the timestamp is increased by 320 for each consecutive 147 frame. 149 The M bit should be set as specified in the applicable RTP profile, 150 for example, RFC 3551 [3]. 152 The assignment of an RTP payload type for this packet format is 153 outside the scope of the document, and will not be specified here. 154 It is expected that the RTP profile under which this payload format 155 is being used will assign a payload type for this codec or specify 156 that the payload type is to be bound dynamically (see Section 5.2). 158 4. Payload format 159 4.1. Payload structure 161 The complete payload consists of a payload header of 1 octet, 162 followed by audio data representing one or more consecutive frames at 163 the same bit rate. 165 0 1 2 3 166 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 167 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 168 | MBS | FT | | 169 +-+-+-+-+-+-+-+-+ + 170 : one ore more frames at the same bit rate : 171 : : 172 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 174 4.2. Payload Header: MBS field 176 MBS (4 bits): maximum bit rate supported. Indicates a maximum bit 177 rate to the encoder at the site of the receiver of this payload. The 178 value of the MBS field is set according to the following table: 180 +-------+--------------+ 181 | MBS | max bit rate | 182 +-------+--------------+ 183 | 0 | 8 kbps | 184 | 1 | 12 kbps | 185 | 2 | 14 kbps | 186 | 3 | 16 kbps | 187 | 4 | 18 kbps | 188 | 5 | 20 kbps | 189 | 6 | 22 kbps | 190 | 7 | 24 kbps | 191 | 8 | 26 kbps | 192 | 9 | 28 kbps | 193 | 10 | 30 kbps | 194 | 11 | 32 kbps | 195 | 12-14 | (reserved) | 196 | 15 | NO_MBS | 197 +-------+--------------+ 199 The MBS is used to tell the other party the maximum bit rate one can 200 receive. The encoder MUST follow the received MBS. It MUST NOT send 201 frames at a bit rate higher than the received MBS. Thanks to the 202 embedded property of the coding scheme, note that it can send frames 203 at the MBS rate or any lower rate. As long as it does not exceed the 204 MBS, it can change its bit rate at any time without previous notice. 206 The MBS received is valid until the next MBS is received, i.e. a 207 newly received MBS value overrides the previous one. 209 If a payload with an invalid MBS value is received, the MBS MUST be 210 ignored. 212 Note that the MBS is a codec bit rate, the actual network bit rate is 213 higher and depends on the overhead of the underlying protocols. 215 The MBS field MUST be set to 15 for packets sent to a multicast 216 group. 218 The MBS field MUST be set to 15 in all packets when the actual MBS 219 value is sent through non-RTP means. This is out of the scope of 220 this specification. 222 4.3. Payload Header: FT field 224 FT (4 bits): Frame type of the frame(s) in this packet, as per the 225 following table: 227 +-------+---------------+------------+ 228 | FT | encoding rate | frame size | 229 +-------+---------------+------------+ 230 | 0 | 8 kbps | 20 octets | 231 | 1 | 12 kbps | 30 octets | 232 | 2 | 14 kbps | 35 octets | 233 | 3 | 16 kbps | 40 octets | 234 | 4 | 18 kbps | 45 octets | 235 | 5 | 20 kbps | 50 octets | 236 | 6 | 22 kbps | 55 octets | 237 | 7 | 24 kbps | 60 octets | 238 | 8 | 26 kbps | 65 octets | 239 | 9 | 28 kbps | 70 octets | 240 | 10 | 30 kbps | 75 octets | 241 | 11 | 32 kbps | 80 octets | 242 | 12-14 | (reserved) | | 243 | 15 | NO_DATA | 0 | 244 +-------+---------------+------------+ 246 The FT value 15 (NO_DATA) indicates that there is no audio data in 247 the payload. This MAY be used to update the MBS value when there is 248 no audio frame to transmit. The payload will then be reduced to the 249 payload header. 251 If a payload with an invalid FT value is received, the whole payload 252 MUST be ignored. 254 4.4. Audio data 256 Audio data of a payload contains one or more consecutive audio frames 257 at the same bit rate. The audio frames are packed in order of time, 258 that is the older first. 260 The actual number of frame is easy to infer from the size of the 261 audio data part: 263 nb_frames = (size_of_audio_data) / (size_of_one_frame). 265 This is compatible with DTX, with the restriction that the SID frame 266 MUST be at the end of the payload (it is consistent with the payload 267 format of G.729 described in section 4.5.6 of RFC 3551 [3]). Since 268 the SID frame is much smaller than any other frame, it will not 269 hinder the calculation of the number of frames at the receiver side 270 and can be easily detected. Actually the presence of a SID frame 271 will be inferred by the result of the above division not being an 272 integer. 274 Note that if FT=15, there will be no audio frame in the payload. 276 5. Payload format parameters 278 This section defines the parameters that may be used to configure 279 optional features in the G.729EV RTP transmission. 281 The parameters are defined here as part of the media subtype 282 registration for the G.729EV codec. A mapping of the parameters into 283 the Session Description Protocol (SDP) [4] is also provided for those 284 applications that use SDP. In control protocols that do not use MIME 285 or SDP, the media type parameters must be mapped to the appropriate 286 format used with that control protocol. 288 5.1. Media type registration 290 This registration is done using the template defined in [8] and 291 following RFC 3555 [9] 293 Type name: audio 295 Subtype name: G729EV 297 Required parameters: none 299 Optional parameters: 301 dtx: indicates that discontinuous transmission (DTX) is used or 302 preferred. DTX means voice activity detection and non 303 transmission of silent frames. Permissible values are 0 and 1. 0 304 means no DTX. 0 is implied if this parameter is omitted. The 305 first version of G.729EV will not support DTX. 307 mbs: indicates an initial value of MBS, that is the current maximum 308 codec bit rate supported as a receiver. Permissible values are 309 between 0 and 11 (see table in Section 4.2 of RFC XXXX). The 310 maximum MBS, that is 11, is implied if this parameter is omitted. 311 Note that this parameter will be dynamically updated by the MBS 312 field of the RTP packets sent, it is not an absolute value for the 313 session. The goal is to announce this value, prior to the sending 314 of any packet, to avoid the remote sender to exceed the MBS at the 315 beginning of the session. 317 ptime: the recommended length of time in milliseconds represented by 318 the media in a packet. See RFC 2327 [4]. 320 maxptime: the maximum length of time in milliseconds which can be 321 encapsulated in a packet. 323 Encoding considerations: This media type is framed and contains 324 binary data. 326 Security considerations: See Section 6 of RFC XXXX 328 Interoperability considerations: none 330 Published specification: RFC XXXX 332 Applications which use this media type: Audio and video conferencing 333 tools. 335 Additional information: none 337 Person & email address to contact for further information: Aurelien 338 Sollaud, aurelien.sollaud@francetelecom.com 340 Intended usage: COMMON 342 Restrictions on usage: This media type depends on RTP framing, and 343 hence is only defined for transfer via RTP [2]. 345 Author/Change controller: IETF Audio/Video Transport working group 346 delegated from the IESG 348 5.2. Mapping to SDP parameters 350 The information carried in the media type specification has a 351 specific mapping to fields in the Session Description Protocol (SDP) 352 [4], which is commonly used to describe RTP sessions. When SDP is 353 used to specify sessions employing the G.729EV codec, the mapping is 354 as follows: 356 o The media type ("audio") goes in SDP "m=" as the media name. 358 o The media subtype ("G729EV") goes in SDP "a=rtpmap" as the 359 encoding name. The RTP clock rate in "a=rtpmap" MUST be 16000 for 360 G.729EV. 362 o The parameters "ptime" and "maxptime" go in the SDP "a=ptime" and 363 "a=maxptime" attributes, respectively. 365 o Any remaining parameters go in the SDP "a=fmtp" attribute by 366 copying them directly from the media type string as a semicolon 367 separated list of parameter=value pairs. 369 Some example SDP session descriptions utilizing G.729EV encodings 370 follow. 372 Example 1: default parameters 374 m=audio 53146 RTP/AVP 98 375 a=rtpmap:98 G729EV/16000 377 Example 2: recommended packet duration of 40 ms (=2 frames), DTX off, 378 and initial MBS set to 26 kbps 380 m=audio 51258 RTP/AVP 99 381 a=rtpmap:99 G729EV/16000 382 a=fmtp:99 dtx=0; mbs=8 383 a=ptime:40 385 5.3. Offer-answer model considerations 387 The following considerations apply when using SDP offer-answer 388 procedures to negotiate the use of G.729EV payload in RTP: 390 o Since G.729EV is an extension of G.729, the offerer SHOULD 391 announce G.729 support in its "m=audio" line, with G.729EV 392 preferred. This will allow interoperability with both G.729EV and 393 G.729-only capable parties. 395 Below is an example of such an offer: 397 m=audio 55954 RTP/AVP 98 18 398 a=rtpmap:98 G729EV/16000 399 a=rtpmap:18 G729/8000 401 If the answerer supports G.729EV, it will keep the payload type 98 402 in its answer and the conversation will be done using G.729EV. 403 Else, if the answerer supports only G.729, it will leave only the 404 payload type 18 in its answer and the conversation will be done 405 using G.729 (the payload format for G.729 is defined in RFC 3551 406 [3]). 408 o The "dtx" parameter concerns both sending and receiving, so both 409 sides of a bi-directional session MUST use the same "dtx" value. 410 If one party indicates it does not support DTX, DTX must be 411 deactivated both ways. 413 o The "mbs" parameter is not symmetric. Values in the offer and the 414 answer are independent and take into account local constraints. 415 Anyway, one party MUST NOT start sending frames at a bit rate 416 higher than the "mbs" of the other party. 418 o The parameters "ptime" and "maxptime" will in most cases not 419 affect interoperability. The SDP offer-answer handling of the 420 "ptime" parameter is described in RFC 3264 [5]. The "maxptime" 421 parameter MUST be handled in the same way. 423 6. Security considerations 425 RTP packets using the payload format defined in this specification 426 are subject to the general security considerations discussed in the 427 RTP specification [2] and any appropriate profile (for example, RFC 428 3551 [3]). 430 As this format transports encoded speech/audio, the main security 431 issues include confidentiality and authentication of the speech/audio 432 itself. The payload format itself does not have any built-in 433 security mechanisms. Confidentiality of the media streams is 434 achieved by encryption, therefore external mechanisms, such as SRTP 435 [6], MAY be used for that purpose. 437 This payload format and the G.729EV encoding do not exhibit any 438 significant non-uniformity in the receiver-end computational load and 439 thus in unlikely to pose a denial-of-service threat due to the 440 receipt of pathological datagrams. 442 7. IANA considerations 444 It is requested that one new media subtype (audio/G729EV) is 445 registered by IANA, see Section 5.1. 447 8. References 449 8.1. Normative references 451 [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement 452 Levels", BCP 14, RFC 2119, March 1997. 454 [2] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, 455 "RTP: A Transport Protocol for Real-Time Applications", STD 64, 456 RFC 3550, July 2003. 458 [3] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and Video 459 Conferences with Minimal Control", STD 65, RFC 3551, July 2003. 461 [4] Handley, M. and V. Jacobson, "SDP: Session Description 462 Protocol", RFC 2327, April 1998. 464 [5] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with 465 Session Description Protocol (SDP)", RFC 3264, June 2002. 467 [6] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 468 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 469 RFC 3711, March 2004. 471 8.2. Informative references 473 [7] International Telecommunications Union, "Coding of speech at 8 474 kbit/s using conjugate-structure algebraic-code-excited linear- 475 prediction (CS-ACELP)", ITU-T Recommendation G.729, March 1996. 477 [8] Freed, N. and J. Klensin, "Media Type Specifications and 478 Registration Procedures", draft-freed-media-type-reg-05 (work in 479 progress), August 2005. 481 [9] Casner, S. and P. Hoschka, "MIME Type Registration of RTP 482 Payload Formats", RFC 3555, July 2003. 484 Author's Address 486 Aurelien Sollaud 487 France Telecom 488 2 avenue Pierre Marzin 489 Lannion Cedex 22307 490 France 492 Phone: +33 2 96 05 15 06 493 Email: aurelien.sollaud@francetelecom.com 495 Intellectual Property Statement 497 The IETF takes no position regarding the validity or scope of any 498 Intellectual Property Rights or other rights that might be claimed to 499 pertain to the implementation or use of the technology described in 500 this document or the extent to which any license under such rights 501 might or might not be available; nor does it represent that it has 502 made any independent effort to identify any such rights. Information 503 on the procedures with respect to rights in RFC documents can be 504 found in BCP 78 and BCP 79. 506 Copies of IPR disclosures made to the IETF Secretariat and any 507 assurances of licenses to be made available, or the result of an 508 attempt made to obtain a general license or permission for the use of 509 such proprietary rights by implementers or users of this 510 specification can be obtained from the IETF on-line IPR repository at 511 http://www.ietf.org/ipr. 513 The IETF invites any interested party to bring to its attention any 514 copyrights, patents or patent applications, or other proprietary 515 rights that may cover technology that may be required to implement 516 this standard. Please address the information to the IETF at 517 ietf-ipr@ietf.org. 519 Disclaimer of Validity 521 This document and the information contained herein are provided on an 522 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 523 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 524 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 525 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 526 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 527 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 529 Copyright Statement 531 Copyright (C) The Internet Society (2005). This document is subject 532 to the rights, licenses and restrictions contained in BCP 78, and 533 except as set forth therein, the authors retain all their rights. 535 Acknowledgment 537 Funding for the RFC Editor function is currently provided by the 538 Internet Society.