AVT P. Luthi Internet-Draft Tandberg Expires: July 29, 2006 R. Even, Ed. Polycom January 25, 2006 RTP Payload Format for ITU-T Recommendation G.722.1 draft-ietf-avt-rfc3047-bis-01.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on July 29, 2006. Copyright Notice Copyright (C) The Internet Society (2006). Abstract International Telecommunication Union (ITU-T) Recommendation G.722.1 is a wide-band audio codec. This document describes the payload format for including G.722.1 generated bit streams within an RTP packet. The document also describe the syntax and semantics of the SDP parameters needed to support G.722.1 audio codec. Luthi & Even Expires July 29, 2006 [Page 1] Internet-Draft G7221 January 2006 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. RTP payload format for G.722.1 . . . . . . . . . . . . . . . . 5 3.1. Multiple G.722.1 frames in a RTP packet . . . . . . . . . 7 3.2. Computing the number of G.722.1 frames . . . . . . . . . . 7 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 4.1. Media Type Registration . . . . . . . . . . . . . . . . . 8 4.1.1. Registration of MIME media type audio/G7221 . . . . . 8 5. SDP Parameters . . . . . . . . . . . . . . . . . . . . . . . . 10 5.1. Usage with the SDP Offer Answer Model . . . . . . . . . . 10 6. Security Considerations . . . . . . . . . . . . . . . . . . . 11 7. Changes from RFC 3047 . . . . . . . . . . . . . . . . . . . . 12 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 13 8.1. Normative References . . . . . . . . . . . . . . . . . . . 13 8.2. Informative References . . . . . . . . . . . . . . . . . . 13 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 14 Intellectual Property and Copyright Statements . . . . . . . . . . 15 Luthi & Even Expires July 29, 2006 [Page 2] Internet-Draft G7221 January 2006 1. Introduction ITU-T G.722.1 [ITU.G7221] is a low complexity coder, it compresses 50Hz - 14kHz audio signals into one of the following bit rates, 24 kbit/s, 32 kbit/s or 48 kbit/s. The coder may be used for speech, music and other types of audio. Some of the applications for which this coder is suitable are: o Real-time communications such as videoconferencing and telephony. o Streaming audio o Archival and messaging A fixed frame size of 20 ms is used, and for any given bit rate the number of bits in a frame is a constant. Luthi & Even Expires July 29, 2006 [Page 3] Internet-Draft G7221 January 2006 2. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC2119 [RFC2119] and indicate requirement levels for compliant RTP implementations. Luthi & Even Expires July 29, 2006 [Page 4] Internet-Draft G7221 January 2006 3. RTP payload format for G.722.1 ITU-T G.722.1 [ITU.G7221] uses 20 ms frames and a sampling rate clock of 16 kHz or 32kHz, so the RTP [RFC3550] timestamp MUST be in units of 1/16000 or 1/32000 of a second. The RTP payload for G.722.1 has the format shown in Figure 1. No additional header specific to this payload format is required. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | RTP Header [3] | +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ | | + one or more frames of G.722.1 | | .... | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 1: RTP payload for G.722.1 The encoding and decoding algorithm can change the bit rate at any 20ms frame boundary, but no bit rate change notification is provided in-band with the bit stream. Therefore, a separate out-of-band method is REQUIRED to indicate the bit rate (see section 6 for an example of signaling bit rate information using SDP). For the payload format specified here, the bit rate MUST remain constant for a particular payload type value. An application MAY switch bit rates from packet to packet by defining two payload type values and switching between them. The assignment of an RTP payload type for this new packet format is outside the scope of this document, and will not be specified here. It is expected that the RTP profile for a particular class of applications will assign a payload type for this encoding, or if that is not done then a payload type in the dynamic range shall be chosen. The number of bits within a frame is fixed, and within this fixed frame G.722.1 uses variable length coding (e.g., Huffman coding) to represent most of the encoded parameters. All variable length codes are transmitted in order from the left most (most significant - MSB) bit to the right most (least significant - LSB) bit, see [ITU.G7221] for more details. The use of Huffman coding means that it is not possible to identify the various encoded parameters/fields contained within the bit stream without first completely decoding the entire frame. For the purposes of packetizing the bit stream in RTP, it is only necessary to Luthi & Even Expires July 29, 2006 [Page 5] Internet-Draft G7221 January 2006 consider the sequence of bits as output by the G.722.1 encoder, and present the same sequence to the decoder. The payload format described here maintains this sequence. When operating at 24 kbit/s, 480 bits (60 octets) are produced per frame. When operating at 32 kbit/s, 640 bits (80 octets) are produced per frame. When operating at 48 kbit/s, 960 bits (120 octets) are produced per frame. Thus, all three bit rates allow for octet alignment without the need for padding bits. Figure 2 illustrates how the G.722.1 bit stream MUST be mapped into an octet aligned RTP payload. first bit last bit transmitted transmitted +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + sequence of bits (480, 640 or 960) generated by the | | G.722.1 encoder for transmission | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | | | | | | ... | | | | | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ... +-+-+-+-+-+-+-+-+-+-+-+-+ |MSB... LSB|MSB... LSB| |MSB... LSB| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+ ... +-+-+-+-+-+-+-+-+-+-+-+-+ RTP RTP RTP octet 1 octet 2 octet 60, 80, 120 Figure 2: The G.722.1 encoder bit stream is split into a sequence of octets (60, 80 or 120 depending on the bit rate), and each octet is in turn mapped into an RTP octet. The ITU-T standardized bit rates for G.722.1 are 24 kbit/s or 32kbit/s at 16 Khz sample rate, and 24 kbit/s, 32 kbit/s or 48 kbit/s at 32 Khz sample rate. However, the coding algorithm itself has the capability to run at any user specified bit rate (not just 24, 32 and 48 kbit/s) while maintaining an audio bandwidth of 50 Hz to 14 kHz. This rate change is accomplished by a linear scaling of the codec operation, resulting in frames with size in bits equal to 1/50 of the corresponding bit rate. Luthi & Even Expires July 29, 2006 [Page 6] Internet-Draft G7221 January 2006 When operating at non-standard rates the payload format MUST follow the guidelines illustrated in Figure 2. It is RECOMMENDED that values in the range 16000 to 48000 be used, and that any value MUST be a multiple of 400 (this maintains octet alignment and does not then require (undefined) padding bits for each frame if not octet aligned). For example, a bit rate of 16.4 kbit/s will result in a frame of size 328 bits or 41 octets which are mapped into RTP per Figure 2. 3.1. Multiple G.722.1 frames in a RTP packet More than one G.722.1 frame may be included in a single RTP packet by a sender. Senders have the following additional restrictions: o Sender SHOULD NOT include more G.722.1 frames in a single RTP packet than will fit in the MTU of the RTP transport protocol. o All frames contained in a single RTP packet MUST be of the same length, that is they MUST have the same bit rate (octets per frame). o Frames MUST NOT be split between RTP packets. It is RECOMMENDED that the number of frames contained within an RTP packet be consistent with the application. For example, in a telephony application where delay is important, then the fewer frames per packet the lower the delay, whereas for a delay insensitive streaming or messaging application, many frames per packet would be acceptable. 3.2. Computing the number of G.722.1 frames Information describing the number of frames contained in an RTP packet is not transmitted as part of the RTP payload. The only way to determine the number of G.722.1 frames is to count the total number of octets within the RTP packet, and divide the octet count by the number of expected octets per frame. Luthi & Even Expires July 29, 2006 [Page 7] Internet-Draft G7221 January 2006 4. IANA Considerations This document updates the G7221 media type described in RFC3047. 4.1. Media Type Registration This section describes the media types and names associated with this payload format. The section registers the media types, as per RFC4288 [RFC4288] 4.1.1. Registration of MIME media type audio/G7221 MIME media type name: audio MIME subtype name: G7221 Required parameters: bitrate: the data rate for the audio bit stream. This parameter is mandatory because the bit rate is not signaled within the G.722.1 bit stream. At the standard G.722.1 bit rates, the value MUST be either 24000 or 32000 at 16 Khz sample rate, and 24000, 32000 or 48000 at 32 Khz sample rate. If using the non-standard bit rates, then it is RECOMMENDED that values in the range 16000 to 48000 be used, and that any value MUST be a multiple of 400 (this maintains octet alignment and does not then require (undefined) padding bits for each frame if not octet aligned). Optional parameters: ptime: RECOMMENDED duration of each packet in milliseconds. Encoding considerations: This media type is framed and binary, see section 4.8 in [RFC4288]. Security considerations: See Section 6 Interoperability considerations: Terminals SHOULD offer a media type at 16 Khz sample rate in order to interoperate with terminals that do not support the new 32 Khz sample rate. Published specification: RFC yyy. Applications which use this media type: Luthi & Even Expires July 29, 2006 [Page 8] Internet-Draft G7221 January 2006 Audio and Video streaming and conferencing applications. Additional information: none Person and email address to contact for further information : Roni Even: roni.even@polycom.co.il Intended usage: COMMON Restrictions on usage: This media type depends on RTP framing, and hence is only defined for transfer via RTP [RFC3550]. Transport within other framing protocols is not defined at this time. Author: Roni Even Change controller: IETF Audio/Video Transport working group delegated from the IESG. Luthi & Even Expires July 29, 2006 [Page 9] Internet-Draft G7221 January 2006 5. SDP Parameters The media types audio/G7221 are mapped to fields in the Session Description Protocol (SDP) [RFC2327] as follows: o The media name in the "m=" line of SDP MUST be audio. o The encoding name in the "a=rtpmap" line of SDP MUST be G7221 (the MIME subtype). o The clock rate in the "a=rtpmap" line MUST be 16000 or 32000. o One optional parameter MUST be included in the "a=fmtp" line of SDP. One bitrate SHALL be defined for each payload type. 5.1. Usage with the SDP Offer Answer Model When offering G.722.1 over RTP using SDP in an Offer/Answer model [RFC3264] the following considerations are necessary. There are two clock rates defined for the updated G.722.1. RFC3047 [RFC3047] supported only the 16 Khz clock rate. Therefore a system that wants to use G.722.1 SHOULD offer a payload type with clock rate of 16000. An example of an offer that includes a 16000 and 32000 clock rate is: m=audio 49000 RTP/AVP 121 122 a=rtpmap:121 G7221/16000 a=fmtp:121 bitrate=24000 a=rtpmap:122 G7221/32000 a=fmtp:121 bitrate=48000 Luthi & Even Expires July 29, 2006 [Page 10] Internet-Draft G7221 January 2006 6. Security Considerations RTP packets using the payload format defined in this specification are subject to the security considerations discussed in the RTP specification [RFC3550], and any appropriate RTP profile. As this format transports encoded audio, the main security issues include confidentiality, integrity protection, and data origin authentication of the audio itself. The payload format itself does not have any built-in security mechanisms. Any suitable external mechanisms, such as SRTP [RFC3711], MAY be used. Because the data compression used with this payload format is applied end-to-end, encryption will be performed after compression so there is no conflict between the two operations. A potential denial-of-service threat exists for data encoding using compression techniques that have non-uniform receiver-end computational load. The attacker can inject pathological datagrams into the stream which are complex to decode and cause the receiver to be overloaded. However, this encoding does not exhibit any significant non-uniformity and thus are unlikely to pose a denial-of- service threat due to the receipt of pathological data. . Note that the appropriate mechanism to ensure confidentiality and integrity of RTP packets and their payloads is very dependent on the application and on the transport and signaling protocols employed. Thus, although SRTP is given as an example above, other possible choices exist. Luthi & Even Expires July 29, 2006 [Page 11] Internet-Draft G7221 January 2006 7. Changes from RFC 3047 The new draft updates RFC3047 adding the support for the Wideband audio support defined in the new revision of ITU-T G.722.1. Other changes Update the text to be in line with the current rules for RFC. Luthi & Even Expires July 29, 2006 [Page 12] Internet-Draft G7221 January 2006 8. References 8.1. Normative References [ITU.G7221] International Telecommunications Union, "Low-complexity coding at 24 and 32 kbit/s for hands-free operation in systems with low frame loss", ITU-T Recommendation G.722.1, 2005. [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC2327] Handley, M. and V. Jacobson, "SDP: Session Description Protocol", RFC 2327, April 1998. [RFC3047] Luthi, P., "RTP Payload Format for ITU-T Recommendation G.722.1", RFC 3047, January 2001. [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with Session Description Protocol (SDP)", RFC 3264, June 2002. [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", STD 64, RFC 3550, July 2003. 8.2. Informative References [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. Norrman, "The Secure Real-time Transport Protocol (SRTP)", RFC 3711, March 2004. [RFC4288] Freed, N. and J. Klensin, "Media Type Specifications and Registration Procedures", BCP 13, RFC 4288, December 2005. Luthi & Even Expires July 29, 2006 [Page 13] Internet-Draft G7221 January 2006 Authors' Addresses Patrick Luthi Tandberg Philip Pedersens vei 22 Lysaker 1366 Norway Email: patrick.luthi@tandberg.no Roni Even (editor) Polycom 94 Derech Em Hamoshavot Petach Tikva 49130 Israel Email: roni.even@polycom.co.il Luthi & Even Expires July 29, 2006 [Page 14] Internet-Draft G7221 January 2006 Intellectual Property Statement The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Disclaimer of Validity This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Copyright Statement Copyright (C) The Internet Society (2006). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. Acknowledgment Funding for the RFC Editor function is currently provided by the Internet Society. Luthi & Even Expires July 29, 2006 [Page 15]