idnits 2.17.1 draft-ietf-avt-rtp-isac-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (April 17, 2012) is 4391 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 4566 (ref. '3') (Obsoleted by RFC 8866) Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group T. le Grand 3 Internet-Draft Google 4 Intended status: Standards Track P. Jones 5 Expires: October 19, 2012 P. Huart 6 Cisco Systems 7 H. Alvestrand, Ed. 8 Google 9 April 17, 2012 11 RTP Payload Format for the iSAC Codec 12 draft-ietf-avt-rtp-isac-01 14 Abstract 16 iSAC is a proprietary wideband speech and audio codec developed by 17 Global IP Solutions, suitable for use in Voice over IP applications. 18 This document describes the payload format for iSAC generated bit 19 streams within a Real-Time Protocol (RTP) packet. Also included here 20 are the necessary details for the use of iSAC with the Session 21 Description Protocol (SDP). 23 Requirements Language 25 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 26 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 27 document are to be interpreted as described in RFC 2119 [1]. 29 Status of this Memo 31 This Internet-Draft is submitted in full conformance with the 32 provisions of BCP 78 and BCP 79. 34 Internet-Drafts are working documents of the Internet Engineering 35 Task Force (IETF). Note that other groups may also distribute 36 working documents as Internet-Drafts. The list of current Internet- 37 Drafts is at http://datatracker.ietf.org/drafts/current/. 39 Internet-Drafts are draft documents valid for a maximum of six months 40 and may be updated, replaced, or obsoleted by other documents at any 41 time. It is inappropriate to use Internet-Drafts as reference 42 material or to cite them other than as "work in progress." 44 This Internet-Draft will expire on October 19, 2012. 46 Copyright Notice 48 Copyright (c) 2012 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents 53 (http://trustee.ietf.org/license-info) in effect on the date of 54 publication of this document. Please review these documents 55 carefully, as they describe your rights and restrictions with respect 56 to this document. Code Components extracted from this document must 57 include Simplified BSD License text as described in Section 4.e of 58 the Trust Legal Provisions and are provided without warranty as 59 described in the Simplified BSD License. 61 Table of Contents 63 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 64 2. iSAC Codec Description . . . . . . . . . . . . . . . . . . . . 3 65 3. RTP Payload Format . . . . . . . . . . . . . . . . . . . . . . 4 66 3.1. iSAC Payload Format . . . . . . . . . . . . . . . . . . . . 4 67 3.2. Payload Header . . . . . . . . . . . . . . . . . . . . . . 5 68 3.3. Encoded Speech Data . . . . . . . . . . . . . . . . . . . . 5 69 3.4. Multiple iSAC frames in an RTP packet . . . . . . . . . . . 6 70 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 6 71 4.1. Media Type registration of iSAC . . . . . . . . . . . . . . 7 72 5. Mapping to SDP Parameters . . . . . . . . . . . . . . . . . . . 7 73 5.1. Example Initial Target Bit Rate . . . . . . . . . . . . . . 8 74 5.2. Example Max Bit Rate . . . . . . . . . . . . . . . . . . . 8 75 6. Security Considerations . . . . . . . . . . . . . . . . . . . . 9 76 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . 9 77 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 9 78 8.1. Normative References . . . . . . . . . . . . . . . . . . . 9 79 8.2. Informative References . . . . . . . . . . . . . . . . . . 9 80 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 9 82 1. Introduction 84 This document gives a general description of the iSAC wideband speech 85 codec and specifies the iSAC payload format for usage in RTP packets. 86 Also included here are the necessary details for the use of iSAC with 87 the Session Description Protocol (SDP). 89 2. iSAC Codec Description 91 The iSAC codec is an adaptive wideband speech and audio codec that 92 operates with short delay, making it suitable for high quality real 93 time communication. It is specially designed to deliver wideband 94 speech quality in both low and medium bit rate applications. It also 95 handles non-speech audio well, such as music and background noise 96 [5]. 98 The iSAC codec compresses speech frames of 16 kHz, 16-bit sampled 99 input speech, each frame containing 30 or 60 ms of speech. 101 The codec runs in one of two different modes called channel-adaptive 102 mode and channel-independent mode. In both modes iSAC is aiming at a 103 target bit rate, which is neither the average nor the maximum bit 104 rate that will be reach by iSAC, but corresponds to the average bit 105 rate during peaks in speech activity. The bit rate will sometimes 106 exceed the target bit rate, but most of the time will be below. The 107 average bit rate obtained is on average about a factor of 1.4 times 108 lower than the target bit rate. 110 In channel-adaptive mode the target bit rate is adapted to give a bit 111 rate corresponding to the available bandwidth on the channel. The 112 available bandwidth is constantly estimated at the receiving iSAC and 113 signaled in-band in the iSAC bit stream. Even at dial-up modem data 114 rates (including IP, UDP, and RTP overhead) iSAC delivers high 115 quality by automatically adjusting transmission rates to give the 116 best possible listening experience over the available bandwidth. The 117 default initial target bit rate is 20000 bits per second in channel- 118 adaptive mode. 120 In channel-independent mode a target bit rate has to be provided to 121 iSAC prior to encoding. 123 After encoding the speech signal the iSAC coder uses lossless coding 124 to further reduce the size of each packet, and hence the total bit 125 rate used. 127 The adaptation and the lossless coding described above both result in 128 a variation of packet size, depending both of the nature of speech 129 and the available bandwidth. Therefore the iSAC codec operates at 130 transmission rates from about 10 kbps to about 32 kbps. 132 The main characteristics can be summarized as follows: 134 o Wideband, 16 kHz, speech and audio codec 136 o Variable bit rate, which depends on the input signal 138 o Adaptive rate with two modes: channel-adaptive or channel- 139 independent mode 141 o Bit rate range from around 10 kbps to 32 kbps 143 o Operates on 30 or 60 ms of speech 145 3. RTP Payload Format 147 The iSAC codec uses a sampling rate clock of 16 kHz, so the RTP 148 timestamp MUST be in units of 1/16000 of a second. 150 The RTP payload for iSAC has the format shown in Figure 1. No 151 additional header fields specific to this payload format are 152 required. For RTP based transportation of iSAC encoded audio, the 153 standard RTP header [2] is followed by one payload data block. 154 0 1 2 3 155 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 156 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 157 | RTP Header | 158 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 159 | | 160 + iSAC Payload Block + 161 | | 162 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 163 Figure 1: RTP packet format for iSAC 165 3.1. iSAC Payload Format 167 The iSAC payload block consists of a payload header and one or two 168 encoded 30 ms speech frames. The iSAC payload is generated in the 169 following manner: 171 o Parameters representing one or two 30 ms frames of speech data are 172 determined by the encoder. The parameters are quantized to 173 generate encoded data corresponding to the one or two speech 174 frames. The length of the encoded data is variable and depends on 175 the signal characteristics and the target bit rate. 177 o The payload header is generated (described in Section 3.2) and 178 added before the encoded parameter data for the speech frame(s). 180 o Lossless coding is applied to the complete iSAC payload block, 181 including payload header, to generate a compressed payload. The 182 length depends on the length of the data generated to represent 183 the speech and the effectiveness of the lossless coding. 185 No part of the payload header or the encoded speech data can be 186 retrieved without partly or fully decoding the packet. 188 The following figure shows an iSAC payload block containing 60 ms of 189 encoded speech data: 190 +--------+--------+--------+--------+--------+--------+--------+ 191 |Payload | 30 ms Encoded | 30 ms Encoded | 192 |Header | Speech Data | Speech Data | 193 +--------+--------+--------+--------+--------+--------+--------+ 194 Figure 2: Payload format for iSAC 196 3.2. Payload Header 198 The payload header holds information for the receiver about the 199 available bandwidth (BEI), and the length of the speech data in the 200 current payload (FL). The header has the format defined in Figure 3. 201 Note that the size of the header can vary due to the lossless 202 encoding described in section 2 and in section 3.1. Also note that 203 the BEI is always estimated and transmitted, even if iSAC runs in 204 channel-independent mode. 205 +-+-+-+-+-+-+ 206 | BEI | FL | 207 +-+-+-+-+-+-+ 208 Figure 3: Payload Header 210 o BEI: Bandwidth Estimation Index. The bandwidth estimate is 211 quantized into one out of 24 values. Valid values are 0 to 23. 213 o FL: The length of the speech data (Frame Length) present in the 214 payload, given in number of speech samples. Valid frame lengths 215 are 480 (30 ms) and 960 (60 ms) samples. 217 3.3. Encoded Speech Data 219 The iSAC encoded speech data consist of parameters representing one 220 or two frames of 30 ms speech. The length of the speech data is 221 signaled in the header (in number of samples), and the length may 222 change at any time during a session. In channel-adaptive mode the 223 length is changed to best utilize the available bandwidth. 225 The iSAC payload is padded to whole octets, and has a variable length 226 depending on the input source signal, number of 30 ms speech frames, 227 and target bit rate. 229 The number of octets used to describe one frame of 30 ms speech 230 typically varies from around 50 to around 120 octets. For the case 231 of 60 ms speech (two 30 ms speech frames), the number of octets 232 varies from around 100 to around 240 octets. The absolute maximum 233 allowed payload length is 400 octets. The user can choose to lower 234 the maximum allowed payload length. Minimum value is 100 octets. It 235 is possible for the user to choose a maximum bit rate instead of a 236 maximum payload length. The maximum payload length is then dependent 237 on the length of the speech data represented in the payload (30 or 60 238 ms). Possible maximum rates are in the range of 32000 to 53400 bits 239 per second. 241 The sensitivity to bit errors is equal for all bits in the payload. 243 3.4. Multiple iSAC frames in an RTP packet 245 More than one iSAC payload block MUST NOT be included in an RTP 246 packet by a sender. 248 Further, iSAC payload blocks MUST NOT be split between RTP packets. 250 4. IANA Considerations 252 This document defines the iSAC media type. 254 Media type name: audio 256 Media subtype: isac 258 Required parameters: None 260 Optional parameters: 262 * ibitrate: The parameter indicates the upper bound of the 263 initial target bit rate the device would like to receive. For 264 channel-adaptive mode, the target bit rate may vary with time; 265 for channel-independent mode, the target bit rate will remain 266 at that level unless instructed otherwise. An acceptable value 267 for ibitrate is in the range of 20000 to 32000 (bits per 268 second). 270 * maxbitrate: The parameter indicates the maximum bit rate the 271 endpoint expects to receive. The recipient of this parameter 272 SHOULD NOT transmit at a higher bit rate. 274 Encoding considerations: 275 This media format is framed and binary. 277 Security considerations: See Section 6 279 Interoperability considerations: None 281 Published specification: RFC XXXX 283 Applications which use this media type: 284 This media type is suitable for use in numerous applications 285 needing to transport encoded voice or other audio. Some examples 286 include Voice over IP, Streaming Media, Voice Messaging, and 287 Conferencing. 289 Additional information: None 291 Intended usage: COMMON 293 Other Information/General Comment: 294 iSAC is a proprietary speech and audio codec owned by Global IP 295 Solutions. The codec operates on 30 or 60 ms speech frames at a 296 sampling rate clock of 16 kHz. 298 Person to contact for further information: 299 Tina le Grand [tlegrand@google.com] 301 Restrictions on usage: 302 This media type depends on RTP framing, and hence is only defined 303 for transfer via RTP [2] Transport within other framing protocols 304 is not defined at this time. 306 Change controller: 307 IETF Audio/Video Transport working group delegated from the IESG. 309 Note to the RFC Editor / IANA: Please replace "RFC XXXX" above with 310 the number of this RFC when published, and remove this note. 312 4.1. Media Type registration of iSAC 314 5. Mapping to SDP Parameters 316 The information carried in the media type specification has a 317 specific mapping to fields in the Session Description Protocol (SDP) 318 [3], which is commonly used to describe RTP sessions. When SDP is 319 used to specify sessions employing the iSAC codec, the mapping is as 320 follows: 322 o The media type ("audio") goes in SDP "m=" as the media name. 324 o The media subtype (payload format name) goes in SDP "a=rtpmap" as 325 the encoding name. 327 o Any remaining parameters go in the SDP "a=fmtp" attribute by 328 copying them directly from the media type string as a semicolon 329 separated list of parameter=value pairs. 331 The optional parameter ibitrate MUST NOT be higher than the parameter 332 maxbitrate. 334 The iSAC parameters in an SDP offer are completely independent from 335 those in the SDP answer. For both ibitrate and maxbitrate it is 336 legal for the answer to contain a value that is different than what 337 is provided in an offer. The parameter may be present in the answer, 338 even if absent in the offer. 340 When conveying information by SDP, the encoding name SHALL be "isac" 341 (the same as the media subtype). 343 5.1. Example Initial Target Bit Rate 345 The offer indicates that it wishes to receive a bitstream with an 346 initial target rate of 20000 bits per second. The remote party MAY 347 change its initial target rate to the requested value. 348 m=audio 10000 RTP/AVP 98 349 a=rtpmap: 98 isac/16000 350 a=fmtp:98 ibitrate=20000 352 5.2. Example Max Bit Rate 354 The offer indicates that it wishes to receive a bitstream with an 355 initial target rate of 20000 bits per second, and a maximum bit rate 356 of 45000 bits per second. The remote party MAY change its initial 357 target rate and SHOULD NOT transmit at a higher rate than 45000. 358 m=audio 10000 RTP/AVP 98 359 a=rtpmap: 98 isac/16000 360 a=fmtp:98 ibitrate=20000;maxrate=45000 362 6. Security Considerations 364 RTP packets using the payload format defined in this specification 365 are subject to the general security considerations discussed in RFC 366 3550 8.1. 368 As this format transports encoded speech, the main security issues 369 include confidentiality and authentication of the speech itself. The 370 payload format itself does not have any built-in security mechanisms. 371 External mechanisms, such as SRTP [4], MAY be used. 373 7. Acknowledgments 375 This document was originally prepared using 2-Word-v2.0.template.dot. 377 The present version is prepared using xml2rfc and xxe-xml2rfc. 379 8. References 381 8.1. Normative References 383 [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement 384 Levels", BCP 14, RFC 2119, March 1997. 386 [2] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, 387 "RTP: A Transport Protocol for Real-Time Applications", STD 64, 388 RFC 3550, July 2003. 390 [3] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 391 Description Protocol", RFC 4566, July 2006. 393 [4] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 394 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 395 RFC 3711, March 2004. 397 8.2. Informative References 399 [5] "iSAC datasheet at Golbal IP Solutions website 400 http://www.gipscorp.com/files/english/datasheets/iSAC.pdf". 402 Authors' Addresses 404 Tina le Grand 405 Google 406 Kungsbron 2 407 Stockholm, 11122 408 Sweden 410 Paul E. Jones 411 Cisco Systems 412 7025 Kit Creek Rd. 413 Research Triangle Park, NC 27709 414 USA 416 Phone: +1 919 476 2048 417 Fax: 418 Email: paulej@packetizer.com 419 URI: 421 Pascal Huart 422 Cisco Systems 423 400, Avenue Roumanille, Batiment T3 424 Biot - Sophia Antipolis, 06410 425 France 427 Phone: +33 4 9723 2643 428 Fax: 429 Email: phuart@cisco.com 430 URI: 432 Harald Alvestrand (editor) 433 Google 434 Kungsbron 2 435 Stockholm, 11122 436 Sweden 438 Phone: 439 Fax: 440 Email: hta@google.com 441 URI: