idnits 2.17.1 draft-ietf-avt-rtp-isac-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (April 29, 2010) is 5111 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 4566 (ref. '4') (Obsoleted by RFC 8866) Summary: 1 error (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 AVT T. le Grand 2 Internet-Draft Global IP Solutions 3 Intended status: Standards Track P. Jones 4 Expires: October 2010 Cisco 5 P. Huart 6 Cisco 7 April 29, 2010 9 RTP Payload Format for the iSAC Codec 10 draft-ietf-avt-rtp-isac-00.txt 12 Status of this Memo 14 This Internet-Draft is submitted to IETF in full conformance with the 15 provisions of BCP 78 and BCP 79. 17 Copyright (c) 2010 IETF Trust and the persons identified as the 18 document authors. All rights reserved. 20 This document is subject to BCP 78 and the IETF Trust's Legal 21 Provisions Relating to IETF Documents 22 (http://trustee.ietf.org/license-info) in effect on the date of 23 publication of this document. Please review these documents 24 carefully, as they describe your rights and restrictions with respect 25 to this document. Code Components extracted from this document must 26 include Simplified BSD License text as described in Section 4.e of 27 the Trust Legal Provisions and are provided without warranty as 28 described in the Simplified BSD License. 30 Internet-Drafts are working documents of the Internet Engineering 31 Task Force (IETF), its areas, and its working groups. Note that 32 other groups may also distribute working documents as Internet- 33 Drafts. 35 Internet-Drafts are draft documents valid for a maximum of six months 36 and may be updated, replaced, or obsoleted by other documents at any 37 time. It is inappropriate to use Internet-Drafts as reference 38 material or to cite them other than as "work in progress." 40 The list of current Internet-Drafts can be accessed at 41 http://www.ietf.org/ietf/1id-abstracts.txt 43 The list of Internet-Draft Shadow Directories can be accessed at 44 http://www.ietf.org/shadow.html 46 This Internet-Draft will expire on October 29, 2010. 48 Abstract 50 iSAC is a proprietary wideband speech and audio codec developed by 51 Global IP Solutions, suitable for use in Voice over IP applications. 52 This document describes the payload format for iSAC generated bit 53 streams within a Real-Time Protocol (RTP) packet. Also included here 54 are the necessary details for the use of iSAC with the Session 55 Description Protocol (SDP). 57 Conventions used in this document 59 In examples, "C:" and "S:" indicate lines sent by the client and 60 server respectively. 62 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 63 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 64 document are to be interpreted as described in RFC 2119 [1]. 66 Table of Contents 68 1. Introduction...................................................2 69 2. iSAC Codec Description.........................................3 70 3. RTP Payload Format.............................................4 71 3.1. iSAC Payload Format.......................................4 72 3.2. Payload Header............................................5 73 3.3. Encoded Speech Data.......................................5 74 3.4. Multiple iSAC frames in an RTP packet.....................6 75 4. IANA Considerations............................................6 76 4.1. Media Type registration of iSAC...........................6 77 5. Mapping to SDP Parameters......................................8 78 5.1. Example Initial Target Bit Rate...........................8 79 5.2. Example Max Bit Rate......................................9 80 6. Security Considerations........................................9 81 7. Acknowledgments................................................9 82 8. References.....................................................9 83 8.1. Normative References......................................9 84 8.2. Informative References...................................10 85 Author's Addresses...............................................10 87 1. Introduction 89 This document gives a general description of the iSAC wideband speech 90 codec and specifies the iSAC payload format for usage in RTP packets. 91 Also included here are the necessary details for the use of iSAC with 92 the Session Description Protocol (SDP). 94 2. iSAC Codec Description 96 The iSAC codec is an adaptive wideband speech and audio codec that 97 operates with short delay, making it suitable for high quality real 98 time communication. It is specially designed to deliver wideband 99 speech quality in both low and medium bit rate applications. It also 100 handles non-speech audio well, such as music and background noise 101 [5]. 103 The iSAC codec compresses speech frames of 16 kHz, 16-bit sampled 104 input speech, each frame containing 30 or 60 ms of speech. 106 The codec runs in one of two different modes called channel-adaptive 107 mode and channel-independent mode. In both modes iSAC is aiming at a 108 target bit rate, which is neither the average nor the maximum bit 109 rate that will be reach by iSAC, but corresponds to the average bit 110 rate during peaks in speech activity. The bit rate will sometimes 111 exceed the target bit rate, but most of the time will be below. The 112 average bit rate obtained is on average about a factor of 1.4 times 113 lower than the target bit rate. 115 In channel-adaptive mode the target bit rate is adapted to give a bit 116 rate corresponding to the available bandwidth on the channel. The 117 available bandwidth is constantly estimated at the receiving iSAC and 118 signaled in-band in the iSAC bit stream. Even at dial-up modem data 119 rates (including IP, UDP, and RTP overhead) iSAC delivers high 120 quality by automatically adjusting transmission rates to give the 121 best possible listening experience over the available bandwidth. The 122 default initial target bit rate is 20000 bits per second in channel- 123 adaptive mode. 125 In channel-independent mode a target bit rate has to be provided to 126 iSAC prior to encoding. 128 After encoding the speech signal the iSAC coder uses lossless coding 129 to further reduce the size of each packet, and hence the total bit 130 rate used. 132 The adaptation and the lossless coding described above both result in 133 a variation of packet size, depending both of the nature of speech 134 and the available bandwidth. Therefore the iSAC codec operates at 135 transmission rates from about 10 kbps to about 32 kbps. 137 The main characteristics can be summarized as follows: 139 o Wideband, 16 kHz, speech and audio codec 140 o Variable bit rate, which depends on the input signal 142 o Adaptive rate with two modes: channel-adaptive or channel- 143 independent mode 145 o Bit rate range from around 10 kbps to 32 kbps 147 o Operates on 30 or 60 ms of speech 149 3. RTP Payload Format 151 The iSAC codec uses a sampling rate clock of 16 kHz, so the RTP 152 timestamp MUST be in units of 1/16000 of a second. 154 The RTP payload for iSAC has the format shown in Figure 1. No 155 additional header fields specific to this payload format are 156 required. For RTP based transportation of iSAC encoded audio, the 157 standard RTP header [2] is followed by one payload data block. 159 0 1 2 3 160 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 161 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 162 | RTP Header | 163 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 164 | | 165 + iSAC Payload Block + 166 | | 167 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 168 Figure 1: RTP packet format for iSAC 170 3.1. iSAC Payload Format 172 The iSAC payload block consists of a payload header and one or two 173 encoded 30 ms speech frames. The iSAC payload is generated in the 174 following manner: 176 o Parameters representing one or two 30 ms frames of speech data are 177 determined by the encoder. The parameters are quantized to 178 generate encoded data corresponding to the one or two speech 179 frames. The length of the encoded data is variable and depends on 180 the signal characteristics and the target bit rate. 182 o The payload header is generated (described in Section 3.2) and 183 added before the encoded parameter data for the speech frame(s). 185 o Lossless coding is applied to the complete iSAC payload block, 186 including payload header, to generate a compressed payload. The 187 length depends on the length of the data generated to represent 188 the speech and the effectiveness of the lossless coding. 190 No part of the payload header or the encoded speech data can be 191 retrieved without partly or fully decoding the packet. 193 The following figure shows an iSAC payload block containing 60 ms of 194 encoded speech data: 196 +--------+--------+--------+--------+--------+--------+--------+ 197 |Payload | 30 ms Encoded | 30 ms Encoded | 198 |Header | Speech Data | Speech Data | 199 +--------+--------+--------+--------+--------+--------+--------+ 201 Figure 2: Payload format for iSAC 203 3.2. Payload Header 205 The payload header holds information for the receiver about the 206 available bandwidth (BEI), and the length of the speech data in the 207 current payload (FL). The header has the format defined in Figure 3. 208 Note that the size of the header can vary due to the lossless 209 encoding described in section 2 and in section 3.1. Also note that 210 the BEI is always estimated and transmitted, even if iSAC runs in 211 channel-independent mode. 213 +-+-+-+-+-+-+ 214 | BEI | FL | 215 +-+-+-+-+-+-+ 217 Figure 3: Payload Header 219 o BEI: Bandwidth Estimation Index. The bandwidth estimate is 220 quantized into one out of 24 values. Valid values are 0 to 23. 222 o FL: The length of the speech data (Frame Length) present in the 223 payload, given in number of speech samples. Valid frame lengths 224 are 480 (30 ms) and 960 (60 ms) samples. 226 3.3. Encoded Speech Data 228 The iSAC encoded speech data consist of parameters representing one 229 or two frames of 30 ms speech. The length of the speech data is 230 signaled in the header (in number of samples), and the length may 231 change at any time during a session. In channel-adaptive mode the 232 length is changed to best utilize the available bandwidth. 234 The iSAC payload is padded to whole octets, and has a variable length 235 depending on the input source signal, number of 30 ms speech frames, 236 and target bit rate. 238 The number of octets used to describe one frame of 30 ms speech 239 typically varies from around 50 to around 120 octets. For the case 240 of 60 ms speech (two 30 ms speech frames), the number of octets 241 varies from around 100 to around 240 octets. The absolute maximum 242 allowed payload length is 400 octets. The user can choose to lower 243 the maximum allowed payload length. Minimum value is 100 octets. It 244 is possible for the user to choose a maximum bit rate instead of a 245 maximum payload length. The maximum payload length is then dependent 246 on the length of the speech data represented in the payload (30 or 60 247 ms). Possible maximum rates are in the range of 32000 to 53400 bits 248 per second. 250 The sensitivity to bit errors is equal for all bits in the payload. 252 3.4. Multiple iSAC frames in an RTP packet 254 More than one iSAC payload block MUST NOT be included in an RTP 255 packet by a sender. 257 Further, iSAC payload blocks MUST NOT be split between RTP packets. 259 4. IANA Considerations 261 This document defines the iSAC media type. 263 4.1. Media Type registration of iSAC 265 Media type name: audio 267 Media subtype: isac 269 Required parameters: None 271 Optional parameters: 273 o ibitrate: The parameter indicates the upper bound of the initial 274 target bit rate the device would like to receive. For channel- 275 adaptive mode, the target bit rate may vary with time; for 276 channel-independent mode, the target bit rate will remain at that 277 level unless instructed otherwise. An acceptable value for 278 ibitrate is in the range of 20000 to 32000 (bits per second). 280 o maxbitrate: The parameter indicates the maximum bit rate the 281 endpoint expects to receive. The recipient of this parameter 282 SHOULD NOT transmit at a higher bit rate. 284 Encoding considerations: 286 This media format is framed and binary. 288 Security considerations: 290 See section 6. 292 Interoperability considerations: None 294 Published specification: 296 Applications which use this media type: 298 This media type is suitable for use in numerous applications 299 needing to transport encoded voice or other audio. Some examples 300 include Voice over IP, Streaming Media, Voice Messaging, and 301 Conferencing. 303 Additional information: None 305 Intended usage: COMMON 307 Other Information/General Comment: 309 iSAC is a proprietary speech and audio codec owned by Global IP 310 Solutions. The codec operates on 30 or 60 ms speech frames at a 311 sampling rate clock of 16 kHz. 313 Person to contact for further information: 315 Tina le Grand [tina.legrand@gipscorp.com] 317 Restrictions on usage: 319 This media type depends on RTP framing, and hence is only defined 320 for transfer via RTP [2]. Transport within other framing 321 protocols is not defined at this time. 323 Change controller: 325 IETF Audio/Video Transport working group delegated from the IESG. 327 5. Mapping to SDP Parameters 329 The information carried in the media type specification has a 330 specific mapping to fields in the Session Description Protocol (SDP) 331 [4], which is commonly used to describe RTP sessions. When SDP is 332 used to specify sessions employing the iSAC codec, the mapping is as 333 follows: 335 o The media type ("audio") goes in SDP "m=" as the media name. 337 o The media subtype (payload format name) goes in SDP "a=rtpmap" as 338 the encoding name. 340 o Any remaining parameters go in the SDP "a=fmtp" attribute by 341 copying them directly from the media type string as a semicolon 342 separated list of parameter=value pairs. 344 The optional parameter ibitrate MUST NOT be higher than the parameter 345 maxbitrate. 347 The iSAC parameters in an SDP offer are completely independent from 348 those in the SDP answer. For both ibitrate and maxbitrate it is 349 legal for the answer to contain a value that is different than what 350 is provided in an offer. The parameter may be present in the answer, 351 even if absent in the offer. 353 When conveying information by SDP, the encoding name SHALL be "isac" 354 (the same as the media subtype). 356 5.1. Example Initial Target Bit Rate 358 The offer indicates that it wishes to receive a bitstream with an 359 initial target rate of 20000 bits per second. The remote party MAY 360 change its initial target rate to the requested value. 362 m=audio 10000 RTP/AVP 98 363 a=rtpmap: 98 isac/16000 364 a=fmtp:98 ibitrate=20000 366 5.2. Example Max Bit Rate 368 The offer indicates that it wishes to receive a bitstream with an 369 initial target rate of 20000 bits per second, and a maximum bit rate 370 of 45000 bits per second. The remote party MAY change its initial 371 target rate and SHOULD NOT transmit at a higher rate than 45000. 373 m=audio 10000 RTP/AVP 98 374 a=rtpmap: 98 isac/16000 375 a=fmtp:98 ibitrate=20000;maxrate=45000 377 6. Security Considerations 379 RTP packets using the payload format defined in this specification 380 are subject to the general security considerations discussed in RFC 381 3550 [2]. 383 As this format transports encoded speech, the main security issues 384 include confidentiality and authentication of the speech itself. The 385 payload format itself does not have any built-in security mechanisms. 386 External mechanisms, such as SRTP [3], MAY be used. 388 7. Acknowledgments 390 This document was prepared using 2-Word-v2.0.template.dot. 392 8. References 394 8.1. Normative References 396 [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement 397 Levels", BCP 14, RFC 2119, March 1997. 399 [2] Schulzrinne, H., Casner, S., Frederick, R., and Jacobson, V., 400 "RTP: A Transport Protocol for Real-Time Applications", STD 64, 401 RFC 3550, July 2003. 403 [3] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and Norrman, 404 K., "The Secure Real-time Transport Protocol (SRTP)", RFC 3711, 405 March 2004. 407 [4] Handley, M., Jacobson, V., and Perkins, C., "SDP: Session 408 Description Protocol", RFC 4566, July 2006. 410 8.2. Informative References 412 [5] iSAC datasheet at Global IP Solutions website, 413 http://www.gipscorp.com/files/english/datasheets/iSAC.pdf 415 Author's Addresses 417 Tina le Grand 418 Global IP Solutions 419 Magnus Ladulasgatan 63B 420 SE-118 27 Stockholm 421 Sweden 422 Email: tina.legrand@gipscorp.com 424 Paul E. Jones 425 Cisco Systems, Inc, 426 7025 Kit Creek Rd. 427 Research Triangle Park, NC 27709 428 USA 429 Tel: +1 919 476 2048 430 Email: paulej@packetizer.com 432 Pascal Huart 433 Cisco Systems 434 400, Avenue Roumanille 435 Batiment T3 436 06410 BIOT - SOPHIA ANTIPOLIS 437 FRANCE 438 Tel: +33 4 9723 2643 439 Email: phuart@cisco.com