idnits 2.17.1 draft-legrand-rtp-isac-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** The document seems to lack a License Notice according IETF Trust Provisions of 28 Dec 2009, Section 6.b.ii or Provisions of 12 Sep 2009 Section 6.b -- however, there's a paragraph with a matching beginning. Boilerplate error? (You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Feb 2009 rather than one of the newer Notices. See https://trustee.ietf.org/license-info/.) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (October 15, 2009) is 5305 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 4566 (ref. '4') (Obsoleted by RFC 8866) Summary: 2 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 AVT T. le Grand 2 Internet Draft Global IP Solutions 3 Intended status: Standards Track P. Jones 4 Expires: April 2010 Cisco 5 P. Huart 6 Cisco 7 October 15, 2009 9 RTP Payload Format for the iSAC Codec 10 draft-legrand-rtp-isac-02.txt 12 Status of this Memo 14 This Internet-Draft is submitted to IETF in full conformance with the 15 provisions of BCP 78 and BCP 79. 17 Copyright (c) 2009 IETF Trust and the persons identified as the 18 document authors. All rights reserved. 20 This document is subject to BCP 78 and the IETF Trust's Legal 21 Provisions Relating to IETF Documents in effect on the date of 22 publication of this document (http://trustee.ietf.org/license-info). 23 Please review these documents carefully, as they describe your rights 24 and restrictions with respect to this document. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF), its areas, and its working groups. Note that 28 other groups may also distribute working documents as Internet- 29 Drafts. 31 Internet-Drafts are draft documents valid for a maximum of six months 32 and may be updated, replaced, or obsoleted by other documents at any 33 time. It is inappropriate to use Internet-Drafts as reference 34 material or to cite them other than as "work in progress." 36 The list of current Internet-Drafts can be accessed at 37 http://www.ietf.org/ietf/1id-abstracts.txt 39 The list of Internet-Draft Shadow Directories can be accessed at 40 http://www.ietf.org/shadow.html 42 This Internet-Draft will expire on April 15, 2010. 44 Abstract 46 iSAC is a proprietary wideband speech and audio codec developed by 47 Global IP Solutions, suitable for use in Voice over IP applications. 48 This document describes the payload format for iSAC generated bit 49 streams within a Real-Time Protocol (RTP) packet. Also included here 50 are the necessary details for the use of iSAC with the Session 51 Description Protocol (SDP). 53 Conventions used in this document 55 In examples, "C:" and "S:" indicate lines sent by the client and 56 server respectively. 58 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 59 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 60 document are to be interpreted as described in RFC 2119 [1]. 62 Table of Contents 64 1. Introduction...................................................2 65 2. iSAC Codec Description.........................................3 66 3. RTP Payload Format.............................................4 67 3.1. iSAC Payload Format.......................................4 68 3.2. Payload Header............................................5 69 3.3. Encoded Speech Data.......................................5 70 3.4. Multiple iSAC frames in an RTP packet.....................6 71 4. IANA Considerations............................................6 72 4.1. Media Type registration of iSAC...........................6 73 5. Mapping to SDP Parameters......................................8 74 5.1. Example Initial Target Bit Rate...........................8 75 5.2. Example Max Bit Rate......................................9 76 6. Security Considerations........................................9 77 7. Acknowledgments................................................9 78 8. References.....................................................9 79 8.1. Normative References......................................9 80 8.2. Informative References...................................10 81 Author's Addresses...............................................10 83 1. Introduction 85 This document gives a general description of the iSAC wideband speech 86 codec and specifies the iSAC payload format for usage in RTP packets. 87 Also included here are the necessary details for the use of iSAC with 88 the Session Description Protocol (SDP). 90 2. iSAC Codec Description 92 The iSAC codec is an adaptive wideband speech and audio codec that 93 operates with short delay, making it suitable for high quality real 94 time communication. It is specially designed to deliver wideband 95 speech quality in both low and medium bit rate applications. It also 96 handles non-speech audio well, such as music and background noise 97 [5]. 99 The iSAC codec compresses speech frames of 16 kHz, 16-bit sampled 100 input speech, each frame containing 30 or 60 ms of speech. 102 The codec runs in one of two different modes called channel-adaptive 103 mode and channel-independent mode. In both modes iSAC is aiming at a 104 target bit rate, which is neither the average nor the maximum bit 105 rate that will be reach by iSAC, but corresponds to the average bit 106 rate during peaks in speech activity. The bit rate will sometimes 107 exceed the target bit rate, but most of the time will be below. The 108 average bit rate obtained is on average about a factor of 1.4 times 109 lower than the target bit rate. 111 In channel-adaptive mode the target bit rate is adapted to give a bit 112 rate corresponding to the available bandwidth on the channel. The 113 available bandwidth is constantly estimated at the receiving iSAC and 114 signaled in-band in the iSAC bit stream. Even at dial-up modem data 115 rates (including IP, UDP, and RTP overhead) iSAC delivers high 116 quality by automatically adjusting transmission rates to give the 117 best possible listening experience over the available bandwidth. The 118 default initial target bit rate is 20000 bits per second in channel- 119 adaptive mode. 121 In channel-independent mode a target bit rate has to be provided to 122 iSAC prior to encoding. 124 After encoding the speech signal the iSAC coder uses lossless coding 125 to further reduce the size of each packet, and hence the total bit 126 rate used. 128 The adaptation and the lossless coding described above both result in 129 a variation of packet size, depending both of the nature of speech 130 and the available bandwidth. Therefore the iSAC codec operates at 131 transmission rates from about 10 kbps to about 32 kbps. 133 The main characteristics can be summarized as follows: 135 o Wideband, 16 kHz, speech and audio codec 136 o Variable bit rate, which depends on the input signal 138 o Adaptive rate with two modes: channel-adaptive or channel- 139 independent mode 141 o Bit rate range from around 10 kbps to 32 kbps 143 o Operates on 30 or 60 ms of speech 145 3. RTP Payload Format 147 The iSAC codec uses a sampling rate clock of 16 kHz, so the RTP 148 timestamp MUST be in units of 1/16000 of a second. 150 The RTP payload for iSAC has the format shown in Figure 1. No 151 additional header fields specific to this payload format are 152 required. For RTP based transportation of iSAC encoded audio, the 153 standard RTP header [2] is followed by one payload data block. 155 0 1 2 3 156 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 157 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 158 | RTP Header | 159 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 160 | | 161 + iSAC Payload Block + 162 | | 163 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 164 Figure 1: RTP packet format for iSAC 166 3.1. iSAC Payload Format 168 The iSAC payload block consists of a payload header and one or two 169 encoded 30 ms speech frames. The iSAC payload is generated in the 170 following manner: 172 o Parameters representing one or two 30 ms frames of speech data are 173 determined by the encoder. The parameters are quantized to 174 generate encoded data corresponding to the one or two speech 175 frames. The length of the encoded data is variable and depends on 176 the signal characteristics and the target bit rate. 178 o The payload header is generated (described in Section 3.2) and 179 added before the encoded parameter data for the speech frame(s). 181 o Lossless coding is applied to the complete iSAC payload block, 182 including payload header, to generate a compressed payload. The 183 length depends on the length of the data generated to represent 184 the speech and the effectiveness of the lossless coding. 186 No part of the payload header or the encoded speech data can be 187 retrieved without partly or fully decoding the packet. 189 The following figure shows an iSAC payload block containing 60 ms of 190 encoded speech data: 192 +--------+--------+--------+--------+--------+--------+--------+ 193 |Payload | 30 ms Encoded | 30 ms Encoded | 194 |Header | Speech Data | Speech Data | 195 +--------+--------+--------+--------+--------+--------+--------+ 197 Figure 2: Payload format for iSAC 199 3.2. Payload Header 201 The payload header holds information for the receiver about the 202 available bandwidth (BEI), and the length of the speech data in the 203 current payload (FL). The header has the format defined in Figure 3. 204 Note that the size of the header can vary due to the lossless 205 encoding described in section 2 and in section 3.1. Also note that 206 the BEI is always estimated and transmitted, even if iSAC runs in 207 channel-independent mode. 209 +-+-+-+-+-+-+ 210 | BEI | FL | 211 +-+-+-+-+-+-+ 213 Figure 3: Payload Header 215 o BEI: Bandwidth Estimation Index. The bandwidth estimate is 216 quantized into one out of 24 values. Valid values are 0 to 23. 218 o FL: The length of the speech data (Frame Length) present in the 219 payload, given in number of speech samples. Valid frame lengths 220 are 480 (30 ms) and 960 (60 ms) samples. 222 3.3. Encoded Speech Data 224 The iSAC encoded speech data consist of parameters representing one 225 or two frames of 30 ms speech. The length of the speech data is 226 signaled in the header (in number of samples), and the length may 227 change at any time during a session. In channel-adaptive mode the 228 length is changed to best utilize the available bandwidth. 230 The iSAC payload is padded to whole octets, and has a variable length 231 depending on the input source signal, number of 30 ms speech frames, 232 and target bit rate. 234 The number of octets used to describe one frame of 30 ms speech 235 typically varies from around 50 to around 120 octets. For the case 236 of 60 ms speech (two 30 ms speech frames), the number of octets 237 varies from around 100 to around 240 octets. The absolute maximum 238 allowed payload length is 400 octets. The user can choose to lower 239 the maximum allowed payload length. Minimum value is 100 octets. It 240 is possible for the user to choose a maximum bit rate instead of a 241 maximum payload length. The maximum payload length is then dependent 242 on the length of the speech data represented in the payload (30 or 60 243 ms). Possible maximum rates are in the range of 32000 to 53400 bits 244 per second. 246 The sensitivity to bit errors is equal for all bits in the payload. 248 3.4. Multiple iSAC frames in an RTP packet 250 More than one iSAC payload block MUST NOT be included in an RTP 251 packet by a sender. 253 Further, iSAC payload blocks MUST NOT be split between RTP packets. 255 4. IANA Considerations 257 This document defines the iSAC media type. 259 4.1. Media Type registration of iSAC 261 Media type name: audio 263 Media subtype: isac 265 Required parameters: None 267 Optional parameters: 269 o ibitrate: The parameter indicates the upper bound of the initial 270 target bit rate the device would like to receive. For channel- 271 adaptive mode, the target bit rate may vary with time; for 272 channel-independent mode, the target bit rate will remain at that 273 level unless instructed otherwise. An acceptable value for 274 ibitrate is in the range of 20000 to 32000 (bits per second). 276 o maxbitrate: The parameter indicates the maximum bit rate the 277 endpoint expects to receive. The recipient of this parameter 278 SHOULD NOT transmit at a higher bit rate. 280 Encoding considerations: 282 This media format is framed and binary. 284 Security considerations: 286 See section 6. 288 Interoperability considerations: None 290 Published specification: 292 Applications which use this media type: 294 This media type is suitable for use in numerous applications 295 needing to transport encoded voice or other audio. Some examples 296 include Voice over IP, Streaming Media, Voice Messaging, and 297 Conferencing. 299 Additional information: None 301 Intended usage: COMMON 303 Other Information/General Comment: 305 iSAC is a proprietary speech and audio codec owned by Global IP 306 Solutions. The codec operates on 30 or 60 ms speech frames at a 307 sampling rate clock of 16 kHz. 309 Person to contact for further information: 311 Tina le Grand [tina.legrand@gipscorp.com] 313 Restrictions on usage: 315 This media type depends on RTP framing, and hence is only defined 316 for transfer via RTP [2]. Transport within other framing 317 protocols is not defined at this time. 319 Change controller: 321 IETF Audio/Video Transport working group delegated from the IESG. 323 5. Mapping to SDP Parameters 325 The information carried in the media type specification has a 326 specific mapping to fields in the Session Description Protocol (SDP) 327 [4], which is commonly used to describe RTP sessions. When SDP is 328 used to specify sessions employing the iSAC codec, the mapping is as 329 follows: 331 o The media type ("audio") goes in SDP "m=" as the media name. 333 o The media subtype (payload format name) goes in SDP "a=rtpmap" as 334 the encoding name. 336 o Any remaining parameters go in the SDP "a=fmtp" attribute by 337 copying them directly from the media type string as a semicolon 338 separated list of parameter=value pairs. 340 The optional parameter ibitrate MUST NOT be higher than the parameter 341 maxbitrate. 343 The iSAC parameters in an SDP offer are completely independent from 344 those in the SDP answer. For both ibitrate and maxbitrate it is 345 legal for the answer to contain a value that is different than what 346 is provided in an offer. The parameter may be present in the answer, 347 even if absent in the offer. 349 When conveying information by SDP, the encoding name SHALL be "isac" 350 (the same as the media subtype). 352 5.1. Example Initial Target Bit Rate 354 The offer indicates that it wishes to receive a bitstream with an 355 initial target rate of 20000 bits per second. The remote party MAY 356 change its initial target rate to the requested value. 358 m=audio 10000 RTP/AVP 98 359 a=rtpmap: 98 isac/16000 360 a=fmtp:98 ibitrate=20000 362 5.2. Example Max Bit Rate 364 The offer indicates that it wishes to receive a bitstream with an 365 initial target rate of 20000 bits per second, and a maximum bit rate 366 of 45000 bits per second. The remote party MAY change its initial 367 target rate and SHOULD NOT transmit at a higher rate than 45000. 369 m=audio 10000 RTP/AVP 98 370 a=rtpmap: 98 isac/16000 371 a=fmtp:98 ibitrate=20000;maxrate=45000 373 6. Security Considerations 375 RTP packets using the payload format defined in this specification 376 are subject to the general security considerations discussed in RFC 377 3550 [2]. 379 As this format transports encoded speech, the main security issues 380 include confidentiality and authentication of the speech itself. The 381 payload format itself does not have any built-in security mechanisms. 382 External mechanisms, such as SRTP [3], MAY be used. 384 7. Acknowledgments 386 This document was prepared using 2-Word-v2.0.template.dot. 388 8. References 390 8.1. Normative References 392 [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement 393 Levels", BCP 14, RFC 2119, March 1997. 395 [2] Schulzrinne, H., Casner, S., Frederick, R., and Jacobson, V., 396 "RTP: A Transport Protocol for Real-Time Applications", STD 64, 397 RFC 3550, July 2003. 399 [3] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and Norrman, 400 K., "The Secure Real-time Transport Protocol (SRTP)", RFC 3711, 401 March 2004. 403 [4] Handley, M., Jacobson, V., and Perkins, C., "SDP: Session 404 Description Protocol", RFC 4566, July 2006. 406 8.2. Informative References 408 [5] iSAC datasheet at Global IP Solutions website, 409 http://www.gipscorp.com/files/english/datasheets/iSAC.pdf 411 Author's Addresses 413 Tina le Grand 414 Global IP Solutions 415 Magnus Ladulasgatan 63B 416 SE-118 27 Stockholm 417 Sweden 418 Email: tina.legrand@gipscorp.com 420 Paul E. Jones 421 Cisco Systems, Inc, 422 7025 Kit Creek Rd. 423 Research Triangle Park, NC 27709 424 USA 425 Tel: +1 919 476 2048 426 Email: paulej@packetizer.com 428 Pascal Huart 429 Cisco Systems 430 400, Avenue Roumanille 431 Batiment T3 432 06410 BIOT - SOPHIA ANTIPOLIS 433 FRANCE 434 Tel: +33 4 9723 2643 435 Email: phuart@cisco.com