idnits 2.17.1 draft-ietf-avt-rtp-isac-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 298 has weird spacing: '... Header spee...' -- The document date (February 8, 2013) is 4066 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group T. le Grand 3 Internet-Draft Google 4 Intended status: Standards Track P. Jones 5 Expires: August 12, 2013 P. Huart 6 Cisco Systems 7 T. Shabestary 8 H. Alvestrand, Ed. 9 Google 10 February 8, 2013 12 RTP Payload Format for the iSAC Codec 13 draft-ietf-avt-rtp-isac-04 15 Abstract 17 iSAC is a proprietary wideband speech and audio codec developed by 18 Global IP Solutions (now part of Google), suitable for use in Voice 19 over IP applications. This document describes the payload format for 20 iSAC generated bit streams within a Real-Time Protocol (RTP) packet. 21 Also included here are the necessary details for the use of iSAC with 22 the Session Description Protocol (SDP). 24 Requirements Language 26 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 27 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 28 document are to be interpreted as described in RFC 2119 [RFC2119]. 30 Status of this Memo 32 This Internet-Draft is submitted in full conformance with the 33 provisions of BCP 78 and BCP 79. 35 Internet-Drafts are working documents of the Internet Engineering 36 Task Force (IETF). Note that other groups may also distribute 37 working documents as Internet-Drafts. The list of current Internet- 38 Drafts is at http://datatracker.ietf.org/drafts/current/. 40 Internet-Drafts are draft documents valid for a maximum of six months 41 and may be updated, replaced, or obsoleted by other documents at any 42 time. It is inappropriate to use Internet-Drafts as reference 43 material or to cite them other than as "work in progress." 45 This Internet-Draft will expire on August 12, 2013. 47 Copyright Notice 48 Copyright (c) 2013 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents 53 (http://trustee.ietf.org/license-info) in effect on the date of 54 publication of this document. Please review these documents 55 carefully, as they describe your rights and restrictions with respect 56 to this document. Code Components extracted from this document must 57 include Simplified BSD License text as described in Section 4.e of 58 the Trust Legal Provisions and are provided without warranty as 59 described in the Simplified BSD License. 61 Table of Contents 63 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 64 2. iSAC Codec Description . . . . . . . . . . . . . . . . . . . . 3 65 3. RTP Payload Format . . . . . . . . . . . . . . . . . . . . . . 4 66 3.1. Payload Header . . . . . . . . . . . . . . . . . . . . . . 5 67 3.2. iSAC Wideband Payload Format . . . . . . . . . . . . . . . 6 68 3.2.1. Encoded Speech Data . . . . . . . . . . . . . . . . . 6 69 3.3. iSAC Superwideband Payload Format . . . . . . . . . . . . 7 70 3.3.1. Encoded Upper-band Speech Data . . . . . . . . . . . . 8 71 3.4. Padding . . . . . . . . . . . . . . . . . . . . . . . . . 8 72 3.5. Multiple iSAC frames in an RTP packet . . . . . . . . . . 9 73 4. Congestion Control . . . . . . . . . . . . . . . . . . . . . . 9 74 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 75 6. Mapping to SDP Parameters . . . . . . . . . . . . . . . . . . 12 76 6.1. Example Initial Target Bit Rate . . . . . . . . . . . . . 12 77 6.2. Example Max Bit Rate . . . . . . . . . . . . . . . . . . . 13 78 6.3. Example with both WB and SWB offered . . . . . . . . . . . 13 79 7. Security Considerations . . . . . . . . . . . . . . . . . . . 13 80 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 14 81 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 14 82 9.1. Normative References . . . . . . . . . . . . . . . . . . . 14 83 9.2. Informative References . . . . . . . . . . . . . . . . . . 14 84 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 15 86 1. Introduction 88 This document gives a general description of the iSAC wideband speech 89 codec and specifies the iSAC payload format for usage in RTP packets. 90 Also included here are the necessary details for the use of iSAC with 91 the Session Description Protocol (SDP). 93 2. iSAC Codec Description 95 The iSAC codec is an adaptive wideband/superwideband speech and audio 96 codec that operates with short delay, making it suitable for high 97 quality real time communication. It is specially designed to deliver 98 wideband speech quality in both low and medium bit rate applications. 99 It also handles non-speech audio well, such as music and background 100 noise. A freely available reference implementation exists [iSAC]. 102 The iSAC codec compresses speech frames of 16 kHz, 16-bit sampled 103 input speech, each frame containing 30 or 60 ms of speech. It also 104 has a superwideband mode which allows a 32 kHz sampling rate. In 105 super-wideband mode the input signal is split into wideband (0-8 kHz) 106 and upper (8-16 kHz) signal. Each sub-band is encoded independently, 107 and their associated payloads concatenated, c.f. Figure 2, to 108 construct the overall iSAC super-wideband RTP payload. Note that the 109 same encoder/decoder is used for the wideband part for both wideband 110 and super-wideband modes. 112 The codec runs in one of two different modes called channel-adaptive 113 mode and channel-independent mode. In both modes iSAC is aiming at a 114 target bit rate, which is neither the average nor the maximum bit 115 rate that will be reach by iSAC, but corresponds to the average bit 116 rate during peaks in speech activity. The bit rate will sometimes 117 exceed the target bit rate, but most of the time will be below. The 118 average bit rate obtained is on average about a factor of 1.2 times 119 lower than the target bit rate on continuous speech, and will be 120 lower on speech with pauses. 122 In channel-adaptive mode the target bit rate is adapted to give a bit 123 rate corresponding to the available bandwidth on the channel. Even 124 at dial-up modem data rates (including IP, UDP, and RTP overhead) 125 iSAC delivers high quality by automatically adjusting transmission 126 rates to give the best possible listening experience over the 127 available bandwidth. 129 In channel-independent mode a target bit rate has to be provided to 130 iSAC prior to encoding; the target bit rate can be changed over the 131 time of the call. 133 After encoding the speech signal the iSAC coder uses lossless coding 134 to further reduce the size of each packet, and hence the total bit 135 rate used. 137 The adaptation and the lossless coding described above both result in 138 a variation of packet size, depending both of the nature of speech 139 and the available bandwidth. Therefore, the iSAC codec, in wideband 140 mode, operates at transmission rates from about 10 kbps to about 32 141 kbps. In super-wideband mode, the transmission rate is in the range 142 of 10 kbps to 56 kbps. If operating in super-wideband mode, the iSAC 143 codec automatically adjusts the effective encoded audio bandwidth for 144 the best experience. 146 Bit Rate | 10 - 32 | 32 - 38 | 38 - 45 | 45 - 50 | 50 - 56 147 [kbps] | | | | | 148 ----------+----------+------------+----------------------+--------- 149 Effective | | 0 - 8 | 0 - 12 | 0 - 12 | 0 - 16 150 Bandwidth | 0 - 8 kHz| operating | | operating | 151 [kHz] | | at 32 kbps | | at 45 kbps | 153 The main characteristics can be summarized as follows: 155 o Wideband or superwideband, 16 kHz or 32 kHz respectively, speech 156 and audio codec 158 o Variable bit rate, which depends on the input signal 160 o Adaptive rate with two modes: channel-adaptive or channel- 161 independent mode 163 o Bit rate range from around 10 kbps to 32 kbps when operating on 164 wideband input. For input audio sampled at 32 kHz, the bit rate 165 range 10 kbps to 56 kbps. 167 o Operates on 30 or 60 ms of speech for wideband inputs, and only 30 168 ms for super-wideband inputs. 170 o In super-wideband mode, depending on the target bit rate, the 171 effective bandwidth is adjusted for the optimal experience. 173 3. RTP Payload Format 175 The iSAC codec in wideband mode uses a sampling rate clock of 16 kHz, 176 so the RTP timestamp MUST be in units of 1/16000 of a second. In 177 super-wideband mode, the iSAC codec uses a sampling rate clock of 32 178 kHz, so the RTP timestamp MUST be in units of 1/32000 of a second. 180 The RTP payload for iSAC has the format shown in Figure 1. No 181 additional header fields specific to this payload format are 182 required. For RTP based transportation of iSAC encoded audio, the 183 standard RTP header [RFC3550] is followed by one payload data block. 185 The assignment of an RTP payload type for the format defined in this 186 memo is outside the scope of this document. The RTP profiles in use 187 currently mandate binding the payload type dynamically for this 188 payload format. 190 0 1 2 3 191 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 192 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 193 | RTP Header | 194 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 195 | | 196 + iSAC Payload Block + 197 | | 198 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 199 Figure 1: RTP packet format for iSAC 201 3.1. Payload Header 203 The payload header holds information for the receiver about the 204 available bandwidth, in the form of a Bandwidth Estimation Index 205 (BEI), and the length of the speech data in the current payload 206 (frame length, FL). The header has the format defined in Figure 3. 207 Note that the size of the header can vary due to the lossless 208 encoding described in section 2 and in section 3.1. Also note that 209 the BEI is always estimated and transmitted, even if iSAC runs in 210 channel-independent mode. 212 +-+-+-+-+-+-+ 213 | BEI | FL | 214 +-+-+-+-+-+-+ 215 Figure 3: Payload Header 217 o BEI: Bandwidth Estimation Index. The bandwidth estimate that the 218 sender estimates for a stream originated at the receiver. It is 219 quantized into one out of 24 values. Valid values are 0 to 23; 220 consult source code for details. 222 o FL: The length of the speech data (Frame Length) present in the 223 payload, given in number of speech samples. Valid frame lengths 224 are 480 (30 ms) and 960 (60 ms) samples. 226 The BEI and FL are encoded together with the data using a lossless 227 compressed encoding, which results in a variable number of bits used 228 to represent the fields. 230 3.2. iSAC Wideband Payload Format 232 The iSAC payload block consists of a payload header and one or two 233 encoded 30 ms speech frames. The iSAC payload is generated in the 234 following manner: 236 o Parameters representing one or two 30 ms frames of speech data are 237 determined by the encoder. The parameters are quantized to 238 generate encoded data corresponding to the one or two speech 239 frames. The length of the encoded data is variable and depends on 240 the signal characteristics and the target bit rate. 242 o The payload header is generated (described in Section 3.1) and 243 added before the encoded parameter data for the speech frame(s). 245 o Lossless coding is applied to the complete iSAC payload block, 246 including payload header, to generate a compressed payload. The 247 length depends on the length of the data generated to represent 248 the speech and the effectiveness of the lossless coding. 250 No part of the payload header or the encoded speech data can be 251 retrieved without partly or fully decoding the packet. 253 The following figure shows an iSAC payload block containing 60 ms of 254 encoded speech data. 256 +--------+--------+--------+--------+--------+--------+------+ 257 |Payload | 30 ms Encoded | 30 ms Encoded | 258 |Header | Speech Data | Speech Data | 259 +--------+--------+--------+--------+--------+--------+------+ 260 Figure 2: Payload format for iSAC 262 3.2.1. Encoded Speech Data 264 The iSAC encoded speech data consist of parameters representing one 265 or two frames of 30 ms speech. The length of the speech data is 266 signaled in the header (in number of samples), and the length may 267 change at any time during a session. In channel-adaptive mode the 268 length is changed to best utilize the available bandwidth, and extra 269 padding is added to some packets as a bandwidth probe. 271 The iSAC payload is padded to whole octets, and has a variable length 272 depending on the input source signal, number of 30 ms speech frames, 273 and target bit rate. 275 The number of octets used to describe one frame of 30 ms speech 276 typically varies from around 50 to around 120 octets. For the case 277 of 60 ms speech (two 30 ms speech frames), the number of octets 278 varies from around 100 to around 240 octets. The absolute maximum 279 allowed payload length is 400 octets. The sender can choose to limit 280 the packet size further when transmitting. The minimum useful limit 281 for the payload length is 100 octets. 283 The sensitivity to bit errors is equal for all bits in the payload. 285 3.3. iSAC Superwideband Payload Format 287 In super-wideband mode, payloads associated with each sub-band 288 (wideband 0-8 kHz and upper-band 8-16 kHz) are constructed 289 independently and concatenated as depicted in Figure 2. Note that in 290 super-wideband mode only one 30 ms frame is encoded in each payload. 292 The receiver will know from negotiation whether wideband or super- 293 wideband is sent; it can also verify this for each packet by 294 verifying the CRC checksum. 296 +--------------------------------+---+------------------------+-----+ 297 | Payload +30 ms Encoded wideband|LEN|30 ms Encoded upper-band| CRC | 298 | Header speech data | |speech data |check| 299 +--------------------------------+---+------------------------+-----+ 300 |<--- CRC checked data ->| 302 Figure 4: Super-Wideband payload format 304 Payloads of wideband and upper-band are encoded independently, 305 allowing the encoder to simply concatenate two payloads to construct 306 one iSAC super-wideband payload. The RTP payload of the iSAC super- 307 wideband codec starts with the payload of the wideband part, which is 308 padded to whole octets, followed by one byte (LEN in Figure 4) 309 representing the length of the remaining sequence, payload of the 310 upper-band plus 4 bytes for CRC sequence. 312 If LEN_UB denotes the length of the upper-band payload, then LEN = 1 313 + LEN_UB + 4. If this value would exceed 255 at encoding, the upper- 314 band payload is omitted. 316 The CRC check is added to distinguish between upper-band payload and 317 random bit-stream padding that can be added for probing available 318 network bandwidth. 320 At the receive side, a super-wideband payload is first given to the 321 wideband decoder. The wideband decoder decodes as many parameters as 322 required to uniquely reproduce the encoded wideband audio. The next 323 byte in the payload should hold the value of LEN. This provides a 324 sanity check that the decoding process has not failed. Thereafter, 325 the receiver runs a CRC check over the upper-band payload and 326 compares the results with the last 4 bytes in the packet. 328 If the computed CRC and the last four bytes of the payload don't 329 match, the remaining bits are assumed to be added for probing the 330 network. Hence, the upper-band signal is replaced by zeros and 331 combined with the wideband signal to generate the super-wideband 332 signal. 334 If the two CRCs match, then the upper-band payload is given to the 335 upper-band decoder. Thereby, the output of the upper-band decoder is 336 combined with the wide-band decoded audio to generate the super- 337 wideband signal. 339 It might be that for a given packet, the wideband decoder uses all 340 the given payload. This can be the case when a super-wideband 341 encoder is operating at low rates and has adjusted the effective 342 bandwidth to wideband. In this case, the decoder inserts zeros as 343 the reconstructed upper-band and combines both bands to reproduce the 344 super-wideband signal. 346 3.3.1. Encoded Upper-band Speech Data 348 The iSAC encoded upper-band speech data consists of parameters 349 representing one frame of 30 ms speech. Depending on the target rate 350 the upper-band encoder might choose to only encode the sub-band of 8 351 kHz to 12 kHz. 353 3.4. Padding 355 Padding, which consists of randomly generated bits, may be added at 356 the end of the payload in both wideband and superwideband modes. It 357 can be used by the sender for bandwidth probing, and is always 358 ignored by the receiver. 360 In wideband mode, padding simply follows the payload, preceded by a 361 length field. 363 +----------+---+--------+ 364 | Wideband |LEN|Padding | 365 | payload | | | 366 +----------+---+--------+ 368 Figure 5: Wideband payload format with padding. 370 LEN is the length of the padding in bytes + 1: LEN = LEN_PAD + 1 371 In superwideband mode, the format of a packet with padding looks like 372 the following. 374 +----------+---+-------------+--+--------+-----+ 375 | Wideband |LEN|Upper-band |L2|Padding |CRC | 376 | payload | |speech data | | |check| 377 +----------+---+-------------+--+--------+-----+ 378 |<-- CRC checked data --->| 380 Figure 6: Super-Wideband payload format 382 LEN is 1 + LEN_UB + 1 + LEN_PAD + 4, where LEN_UB is the length of 383 the upper-band speech data in bytes, and LEN_PAD is the length of the 384 padding in bytes. 386 L2 is LEN_PAD + 1. 388 The CRC check runs over the upper-band speech data, L2 and the 389 padding. 391 3.5. Multiple iSAC frames in an RTP packet 393 More than one iSAC payload block MUST NOT be included in an RTP 394 packet by a sender. 396 Further, iSAC payload blocks MUST NOT be split between RTP packets. 398 4. Congestion Control 400 When ISAC is used in an environment where congestion control is 401 useful, there are two properties of importance: 403 o The ISAC format has the ability to pad packets. This allows a 404 sender to probe a channel with more bits per second than is 405 strictly needed for the transmission of current data, so that it 406 can check for the possibility of sending bigger packets without 407 incurring increased packet loss. 409 o The iSAC encoder (in channel-adaptive mode) can continuously tune 410 its encoding parameters so as to adapt the encoding to the 411 available bandwidth, without introducing switching artifacts into 412 the audio stream. 414 o In the case where two parties have one audio channel in each 415 direction, they can use the BEI field of the A->B audio flow as a 416 feedback channel for the B->A audio flow. 418 Coupled with a feedback channel (which may be of any type), the 419 sender can send some packets of larger size than necessary; the 420 recipient can then figure out if this increased size led to increased 421 packet loss or delay, and can send back information about this to the 422 sender. 424 The sender can then change its encoding parameters to produce smaller 425 or larger packets; when in wideband mode, it can also switch between 426 30-ms and 60-ms mode. 428 In the particular case of one audio channel in each direction, both 429 using iSAC, iSAC defines the BEI field as a feedback channel. The 430 available bandwidth is continuously estimated at the receiving iSAC; 431 the receiver will signal the sender in-band in the iSAC bit stream, 432 using the BEI field, what its estimate is. If the sending iSAC is 433 running in channel-adaptive mode, it will adjust its bitrate 434 accordingly. 436 This specification does not specify any particular feedback mechanism 437 for any other use case. 439 Note: This mechanism is only capable of reducing iSAC traffic to the 440 lowest available setting for iSAC. If there is congestion that makes 441 even less bandwidth available, other mechanisms, such as dropping the 442 call, will have to be used to escape from the congestion situation. 444 5. IANA Considerations 446 This RTP payload format is identified using the media type audio/ 447 isac, which is registered in accordance with [RFC4855] and uses the 448 template of [RFC6838]. 450 Type name: audio 452 Subtype name: isac 454 Required parameters: None 456 Optional parameters: 458 * ibitrate: The parameter indicates the upper bound in bits per 459 second of the initial target bit rate (counting only payload 460 bits) the device would like to receive. A sender SHOULD set its 461 initial target bitrate to a value less than or equal to this 462 parameter. An acceptable value for ibitrate is in the range of 463 20000 to 32000 (bits per second). In the absence of the 464 parameter, the sender can choose any value up to the maximum 465 bitrate possible. 467 * maxbitrate: The parameter indicates the maximum bit rate the 468 endpoint expects to receive. The recipient of this parameter 469 SHOULD NOT transmit at a higher bit rate. The default maximum 470 value is 53400 bits per second, which is the maximum bitrate 471 possible for iSAC. 473 Encoding considerations: 474 This media format is framed and binary. 476 Security considerations: See Section 7 478 Interoperability considerations: None 480 Published specification: RFC XXXX 482 Applications which use this media type: 483 This media type is suitable for use in numerous applications 484 needing to transport encoded voice or other audio. Some examples 485 include Voice over IP, Streaming Media, Voice Messaging, and 486 Conferencing. 488 Fragment identifier considerations The meaning of fragment 489 identifiers is not defined by this specification. 491 Additional information: None 493 Person to contact for further information: 494 Tina le Grand [tlegrand@google.com] 496 Intended usage: COMMON 498 Other Information/General Comment: 499 iSAC is a speech and audio codec owned by Google. The codec 500 operates on 30 or 60 ms speech frames at a sampling rate clock of 501 16 kHz or 32 kHz. 503 Restrictions on usage: 504 This media type depends on RTP framing, and hence is only defined 505 for transfer via RTP [RFC3550]. Transport within other framing 506 protocols is not defined at this time. 508 Author Tina Le Grand and the listed authors of RFC XXXX 510 Change controller: The IETF Payload working group delegated from the 511 IESG. 513 Provisional registration? No 515 Note to the RFC Editor / IANA: Please replace "RFC XXXX" above with 516 the number of this RFC when published, and remove this note. 518 6. Mapping to SDP Parameters 520 The information carried in the media type specification has a 521 specific mapping to fields in the Session Description Protocol (SDP) 522 [RFC4566], which is commonly used to describe RTP sessions. When SDP 523 is used to specify sessions employing the iSAC codec, the mapping is 524 as follows: 526 o The media type ("audio") goes in SDP "m=" as the media name. 528 o The media subtype (payload format name) goes in SDP "a=rtpmap" as 529 the encoding name. 531 o The clock rate is 16000 for wideband, and 32000 for superwideband. 533 o Any remaining parameters go in the SDP "a=fmtp" attribute by 534 copying them directly from the media type string as a semicolon 535 separated list of parameter=value pairs. 537 The optional parameter ibitrate MUST NOT be higher than the parameter 538 maxbitrate. 540 The iSAC parameters in an SDP offer are completely independent from 541 those in the SDP answer. For both ibitrate and maxbitrate it is 542 legal for the answer to contain a value that is different than what 543 is provided in an offer. The parameter may be present in the answer, 544 even if absent in the offer. 546 When conveying information by SDP, the encoding name SHALL be "isac" 547 (the same as the media subtype). 549 6.1. Example Initial Target Bit Rate 551 The offer indicates that it wishes to receive a wideband bitstream 552 with an initial target rate of 20000 bits per second. The remote 553 party should change its initial target rate to the requested value or 554 less. 556 m=audio 10000 RTP/AVP 98 557 a=rtpmap: 98 isac/16000 558 a=fmtp:98 ibitrate=20000 560 6.2. Example Max Bit Rate 562 The offer indicates that it wishes to receive a superwideband 563 bitstream with an initial target rate of 20000 bits per second, and a 564 maximum bit rate of 45000 bits per second. The remote party should 565 change its initial target rate to 20000 bits per second or less, and 566 should not transmit at a higher rate than 45000. 568 m=audio 10000 RTP/AVP 98 569 a=rtpmap: 98 isac/32000 570 a=fmtp:98 ibitrate=20000;maxbitrate=45000 572 6.3. Example with both WB and SWB offered 574 This offer indicates willingness to receive both wideband and 575 superwideband iSAC encodings, with default values for ibitrate and 576 bitrate. Superwideband is preferred. 578 m=audio 10000 RTP/AVP 98 99 579 a=rtpmap: 98 isac/32000 580 a=rtpmap: 99 isac/16000 582 7. Security Considerations 584 RTP packets using the payload format defined in this specification 585 are subject to the general security considerations discussed in RFC 586 3550 section 8.1. 588 As this format transports encoded speech, the main security issues 589 include confidentiality and authentication of the speech itself. The 590 payload format itself does not have any built-in security mechanisms. 591 External mechanisms, such as SRTP [RFC3711], MAY be used. 593 Since iSAC is a variable rate codec, the attack using the length of 594 encoded packets described in [RFC6562] is of interest. When using 595 RTP for transport, the padding approach described in that document is 596 usable; when such padding is not available or not feasible, the iSAC 597 padding mechanism can be used to the same effect. 599 8. Acknowledgments 601 Special thanks to Roni Even for his thorough review of the document, 602 and to Colin Perkins for additional review. 604 This document was originally prepared using 2-Word-v2.0.template.dot. 606 The present version is prepared using xml2rfc and xxe-xml2rfc. 608 9. References 610 9.1. Normative References 612 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 613 Requirement Levels", BCP 14, RFC 2119, March 1997. 615 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 616 Jacobson, "RTP: A Transport Protocol for Real-Time 617 Applications", RFC 3550, July 2003. 619 [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 620 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 621 RFC 3711, March 2004. 623 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 624 Description Protocol", RFC 4566, July 2006. 626 [RFC4855] Casner, S., "Media Type Registration of RTP Payload 627 Formats", RFC 4855, February 2007. 629 [RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type 630 Specifications and Registration Procedures", BCP 13, 631 RFC 6838, January 2013. 633 9.2. Informative References 635 [RFC6562] Perkins, C. and JM. Valin, "Guidelines for the Use of 636 Variable Bit Rate Audio with Secure RTP", RFC 6562, 637 March 2012. 639 [iSAC] GIPS / Google, "iSAC reference implementation". 641 Available at http://code.google.com/p/webrtc/source - 642 directory src/modules/audio_coding/codecs/isac 644 Authors' Addresses 646 Tina le Grand 647 Google 648 Kungsbron 2 649 Stockholm, 11122 650 Sweden 652 Paul E. Jones 653 Cisco Systems 654 7025 Kit Creek Rd. 655 Research Triangle Park, NC 27709 656 USA 658 Phone: +1 919 476 2048 659 Fax: 660 Email: paulej@packetizer.com 661 URI: 663 Pascal Huart 664 Cisco Systems 665 400, Avenue Roumanille, Batiment T3 666 Biot - Sophia Antipolis, 06410 667 France 669 Phone: +33 4 9723 2643 670 Fax: 671 Email: phuart@cisco.com 672 URI: 674 Turaj Zakizadeh Shabestary 675 Google 676 1950 Charleston Road 677 Mountain View, CA 94043 678 USA 680 Phone: 681 Fax: 682 Email: turajs@google.com 683 URI: 685 Harald Alvestrand (editor) 686 Google 687 Kungsbron 2 688 Stockholm, 11122 689 Sweden 691 Phone: 692 Fax: 693 Email: hta@google.com 694 URI: