idnits 2.17.1 draft-ietf-avt-rtp-ilbc-01.txt: -(372): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding -(440): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There are 5 instances of lines with non-ascii characters in the document. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The abstract seems to contain references ([1]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: '3' is defined on line 412, but no explicit reference was found in the text == Unused Reference: '6' is defined on line 422, but no explicit reference was found in the text == Unused Reference: '8' is defined on line 428, but no explicit reference was found in the text == Unused Reference: '11' is defined on line 438, but no explicit reference was found in the text -- No information found for draft-ietf-avt-lbc-codec - is the name correct? -- Possible downref: Normative reference to a draft: ref. '1' ** Obsolete normative reference: RFC 1889 (ref. '4') (Obsoleted by RFC 3550) ** Obsolete normative reference: RFC 1890 (ref. '5') (Obsoleted by RFC 3551) ** Obsolete normative reference: RFC 2327 (ref. '7') (Obsoleted by RFC 4566) -- Possible downref: Non-RFC (?) normative reference: ref. '9' -- Possible downref: Non-RFC (?) normative reference: ref. '10' ** Obsolete normative reference: RFC 3267 (ref. '11') (Obsoleted by RFC 4867) Summary: 7 errors (**), 0 flaws (~~), 6 warnings (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Alan Duric 3 Soren Vang Andersen 4 Internet Draft 5 draft-ietf-avt-rtp-ilbc-01.txt Global IP Sound 6 March 3rd, 2003 7 Expires: September 3rd, 2003 9 RTP Payload Format for iLBC Speech 11 Status of this Memo 13 This document is an Internet-Draft and is in full conformance 14 with all provisions of Section 10 of RFC2026. 16 Internet-Drafts are working documents of the Internet Engineering 17 Task Force (IETF), its areas, and its working groups. Note that 18 other groups may also distribute working documents as Internet- 19 Drafts. 21 Internet-Drafts are draft documents valid for a maximum of six 22 months and may be updated, replaced, or obsoleted by other documents 23 at any time. It is inappropriate to use Internet-Drafts as 24 reference material or to cite them other than as "work in progress." 26 The list of current Internet-Drafts can be accessed at 27 http://www.ietf.org/ietf/1id-abstracts.txt 28 The list of Internet-Draft Shadow Directories can be accessed at 29 http://www.ietf.org/shadow.html. 31 Abstract 33 This document describes the RTP payload format for the internet Low 34 Bit Rate Coder (iLBC) Speech [1] developed by Global IP Sound 35 (GIPS). Also, within the document there are included necessary 36 details for the use of iLBC with MIME and SDP. 38 Table of Contents 40 Status of this Memo................................................1 41 Abstract...........................................................1 42 Table of Contents..................................................1 43 1. INTRODUCTION....................................................2 44 2. BACKGROUND......................................................2 45 3. RTP PAYLOAD FORMAT..............................................3 46 3.1 Bitstream definition...........................................3 47 3.2 Multiple iLBC frames in a RTP packet...........................5 48 4. IANA CONSIDERATIONS.............................................6 49 4.1 Storage Mode...................................................6 50 4.2 MIME registration of iLBC......................................6 51 6. SECURITY CONSIDERATIONS.........................................8 52 7. REFERENCES......................................................8 53 8. ACKNOWLEDGEMENTS................................................9 54 9. AUTHOR'S ADDRESSES..............................................9 56 1. INTRODUCTION 58 This document describes how compressed iLBC speech as produced by 59 the iLBC codec [1] may be formatted for use as an RTP payload type. 60 Methods are provided to packetize the codec data frames into RTP 61 packets. The sender may send one or more codec data frames per 62 packet, depending on the application scenario or based on the 63 transport network condition, bandwidth restriction, delay 64 requirements and packet-loss tolerance. 66 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 67 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in 68 this document are to be interpreted as described in RFC 2119 [2]. 70 2. BACKGROUND 72 Global IP Sound (GIPS) has developed and defines a freeware speech 73 compression algorithm for use in IP based communications [1]. The 74 iLBC codec enables graceful speech quality degradation in the case 75 of lost frames, which occurs in connection with lost or delayed IP 76 packets. 78 Some of the applications for which this coder is suitable are: real 79 time communications such as telephony and videoconferencing, 80 streaming audio, archival and messaging. 82 The iLBC codec [1] is an algorithm that compresses each basic frame 83 (20 ms or 30 ms) of 8000 Hz, 16-bit sampled input speech into size 84 output frames with rate of 399 bits for 30 ms basic frame size and 85 303 bits for 20 ms basic frame size. 87 The codec has support for two basic frame lengths � 30 ms at 13.33 88 kbit/s and 20 ms at 15.2 kbit/s, using a block independent linear- 89 predictive coding (LPC) algorithm. When the codec operates at block 90 lengths of 20 ms, it produces 303 bits per block which SHOULD be 91 packetized in 38 bytes. Similarly, for block lengths of 30 ms it 92 produces 399 bits per block which SHOULD be packetized in 50 bytes. 93 The described algorithm results in a speech coding system with a 94 controlled response to packet losses similar to what is known from 95 pulse code modulation (PCM) with a packet loss concealment (PLC), 96 such as ITU-T G711 standard [10], which operates at a fixed bit rate 97 of 64 kbit/s. At the same time, the described algorithm enables 98 fixed bit rate coding with a quality-versus-bit rate tradeoff close 99 to what is known from code-excited linear prediction (CELP). 101 3. RTP PAYLOAD FORMAT 103 The iLBC codec uses 20 or 30 ms frames and a sampling rate clock of 104 8 kHz, so the RTP timestamp MUST be in units of 1/8000 of a second. 105 The RTP payload for iLBC has the format shown in the figure bellow. 106 No addition header specific to this payload format is required. 108 This format is intended for the situations where the sender and the 109 receiver send one or more codec data frames per packet. The RTP 110 packet looks as follows: 112 0 1 2 3 113 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 114 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 115 | RTP Header [4] | 116 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 117 | | 118 + one or more frames of iLBC [1] | 119 | | 120 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 122 The RTP header of the packetized encoded iLBC speech has the 123 expected values as described in [4]. The usage of M bit should be as 124 specified in the applicable RTP profile, for example, RFC 1890 [5], 125 where [5] specifies that if the sender does not suppress silence 126 (i.e., sends a frame on every frame interval), the M bit will always 127 be zero. When more then one codec data frame is present in a single 128 RTP packet, the timestamp is, as always, that of the oldest data 129 frame represented in the RTP packet. 131 The assignment of an RTP payload type for this new packet format is 132 outside the scope of this document, and will not be specified here. 133 It is expected that the RTP profile for a particular class of 134 applications will assign a payload type for this encoding, or if 135 that is not done, then a payload type in the dynamic range shall be 136 chosen by the sender. 138 3.1 Bitstream definition 140 The total number of bits used to describe one frame of 20 ms speech 141 is 303, which fits in 38 bytes and results in a bit rate of 15.20 142 kbit/s. For the case with a frame length of 30 ms speech the total 143 number of bits used is 399, which fits in 50 bytes and results in a 144 bit rate of 13.33 kbit/s. In the bitstream definition the bits are 145 distributed into three classes according to their bit error or loss 146 sensitivity. The most sensitive bits (class 1) is placed first in 147 the bitstream for each frame. The less sensitive bits (class 2) is 148 placed after the class 1 bits. The least sensitive bits (class 3) 149 are placed at the end of the bitstream for each frame. 151 Looking at the 20/30 ms frame length casees for each class: The 152 class 1 bits occupy a total of 6/8 bytes (48/64 bits), the class 2 153 bits occupy 8/12 bytes (64/96 bits), and the class 3 bits occupy 154 24/30 bytes (191/239 bits). This distribution of the bits enable the 155 use of uneven level protection (ULP). The detailed bit allocation is 156 shown in the table below. When a quantization index is distributed 157 between more classes the more significant bits belong to the lowest 158 class. 160 Bitstream structure: 162 ------------------------------------------------------------------+ 163 Parameter | Bits Class <1,2,3> | 164 | 20 ms frame | 30 ms frame | 165 ----------------------------------+---------------+---------------+ 166 Split 1 | 6 <6,0,0> | 6 <6,0,0> | 167 LSF 1 Split 2 | 7 <7,0,0> | 7 <7,0,0> | 168 LSF Split 3 | 7 <7,0,0> | 7 <7,0,0> | 169 ------------------+---------------+---------------+ 170 Split 1 | NA (Not Appl.)| 6 <6,0,0> | 171 LSF 2 Split 2 | NA | 7 <7,0,0> | 172 Split 3 | NA | 7 <7,0,0> | 173 ------------------+---------------+---------------+ 174 Sum | 20 <20,0,0> | 40 <40,0,0> | 175 ----------------------------------+---------------+---------------+ 176 Block Class. | 2 <2,0,0> | 3 <3,0,0> | 177 ----------------------------------+---------------+---------------+ 178 Position 22 sample segment | 1 <1,0,0> | 1 <1,0,0> | 179 ----------------------------------+---------------+---------------+ 180 Scale Factor State Coder | 6 <6,0,0> | 6 <6,0,0> | 181 ----------------------------------+---------------+---------------+ 182 Sample 0 | 3 <0,1,2> | 3 <0,1,2> | 183 Quantized Sample 1 | 3 <0,1,2> | 3 <0,1,2> | 184 Residual : | : : | : : | 185 State : | : : | : : | 186 Samples : | : : | : : | 187 Sample 56 | 3 <0,1,2> | 3 <0,1,2> | 188 Sample 57 | NA | 3 <0,1,2> | 189 ------------------+---------------+---------------+ 190 Sum | 171 <0,57,114>| 174 <0,58,116>| 191 ----------------------------------+---------------+---------------+ 192 Stage 1 | 7 <6,0,1> | 7 <4,2,1> | 193 CB for 22/23 Stage 2 | 7 <0,0,7> | 7 <0,0,7> | 194 sample block Stage 3 | 7 <0,0,7> | 7 <0,0,7> | 195 ------------------+---------------+---------------+ 196 Sum | 21 <6,0,15> | 21 <4,2,15> | 197 ----------------------------------+---------------+---------------+ 198 Stage 1 | 5 <2,0,3> | 5 <1,1,3> | 199 Gain for 22/23 Stage 2 | 4 <1,1,2> | 4 <1,1,2> | 200 sample block Stage 3 | 3 <0,0,3> | 3 <0,0,3> | 201 ------------------+---------------+---------------+ 202 Sum | 12 <3,1,8> | 12 <2,2,8> | 203 ----------------------------------+---------------+---------------+ 204 Stage 1 | 8 <7,0,1> | 8 <6,1,1> | 205 sub-block 1 Stage 2 | 7 <0,0,7> | 7 <0,0,7> | 206 Stage 3 | 7 <0,0,7> | 7 <0,0,7> | 207 ------------------+---------------+---------------+ 208 Stage 1 | 8 <0,0,8> | 8 <0,7,1> | 209 sub-block 2 Stage 2 | 8 <0,0,8> | 8 <0,0,8> | 210 Indices Stage 3 | 8 <0,0,8> | 8 <0,0,8> | 211 for CB ------------------+---------------+---------------+ 212 sub-blocks Stage 1 | NA | 8 <0,7,1> | 213 sub-block 3 Stage 2 | NA | 8 <0,0,8> | 214 Stage 3 | NA | 8 <0,0,8> | 215 ------------------+---------------+---------------+ 216 Stage 1 | NA | 8 <0,7,1> | 217 sub-block 4 Stage 2 | NA | 8 <0,0,8> | 218 Stage 3 | NA | 8 <0,0,8> | 219 ------------------+---------------+---------------+ 220 Sum | 46 <7,0,39> | 94 <6,22,66> | 221 ----------------------------------+---------------+---------------+ 222 Stage 1 | 5 <1,2,2> | 5 <1,2,2> | 223 sub-block 1 Stage 2 | 4 <1,1,2> | 4 <1,2,1> | 224 Stage 3 | 3 <0,0,3> | 3 <0,0,3> | 225 ------------------+---------------+---------------+ 226 Stage 1 | 5 <1,1,3> | 5 <0,2,3> | 227 sub-block 2 Stage 2 | 4 <0,2,2> | 4 <0,2,2> | 228 Stage 3 | 3 <0,0,3> | 3 <0,0,3> | 229 Gains for ------------------+---------------+---------------+ 230 sub-blocks Stage 1 | NA | 5 <0,1,4> | 231 sub-block 3 Stage 2 | NA | 4 <0,1,3> | 232 Stage 3 | NA | 3 <0,0,3> | 233 ------------------+---------------+---------------+ 234 Stage 1 | NA | 5 <0,1,4> | 235 sub-block 4 Stage 2 | NA | 4 <0,1,3> | 236 Stage 3 | NA | 3 <0,0,3> | 237 ------------------+---------------+---------------+ 238 Sum | 24 <3,6,15> | 48 <2,12,34> | 239 ------------------------------------------------------------------- 240 SUM 303 <48,64,191> 399 <64,96,239> 242 Table 3.1 The bitstream definition for iLBC. 244 When packetized into the payload the bits MUST be sorted as: All the 245 class 1 bits in the order (from top and down) as they were specified 246 in the table, all the class 2 bits (from top and down) and finally 247 all the class 3 bits in the same sequential order. 249 The last unused bit of the payload (for both 20 ms and 30 ms frame 250 size) SHOULD be set to zero. 252 3.2 Multiple iLBC frames in a RTP packet 254 More than one iLBC frame may be included in a single RTP packet by a 255 sender. 257 It is important to observe that senders have the following 258 additional restrictions: 260 o SHOULD NOT include more iLBC frames in a single RTP packet than 261 will fit in the MTU of the RTP transport protocol. 263 o Frames MUST NOT be split between RTP packets. 265 It is RECOMMENDED that the number of frames contained within an RTP 266 packet is consistent with the application. For example, in a 267 telephony and other real time applications where delay is important, 268 then the fewer frames per packet the lower the delay, whereas for a 269 bandwidth constrained links or delay insensitive streaming messaging 270 application, more then one or many frames per packet would be 271 acceptable. 273 Information describing the number of frames contained in an RTP 274 packet is not transmitted as part of the RTP payload. The way to 275 determine the number of iLBC frames is to count the total number of 276 octets within the RTP packet, and divide the octet count by the 277 number of expected octets per frame (32/50 per frame). 279 4. IANA CONSIDERATIONS 281 One new MIME sub-type as described in this section is to be 282 registered. 284 4.1 Storage Mode 286 The storage mode is used for storing speech frames (e.g. as a file 287 or e-mail attachment). 289 +------------------+ 290 | Header | 291 +------------------+ 292 | Speech frame 1 | 293 +------------------+ 294 : : 295 +------------------+ 296 | Speech frame n | 297 +------------------+ 299 The file begins with a header that includes only a magic number to 300 identify that it is an iLBC file. The magic number for iLBC file 301 MUST correspond to the ASCII character string "#!iLBC\n", or "0x23 302 0x21 0x69 0x4C 0x42 0x43 0x0A" in hexadecimal form. After the 303 header, follow the speech frames in consecutive order. 305 4.2 MIME registration of iLBC 307 MIME media type name: audio 308 MIME subtype: iLBC 310 Optional parameters: 312 This parameter applies to RTP transfer only. 314 maxptime:The maximum amount of media which can be 315 encapsulated in a payload packet, expressed 316 as time in milliseconds. The time is 317 calculated as the sum of the time the media 318 present in the packet represents. The time SHOULD be 319 a multiple of the frame size. If this parameter is 320 not present, the sender MAY encapsulate any number of 321 speech frames into one RTP packet. 323 Encoding considerations: 324 This type is defined for transfer via both RTP (RFC 325 1889) and stored-file methods as described in Section 326 4.1, of RFC XXXX. Audio data is binary data, and must 327 be encoded for non-binary transport; the Base64 328 encoding is suitable for Email. 330 Security considerations: 331 See Section 6 of RFC XXXX. 333 Public specification: 334 Please refer to RFC XXXX [1]. 336 Additional information: 337 The following applies to stored-file transfer 338 methods: 340 Magic number: 341 ASCII character string "#!iLBC\n" 342 (or 0x23 0x21 0x69 0x4C 0x42 0x43 0x0A in 343 hexadecimal) 345 File extensions: lbc, LBC 346 Macintosh file type code: none 347 Object identifier or OID: none 349 Person & email address to contact for further information: 350 alan.duric@globalipsound.com 352 Intended usage: COMMON. 353 It is expected that many VoIP applications will use 354 this type. 356 Author/Change controller: 357 alan.duric@globalipsound.com 358 IETF Audio/Video transport working group 359 5. MAPPING TO SDP PARAMETERS 361 Parameters are mapped to SDP [7] in a standard way. When conveying 362 information by SDP, the encoding name SHALL be "iLBC" (the same as 363 the MIME subtype). An example of the media representation in SDP for 364 describing iLBC might be: 366 m=audio 49120 RTP/AVP 97 367 a=rtpmap:97 iLBC/8000 369 If 20 ms frame size mode is used, remote iLBC encoder SHALL receive 370 �mode� parameter in the SDP "a=fmtp" attribute by copying them 371 directly from the MIME media type string as a semicolon separated 372 with parameter=value, where parameter is �mode�, and values can be 373 0, 20 or 30 (where 0 stands for support of both frame size modes; 20 374 stands for preffered 20 ms frame size, etc.). An example of the 375 media representation in SDP for describing iLBC when 20 ms frame 376 size mode is used might be: 378 m=audio 49120 RTP/AVP 97 379 a=rtpmap:97 iLBC/8000 380 a=fmtp:97 mode=20 382 6. SECURITY CONSIDERATIONS 384 RTP packets using the payload format defined in this specification 385 are subject to the general security considerations discussed in [4] 386 and any appropriate profile (e.g. [5]). 388 As this format transports encoded speech, the main security issues 389 include confidentiality and authentication of the speech itself. The 390 payload format itself does not have any built-in security 391 mechanisms. Confidentiality of the media streams is achieved by 392 encryption, therefore external mechanisms, such as SRTP [9], MAY be 393 used for that purpose. The data compression used with this payload 394 format is applied end-to-end; hence encryption may be performed 395 after compression with no conflict between the two operations. 397 A potential denial-of-service threat exists for data encoding using 398 compression techniques that have non-uniform receiver-end 399 computational load. The attacker can inject pathological datagrams 400 into the stream which are complex to decode and cause the receiver 401 to become overloaded. However, the encodings covered in this 402 document do not exhibit any significant non-uniformity. 404 7. REFERENCES 406 [1] Andersen, et al., Internet Low Bit Rate Codec (iLBC)", draft- 407 ietf-avt-lbc-codec-01.txt, March 2003. 409 [2] S. Bradner, "Key words for use in RFCs to Indicate requirement 410 Levels", BCP 14, RFC 2119, March 1997. 412 [3] S. Bradner, "The Internet Standards Process -- Revision 3", BCP 413 9, RFC 2026, October 1996 415 [4] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, "RTP: 416 A Transport Protocol for Real-Time Applications", IETF RFC 1889, 417 January 1996. 419 [5] H. Schulzrinne, "RTP Profile for Audio and Video Conferences 420 with Minimal Control" IETF RFC 1890, January 1996. 422 [6] Handley & Perkins, "Guidelines for Writers of RTP Payload 423 Formats", BCP 36, RFC 2736, December 1999. 425 [7] M. Handley and V. Jacobson, "SDP: Session Description Protocol", 426 IETF RFC 2327, April 1998 428 [8] N. Freed and N. Borenstein, "Multipurpose Internet Mail 429 Extensions (MIME) Part One: Format of Internet Message Bodies", 430 IETF RFC 2045, November 1996. 432 [9] Baugher, et al., "The Secure Real Time Transport Protocol", IETF 433 Draft, June 2002. 435 [10] ITU-T Recommendation G.711, available online from the ITU 436 bookstore at http://www.itu.int. 438 [11] J. Sjoberg, M. Westerlund, A. Lakaniemi, Q. Xie, �RTP payload 439 format and file storage format for the Adaptive Multi-Rate (AMR) 440 and Adaptive Multi-Rate Wideband (AMR-WB) audio codecs�, IETF RFC 441 3267, June 2002. 443 8. ACKNOWLEDGEMENTS 445 The authors wish to thank Henry Sinnreich and Patrik Faltstrom for 446 great support of the iLBC initiative and for their valuable feedback 447 and comments. 449 9. AUTHOR'S ADDRESSES 451 Alan Duric 452 Global IP Sound AB 453 Rosenlundsgatan 54 454 Stockholm, S-11863 455 Sweden 456 Phone: +46 8 54553040 457 Email: alan.duric@globalipsound.com 459 Soren Vang Andersen 460 Department of Communication Technology 461 Aalborg University 462 Fredrik Bajers Vej 7A 463 9200 Aalborg 464 Denmark 465 Phone: ++45 9 6358627 466 Email: sva@kom.auc.dk