idnits 2.17.1 draft-ietf-avt-rtp-ilbc-00.txt: -(420): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There are 2 instances of lines with non-ascii characters in the document. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The abstract seems to contain references ([1]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: '3' is defined on line 392, but no explicit reference was found in the text == Unused Reference: '6' is defined on line 402, but no explicit reference was found in the text == Unused Reference: '8' is defined on line 408, but no explicit reference was found in the text == Unused Reference: '11' is defined on line 418, but no explicit reference was found in the text == Outdated reference: A later version (-05) exists of draft-ietf-avt-rtp-ilbc-00 ** Downref: Normative reference to an Experimental draft: draft-ietf-avt-rtp-ilbc (ref. '1') ** Obsolete normative reference: RFC 1889 (ref. '4') (Obsoleted by RFC 3550) ** Obsolete normative reference: RFC 1890 (ref. '5') (Obsoleted by RFC 3551) ** Obsolete normative reference: RFC 2327 (ref. '7') (Obsoleted by RFC 4566) -- Possible downref: Non-RFC (?) normative reference: ref. '9' -- Possible downref: Non-RFC (?) normative reference: ref. '10' ** Obsolete normative reference: RFC 3267 (ref. '11') (Obsoleted by RFC 4867) Summary: 8 errors (**), 0 flaws (~~), 7 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Alan Duric 3 Soren Vang Andersen 4 Internet Draft 5 draft-ietf-avt-rtp-ilbc-00.txt Global IP Sound 6 October 28th, 2002 7 Expires: April, 28th, 2003 9 RTP Payload Format for iLBC Speech 11 Status of this Memo 13 This document is an Internet-Draft and is in full conformance 14 with all provisions of Section 10 of RFC2026. 16 Internet-Drafts are working documents of the Internet Engineering 17 Task Force (IETF), its areas, and its working groups. Note that 18 other groups may also distribute working documents as Internet- 19 Drafts. 21 Internet-Drafts are draft documents valid for a maximum of six 22 months and may be updated, replaced, or obsoleted by other documents 23 at any time. It is inappropriate to use Internet-Drafts as 24 reference material or to cite them other than as "work in progress." 26 The list of current Internet-Drafts can be accessed at 27 http://www.ietf.org/ietf/1id-abstracts.txt 28 The list of Internet-Draft Shadow Directories can be accessed at 29 http://www.ietf.org/shadow.html. 31 Abstract 33 This document describes the RTP payload format for the internet Low 34 Bit Rate Coder (iLBC) Speech [1] developed by Global IP Sound 35 (GIPS). Also, within the document there are included necessary 36 details for the use of iLBC with MIME and SDP. 38 Table of Contents 40 Status of this Memo................................................1 41 Abstract...........................................................1 42 Table of Contents..................................................1 43 1. INTRODUCTION....................................................2 44 2. BACKGROUND......................................................2 45 3. RTP PAYLOAD FORMAT..............................................3 46 3.1 Bitstream definition...........................................3 47 3.2 Multiple iLBC frames in a RTP packet...........................5 48 4. IANA CONSIDERATIONS.............................................6 49 4.1 Storage Mode...................................................6 50 4.2 MIME registration of iLBC......................................6 51 6. SECURITY CONSIDERATIONS.........................................7 52 7. REFERENCES......................................................8 53 8. ACKNOWLEDGEMENTS................................................9 54 9. AUTHOR'S ADDRESSES..............................................9 56 1. INTRODUCTION 58 This document describes how compressed iLBC speech as produced by 59 the iLBC codec [1] may be formatted for use as an RTP payload type. 60 Methods are provided to packetize the codec data frames into RTP 61 packets. The sender may send one or more codec data frames per 62 packet, depending on the application scenario or based on the 63 transport network condition, bandwidth restriction, delay 64 requirements and packet-loss tolerance. 66 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 67 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in 68 this document are to be interpreted as described in RFC 2119 [2]. 70 2. BACKGROUND 72 Global IP Sound (GIPS) has developed and defines a freeware speech 73 compression algorithm for use in IP based communications [1]. The 74 iLBC codec enables graceful speech quality degradation in the case 75 of lost frames, which occurs in connection with lost or delayed IP 76 packets. 78 Some of the applications for which this coder is suitable are: real 79 time communications such as telephony and videoconferencing, 80 streaming audio, archival and messaging. 82 The iLBC codec [1] is an algorithm that compresses each 30 ms of 83 8000 Hz, 16-bit sampled input speech into size output frames with 84 rate of 399 bits. 86 The codec has a bit rate of 13.33 kbits/s using a block independent 87 linear-predictive coding (LPC) algorithm. The codec operates at 88 block lengths of 30 ms and produces 399 bits per block, which can be 89 packetized in 50 bytes. The described algorithm results in a speech 90 coding system with a controlled response to packet losses similar to 91 what is known from pulse code modulation (PCM) with a packet loss 92 concealment (PLC), such as ITU-T G711 standard [10], which operates 93 at a fixed bit rate of 64 kbit/s. At the same time, the described 94 algorithm enables fixed bit rate coding with a quality-versus-bit 95 rate tradeoff close to what is known from code-excited linear 96 prediction (CELP). 98 3. RTP PAYLOAD FORMAT 100 The iLBC codec uses 30 ms frames and a sampling rate clock of 8 kHz, 101 so the RTP timestamp MUST be in units of 1/8000 of a second. The RTP 102 payload for iLBC has the format shown in the figure bellow. No 103 addition header specific to this payload format is required. 105 This format is intended for the situations where the sender and the 106 receiver send one or more codec data frames per packet. The RTP 107 packet looks as follows: 109 0 1 2 3 110 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 111 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 112 | RTP Header [4] | 113 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 114 | | 115 + one or more frames of iLBC [1] | 116 | | 117 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 119 The RTP header of the packetized encoded iLBC speech has the 120 expected values as described in [4]. The usage of M bit should be as 121 specified in the applicable RTP profile, for example, RFC 1890 [5], 122 where [5] specifies that if the sender does not suppress silence 123 (i.e., sends a frame on every 30 millisecond interval), the M bit 124 will always be zero. When more then one codec data frame is present 125 in a single RTP packet, the timestamp is, as always, that of the 126 oldest data frame represented in the RTP packet. 128 The assignment of an RTP payload type for this new packet format is 129 outside the scope of this document, and will not be specified here. 130 It is expected that the RTP profile for a particular class of 131 applications will assign a payload type for this encoding, or if 132 that is not done, then a payload type in the dynamic range shall be 133 chosen by the sender. 135 3.1 Bitstream definition 137 The total number of bits used to describe one block of 30 ms speech 138 is 399, which fits in 50 bytes and results in a bit rate of 13.33 139 kbit/s. In the bitstream definition the bits are distributed into 140 three classes according to their bit error or loss sensitivity. The 141 most sensitive bits (class 1) are placed first in the bitstream for 142 each frame. The less sensitive bits (class 2) are placed after the 143 class 1 bits. The least sensitive bits (class 3) are placed at the 144 end of the bitstream for each frame. 146 The class 1 bits occupy a total of 8 bytes (64 bits), the class 2 147 bits occupy 12 bytes (96 bits), and the class 3 bits occupy 30 bytes 148 (239 bits). This distribution of the bits enables the use of uneven 149 level protection (ULP). The detailed bit allocation is shown in the 150 table below. When a quantization index is distributed between more 151 classes the more significant bits belong to the lowest class. 153 Bitstream structure: 155 Parameter Bits Class 1,2,3 157 ------------------------------------------------------------------- 158 Split 1 6 6,0,0 159 LSF 1 Split 2 7 7,0,0 160 LSF Split 3 7 7,0,0 161 ---------------------------------------------------- 162 Split 1 6 6,0,0 163 LSF 2 Split 2 7 7,0,0 164 Split 3 7 7,0,0 165 ---------------------------------------------------- 166 Sum 40 20,0,0 167 ------------------------------------------------------------------- 168 Block Class. 3 3,0,0 169 ------------------------------------------------------------------- 170 Position 22 sample segment 1 1,0,0 171 ------------------------------------------------------------------- 172 Scale Factor State Coder 6 6,0,0 173 ------------------------------------------------------------------- 174 Sample 0 3 0,1,2 175 Quantized Sample 1 3 0,1,2 176 Residual : : : 177 State : : : 178 Samples : : : 179 Sample 56 3 0,1,2 180 Sample 57 3 0,1,2 181 ---------------------------------------------------- 182 Sum 174 0,58,116 183 ------------------------------------------------------------------- 184 Stage 1 7 4,2,1 185 CB for 22 samples in start state Stage 2 7 0,0,7 186 Stage 3 7 0,0,7 187 ---------------------------------------------------- 188 Sum 21 4,2,15 189 ------------------------------------------------------------------- 190 Stage 1 5 1,1,3 191 Gain for 22 samples in start state Stage 2 4 1,1,2 192 Stage 3 3 0,0,3 193 ---------------------------------------------------- 194 Sum 12 2,2,8 195 ------------------------------------------------------------------- 196 Stage 1 8 6,1,1 197 Indices sub-block 1 Stage 2 7 0,0,7 198 Stage 3 7 0,0,7 199 ---------------------------------------------------- 200 Stage 1 8 0,7,1 201 Indices sub-block 2 Stage 2 8 0,0,8 202 Stage 3 8 0,0,8 203 CB sub-blocks ---------------------------------------------------- 204 Stage 1 8 0,7,1 205 Indices sub-block 3 Stage 2 8 0,0,8 206 Stage 3 8 0,0,8 207 ---------------------------------------------------- 208 Stage 1 8 0,7,1 209 Indices sub-block 4 Stage 2 8 0,0,8 210 Stage 3 8 0,0,8 211 ---------------------------------------------------- 212 Sum 94 6,22,66 213 ------------------------------------------------------------------- 214 Stage 1 5 1,2,2 215 Gains sub-block 1 Stage 2 4 1,2,1 216 Stage 3 3 0,0,3 217 ---------------------------------------------------- 218 Stage 1 5 0,2,3 219 Gains sub-block 2 Stage 2 4 0,2,2 220 Stage 3 3 0,0,3 221 Gain sub-blocks --------------------------------------------------- 222 Stage 1 5 0,1,4 223 Gains sub-block 3 Stage 2 4 0,1,3 224 Stage 3 3 0,0,3 225 ---------------------------------------------------- 226 Stage 1 5 0,1,4 227 Gains sub-block 4 Stage 2 4 0,1,3 228 Stage 3 3 0,0,3 229 ---------------------------------------------------- 230 Sum 48 2,12,34 231 ------------------------------------------------------------------- 232 SUM 399 64,96,239 234 Table 3.1 The bitstream definition for iLBC. 236 When packetized into the payload the bits MUST be sorted as: All the 237 class 1 bits in the order (from top and down) as they were specified 238 in the table, all the class 2 bits (from top and down) and finally 239 all the class 3 bits in the same sequential order. The last unused 240 bit of the payload SHOULD be set to zero. 242 3.2 Multiple iLBC frames in a RTP packet 244 More than one iLBC frame may be included in a single RTP packet by a 245 sender. 247 It is important to observe that senders have the following 248 additional restrictions: 250 o SHOULD NOT include more iLBC frames in a single RTP packet than 251 will fit in the MTU of the RTP transport protocol. 253 o Frames MUST NOT be split between RTP packets. 255 It is RECOMMENDED that the number of frames contained within an RTP 256 packet is consistent with the application. For example, in a 257 telephony and other real time applications where delay is important, 258 then the fewer frames per packet the lower the delay, whereas for a 259 bandwidth constrained links or delay insensitive streaming messaging 260 application, more then one or many frames per packet would be 261 acceptable. 263 Information describing the number of frames contained in an RTP 264 packet is not transmitted as part of the RTP payload. The way to 265 determine the number of iLBC frames is to count the total number of 266 octets within the RTP packet, and divide the octet count by the 267 number of expected octets per frame (50 per frame). 269 4. IANA CONSIDERATIONS 271 One new MIME sub-type as described in this section is to be 272 registered. 274 4.1 Storage Mode 276 The storage mode is used for storing speech frames (e.g. as a file 277 or e-mail attachment). 279 +------------------+ 280 | Header | 281 +------------------+ 282 | Speech frame 1 | 283 +------------------+ 284 : : 285 +------------------+ 286 | Speech frame n | 287 +------------------+ 289 The file begins with a header that includes only a magic number to 290 identify that it is an iLBC file. The magic number for iLBC file 291 MUST correspond to the ASCII character string "#!iLBC\n", or "0x23 292 0x21 0x69 0x4C 0x42 0x43 0x0A" in hexadecimal form. After the 293 header, follow the speech frames in consecutive order. 295 4.2 MIME registration of iLBC 297 MIME media type name: audio 299 MIME subtype: iLBC 301 Optional parameters: 303 This parameter applies to RTP transfer only. 305 maxptime:The maximum amount of media which can be 306 encapsulated in a payload packet, expressed 307 as time in milliseconds. The time is 308 calculated as the sum of the time the media 309 present in the packet represents. The time SHOULD be 310 a multiple of the frame size. If this parameter is 311 not present, the sender MAY encapsulate any number of 312 speech frames into one RTP packet. 314 Encoding considerations: 316 This type is defined for transfer via both RTP (RFC 317 1889) and stored-file methods as described in Section 318 4.1, of RFC XXXX. Audio data is binary data, and must 319 be encoded for non-binary transport; the Base64 320 encoding is suitable for Email. 322 Security considerations: 323 See Section 6 of RFC XXXX. 325 Public specification: 326 Please refer to RFC XXXX [1]. 328 Additional information: 329 The following applies to stored-file transfer 330 methods: 332 Magic number: 333 ASCII character string "#!iLBC\n" 334 (or 0x23 0x21 0x69 0x4C 0x42 0x43 0x0A in 335 hexadecimal) 337 File extensions: lbc, LBC 338 Macintosh file type code: none 339 Object identifier or OID: none 341 Person & email address to contact for further information: 342 alan.duric@globalipsound.com 344 Intended usage: COMMON. 345 It is expected that many VoIP applications will use 346 this type. 348 Author/Change controller: 349 alan.duric@globalipsound.com 350 IETF Audio/Video transport working group 352 5. MAPPING TO SDP PARAMETERS 354 Parameters are mapped to SDP [7] in a standard way. When conveying 355 information by SDP, the encoding name SHALL be "iLBC" (the same as 356 the MIME subtype). An example of the media representation in SDP for 357 describing iLBC might be: 359 m=audio 49120 RTP/AVP 97 360 a=rtpmap:97 iLBC/8000 362 6. SECURITY CONSIDERATIONS 364 RTP packets using the payload format defined in this specification 365 are subject to the general security considerations discussed in [4] 366 and any appropriate profile (e.g. [5]). 368 As this format transports encoded speech, the main security issues 369 include confidentiality and authentication of the speech itself. The 370 payload format itself does not have any built-in security 371 mechanisms. Confidentiality of the media streams is achieved by 372 encryption, therefore external mechanisms, such as SRTP [9], MAY be 373 used for that purpose. The data compression used with this payload 374 format is applied end-to-end; hence encryption may be performed 375 after compression with no conflict between the two operations. 377 A potential denial-of-service threat exists for data encoding using 378 compression techniques that have non-uniform receiver-end 379 computational load. The attacker can inject pathological datagrams 380 into the stream which are complex to decode and cause the receiver 381 to become overloaded. However, the encodings covered in this 382 document do not exhibit any significant non-uniformity. 384 7. REFERENCES 386 [1] Andersen, et al., Internet Low Bit Rate Codec (iLBC)", draft- 387 ietf-avt-rtp-ilbc-00.txt, September 2002. 389 [2] S. Bradner, "Key words for use in RFCs to Indicate requirement 390 Levels", BCP 14, RFC 2119, March 1997. 392 [3] S. Bradner, "The Internet Standards Process -- Revision 3", BCP 393 9, RFC 2026, October 1996 395 [4] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, "RTP: 396 A Transport Protocol for Real-Time Applications", IETF RFC 1889, 397 January 1996. 399 [5] H. Schulzrinne, "RTP Profile for Audio and Video Conferences 400 with Minimal Control" IETF RFC 1890, January 1996. 402 [6] Handley & Perkins, "Guidelines for Writers of RTP Payload 403 Formats", BCP 36, RFC 2736, December 1999. 405 [7] M. Handley and V. Jacobson, "SDP: Session Description Protocol", 406 IETF RFC 2327, April 1998 408 [8] N. Freed and N. Borenstein, "Multipurpose Internet Mail 409 Extensions (MIME) Part One: Format of Internet Message Bodies", 410 IETF RFC 2045, November 1996. 412 [9] Baugher, et al., "The Secure Real Time Transport Protocol", IETF 413 Draft, June 2002. 415 [10] ITU-T Recommendation G.711, available online from the ITU 416 bookstore at http://www.itu.int. 418 [11] J. Sjoberg, M. Westerlund, A. Lakaniemi, Q. Xie, �RTP payload 419 format and file storage format for the Adaptive Multi-Rate (AMR) 420 and Adaptive Multi-Rate Wideband (AMR-WB) audio codecs�, IETF RFC 421 3267, June 2002. 423 8. ACKNOWLEDGEMENTS 425 The authors wish to thank Henry Sinnreich and Patrik Faltstrom for 426 great support of the iLBC initiative and for their valuable feedback 427 and comments. 429 9. AUTHOR'S ADDRESSES 431 Alan Duric 432 Global IP Sound AB 433 Rosenlundsgatan 54 434 Stockholm, S-11863 435 Sweden 436 Phone: +46 8 54553040 437 Email: alan.duric@globalipsound.com 439 Soren Vang Andersen 440 Department of Communication Technology 441 Aalborg University 442 Fredrik Bajers Vej 7A 443 9200 Aalborg 444 Denmark 445 Phone: ++45 9 6358627 446 Email: sva@kom.auc.dk