idnits 2.17.1 draft-ietf-avt-rtp-ipmr-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** The document seems to lack a License Notice according IETF Trust Provisions of 28 Dec 2009, Section 6.b.i or Provisions of 12 Sep 2009 Section 6.b -- however, there's a paragraph with a matching beginning. Boilerplate error? (You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Feb 2009 rather than one of the newer Notices. See https://trustee.ietf.org/license-info/.) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document seems to contain a disclaimer for pre-RFC5378 work, and may have content which was first submitted before 10 November 2008. The disclaimer is necessary when there are original authors that you have been unable to contact, or if some do not wish to grant the BCP78 rights to the IETF Trust. If you are able to get all authors (current and original) to grant those rights, you can and should remove the disclaimer; otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (May 20, 2009) is 5453 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) ** Obsolete normative reference: RFC 5246 (Obsoleted by RFC 8446) Summary: 3 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Audio/Video Transport Working Group S. Ikonin 2 Internet Draft SPIRIT DSP 3 Intended status: Informational May 20, 2009 5 RTP Payload Format for IP-MR Speech Codec draft-ietf-avt-rtp-ipmr-04.txt 7 Status of this Memo 9 This Internet-Draft is submitted to IETF in full conformance with the 10 provisions of BCP 78 and BCP 79. 12 Copyright (c) 2009 IETF Trust and the persons identified as the document 13 authors. All rights reserved. 15 This document is subject to BCP 78 and the IETF Trust's Legal Provisions 16 Relating to IETF Documents in effect on the date of publication of this 17 document (http://trustee.ietf.org/license-info). Please review these 18 documents carefully, as they describe your rights and restrictions with 19 respect to this document. 21 Internet-Drafts are working documents of the Internet Engineering Task 22 Force (IETF), its areas, and its working groups. Note that other groups 23 may also distribute working documents as Internet-Drafts. 25 Internet-Drafts are draft documents valid for a maximum of six months 26 and may be updated, replaced, or obsoleted by other documents at any 27 time. It is inappropriate to use Internet-Drafts as reference material 28 or to cite them other than as "work in progress." 30 The list of current Internet-Drafts can be accessed at 31 http://www.ietf.org/1id-abstracts.html 33 The list of Internet-Draft Shadow Directories can be accessed at 34 http://www.ietf.org/shadow.html 36 This Internet-Draft will expire on November 20, 2009. 38 Abstract 40 This document specifies the payload format for packetization of SPIRIT 41 IP-MR encoded speech signals into the Real-time Transport Protocol 42 (RTP). The payload format supports transmission of multiple frames per 43 payload and introduced redundancy for robustness against packet loss. 45 Table of Contents 47 1. Introduction......................................................3 48 2. IP-MR Codec Description...........................................3 49 3. Payload Format....................................................4 50 3.1. RTP Header Usage.............................................4 51 3.2. Payload Format Structure.....................................5 52 3.3. Payload Header...............................................5 53 3.4. Speech Table of Contents.....................................6 54 3.5. Speech Data..................................................7 55 3.6. Redundancy Header............................................7 56 3.7. Redundancy Table of Contents.................................8 57 3.8. Redundancy Data..............................................9 58 4. Payload Examples..................................................9 59 4.1. Payload Carrying a Single Frame..............................9 60 4.2. Payload Carrying Multiple Frames with Redundancy............10 61 5. Media Type Registration..........................................11 62 5.1. Registration of media subtype audio/ip-mr_v2.5..............11 63 5.2. Mapping Media Type Parameters into SDP......................12 64 6. Security Considerations..........................................13 65 7. Congestion Control...............................................13 66 8. IANA Considerations..............................................14 67 9. Normative References.............................................15 68 10. Author(s) Information ..........................................15 69 11. Disclaimer......................................................15 70 12. Legal Terms.....................................................16 72 1. Introduction 74 This document specifies the payload format for packetization of SPIRIT 75 IP-MR encoded speech signals into the Real-time Transport Protocol 76 (RTP). The payload format supports transmission of multiple frames per 77 payload and introduced redundancy for robustness against packet loss. 79 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 80 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 81 document are to be interpreted as described in RFC 2119 [RFC 2119]. 83 2. IP-MR Codec Description 85 The IP-MR codec is scalable adaptive multi-rate wideband speech codec 86 designed by SPIRIT for use in IP based networks. These codec is suitable 87 for real time communications such as telephony and videoconferencing. 89 The codec operates on 20 ms frames at 16 kHz sampling rate and has an 90 algorithmic delay of 25ms. 92 The IP-MR supports six wide band speech coding modes with respective bit 93 rates ranging from about 7.7 to about 34.2 kbps. The coding mode can be 94 changed at any 20 ms frame boundary making possible to dynamically 95 adjust the speech encoding rate during a session to adapt to the varying 96 transmission conditions. 98 The coded frame consists of multiple coding layers - base (or core) 99 layer and several enhancement layers which are coded independently. 100 Onlythe core layer is mandatory to decode understandable speech and 101 upper layers provide quality enhancement. These enhancement layers 102 may be omitted and remaining base layer can be meaningfully decoded 103 without artifacts. This making the bit stream scalable and allows 104 reduce bit rate during transmission without re-encoding. 106 This memo specifies an optional form of redundancy coding within RTP 107 for protection against packet loss. It is based on commonly known 108 scheme when previously transmitted frames are aggregated together 109 with new ones. Each frame is retransmitted once in the following 110 RTP payload packet. f(n-2)...f(n+4) denote a sequence of speech 111 frames, and p(n-1)...p(n+4) a sequence of payload packets: 113 --+--------+--------+--------+--------+--------+--------+--------+-- 114 | f(n-2) | f(n-1) | f(n) | f(n+1) | f(n+2) | f(n+3) | f(n+4) | 115 --+--------+--------+--------+--------+--------+--------+--------+-- 117 <---- p(n-1) ----> 118 <----- p(n) -----> 119 <---- p(n+1) ----> 120 <---- p(n+2) ----> 121 <---- p(n+3) ----> 122 <---- p(n+4) ----> 124 But because of the scalable nature of IP-MR codec there is no need to 125 duplicate the whole previous frame - only the core layer may be 126 retransmitted. This reduces redundancy overhead while keeping 127 efficiency. Moreover, the speech bits encoded in core layer are divided 128 on six classes (from A to F) of perceptual sensitivity to errors. Using 129 these classes as introduced redundancy make possible to adjust trade-off 130 between overhead and robustness against packet loss. 132 The mechanism described does not really require signaling at the session 133 setup. The sender is responsible for selecting an appropriate amount of 134 redundancy based on feedback about the channel conditions. 136 The main codec characteristics can be summarized as follows: 138 o Wideband, 16 kHz, speech codec 140 o Adaptive multi rate with six modes from about 7.7 to about 141 34.2 kbps 143 o Bit rate scalable 145 o Variable bit rate changing in accordance with actual speech 146 content 148 o Discontinuous Transmission (DTX), silence suppression and 149 comfort noise generation 151 o In-band redundancy scheme for protection against packet loss 153 3. Payload Format 155 The main purpose of the payload design for IP-MR is to maximize the 156 potential of the codec with as minimal overhead as possible. The payload 157 format allows changing parameters of the codecs (such as bit rate, 158 level of scalability, DTX and redundancy mode) without re-negotiation 159 at any packet boundary. This make possible dynamically adjust streaming 160 parametersin accordance to changing network conditions. The payload 161 format also supports aggregation of multiple consecutive frames 162 (up to 4) in a payload. That allows controlling trade-off between 163 delay and header overhead. 165 3.1. RTP Header Usage 167 The RTP timestamp corresponds to the sampling instant of the first 168 sample encoded for the first frame-block in the packet. The timestamp 169 clock frequency SHALL be 16 kHz. The duration of one frame is 20 ms, 170 corresponding to 320 samples at 16 kHz. Thus the timestamp is increased 171 by 320 for each consecutive frame. The timestamp is also used to recover 172 the correct decoding order of the frame-blocks. 174 The RTP header marker bit (M) SHALL be set to 1 whenever the first 175 frame-block carried in the packet is the first frame-block in a 176 talkspurt (see definition of the talkspurt in Section 4.1 [RFC 3551]). 177 For all other packets, the marker bit SHALL be set to zero (M=0). 179 The assignment of an RTP payload type for the format defined in this 180 memo is outside the scope of this document. The RTP profiles in use 181 currently mandate binding the payload type dynamically for this payload 182 format. This is basically necessary because the payload type expresses 183 the configuration of the payload itself, i.e. basic or interleaved mode, 184 and the number of channels carried. 186 The remaining RTP header fields are used as specified in [RFC 3550]. 188 3.2. Payload Format Structure 190 The IP-MR payload format consists of a payload header with general 191 information about packet, a speech table of contents (TOC), and speech 192 data. An optional redundancy section follows after speech data. The 193 redundancy section consists of redundancy header, redundancy TOC and 194 redundancy data payload. 196 The following diagram shows the standard payload format layout: 198 +---------+--------+--------+- - - - - - +- - - - - - +- - - - - - + 199 | payload | speech | speech | redundancy | redundancy | redundancy | 200 | header | TOC | data | header | TOC | data | 201 +---------+--------+--------+- - - - - - +- - - - - - +- - - - - - + 203 3.3. Payload Header 205 The payload header has the following format: 207 0 1 208 0 1 2 3 4 5 6 7 8 9 0 1 209 +-+-+-+-+-+-+-+-+-+-+-+-+ 210 |T| CR | BR |D|A|GR |R| 211 +-+-+-+-+-+-+-+-+-+-+-+-+ 213 o T (1 bit): Reserved compatibility with future extensions. SHOULD 214 be set to 0. 216 o CR (3 bits): coding rate of frame(s) in this packet, as per the 217 following table: 218 +-------+--------------+ 219 | CR | avg. bitrate | 220 +-------+--------------+ 221 | 0 | 7.7 kbps | 222 | 1 | 9.8 kbps | 223 | 2 | 14.3 kbps | 224 | 3 | 20.8 kbps | 225 | 4 | 27.9 kbps | 226 | 5 | 34.2 kbps | 227 | 6 | (reserved) | 228 | 7 | NO_DATA | 229 +-------+--------------+ 231 The CR value 7 (NO_DATA) indicates that there is no speech data (and 232 speech TOC accordingly) in the payload. This MAY be used to transmit 233 redundancy data only. The value 6 is reserved. If receiving this value 234 the packet SHOULD be discarded. 236 o BR (3 bits): base rate for core layer of frame(s) in this packet 237 using the table for CR. Values in the range 0-5 indicate bitrates 238 for core layer, same as for packet SHOULD be discarded. The base 239 rate is the lowest rate for scalability, so speech payload can 240 be scaled down not lower than BR value. If a received packet has 241 BR > CR then during decoding it will be assumed that BR = CR. 243 o D (1 bit): indicates if the DTX mode is allowed or not. 245 o A (1 bit): byte-aligned payload. If A=1 then all speech frames 246 MUST be byte-aligned. This mode speeds up speech data access. 247 The A=0 value specifies bandwidth-efficient mode with no byte 248 alignment(including end of header). 250 o GR (2 bits): number of frames in packet (grouping size). Actual 251 grouping size is GR + 1, thus maximum grouping supported is 4. 253 o R (1 bit): redundancy presence bit. If R=1 then the packet 254 contains redundancy information for lost packets recovery. 255 In this case after speech data the redundancy section is present. 257 3.4. Speech Table of Contents 259 The speech TOC contains entries for each frame in packet (grouping size 260 in total). Each entry contains a single field: 262 0 263 +-+ 264 |E| 265 +-+ 267 o E (1 bit): frame existence indicator. If set to 0, this indicates 268 the corresponding frame is absent and the receiver should set 269 special LOST_FRAME flag for decoder. This can be followed by the 270 lost frame itself or by empty frames generated by the encoder 271 during silence intervals in DTX mode. 273 Note that if CR flag from payload header is 7 (NO_DATA) then speech TOC 274 is empty. 276 3.5. Speech Data 278 Speech data of a payload contains one or more speech frames or comfort 279 noise frames, as specified in the speech TOC of the payload. 281 Each speech frame represents 20 ms of speech encoded with the rate 282 indicated in the CR and base rate indicated in BR field of the payload 283 header. 284 The size of coded speech frame is variable due to the nature of codec. 285 The Encoder's algorithm decides what size of each frame is and returns 286 it after encoding. In order to save bandwidth the size is not placed 287 into payload obviously. Decoder can calculate frame size by its content 288 and returns it to the top level application. This way a size of each 289 frame can be obtained. Moreover, there is a special service function 290 that returns frame size without total decoding which may be used for 291 this purpose. 293 3.6. Redundancy Header 295 If a packet contains redundancy (R field of payload header is 1) the 296 speech data is followed by redundancy header: 298 0 1 2 3 4 5 299 +-+-+-+-+-+-+ 300 | CL1 | CL2 | 301 +-+-+-+-+-+-+ 303 Redundancy header consists of two fields. Each field contains class 304 specifier for amount of redundancy partly taken from the preceding 305 packet (CL1) and pre-preceding packet (CL2), e.g. distant from the 306 current packet by 1 and 2 packets accordingly. The values are listed 307 in the table below: 309 +-------+-------------------+ 310 | CL | amount redundancy | 311 +-------+-------------------+ 312 | 0 | NONE | 313 | 1 | CLASS A | 314 | 2 | CLASS B | 315 | 3 | CLASS C | 316 | 4 | CLASS D | 317 | 5 | CLASS E | 318 | 6 | CLASS F | 319 | 7 | (reserved) | 320 +-------+-------------------+ 322 Each specifier takes 3 bits, thus the total redundancy header size is 6 323 bits. 325 3.7. Redundancy Table of Contents 327 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 328 | Pkt1 Entries| Pkt2 Entries| 329 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 331 The redundancy TOC contains entries for redundancy frames from preceding 332 and pre-preceding packets. Each entry takes 1 bit like speech TOC entry 333 (3.3): 335 0 336 +-+ 337 |E| 338 +-+ 340 o E (1 bit): frame existence indicator. If set to 0, this indicates 341 the corresponding frame is absent. 343 o For each preceding and pre-preceding packet the number of entries 344 is equal to the grouping size of the current packet. E.g. maximum 345 number of entries is 4*2 = 8. 347 o If class specifier in the redundancy header is CL=0 (NO_DATA) 348 then there is no entries for corresponding packet redundancy. 350 3.8. Redundancy Data 352 Redundancy data of a payload contains redundancy information for one or 353 more speech frames or comfort noise frames that may be lost during 354 transition, as specified in the redundancy TOC of the payload. Actually 355 redundancy is the most important part of preceding frames representing 356 20 ms of speech. This data MAY be used for partial reconstruction of 357 lost frames. The amount of available redundancy is specified by CL flag 358 in redundancy header section (3.5). This flag SHOULD be passed to 359 decoder. The length of redundancy frame is variable and can be 281 360 calculated after decoding. 362 4. Payload Examples 364 A few examples to highlight the payload format follow. 366 4.1. Payload Carrying a Single Frame 368 The following diagram shows a standard IP-MR payload carrying a single 369 speech frame without redundancy: 371 0 1 2 3 372 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 373 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 374 |0|CR=1 |BR=0 |0|0|0 0|0|1|sp(0) | 375 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 376 | | 377 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 378 | | 379 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 380 | | 381 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 382 | | 383 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 384 | | 385 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 386 | sp(193)|P| 387 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 389 In the payload the speech frame is not damaged at the IP origin (E=1), 390 the coding rate is 9.7 kbps(CR=1), the base rate is 7.8 kbps (BR=0), and 391 the DTX mode is off. There is no byte alignment (A=0) and no redundancy 392 (R=0). The encoded speech bits - s(0) to s(193) - are placed immediately 393 after TOC. Finally, one zero bit is added at the end as padding to make 394 the payload byte aligned. 396 4.2. Payload Carrying Multiple Frames with Redundancy 398 The following diagram shows a payload that contains three frames, one of 399 them with no speech data. The coding rate is 7.7 kbps (CR=0), the base 400 rate is 7.7 kbps (BR=0), and the DTX mode is on. The speech frames are 401 byte aligned (A=1), so 1 zero bit is added at the end of the header. 402 Besides the speech frames the payload contains six redundancy frames 403 (three per each delayed packet). 405 The first speech frame consists of bits sp1(0) to sp1(92). After that 3 406 bits are added for byte alignment. The second frame does not contain any 407 speech information that is represented in the payload by its TOC entry. 408 The third frame consists of bits sp3(0) to sp3(171). 410 The redundancy header follows after speech data. The one-packet- delayed 411 redundancy contains class A+B bits (CL1=2), and two-packet- delayed 412 redundancy contains class A bits (Cl2=1). The one-packet- delayed 413 redundancy contains three frames with 20, 39 and 35 bits respectively. 415 The first frame of two-packet-delayed redundancy is absent, it is 416 represented in its TOC entry, and two other frames have sizes 15 and 19 417 bits. 419 Note that all speech frames are padded with zero bits for byte 420 alignment. 422 0 1 2 3 423 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 424 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 425 |0|CR=0 |BR=0 |1|1|1 0|1|1 0 1|P|sp1(0) | 426 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 427 | | 428 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 429 | | 430 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 431 | sp1(92)|P|P|P|sp3(0) | 432 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 433 | | 434 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 435 | | 436 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 437 | | 438 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 439 | | 440 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 441 | sp3(171)|P|P|P|P| 442 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 443 |CL1=2|CL2=1|1 1 1|0 1 1|red1_1(0) red1_1(19)| 444 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 445 |red1_2(0) | 446 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 447 | red1_2(38)|red1_3(0) | 448 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 449 | red1_3(34)|red2_2(0) red2_2(14)|red2_3(0) | 450 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 451 | red2_3(18)|P|P|P|P| 452 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 454 5. Media Type Registration 456 This section describes the media types and names associated with this 457 payload format. 459 5.1. Registration of media subtype audio/ip-mr_v2.5 461 Type name: audio 463 Subtype name: ip-mr_v2.5 465 Required parameters: none 467 Optional parameters: 469 * ptime: Gives the length of time in milliseconds represented by the 470 media in a packet. Allowed values are: 20, 40, 60 and 80. 472 Encoding considerations: This media type is framed binary data (see RFC 473 4288, Section 4.8). 475 Security considerations: See RFC 3550 [RFC 3550] 477 Interoperability considerations: none 479 Published specification: RFC XXXX 481 Applications that use this media type: Real-time audio applications like 482 voice over IP and teleconference, and multi-media streaming. 484 Additional information: none 486 Person & email address to contact for further information: 487 Elena Berlizova 488 berlizova@spiritdsp.com 490 Intended usage: COMMON 492 Restrictions on usage: This media type depends on RTP framing, and hence 493 is only defined for transfer via RTP [RFC 3550]. 495 Author: 496 Sergey Ikonin 498 Change controller: IETF Audio/Video Transport working group delegated 499 from the IESG. 501 5.2. Mapping Media Type Parameters into SDP 503 The information carried in the media type specification has a specific 504 mapping to fields in the Session Description Protocol (SDP) [RFC 4566], 505 which is commonly used to describe RTP sessions. When SDP is used to 506 specify sessions employing the IP-MR codec, the mapping is as follows: 508 o The media type ("audio") goes in SDP "m=" as the media name. 510 o The media subtype (payload format name) goes in SDP "a=rtpmap" 511 as the encoding name. The RTP clock rate in "a=rtpmap" MUST 16000. 513 o The parameter "ptime" goes in the SDP "a=ptime" attributes. 515 Any remaining parameters go in the SDP "a=fmtp" attribute by copying 516 them directly from the media type parameter string as a semicolon- 517 separated list of parameter=value pairs. 519 Note that the payload format (encoding) names are commonly shown in 520 upper case. Media subtypes are commonly shown in lower case. These 521 names are case-insensitive in both places. 523 6. Security Considerations 525 RTP packets using the payload format defined in this specification 526 are subject to the security considerations discussed in the RTP 527 specification [RFC 3550] and in any applicable RTP profile. The main 528 security considerations for the RTP packet carrying the RTP payload 529 format defined within this memo are confidentiality, integrity, and 530 source authenticity. Confidentiality is achieved by encryption of the 531 RTP payload. Integrity of the RTP packets is achieved through a suitable 532 cryptographic integrity protection mechanism. Such a cryptographic 533 system may also allow the authentication of the source of the payload. 535 A suitable security mechanism for this RTP payload format should 536 provide confidentiality, integrity protection, and at least source 537 authenticationcapable of determining if an RTP packet is from a 538 member of the RTP session. 540 Note that the appropriate mechanism to provide security to RTP and 541 payloads following this memo may vary. It is dependent on the 542 application, the transport, and the signaling protocol employed. 543 Therefore, a single mechanism is not sufficient, although if suitable, 544 usage of the Secure Real-time Transport Protocol (SRTP) [RFC 3711] is 545 recommended. Other mechanisms that may be used are IPsec [RFC 4301] 546 and Transport Layer Security (TLS) [RFC 5246] (RTPover TCP); other 547 alternatives may exist. 549 This payload format does not exhibit any significant non-uniformity in 550 the receiver side computational complexity for packet processing, and 551 thus is unlikely to pose a denial-of-service threat due to the receipt 552 of pathological data. 554 7. Congestion Control 556 The general congestion control considerations for transporting RTP data 557 apply; see RTP [RFC 3550] and any applicable RTP profile like AVP 558 [RFC 3551]. However, the multi-rate capability of IP-MR speech coding 559 provides a mechanism that may help to control congestion, since the 560 bandwidth demand can be adjusted by selecting a different encoding mode. 562 The bit rate scalability of IP-MR codec allows reducing voice traffic 563 by omitting enhancement layers without re-encoding. This provides 564 additional means for congestion control. Some intermediate network 565 node MAY modify the IP-MR RTP payload by dropping some of the layers 566 during transmission to meet the available bandwidth requirements. In 567 case the payload is forwarded with modified content at least the base 568 layer MUST be preserved in the payload which is being delivered to 569 receiving side guarantees meaningful speech decoding without packet 570 loss concealment procedure. 572 The number of frames encapsulated in each RTP payload highly 573 influences the overall bandwidth of the RTP stream due to header 574 overhead constraints. Packetizing more frames in each RTP payload 575 can reduce the number of packets sent and hence the overhead from 576 IP/UDP/RTP headers, at the expense of increased delay. 578 If in-band redundancy scheme is used to protect against packet loss, 579 the amount of introduced redundancy will need to be regulated so that 580 the use of redundancy itself does not cause a congestion problem. In 581 other words, a sender SHALL NOT increase the total bitrate when adding 582 redundancy in response to packet loss, and needs instead to adjust it 583 down in accordance to the congestion control algorithm being run. Thus, 584 when adding redundancy, the media bitrate will need to be reduced to 585 provide room for the redundancy. 587 8. IANA Considerations 589 One media type has been defined and needs registration in the media 590 types registry. 592 9. Normative References 594 [RFC 2119] Bradner, S., "Key words for use in RFCs to Indicate 595 Requirement Levels", BCP 14, RFC 2119, March 1997. 597 [RFC 3550] Schulzrinne, H., Casner, S., Frederick, R., and 598 V. Jacobson, "RTP: A Transport Protocol for Real-Time 599 Applications", STD 64, RFC 3550, July 2003. 601 [RFC 3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio 602 and Video Conferences with Minimal Control", STD 65, 603 RFC 3551, July 2003. 605 [RFC 4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 606 Description Protocol", RFC 4566, July 2006. 608 [RFC 3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., Norrman, 609 K., "The Secure Real-Time Transport Protocol (SRTP)", RFC 610 3711, March 2004. 612 [RFC 5246] Dierks, T. and E. Rescorla, "The Transport Layer 613 Security (TLS) Protocol Version 1.2", RFC 5246, 614 August 2008. 616 [RFC 4301] Kent, S. and K. Seo, "Security Architecture for the 617 Internet Protocol", RFC 4301, December 2005. 619 10. Author(s) Information 621 Sergey Ikonin 623 email: ikonin@spiritdsp.com 625 Russia 109004 626 Building 27, A. Solgenizyn street 627 Tel: +7 495 661-2178 628 Fax: +7 495 912-6786 630 11. Disclaimer 632 This document may contain material from IETF Documents or IETF 633 Contributions published or made publicly available before November 10, 634 2008. The person(s) controlling the copyright in some of this material 635 may not have granted the IETF Trust the right to allow modifications of 636 such material outside the IETF Standards Process. Without obtaining an 637 adequate license from the person(s) controlling the copyright in such 638 materials, this document may not be modified outside the IETF Standards 639 Process, and derivative works of it may not be created outside the IETF 640 Standards Process, except to format it for publication as an RFC or to 641 translate it into languages other than English. 643 12. Legal Terms 645 All IETF Documents and the information contained therein are provided on 646 an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 647 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 648 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR 649 IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 650 INFORMATION THEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 651 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 653 The IETF Trust takes no position regarding the validity or scope of any 654 Intellectual Property Rights or other rights that might be claimed to 655 pertain to the implementation or use of the technology described in any 656 IETF Document or the extent to which any license under such rights might 657 or might not be available; nor does it represent that it has made any 658 independent effort to identify any such rights. 660 Copies of Intellectual Property disclosures made to the IETF Secretariat 661 and any assurances of licenses to be made available, or the result of an 662 attempt made to obtain a general license or permission for the use of 663 such proprietary rights by implementers or users of this specification 664 can be obtained from the IETF on-line IPR repository at 665 http://www.ietf.org/ipr. 667 The IETF invites any interested party to bring to its attention any 668 copyrights, patents or patent applications, or other proprietary rights 669 that may cover technology that may be required to implement any standard 670 or specification contained in an IETF Document. Please address the 671 information to the IETF at ietf-ipr@ietf.org. 673 The definitive version of an IETF Document is that published by, or 674 under the auspices of, the IETF. Versions of IETF Documents that are 675 published by third parties, including those that are translated into 676 other languages, should not be considered to be definitive versions of 677 IETF Documents. The definitive version of these Legal Provisions is that 678 published by, or under the auspices of, the IETF. Versions of these 679 Legal Provisions that are published by third parties, including those 680 that are translated into other languages, should not be considered to be 681 definitive versions of these Legal Provisions. 683 For the avoidance of doubt, each Contributor to the IETF Standards 684 Process licenses each Contribution that he or she makes as part of the 685 IETF Standards Process to the IETF Trust pursuant to the provisions of 686 RFC 5378. No language to the contrary, or terms, conditions or rights 687 that differ from or are inconsistent with the rights and licenses 688 granted under RFC 5378, shall have any effect and shall be null and 689 void, whether published or posted by such Contributor, or included with 690 or in such Contribution.