idnits 2.17.1 draft-ietf-avt-rtp-gsm-hr-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Sep 2009 rather than the newer Notice from 28 Dec 2009. (See https://trustee.ietf.org/license-info/) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (January 21, 2010) is 5206 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) ** Obsolete normative reference: RFC 5405 (Obsoleted by RFC 8085) -- Obsolete informational reference (is this intentional?): RFC 2326 (Obsoleted by RFC 7826) -- Obsolete informational reference (is this intentional?): RFC 4288 (Obsoleted by RFC 6838) -- Obsolete informational reference (is this intentional?): RFC 5246 (Obsoleted by RFC 8446) Summary: 3 errors (**), 0 flaws (~~), 1 warning (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group X. Duan 3 Internet-Draft S. Wang 4 Intended status: Standards Track China Mobile Communications 5 Expires: July 25, 2010 Corporation 6 M. Westerlund 7 K. Hellwig 8 I. Johansson 9 Ericsson AB 10 January 21, 2010 12 RTP Payload format for GSM-HR 13 draft-ietf-avt-rtp-gsm-hr-03 15 Abstract 17 This document specifies the payload format for packetization of the 18 GSM Half-Rate speech codec data into the Real-time Transport Protocol 19 (RTP). The payload format supports transmission of multiple frames 20 per payload and packet loss robustness methods using redundancy. 22 Status of this Memo 24 This Internet-Draft is submitted to IETF in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF), its areas, and its working groups. Note that 29 other groups may also distribute working documents as Internet- 30 Drafts. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 The list of current Internet-Drafts can be accessed at 38 http://www.ietf.org/ietf/1id-abstracts.txt. 40 The list of Internet-Draft Shadow Directories can be accessed at 41 http://www.ietf.org/shadow.html. 43 This Internet-Draft will expire on July 25, 2010. 45 Copyright Notice 47 Copyright (c) 2010 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (http://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the BSD License. 60 Table of Contents 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 63 2. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 3 64 3. GSM Half Rate . . . . . . . . . . . . . . . . . . . . . . . . 3 65 4. Payload format Capabilities . . . . . . . . . . . . . . . . . 4 66 4.1. Use of Forward Error Correction (FEC) . . . . . . . . . . 4 67 5. Payload format . . . . . . . . . . . . . . . . . . . . . . . . 5 68 5.1. RTP Header Usage . . . . . . . . . . . . . . . . . . . . . 6 69 5.2. Payload Structure . . . . . . . . . . . . . . . . . . . . 6 70 5.2.1. Encoding of Speech Frames . . . . . . . . . . . . . . 8 71 5.2.2. Encoding of Silence Description Frames . . . . . . . . 8 72 5.3. Implementation Considerations . . . . . . . . . . . . . . 8 73 5.3.1. Transmission of SID frames . . . . . . . . . . . . . . 8 74 5.3.2. Receiving Redundant Frames . . . . . . . . . . . . . . 8 75 5.3.3. Decoding Validation . . . . . . . . . . . . . . . . . 8 76 6. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 77 6.1. 3 frames . . . . . . . . . . . . . . . . . . . . . . . . . 9 78 6.2. 3 Frames with lost frame in the middle . . . . . . . . . . 10 79 7. Payload Format Parameters . . . . . . . . . . . . . . . . . . 10 80 7.1. Media Type Definition . . . . . . . . . . . . . . . . . . 11 81 7.2. Mapping to SDP . . . . . . . . . . . . . . . . . . . . . . 12 82 7.2.1. Offer/Answer Considerations . . . . . . . . . . . . . 13 83 7.2.2. Declarative SDP Considerations . . . . . . . . . . . . 13 84 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 85 9. Congestion Control . . . . . . . . . . . . . . . . . . . . . . 14 86 10. Security Considerations . . . . . . . . . . . . . . . . . . . 14 87 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 15 88 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 15 89 12.1. Normative References . . . . . . . . . . . . . . . . . . . 15 90 12.2. Informative References . . . . . . . . . . . . . . . . . . 16 91 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 16 93 1. Introduction 95 This document specifies the payload format for packetization of GSM 96 Half Rate (GSM-HR) codec [TS46.002] encoded speech signals into the 97 Real-time Transport Protocol (RTP) [RFC3550]. The payload format 98 supports transmission of multiple frames per payload and packet loss 99 robustness methods using redundancy. 101 This document starts with conventions, a brief description of the 102 codec, and the payload formats capabilities. The payload format is 103 specified in Section 5. Examples can be found in Section 6. The 104 media type and its mappings to SDP, usage in SDP offer/answer is then 105 specified. The document ends with considerations around congestion 106 control and security. 108 This document registers a media type (audio/gsm-hr-08) for the Real- 109 time Transport protocol (RTP) payload format for the GSM-HR codec. 110 Note: This format is not compatible with the one that was drafted 111 back in 1999 to 2000 in the Internet drafts: 112 draft-ietf-avt-profile-new-05 to draft-ietf-avt-profile-new-09. A 113 later version of the AVP profile draft was published as RFC 3551 114 without any specification of the GSM-HR payload format. To avoid a 115 possible conflict with this older format, the media type of the 116 payload format specified in this document has a media type name that 117 is different from (audio/gsm-hr). 119 2. Conventions 121 This document uses the normal IETF bit-order representation. Bit 122 fields in figures are read left to right and then down. The left 123 most bit in each field is the most significant. The numbering starts 124 from 0 and ascends, where bit 0 will be the most significant. 126 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 127 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 128 document are to be interpreted as described in RFC 2119 [RFC2119]. 130 3. GSM Half Rate 132 The Global System for Mobile Communication (GSM) network provides 133 with mobile communication services for nearly 3 billion users (status 134 2008). The GSM Half Rate Codec (GSM-HR) is one of the speech codecs 135 that are used in GSM networks. GSM-HR denotes the Half-Rate speech 136 codec as specified in [TS46.002]. 138 Note: for historical reasons these 46-series specifications are 139 internally referenced as 06-series. A simple mapping applies, for 140 example 46.020 is referenced as 06.20 and so on. 142 The GSM-HR codec has a frame length of 20 ms, with narrowband speech 143 sampled at 8 kHz, i.e. 160 samples per frame. Each speech frame is 144 compressed into 112 bits of speech parameters, which is equivalent to 145 a bit rate of 5.6 kbit/s. Speech pauses are detected by a 146 standardized Voice Activity Detection (VAD). During speech pauses 147 the transmission of speech frames is inhibited. Silence Descriptor 148 (SID) frames are transmitted at the end of a talk spurt and about 149 every 480ms during speech pauses to allow for a decent Comfort Noise 150 (CN) quality at receiver side. 152 The SID frame generation in the GSM radio network is determined by 153 the GSM mobile station and the GSM radio subsystem. SID frames come 154 during speech pauses in uplink from the mobile station about every 155 480ms. In downlink to the mobile station, when they are generated by 156 the encoder of the GSM radio subsystem, SID frames are sent every 157 20ms to the GSM base station, which then picks only one every 480ms 158 for downlink radio transmission. For other applications, like 159 transport over IP, it is more appropriate to send the SID frames less 160 often than every 20ms, but 480 ms may be too sparse. We recommend as 161 a compromise that a GSM-HR encoder outside of the GSM radio network 162 (i.e. not in the GSM mobile station and not in the GSM radio 163 subsystem, but for example in the media gateway of the core network) 164 should generate and send SID frames every 160ms. 166 4. Payload format Capabilities 168 This RTP payload format carries one or more GSM-HR encoded frames, 169 either full voice or silence descriptor (SID), representing a mono 170 speech signal. To maintain synchronization or express not sent or 171 lost frames it has the capability to indicate No_Data frames. 173 4.1. Use of Forward Error Correction (FEC) 175 Generic forward error correction within RTP is defined, for example, 176 in RFC 5109 [RFC5109]. Audio redundancy coding is defined in RFC 177 2198 [RFC2198]. Either scheme can be used to add redundant 178 information to the RTP packet stream and make it more resilient to 179 packet losses, at the expense of a higher bit rate. Please see 180 either RFCs for a discussion of the implications of the higher bit 181 rate to network congestion. 183 In addition to these media-unaware mechanisms, this memo specifies an 184 optional to use GSM-HR specific form of audio redundancy coding, 185 which may be beneficial in terms of packetization overhead. 187 Conceptually, previously transmitted transport frames are aggregated 188 together with new ones. A sliding window can be used to group the 189 frames to be sent in each payload. Figure 1 below shows an example. 190 --+--------+--------+--------+--------+--------+--------+--------+-- 191 | f(n-2) | f(n-1) | f(n) | f(n+1) | f(n+2) | f(n+3) | f(n+4) | 192 --+--------+--------+--------+--------+--------+--------+--------+-- 194 <---- p(n-1) ----> 195 <----- p(n) -----> 196 <---- p(n+1) ----> 197 <---- p(n+2) ----> 198 <---- p(n+3) ----> 199 <---- p(n+4) ----> 201 Figure 1: An example of redundant transmission 203 Here, each frame is retransmitted once in the following RTP payload 204 packet. f(n-2)...f(n+4) denote a sequence of audio frames, and p(n- 205 1)...p(n+4) a sequence of payload packets. 207 The mechanism described does not really require signaling at the 208 session setup. However, signalling has been defined to allow for the 209 sender to voluntarily bounding the buffering and delay requirements. 210 If nothing is signalled the use of this mechanism is allowed and 211 unbounded. For a certain timestamp, the receiver may receive 212 multiple copies of a frame containing encoded audio data. The cost 213 of this scheme is bandwidth and the receiver delay necessary to allow 214 the redundant copy to arrive. 216 This redundancy scheme provides a functionality similar to the one 217 described in RFC 2198, but it works only if both original frames and 218 redundant representations are GSM-HR frames. When the use of other 219 media coding schemes is desirable, one has to resort to RFC 2198. 221 The sender is responsible for selecting an appropriate amount of 222 redundancy based on feedback about the channel conditions, e.g., in 223 the RTP Control Protocol (RTCP) [RFC3550] receiver reports. The 224 sender is also responsible for avoiding congestion, which may be 225 exacerbated by redundancy (see Section 9 for more details). 227 5. Payload format 229 The format of the RTP header is specified in [RFC3550]. This payload 230 format uses the fields of the header in a manner consistent with that 231 specification. 233 The duration of one speech frame is 20 ms. The sampling frequency is 234 8kHz, corresponding to 160 speech samples per frame. An RTP packet 235 may contain multiple frames of encoded speech or SID parameters. 236 Each packet covers a period of one or more contiguous 20 ms frame 237 intervals. During silence periods no speech packets are sent, 238 however SID packets are transmitted every now and then. 240 To allow for error resiliency through redundant transmission, the 241 periods covered by multiple packets MAY overlap in time. A receiver 242 MUST be prepared to receive any speech frame multiple times. A given 243 frame MUST NOT be encoded as speech frame in one packet and as SID 244 frame or as No_Data frame in another packet. Furthermore, a given 245 frame MUST NOT be encoded with different voicing modes in different 246 packets. 248 The rules regarding maximum payload size given in Section 3.2 of 249 [RFC5405] SHOULD be followed. 251 5.1. RTP Header Usage 253 The RTP timestamp corresponds to the sampling instant of the first 254 sample encoded for the first frame in the packet. The timestamp 255 clock frequency SHALL be 8000 Hz. The timestamp is also used to 256 recover the correct decoding order of the frames. 258 The RTP header marker bit (M) SHALL be set to 1 whenever the first 259 frame carried in the packet is the first frame in a talkspurt (see 260 definition of the talkspurt in section 4.1 of [RFC3551]). For all 261 other packets the marker bit SHALL be set to zero (M=0). 263 The assignment of an RTP payload type for the format defined in this 264 memo is outside the scope of this document. The RTP profiles in use 265 currently mandates binding the payload type dynamically for this 266 payload format. 268 The remaining RTP header fields are used as specified in RFC 3550 269 [RFC3550]. 271 5.2. Payload Structure 273 The complete payload consists of a payload table of contents (ToC) 274 section, followed by speech data representing one or more speech 275 frames, SID frames or No_Data frames. The following diagram shows 276 the general payload format layout: 277 +-------------+------------------------- 278 | ToC section | speech data section ... 279 +-------------+----------------------- 281 Figure 2: General payload format layout 283 Each ToC element is one octet and corresponds to one speech frame, 284 the number of ToC elements is thus equal to the number of speech 285 frames (including SID frames and No_Data frames). Each ToC entry 286 represents a consecutive speech or SID or No_Data frame. The 287 timestamp value for ToC element (and corresponding speech frame data) 288 N within the payload is (RTP timestamp field + (N-1)*160) mod 2^32 . 289 The format of the ToC element is as follows. 291 0 1 2 3 4 5 6 7 292 +-+-+-+-+-+-+-+-+ 293 |F| FT |R R R R| 294 +-+-+-+-+-+-+-+-+ 296 Figure 3: The TOC element 298 F: Follow flag, 1 denotes that more ToC elements follow, 0 denotes 299 the last ToC element. 301 R: Reserved bits, MUST be set to zero and MUST be ignored by 302 receiver. 304 FT: Frame type 305 000 = Good Speech frame 306 001 = Reserved 307 010 = Good SID frame 308 011 = Reserved 309 100 = Reserved 310 101 = Reserved 311 110 = Reserved 312 111 = No_Data frame 314 The length of the payload data depends on the frame type: 316 Good Speech frame: The 112 speech data bits are put in 14 octets. 318 Good SID frame: The 33 SID data bits are put in 14 octets, as in 319 case of Speech frames, with the unused 79 bits set all to "1". 321 No data frame: Length of payload data is zero octets. 323 Frames marked in the GSM radio subsystem as "Bad Speech frame", "Bad 324 SID frame" or "No_Data frame" are not sent in RTP packets in order to 325 save bandwidth. They are marked as "No_Data frame", if they occur 326 within an RTP packet that carries more than one speech frame, SID 327 frame or No_Data frame. 329 5.2.1. Encoding of Speech Frames 331 The 112 bits of GSM-HR-coded speech (b1...b112) are defined in TS 332 46.020, Annex B [TS46.020], in the order of occurrence. The first 333 bit (b1) of the first parameter is placed in bit 0 (the MSB) of the 334 first octet (octet 1) of the payload field; the second bit is placed 335 in bit 1 of the first octet and so on. The last bit (b112) is placed 336 in the LSB (bit 7) of octet 14. 338 5.2.2. Encoding of Silence Description Frames 340 The GSM-HR Codec applies a specific coding for silence periods in so 341 called SID frames. The coding of SID frames is based on the coding 342 of speech frames by using only the first 33 bits for SID parameters 343 and by setting the remaining 79 bits all to "1". 345 5.3. Implementation Considerations 347 An application implementing this payload format MUST understand all 348 the payload parameters that is defined in this specification. Any 349 mapping of the parameters to a signaling protocol MUST support all 350 parameters. So an implementation of this payload format in an 351 application using SDP is required to understand all the payload 352 parameters in their SDP-mapped form. This requirement ensures that 353 an implementation always can decide whether it is capable of 354 communicating when the communicating entities support this version of 355 the specification. 357 5.3.1. Transmission of SID frames 359 When using this RTP payload format the sender SHOULD generate and 360 send SID frames every 160ms, i.e. every 8th frame, during silent 361 periods. Other SID transmission intervals may occur due to gateways 362 to other systems that uses other transmission intervals. 364 5.3.2. Receiving Redundant Frames 366 The reception of redundant audio frames, i.e. more than one audio 367 frame from the same source for the same time slot, MUST be supported 368 by the implementation. 370 5.3.3. Decoding Validation 372 If the receiver finds a mismatch between the size of a received 373 payload and the size indicated by the ToC of the payload, the 374 receiver SHOULD discard the packet. This is recommended because 375 decoding a frame parsed from a payload based on erroneous ToC data 376 could severely degrade the audio quality. 378 6. Examples 380 A few examples to highlight the payload format. 382 6.1. 3 frames 384 A basic example of the aggregation of 3 consecutive speech frames 385 into a single frame. 387 The first 24 bits are ToC elements. 388 Bit 0 is '1' as another ToC element follow. 389 Bits 1..3 is 000 = Good speech frame 390 Bits 4..7 is 0000 = Reserved 391 Bits 8 is '1' as another ToC element follow. 392 Bits 9..11 is 000 = Good speech frame 393 Bits 12..15 is 0000 = Reserved 394 Bit 16 is '0', no more ToC element follows 395 Bits 17..19 is 000 = Good speech frame 396 Bits 20..23 is 0000 = Reserved 398 0 1 2 3 399 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 400 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 401 |1|0 0 0|0 0 0 0|1|0 0 0|0 0 0 0|0|0 0 0|0 0 0 0|b1 b8| 402 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 403 |b9 Frame 1 b40| 404 + + 405 |b41 b72| 406 + + 407 |b73 b104| 408 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 409 |b105 b112|b1 b24| 410 +-+-+-+-+-+-+-+-+ + 411 |b25 Frame 2 b56| 412 + + 413 |b57 b88| 414 + +-+-+-+-+-+-+-+-+ 415 |b89 b112|b1 b8| 416 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 417 |b9 Frame 3 b40| 418 + + 419 |b41 b72| 420 + + 421 |b73 b104| 422 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 423 |b105 b112| 424 +-+-+-+-+-+-+-+-+ 426 6.2. 3 Frames with lost frame in the middle 428 An example of payload carrying 3 frames where the middle one is 429 No_Data, for example due to loss prior to transmission by the RTP 430 source. 432 The first 24 bits are ToC elements. 433 Bit 0 is '1' as another ToC element follow. 434 Bits 1..3 is 000 = Good speech frame 435 Bits 4..7 is 0000 = Reserved 436 Bits 8 is '1' as another ToC element follow. 437 Bits 9..11 is 111 = No_Data frame 438 Bits 12..15 is 0000 = Reserved 439 Bit 16 is '0', no more ToC element follows 440 Bits 17..19 is 000 = Good speech frame 441 Bits 20..23 is 0000 = Reserved 443 0 1 2 3 444 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 445 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 446 |1|0 0 0|0 0 0 0|1|1 1 1|0 0 0 0|0|0 0 0|0 0 0 0|b1 b8| 447 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 448 |b9 Frame 1 b40| 449 + + 450 |b41 b72| 451 + + 452 |b73 b104| 453 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 454 |b105 b112|b1 b24| 455 +-+-+-+-+-+-+-+-+ + 456 |b25 Frame 3 b56| 457 + + 458 |b57 b88| 459 + +-+-+-+-+-+-+-+-+ 460 |b89 b112| 461 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 463 7. Payload Format Parameters 465 This RTP payload format is identified using the media type "audio/ 466 gsm-hr-08", which is registered in accordance with [RFC4855] and 467 using the template of [RFC4288]. Note: Media subtype names are case- 468 insensitive. 470 7.1. Media Type Definition 472 The media type for the GSM-HR codec is allocated from the IETF tree 473 since GSM-HR is a well know speech codec. This media type 474 registration covers real-time transfer via RTP. The media subtype 475 name contains "-08" to avoid potential conflict with any earlier 476 drafts of GSM-HR RTP payload types that aren't bit compatible. 478 Note, reception of any unspecified parameter MUST be ignored by the 479 receiver to ensure that additional parameters can be added in the 480 future. 482 Type name: audio 484 Subtype name: GSM-HR-08 486 Required parameters: none 488 Optional parameters: 490 max-red: The maximum duration in milliseconds that elapses between 491 the primary (first) transmission of a frame and any redundant 492 transmission that the sender will use. This parameter allows a 493 receiver to have a bounded delay when redundancy is used. Allowed 494 values are integers between 0 (no redundancy will be used) and 495 65535. If the parameter is omitted, no limitation on the use of 496 redundancy is present. 498 ptime: see [RFC4566]. 500 maxptime: see [RFC4566]. 502 Encoding considerations: 504 This media type is framed and binary, see section 4.8 in RFC4288 505 [RFC4288]. 507 Security considerations: 509 See Section 10 of RFCXXXX. 511 Interoperability considerations: 513 Published specification: 515 RFC XXXX, 3GPP TS 46.002 517 Applications that use this media type: 519 Real-time audio applications like voice over IP and 520 teleconference. 522 Additional information: none 524 Person & email address to contact for further information: 526 Ingemar Johansson 528 Intended usage: COMMON 530 Restrictions on usage: 532 This media type depends on RTP framing, and hence is only defined 533 for transfer via RTP [RFC3550]. Transport within other framing 534 protocols is not defined at this time. 536 Author: 538 Xiaodong Duan 540 Shuaiyu Wang 542 Magnus Westerlund 544 Ingemar Johansson 546 Karl Hellwig 548 Change controller: 550 IETF Audio/Video Transport working group delegated from the IESG. 552 7.2. Mapping to SDP 554 The information carried in the media type specification has a 555 specific mapping to fields in the Session Description Protocol (SDP) 556 [RFC4566], which is commonly used to describe RTP sessions. When SDP 557 is used to specify sessions employing the GSM-HR codec, the mapping 558 is as follows: 560 o The media type ("audio") goes in SDP "m=" as the media name. 562 o The media subtype (payload format name) goes in SDP "a=rtpmap" as 563 the encoding name. The RTP clock rate in "a=rtpmap" MUST be 8000, 564 and the encoding parameters (number of channels) MUST either be 565 explicitly set to 1 or omitted, implying a default value of 1. 567 o The parameters "ptime" and "maxptime" go in the SDP "a=ptime" and 568 "a=maxptime" attributes, respectively. 570 o Any remaining parameters go in the SDP "a=fmtp" attribute by 571 copying them directly from the media type parameter string as a 572 semicolon-separated list of parameter=value pairs. 574 7.2.1. Offer/Answer Considerations 576 The following considerations apply when using SDP Offer-Answer 577 procedures to negotiate the use of GSM-HR payload in RTP: 579 o The SDP offerer and answerer MUST generate GSM-HR packets as 580 described by the offered parameters. 582 o In most cases, the parameters "maxptime" and "ptime" will not 583 affect interoperability; however, the setting of the parameters 584 can affect the performance of the application. The SDP offer- 585 answer handling of the "ptime" parameter is described in 586 [RFC3264]. The "maxptime" parameter MUST be handled in the same 587 way. 589 o The parameter "max-red" is a stream property parameter. For 590 sendonly or sendrecv unicast media streams, the parameter declares 591 the limitation on redundancy that the stream sender will use. For 592 recvonly streams, it indicates the desired value for the stream 593 sent to the receiver. The answerer MAY change the value, but is 594 RECOMMENDED to use the same limitation as the offer declares. In 595 the case of multicast, the offerer MAY declare a limitation; this 596 SHALL be answered using the same value. A media sender using this 597 payload format is RECOMMENDED to always include the "max-red" 598 parameter. This information is likely to simplify the media 599 stream handling in the receiver. This is especially true if no 600 redundancy will be used, in which case "max-red" is set to 0. 602 o Any unknown media type parameter in an offer SHALL be removed in 603 the answer. 605 7.2.2. Declarative SDP Considerations 607 In declarative usage, like SDP in RTSP [RFC2326] or SAP [RFC2974], 608 the parameters SHALL be interpreted as follows: 610 o The stream property parameter ("max-red") is declarative, and a 611 participant MUST follow what is declared for the session. In this 612 case it means that the receiver MUST be prepared to allocate 613 buffer memory for the given redundancy. Any transmissions MUST 614 NOT use more redundancy then what has been declared. More than 615 one configuration may be provided if necessary by declaring 616 multiple RTP payload types; however, the number of types should be 617 kept small. 619 o Any "maxptime" and "ptime" values should be selected with care to 620 ensure that the session's participants can achieve reasonable 621 performance. 623 8. IANA Considerations 625 One media type (audio/gsm-hr-08) has been defined and needs 626 registration in the media types registry; see Section 7.1. 628 9. Congestion Control 630 The general congestion control considerations for transporting RTP 631 data apply; see RTP [RFC3550] and any applicable RTP profile like AVP 632 [RFC3551]. 634 The number of frames encapsulated in each RTP payload highly 635 influences the overall bandwidth of the RTP stream due to header 636 overhead constraints. Packetizing more frames in each RTP payload 637 can reduce the number of packets sent and hence the header overhead, 638 at the expense of increased delay and reduced error robustness. If 639 forward error correction (FEC) is used, the amount of FEC-induced 640 redundancy needs to be regulated such that the use of FEC itself does 641 not cause a congestion problem. 643 10. Security Considerations 645 RTP packets using the payload format defined in this specification 646 are subject to the security considerations discussed in the RTP 647 specification [RFC3550] , and in any applicable RTP profile. The 648 main security considerations for the RTP packet carrying the RTP 649 payload format defined within this memo are confidentiality, 650 integrity and source authenticity. Confidentiality is achieved by 651 encryption of the RTP payload. Integrity of the RTP packets through 652 suitable cryptographic integrity protection mechanism. Cryptographic 653 system may also allow the authentication of the source of the 654 payload. A suitable security mechanism for this RTP payload format 655 should provide confidentiality, integrity protection and at least 656 source authentication capable of determining if an RTP packet is from 657 a member of the RTP session or not. 659 Note that the appropriate mechanism to provide security to RTP and 660 payloads following this memo may vary. It is dependent on the 661 application, the transport, and the signalling protocol employed. 662 Therefore a single mechanism is not sufficient, although if suitable 663 the usage of SRTP [RFC3711] is recommended. Other mechanism that may 664 be used are IPsec [RFC4301] and TLS [RFC5246] (RTP over TCP), but 665 also other alternatives may exist. 667 This RTP payload format and its media decoder do not exhibit any 668 significant non-uniformity in the receiver-side computational 669 complexity for packet processing, and thus are unlikely to pose a 670 denial-of-service threat due to the receipt of pathological data. 671 Nor does the RTP payload format contain any active content. 673 11. Acknowledgements 675 The author would like to thank Xiaodong Duan, Shuaiyu Wang, Rocky 676 Wang and Ying Zhang for their initial work in this area. Many thanks 677 also go to Tomas Frankkila for useful input and comments. 679 12. References 681 12.1. Normative References 683 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 684 Requirement Levels", BCP 14, RFC 2119, March 1997. 686 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 687 with Session Description Protocol (SDP)", RFC 3264, 688 June 2002. 690 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 691 Jacobson, "RTP: A Transport Protocol for Real-Time 692 Applications", STD 64, RFC 3550, July 2003. 694 [RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and 695 Video Conferences with Minimal Control", STD 65, RFC 3551, 696 July 2003. 698 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 699 Description Protocol", RFC 4566, July 2006. 701 [RFC5405] Eggert, L. and G. Fairhurst, "Unicast UDP Usage Guidelines 702 for Application Designers", BCP 145, RFC 5405, 703 November 2008. 705 [TS46.002] 706 3GPP, "Specification : 3GPP TS 46.002 http://www.3gpp.org/ 707 ftp/Specs/archive/46_series/46.002/46002-700.zip", 708 June 2007. 710 [TS46.020] 711 3GPP, "Specification : 3GPP TS 46.020 http://www.3gpp.org/ 712 ftp/Specs/archive/46_series/46.002/46020-700.zip", 713 June 2007. 715 12.2. Informative References 717 [RFC2198] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., 718 Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse- 719 Parisis, "RTP Payload for Redundant Audio Data", RFC 2198, 720 September 1997. 722 [RFC2326] Schulzrinne, H., Rao, A., and R. Lanphier, "Real Time 723 Streaming Protocol (RTSP)", RFC 2326, April 1998. 725 [RFC2974] Handley, M., Perkins, C., and E. Whelan, "Session 726 Announcement Protocol", RFC 2974, October 2000. 728 [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 729 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 730 RFC 3711, March 2004. 732 [RFC4288] Freed, N. and J. Klensin, "Media Type Specifications and 733 Registration Procedures", BCP 13, RFC 4288, December 2005. 735 [RFC4301] Kent, S. and K. Seo, "Security Architecture for the 736 Internet Protocol", RFC 4301, December 2005. 738 [RFC4855] Casner, S., "Media Type Registration of RTP Payload 739 Formats", RFC 4855, February 2007. 741 [RFC5109] Li, A., "RTP Payload Format for Generic Forward Error 742 Correction", RFC 5109, December 2007. 744 [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security 745 (TLS) Protocol Version 1.2", RFC 5246, August 2008. 747 Authors' Addresses 749 Xiaodong Duan 750 China Mobile Communications Corporation 751 53A, Xibianmennei Ave., Xuanwu District 752 Beijing, 100053 753 P.R. China 755 Phone: 756 Fax: 757 Email: duanxiaodong@chinamobile.com 758 URI: 760 Shuaiyu Wang 761 China Mobile Communications Corporation 762 53A, Xibianmennei Ave., Xuanwu District 763 Beijing, 100053 764 P.R. China 766 Phone: 767 Fax: 768 Email: wangshuaiyu@chinamobile.com 769 URI: 771 Magnus Westerlund 772 Ericsson AB 773 Farogatan 6 774 Stockholm, SE-164 80 775 Sweden 777 Phone: +46 8 719 0000 778 Fax: 779 Email: magnus.westerlund@ericsson.com 780 URI: 782 Karl Hellwig 783 Ericsson AB 784 Kackertstrasse 7-9 785 52072 Aachen 786 Germany 788 Phone: +49 2407 575-2054 789 Email: karl.hellwig@ericsson.com 790 Ingemar Johansson 791 Ericsson AB 792 Laboratoriegrand 11 793 SE-971 28 Lulea 794 SWEDEN 796 Phone: +46 73 0783289 797 Email: ingemar.s.johansson@ericsson.com