idnits 2.17.1 draft-ietf-avt-rtp-gsm-hr-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** The document seems to lack a License Notice according IETF Trust Provisions of 28 Dec 2009, Section 6.b.i or Provisions of 12 Sep 2009 Section 6.b -- however, there's a paragraph with a matching beginning. Boilerplate error? (You're using the IETF Trust Provisions' Section 6.b License Notice from 12 Feb 2009 rather than one of the newer Notices. See https://trustee.ietf.org/license-info/.) Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (Sept 30, 2009) is 5589 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) ** Obsolete normative reference: RFC 5405 (Obsoleted by RFC 8085) -- Obsolete informational reference (is this intentional?): RFC 2326 (Obsoleted by RFC 7826) -- Obsolete informational reference (is this intentional?): RFC 4288 (Obsoleted by RFC 6838) -- Obsolete informational reference (is this intentional?): RFC 5246 (Obsoleted by RFC 8446) Summary: 3 errors (**), 0 flaws (~~), 1 warning (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group X. Duan 3 Internet-Draft S. Wang 4 Intended status: Standards Track China Mobile Communications 5 Expires: April 3, 2010 Corporation 6 M. Westerlund 7 K. Hellwig 8 I. Johansson 9 Ericsson AB 10 Sept 30, 2009 12 RTP Payload format for GSM-HR 13 draft-ietf-avt-rtp-gsm-hr-02 15 Status of this Memo 17 This Internet-Draft is submitted to IETF in full conformance with the 18 provisions of BCP 78 and BCP 79. 20 Internet-Drafts are working documents of the Internet Engineering 21 Task Force (IETF), its areas, and its working groups. Note that 22 other groups may also distribute working documents as Internet- 23 Drafts. 25 Internet-Drafts are draft documents valid for a maximum of six months 26 and may be updated, replaced, or obsoleted by other documents at any 27 time. It is inappropriate to use Internet-Drafts as reference 28 material or to cite them other than as "work in progress." 30 The list of current Internet-Drafts can be accessed at 31 http://www.ietf.org/ietf/1id-abstracts.txt. 33 The list of Internet-Draft Shadow Directories can be accessed at 34 http://www.ietf.org/shadow.html. 36 This Internet-Draft will expire on April 3, 2010. 38 Copyright Notice 40 Copyright (c) 2009 IETF Trust and the persons identified as the 41 document authors. All rights reserved. 43 This document is subject to BCP 78 and the IETF Trust's Legal 44 Provisions Relating to IETF Documents in effect on the date of 45 publication of this document (http://trustee.ietf.org/license-info). 46 Please review these documents carefully, as they describe your rights 47 and restrictions with respect to this document. 49 Abstract 51 This document specifies the payload format for packetization of the 52 GSM Half-Rate speech codec data into the Real-time Transport Protocol 53 (RTP). The payload format supports transmission of multiple frames 54 per payload and packet loss robustness methods using redundancy. 56 Table of Contents 58 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 59 2. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 3 60 3. GSM Half Rate . . . . . . . . . . . . . . . . . . . . . . . . 3 61 4. Payload format Capabilities . . . . . . . . . . . . . . . . . 4 62 4.1. Use of Forward Error Correction (FEC) . . . . . . . . . . 4 63 5. Payload format . . . . . . . . . . . . . . . . . . . . . . . . 5 64 5.1. RTP Header Usage . . . . . . . . . . . . . . . . . . . . . 6 65 5.2. Payload Structure . . . . . . . . . . . . . . . . . . . . 6 66 5.2.1. Encoding of Speech Frames . . . . . . . . . . . . . . 8 67 5.2.2. Encoding of Silence Description Frames . . . . . . . . 8 68 5.3. Implementation Considerations . . . . . . . . . . . . . . 8 69 5.3.1. Transmission of SID frames . . . . . . . . . . . . . . 8 70 5.3.2. Receiving Redundant Frames . . . . . . . . . . . . . . 8 71 5.3.3. Decoding Validation . . . . . . . . . . . . . . . . . 8 72 6. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 73 6.1. 3 frames . . . . . . . . . . . . . . . . . . . . . . . . . 9 74 6.2. 3 Frames with lost frame in the middle . . . . . . . . . . 10 75 7. Payload Format Parameters . . . . . . . . . . . . . . . . . . 10 76 7.1. Media Type Definition . . . . . . . . . . . . . . . . . . 11 77 7.2. Mapping to SDP . . . . . . . . . . . . . . . . . . . . . . 12 78 7.2.1. Offer/Answer Considerations . . . . . . . . . . . . . 13 79 7.2.2. Declarative SDP Considerations . . . . . . . . . . . . 13 80 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 81 9. Congestion Control . . . . . . . . . . . . . . . . . . . . . . 14 82 10. Security Considerations . . . . . . . . . . . . . . . . . . . 14 83 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 15 84 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 15 85 12.1. Normative References . . . . . . . . . . . . . . . . . . . 15 86 12.2. Informative References . . . . . . . . . . . . . . . . . . 16 87 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 16 89 1. Introduction 91 This document specifies the payload format for packetization of GSM 92 Half Rate (GSM-HR) codec [TS46.002] encoded speech signals into the 93 Real-time Transport Protocol (RTP) [RFC3550]. The payload format 94 supports transmission of multiple frames per payload and packet loss 95 robustness methods using redundancy. 97 This document starts with conventions, a brief description of the 98 codec, and the payload formats capabilities. The payload format is 99 specified in Section 5. Examples can be found in Section 6. The 100 media type and its mappings to SDP, usage in SDP offer/answer is then 101 specified. The document ends with considerations around congestion 102 control and security. 104 This document registers a media type (audio/gsm-hr-08) for the Real- 105 time Transport protocol (RTP) payload format for the GSM-HR codec. 106 Note: This format is not compatible with the one that was drafted 107 back in 1999 to 2000 in the Internet drafts: 108 draft-ietf-avt-profile-new-05 to draft-ietf-avt-profile-new-09. A 109 later version of the AVP profile draft was published as RFC 3551 110 without any specification of the GSM-HR payload format. To avoid a 111 possible conflict with this older format, the media type of the 112 payload format specified in this document has a media type name that 113 is different from (audio/gsm-hr). 115 2. Conventions 117 This document uses the normal IETF bit-order representation. Bit 118 fields in figures are read left to right and then down. The left 119 most bit in each field is the most significant. The numbering starts 120 from 0 and ascends, where bit 0 will be the most significant. 122 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 123 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 124 document are to be interpreted as described in RFC 2119 [RFC2119]. 126 3. GSM Half Rate 128 The Global System for Mobile Communication (GSM) network provides 129 with mobile communication services for nearly 3 billion users (status 130 2008). The GSM Half Rate Codec (GSM-HR) is one of the speech codecs 131 that are used in GSM networks. GSM-HR denotes the Half-Rate speech 132 codec as specified in [TS46.002]. 134 Note: for historical reasons these 46-series specifications are 135 internally referenced as 06-series. A simple mapping applies, for 136 example 46.020 is referenced as 06.20 and so on. 138 The GSM-HR codec has a frame length of 20 ms, with narrowband speech 139 sampled at 8 kHz, i.e. 160 samples per frame. Each speech frame is 140 compressed into 112 bits of speech parameters, which is equivalent to 141 a bit rate of 5.6 kbit/s. Speech pauses are detected by a 142 standardized Voice Activity Detection (VAD). During speech pauses 143 the transmission of speech frames is inhibited. Silence Descriptor 144 (SID) frames are transmitted at the end of a talk spurt and about 145 every 480ms during speech pauses to allow for a decent Comfort Noise 146 (CN) quality at receiver side. 148 The SID frame generation in the GSM radio network is determined by 149 the GSM mobile station and the GSM radio subsystem. SID frames come 150 during speech pauses in uplink from the mobile station about every 151 480ms. In downlink to the mobile station, when they are generated by 152 the encoder of the GSM radio subsystem, SID frames are sent every 153 20ms to the GSM base station, which then picks only one every 480ms 154 for downlink radio transmission. For other applications, like 155 transport over IP, it is more appropriate to send the SID frames less 156 often than every 20ms, but 480 ms may be too sparse. We recommend as 157 a compromise that a GSM-HR encoder outside of the GSM radio network 158 (i.e. not in the GSM mobile station and not in the GSM radio 159 subsystem, but for example in the media gateway of the core network) 160 should generate and send SID frames every 160ms. 162 4. Payload format Capabilities 164 This RTP payload format carries one or more GSM-HR encoded frames, 165 either full voice or silence descriptor (SID), representing a mono 166 speech signal. To maintain synchronization or express not sent or 167 lost frames it has the capability to indicate No_Data frames. 169 4.1. Use of Forward Error Correction (FEC) 171 Generic forward error correction within RTP is defined, for example, 172 in RFC 5109 [RFC5109]. Audio redundancy coding is defined in RFC 173 2198 [RFC2198]. Either scheme can be used to add redundant 174 information to the RTP packet stream and make it more resilient to 175 packet losses, at the expense of a higher bit rate. Please see 176 either RFCs for a discussion of the implications of the higher bit 177 rate to network congestion. 179 In addition to these media-unaware mechanisms, this memo specifies an 180 optional to use GSM-HR specific form of audio redundancy coding, 181 which may be beneficial in terms of packetization overhead. 183 Conceptually, previously transmitted transport frames are aggregated 184 together with new ones. A sliding window can be used to group the 185 frames to be sent in each payload. Figure 1 below shows an example. 186 --+--------+--------+--------+--------+--------+--------+--------+-- 187 | f(n-2) | f(n-1) | f(n) | f(n+1) | f(n+2) | f(n+3) | f(n+4) | 188 --+--------+--------+--------+--------+--------+--------+--------+-- 190 <---- p(n-1) ----> 191 <----- p(n) -----> 192 <---- p(n+1) ----> 193 <---- p(n+2) ----> 194 <---- p(n+3) ----> 195 <---- p(n+4) ----> 197 Figure 1: An example of redundant transmission 199 Here, each frame is retransmitted once in the following RTP payload 200 packet. f(n-2)...f(n+4) denote a sequence of audio frames, and p(n- 201 1)...p(n+4) a sequence of payload packets. 203 The mechanism described does not really require signaling at the 204 session setup. However, signalling has been defined to allow for the 205 sender to voluntarily bounding the buffering and delay requirements. 206 If nothing is signalled the use of this mechanism is allowed and 207 unbounded. For a certain timestamp, the receiver may receive 208 multiple copies of a frame containing encoded audio data. The cost 209 of this scheme is bandwidth and the receiver delay necessary to allow 210 the redundant copy to arrive. 212 This redundancy scheme provides a functionality similar to the one 213 described in RFC 2198, but it works only if both original frames and 214 redundant representations are GSM-HR frames. When the use of other 215 media coding schemes is desirable, one has to resort to RFC 2198. 217 The sender is responsible for selecting an appropriate amount of 218 redundancy based on feedback about the channel conditions, e.g., in 219 the RTP Control Protocol (RTCP) [RFC3550] receiver reports. The 220 sender is also responsible for avoiding congestion, which may be 221 exacerbated by redundancy (see Section 9 for more details). 223 5. Payload format 225 The format of the RTP header is specified in [RFC3550]. This payload 226 format uses the fields of the header in a manner consistent with that 227 specification. 229 The duration of one speech frame is 20 ms. The sampling frequency is 230 8kHz, corresponding to 160 speech samples per frame. An RTP packet 231 may contain multiple frames of encoded speech or SID parameters. 232 Each packet covers a period of one or more contiguous 20 ms frame 233 intervals. During silence periods no speech packets are sent, 234 however SID packets are transmitted every now and then. 236 To allow for error resiliency through redundant transmission, the 237 periods covered by multiple packets MAY overlap in time. A receiver 238 MUST be prepared to receive any speech frame multiple times. A given 239 frame MUST NOT be encoded as speech frame in one packet and as SID 240 frame or as No_Data frame in another packet. Furthermore, a given 241 frame MUST NOT be encoded with different voicing modes in different 242 packets. 244 The rules regarding maximum payload size given in Section 3.2 of 245 [RFC5405] SHOULD be followed. 247 5.1. RTP Header Usage 249 The RTP timestamp corresponds to the sampling instant of the first 250 sample encoded for the first frame in the packet. The timestamp 251 clock frequency SHALL be 8000 Hz. The timestamp is also used to 252 recover the correct decoding order of the frames. 254 The RTP header marker bit (M) SHALL be set to 1 whenever the first 255 frame carried in the packet is the first frame in a talkspurt (see 256 definition of the talkspurt in section 4.1 of [RFC3551]). For all 257 other packets the marker bit SHALL be set to zero (M=0). 259 The assignment of an RTP payload type for the format defined in this 260 memo is outside the scope of this document. The RTP profiles in use 261 currently mandates binding the payload type dynamically for this 262 payload format. 264 The remaining RTP header fields are used as specified in RFC 3550 265 [RFC3550]. 267 5.2. Payload Structure 269 The complete payload consists of a payload table of contents (ToC) 270 section, followed by speech data representing one or more speech 271 frames, SID frames or No_Data frames. The following diagram shows 272 the general payload format layout: 273 +-------------+------------------------- 274 | ToC section | speech data section ... 275 +-------------+----------------------- 277 Figure 2: General payload format layout 279 Each ToC element is one octet and corresponds to one speech frame, 280 the number of ToC elements is thus equal to the number of speech 281 frames (including SID frames and No_Data frames). Each ToC entry 282 represents a consecutive speech or SID or No_Data frame. The 283 timestamp value for ToC element (and corresponding speech frame data) 284 N within the payload is (RTP timestamp field + (N-1)*160) mod 2^32 . 285 The format of the ToC element is as follows. 287 0 1 2 3 4 5 6 7 288 +-+-+-+-+-+-+-+-+ 289 |F| FT |R R R R| 290 +-+-+-+-+-+-+-+-+ 292 Figure 3: The TOC element 294 F: Follow flag, 1 denotes that more ToC elements follow, 0 denotes 295 the last ToC element. 297 R: Reserved bits, MUST be set to zero and MUST be ignored by 298 receiver. 300 FT: Frame type 301 000 = Good Speech frame 302 001 = Reserved 303 010 = Good SID frame 304 011 = Reserved 305 100 = Reserved 306 101 = Reserved 307 110 = Reserved 308 111 = No_Data frame 310 The length of the payload data depends on the frame type: 312 Good Speech frame: The 112 speech data bits are put in 14 octets. 314 Good SID frame: The 33 SID data bits are put in 14 octets, as in 315 case of Speech frames, with the unused 79 bits set all to "1". 317 No data frame: Length of payload data is zero octets. 319 Frames marked in the GSM radio subsystem as "Bad Speech frame", "Bad 320 SID frame" or "No_Data frame" are not sent in RTP packets in order to 321 save bandwidth. They are marked as "No_Data frame", if they occur 322 within an RTP packet that carries more than one speech frame, SID 323 frame or No_Data frame. 325 5.2.1. Encoding of Speech Frames 327 The 112 bits of GSM-HR-coded speech (b1...b112) are defined in TS 328 46.020, Annex B [TS46.020], in the order of occurrence. The first 329 bit (b1) of the first parameter is placed in bit 0 (the MSB) of the 330 first octet (octet 1) of the payload field; the second bit is placed 331 in bit 1 of the first octet and so on. The last bit (b112) is placed 332 in the LSB (bit 7) of octet 14. 334 5.2.2. Encoding of Silence Description Frames 336 The GSM-HR Codec applies a specific coding for silence periods in so 337 called SID frames. The coding of SID frames is based on the coding 338 of speech frames by using only the first 33 bits for SID parameters 339 and by setting the remaining 79 bits all to "1". 341 5.3. Implementation Considerations 343 An application implementing this payload format MUST understand all 344 the payload parameters that is defined in this specification. Any 345 mapping of the parameters to a signaling protocol MUST support all 346 parameters. So an implementation of this payload format in an 347 application using SDP is required to understand all the payload 348 parameters in their SDP-mapped form. This requirement ensures that 349 an implementation always can decide whether it is capable of 350 communicating when the communicating entities support this version of 351 the specification. 353 5.3.1. Transmission of SID frames 355 When using this RTP payload format the sender SHOULD generate and 356 send SID frames every 160ms, i.e. every 8th frame. Other SID 357 transmission intervals may occur due to gateways to other systems 358 that uses other transmission intervals. 360 5.3.2. Receiving Redundant Frames 362 The reception of redundant audio frames, i.e. more than one audio 363 frame from the same source for the same time slot, MUST be supported 364 by the implementation. 366 5.3.3. Decoding Validation 368 If the receiver finds a mismatch between the size of a received 369 payload and the size indicated by the ToC of the payload, the 370 receiver SHOULD discard the packet. This is recommended because 371 decoding a frame parsed from a payload based on erroneous ToC data 372 could severely degrade the audio quality. 374 6. Examples 376 A few examples to highlight the payload format. 378 6.1. 3 frames 380 A basic example of the aggregation of 3 consecutive speech frames 381 into a single frame. 383 The first 24 bits are ToC elements. 384 Bit 0 is '1' as another ToC element follow. 385 Bits 1..3 is 000 = Good speech frame 386 Bits 4..7 is 0000 = Reserved 387 Bits 8 is '1' as another ToC element follow. 388 Bits 9..11 is 000 = Good speech frame 389 Bits 12..15 is 0000 = Reserved 390 Bit 16 is '0', no more ToC element follows 391 Bits 17..19 is 000 = Good speech frame 392 Bits 20..23 is 0000 = Reserved 394 0 1 2 3 395 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 396 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 397 |1|0 0 0|0 0 0 0|1|0 0 0|0 0 0 0|0|0 0 0|0 0 0 0|b1 b8| 398 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 399 |b9 Frame 1 b40| 400 + + 401 |b41 b72| 402 + + 403 |b73 b104| 404 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 405 |b105 b112|b1 b24| 406 +-+-+-+-+-+-+-+-+ + 407 |b25 Frame 2 b56| 408 + + 409 |b57 b88| 410 + +-+-+-+-+-+-+-+-+ 411 |b89 b112|b1 b8| 412 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 413 |b9 Frame 3 b40| 414 + + 415 |b41 b72| 416 + + 417 |b73 b104| 418 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 419 |b105 b112| 420 +-+-+-+-+-+-+-+-+ 422 6.2. 3 Frames with lost frame in the middle 424 An example of payload carrying 3 frames where the middle one is 425 No_Data, for example due to loss prior to transmission by the RTP 426 source. 428 The first 24 bits are ToC elements. 429 Bit 0 is '1' as another ToC element follow. 430 Bits 1..3 is 000 = Good speech frame 431 Bits 4..7 is 0000 = Reserved 432 Bits 8 is '1' as another ToC element follow. 433 Bits 9..11 is 111 = No_Data frame 434 Bits 12..15 is 0000 = Reserved 435 Bit 16 is '0', no more ToC element follows 436 Bits 17..19 is 000 = Good speech frame 437 Bits 20..23 is 0000 = Reserved 439 0 1 2 3 440 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 441 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 442 |1|0 0 0|0 0 0 0|1|1 1 1|0 0 0 0|0|0 0 0|0 0 0 0|b1 b8| 443 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + 444 |b9 Frame 1 b40| 445 + + 446 |b41 b72| 447 + + 448 |b73 b104| 449 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 450 |b105 b112|b1 b24| 451 +-+-+-+-+-+-+-+-+ + 452 |b25 Frame 3 b56| 453 + + 454 |b57 b88| 455 + +-+-+-+-+-+-+-+-+ 456 |b89 b112| 457 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 459 7. Payload Format Parameters 461 This RTP payload format is identified using the media type "audio/ 462 gsm-hr-08", which is registered in accordance with [RFC4855] and 463 using the template of [RFC4288]. Note: Media subtype names are case- 464 insensitive. 466 7.1. Media Type Definition 468 The media type for the GSM-HR codec is allocated from the IETF tree 469 since GSM-HR is a well know speech codec. This media type 470 registration covers real-time transfer via RTP. The media subtype 471 name contains "-08" to avoid potential conflict with any earlier 472 drafts of GSM-HR RTP payload types that aren't bit compatible. 474 Note, reception of any unspecified parameter MUST be ignored by the 475 receiver to ensure that additional parameters can be added in the 476 future. 478 Type name: audio 480 Subtype name: GSM-HR-08 482 Required parameters: none 484 Optional parameters: 486 max-red: The maximum duration in milliseconds that elapses between 487 the primary (first) transmission of a frame and any redundant 488 transmission that the sender will use. This parameter allows a 489 receiver to have a bounded delay when redundancy is used. Allowed 490 values are integers between 0 (no redundancy will be used) and 491 65535. If the parameter is omitted, no limitation on the use of 492 redundancy is present. 494 ptime: see [RFC4566]. 496 maxptime: see [RFC4566]. 498 Encoding considerations: 500 This media type is framed and binary, see section 4.8 in RFC4288 501 [RFC4288]. 503 Security considerations: 505 See Section 10 of RFCXXXX. 507 Interoperability considerations: 509 Published specification: 511 RFC XXXX, 3GPP TS 46.002 513 Applications that use this media type: 515 Real-time audio applications like voice over IP and 516 teleconference. 518 Additional information: none 520 Person & email address to contact for further information: 522 Ingemar Johansson 524 Intended usage: COMMON 526 Restrictions on usage: 528 This media type depends on RTP framing, and hence is only defined 529 for transfer via RTP [RFC3550]. Transport within other framing 530 protocols is not defined at this time. 532 Author: 534 Xiaodong Duan 536 Shuaiyu Wang 538 Magnus Westerlund 540 Ingemar Johansson 542 Karl Hellwig 544 Change controller: 546 IETF Audio/Video Transport working group delegated from the IESG. 548 7.2. Mapping to SDP 550 The information carried in the media type specification has a 551 specific mapping to fields in the Session Description Protocol (SDP) 552 [RFC4566], which is commonly used to describe RTP sessions. When SDP 553 is used to specify sessions employing the GSM-HR codec, the mapping 554 is as follows: 556 o The media type ("audio") goes in SDP "m=" as the media name. 558 o The media subtype (payload format name) goes in SDP "a=rtpmap" as 559 the encoding name. The RTP clock rate in "a=rtpmap" MUST be 8000, 560 and the encoding parameters (number of channels) MUST either be 561 explicitly set to 1 or omitted, implying a default value of 1. 563 o The parameters "ptime" and "maxptime" go in the SDP "a=ptime" and 564 "a=maxptime" attributes, respectively. 566 o Any remaining parameters go in the SDP "a=fmtp" attribute by 567 copying them directly from the media type parameter string as a 568 semicolon-separated list of parameter=value pairs. 570 7.2.1. Offer/Answer Considerations 572 The following considerations apply when using SDP Offer-Answer 573 procedures to negotiate the use of GSM-HR payload in RTP: 575 o The SDP offerer and answerer MUST generate GSM-HR packets as 576 described by the offered parameters. 578 o In most cases, the parameters "maxptime" and "ptime" will not 579 affect interoperability; however, the setting of the parameters 580 can affect the performance of the application. The SDP offer- 581 answer handling of the "ptime" parameter is described in 582 [RFC3264]. The "maxptime" parameter MUST be handled in the same 583 way. 585 o The parameter "max-red" is a stream property parameter. For 586 sendonly or sendrecv unicast media streams, the parameter declares 587 the limitation on redundancy that the stream sender will use. For 588 recvonly streams, it indicates the desired value for the stream 589 sent to the receiver. The answerer MAY change the value, but is 590 RECOMMENDED to use the same limitation as the offer declares. In 591 the case of multicast, the offerer MAY declare a limitation; this 592 SHALL be answered using the same value. A media sender using this 593 payload format is RECOMMENDED to always include the "max-red" 594 parameter. This information is likely to simplify the media 595 stream handling in the receiver. This is especially true if no 596 redundancy will be used, in which case "max-red" is set to 0. 598 o Any unknown media type parameter in an offer SHALL be removed in 599 the answer. 601 7.2.2. Declarative SDP Considerations 603 In declarative usage, like SDP in RTSP [RFC2326] or SAP [RFC2974], 604 the parameters SHALL be interpreted as follows: 606 o The stream property parameter ("max-red") is declarative, and a 607 participant MUST follow what is declared for the session. In this 608 case it means that the receiver MUST be prepared to allocate 609 buffer memory for the given redundancy. Any transmissions MUST 610 NOT use more redundancy then what has been declared. More than 611 one configuration may be provided if necessary by declaring 612 multiple RTP payload types; however, the number of types should be 613 kept small. 615 o Any "maxptime" and "ptime" values should be selected with care to 616 ensure that the session's participants can achieve reasonable 617 performance. 619 8. IANA Considerations 621 One media type (audio/gsm-hr-08) has been defined and needs 622 registration in the media types registry; see Section 7.1. 624 9. Congestion Control 626 The general congestion control considerations for transporting RTP 627 data apply; see RTP [RFC3550] and any applicable RTP profile like AVP 628 [RFC3551]. 630 The number of frames encapsulated in each RTP payload highly 631 influences the overall bandwidth of the RTP stream due to header 632 overhead constraints. Packetizing more frames in each RTP payload 633 can reduce the number of packets sent and hence the header overhead, 634 at the expense of increased delay and reduced error robustness. If 635 forward error correction (FEC) is used, the amount of FEC-induced 636 redundancy needs to be regulated such that the use of FEC itself does 637 not cause a congestion problem. 639 10. Security Considerations 641 RTP packets using the payload format defined in this specification 642 are subject to the security considerations discussed in the RTP 643 specification [RFC3550] , and in any applicable RTP profile. The 644 main security considerations for the RTP packet carrying the RTP 645 payload format defined within this memo are confidentiality, 646 integrity and source authenticity. Confidentiality is achieved by 647 encryption of the RTP payload. Integrity of the RTP packets through 648 suitable cryptographic integrity protection mechanism. Cryptographic 649 system may also allow the authentication of the source of the 650 payload. A suitable security mechanism for this RTP payload format 651 should provide confidentiality, integrity protection and at least 652 source authentication capable of determining if an RTP packet is from 653 a member of the RTP session or not. 655 Note that the appropriate mechanism to provide security to RTP and 656 payloads following this memo may vary. It is dependent on the 657 application, the transport, and the signalling protocol employed. 658 Therefore a single mechanism is not sufficient, although if suitable 659 the usage of SRTP [RFC3711] is recommended. Other mechanism that may 660 be used are IPsec [RFC4301] and TLS [RFC5246] (RTP over TCP), but 661 also other alternatives may exist. 663 This RTP payload format and its media decoder do not exhibit any 664 significant non-uniformity in the receiver-side computational 665 complexity for packet processing, and thus are unlikely to pose a 666 denial-of-service threat due to the receipt of pathological data. 667 Nor does the RTP payload format contain any active content. 669 11. Acknowledgements 671 The author would like to thank Xiaodong Duan, Shuaiyu Wang, Rocky 672 Wang and Ying Zhang for their initial work in this area. Many thanks 673 also go to Tomas Frankkila for useful input and comments. 675 12. References 677 12.1. Normative References 679 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 680 Requirement Levels", BCP 14, RFC 2119, March 1997. 682 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 683 with Session Description Protocol (SDP)", RFC 3264, 684 June 2002. 686 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 687 Jacobson, "RTP: A Transport Protocol for Real-Time 688 Applications", STD 64, RFC 3550, July 2003. 690 [RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and 691 Video Conferences with Minimal Control", STD 65, RFC 3551, 692 July 2003. 694 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 695 Description Protocol", RFC 4566, July 2006. 697 [RFC5405] Eggert, L. and G. Fairhurst, "Unicast UDP Usage Guidelines 698 for Application Designers", BCP 145, RFC 5405, 699 November 2008. 701 [TS46.002] 702 3GPP, "Specification : 3GPP TS 46.002 http://www.3gpp.org/ 703 ftp/Specs/archive/46_series/46.002/46002-700.zip", 704 June 2007. 706 [TS46.020] 707 3GPP, "Specification : 3GPP TS 46.020 http://www.3gpp.org/ 708 ftp/Specs/archive/46_series/46.002/46020-700.zip", 709 June 2007. 711 12.2. Informative References 713 [RFC2198] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., 714 Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse- 715 Parisis, "RTP Payload for Redundant Audio Data", RFC 2198, 716 September 1997. 718 [RFC2326] Schulzrinne, H., Rao, A., and R. Lanphier, "Real Time 719 Streaming Protocol (RTSP)", RFC 2326, April 1998. 721 [RFC2974] Handley, M., Perkins, C., and E. Whelan, "Session 722 Announcement Protocol", RFC 2974, October 2000. 724 [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 725 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 726 RFC 3711, March 2004. 728 [RFC4288] Freed, N. and J. Klensin, "Media Type Specifications and 729 Registration Procedures", BCP 13, RFC 4288, December 2005. 731 [RFC4301] Kent, S. and K. Seo, "Security Architecture for the 732 Internet Protocol", RFC 4301, December 2005. 734 [RFC4855] Casner, S., "Media Type Registration of RTP Payload 735 Formats", RFC 4855, February 2007. 737 [RFC5109] Li, A., "RTP Payload Format for Generic Forward Error 738 Correction", RFC 5109, December 2007. 740 [RFC5246] Dierks, T. and E. Rescorla, "The Transport Layer Security 741 (TLS) Protocol Version 1.2", RFC 5246, August 2008. 743 Authors' Addresses 745 Xiaodong Duan 746 China Mobile Communications Corporation 747 53A, Xibianmennei Ave., Xuanwu District 748 Beijing, 100053 749 P.R. China 751 Phone: 752 Fax: 753 Email: duanxiaodong@chinamobile.com 754 URI: 756 Shuaiyu Wang 757 China Mobile Communications Corporation 758 53A, Xibianmennei Ave., Xuanwu District 759 Beijing, 100053 760 P.R. China 762 Phone: 763 Fax: 764 Email: wangshuaiyu@chinamobile.com 765 URI: 767 Magnus Westerlund 768 Ericsson AB 769 Farogatan 6 770 Stockholm, SE-164 80 771 Sweden 773 Phone: +46 8 719 0000 774 Fax: 775 Email: magnus.westerlund@ericsson.com 776 URI: 778 Karl Hellwig 779 Ericsson AB 780 Kackertstrasse 7-9 781 52072 Aachen 782 Germany 784 Phone: +49 2407 575-2054 785 Email: karl.hellwig@ericsson.com 786 Ingemar Johansson 787 Ericsson AB 788 Laboratoriegrand 11 789 SE-971 28 Lulea 790 SWEDEN 792 Phone: +46 73 0783289 793 Email: ingemar.s.johansson@ericsson.com