idnits 2.17.1 draft-barany-avt-efr-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 450 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There is 1 instance of too long lines in the document, the longest one being 3 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (November 14, 2001) is 8198 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'PAGE 1' on line 50 looks like a reference -- Missing reference section? 'PAGE 2' on line 100 looks like a reference -- Missing reference section? 'PAGE 3' on line 150 looks like a reference -- Missing reference section? 'PAGE 4' on line 198 looks like a reference -- Missing reference section? 'PAGE 5' on line 247 looks like a reference -- Missing reference section? 'PAGE 6' on line 297 looks like a reference -- Missing reference section? 'PAGE 7' on line 346 looks like a reference -- Missing reference section? 'PAGE 8' on line 391 looks like a reference Summary: 3 errors (**), 0 flaws (~~), 2 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force Peter Barany, Nortel Networks 3 Audio Video Transport WG William Navarro, Nortel Networks 4 INTERNET-DRAFT 5 November 14, 2001 6 Expires: May 14, 2002 8 RTP payload format for EFR speech codec 9 11 Status of this Memo 13 This document is an Internet-Draft and is in full conformance with 14 all provisions of Section 10 of RFC 2026. 16 Internet-Drafts are working documents of the Internet Engineering 17 Task Force (IETF), its areas, and its working groups. Note that other 18 groups may also distribute working documents as Internet-Drafts. 20 Internet-Drafts are draft documents valid for a maximum of six months 21 and may be updated, replaced, or obsoleted by other documents at any 22 time. It is inappropriate to use Internet-Drafts as reference 23 material or to cite them other than as "work in progress." 25 The list of current Internet-Drafts can be accessed at 26 http://www.ietf.org/ietf/1id-abstracts.txt 28 The list of Internet-Draft Shadow Directories can be accessed at 29 http://www.ietf.org/shadow.html. 31 This document is an individual submission to the IETF AVT WG. 32 Comments should be directed to the authors. 34 Abstract 36 This document specifies a Real-Time Transport Protocol (RTP) payload 37 format for the Global System for Mobile communications (GSM) Enhanced 38 Full Rate (EFR) speech codec. The EFR speech codec RTP payload format 39 specified in this document closely resembles the EFR speech codec RTP 40 payload format defined in TS 101 318 "Using GSM Speech Codecs Within 41 ITU-T Recommendation H.323". It is designed specifically to optimally 42 interoperate with existing (i.e., legacy) GSM circuit-switched 43 transceiver equipment in the sense that it supports the following 44 EFR speech codec circuit-switched domain functionality in the packet- 45 switched domain: error concealment of lost speech frames and SIlence 46 Descriptor (SID) frames. The EFR speech codec RTP payload format 47 defined in TS 101 318 does not support this functionality. A MIME 48 type registration for the EFR speech codec is also included. 50 Barany et al. [PAGE 1] 51 Revision history 53 -00: Document created for specification of an RTP payload format for 54 the EFR speech codec. 56 Conventions used in this document 58 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 59 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 60 document are to be interpreted as described in RFC 2119 61 [ref-RFC-2119]. 63 Table of contents 65 Status of this memo.................................................1 66 Abstract............................................................1 67 Revision history (remove before publishing).........................2 68 Conventions used in this document...................................2 69 Table of contents...................................................2 70 1. Introduction...................................................2 71 1.1. EFR speech codec...............................................2 72 1.2. Existing RTP payload format for EFR speech codec...............3 73 1.3. Legacy transceiver interoperability............................3 74 1.4 EFR speech codec and AMR speech codec comparison...............4 75 2. Payload format.................................................5 76 3. IANA considerations............................................6 77 4. Security considerations........................................6 78 5. MIME type registration.........................................6 79 5.1. Mapping to SDP parameters.....................................7 80 6. References.....................................................7 81 7. Authors' addresses.............................................8 83 1. Introduction 85 This document specifies a Real-Time Transport Protocol (RTP) payload 86 format for the Global System for Mobile communications (GSM) Enhanced 87 Full Rate (EFR) speech codec. The EFR speech codec RTP payload format 88 specified in this document closely resembles the EFR speech codec RTP 89 payload format defined in [ref-EFR-RTP]. It is designed specifically 90 to optimally interoperate with existing (i.e., legacy) GSM circuit- 91 switched transceiver equipment in the sense that it supports the 92 following EFR speech codec circuit-switched domain functionality in 93 the packet-switched domain: error concealment of lost speech frames 94 and SIlence Descriptor (SID) frames [ref-EFR-ERR]. 96 1.1. EFR speech codec 98 The Enhanced Full Rate (EFR) speech codec [ref-EFR-COD] was developed 100 Barany et al. [PAGE 2] 101 by the European Telecommunications Standards Institute (ETSI). The 102 EFR speech codec is standardized for the Global System for Mobile 103 communications (GSM). 105 The EFR speech codec is a single-mode speech codec with a bit rate of 106 12.2 kbps (i.e., 244 speech bits per 20 ms speech frame). The 107 sampling frequency is 8,000 Hz, consequently there are 160 samples 108 per 20 ms speech frame. 110 In the circuit-switched domain, the EFR speech codec supports the 111 following functionality: 113 (1) DTX operation [ref-EFR-DTX]; and 115 (2) error concealment of lost speech frames and SID frames [ref-EFR- 116 ERR] 118 This functionality is important because it makes it possible to 119 achieve optimum Mean Opinion Scores (MOS) for GSM circuit-switched 120 voice service using the EFR speech codec. 122 1.2. Existing RTP payload format for EFR speech codec 124 An existing RTP payload format for the EFR speech codec is defined 125 in [ref-EFR-RTP] which is referenced in [ref-RTP-PROF]. A MIME 126 registration for this RTP payload format is defined in [ref-RTP- 127 MIME]. 129 While this EFR speech codec RTP payload format can be used to 130 interoperate with existing (i.e., legacy) GSM circuit-switched 131 transceiver equipment, the functionality will be suboptimal in the 132 sense that it does not support the following EFR speech codec 133 circuit-switched domain functionality in the packet-switched domain: 134 error concealment of lost speech frames and SID frames [ref-EFR-ERR]. 136 Error concealment of lost speech frames and SID frames is not 137 possible because the RTP payload format does not incorporate a 138 payload quality indicator. 140 1.3. Legacy transceiver interoperability 142 The GSM/EDGE Radio Access Network (GERAN) (where EDGE stands for 143 Enhanced Data Rates for Global Evolution) is described in [ref- 144 GERAN]. GERAN is an evolution of: 146 (1) GSM circuit-switched voice and data radio access networks; and 148 (2) General Packet Radio Service (GPRS) and Enhanced GPRS (EGPRS) 150 Barany et al. [PAGE 3] 151 packet-switched radio access networks. 153 GERAN provides an interface between these radio access networks and 154 the Universal Mobile Telecommunications System (UMTS) core network. 156 Currently, there are a great deal of legacy GSM circuit-switched 157 transceivers deployed in the field by service providers that 158 implement a standardized scheme for channel coding/decoding, 159 interleaving/deinterleaving, CRC, modulation/demodulation, etc. 160 [ref-EFR-CH] for EFR speech codec based GSM circuit-switched voice 161 service. 163 GERAN defines a service known as the "optimized speech bearer" 164 [ref-GERAN] that makes it possible for a service provider to reuse 165 these legacy GSM circuit-switched transceivers for EFR speech codec 166 based GERAN packet-switched voice service. For the optimized speech 167 bearer service, network level and transport level headers (i.e., 168 IP/UDP/RTP) are not transmitted over the air interface (i.e., Uu 169 interface). The receiving entity (i.e., terminal or radio network 170 controller) can regenerate the headers based upon (1) information 171 submitted during call setup and (2) information derived from lower 172 layers (i.e., link and physical layers). Note that the regenerated 173 headers may not always be semantically identical to the original 174 headers. 176 Figure 1 illustrates a likely EFR speech codec based GERAN optimized 177 speech bearer scenario where the EFR speech codec is used as a 178 packet-switched application in a GERAN system with existing (i.e., 179 legacy) GSM circuit-switched transceiver equipment. 181 Uu interface Iu-ps interface 182 +----------+ +-------------+ +------------+ +-----------+ 183 | |---->| LEGACY |---->| RADIO |---->| | 184 | TERMINAL | | BASE | | NETWORK | | GATEWAY | 185 | |<----| STATION |<----| CONTROLLER |<----| | 186 | | | TRANSCEIVER | | | | | 187 +----------+ +-------------+ +------------+ +-----------+ 189 Figure 1. Terminal to gateway scenario. 191 1.4 EFR speech codec and AMR speech codec comparison 193 As mentioned in Section 1.1 of this document, the EFR speech codec is 194 a single-mode speech codec with a bit rate of 12.2 kbps (i.e., 244 195 speech bits per 20 ms speech frame). The sampling frequency is 8,000 196 Hz, consequently there are 160 samples per 20 ms speech frame. The 198 Barany et al. [PAGE 4] 199 original order of the 244 speech bits for the EFR speech codec as 200 delivered from the speech encoder is defined in Table 5 in [ref-EFR- 201 COD]. The 244 speech bits pass through a preliminary channel encoder 202 which produces 260 bits corresponding to 244 input speech bits and 16 203 redundancy bits [ref-EFR-CH]. The 260 bits are then reordered in 204 descending bit error sensitivity order according to Table 6 in [ref- 205 EFR-CH]. This enables the use of Unequal Error Detection (UED) and 206 Unequal Error Protection (UEP). There are a total of 182 Class 1 bits 207 (protected) and 78 Class 2 bits (unprotected). The Class 1 bits are 208 further divided into Class 1a (the 50 most important bits) and Class 209 1b bits (the 132 next most important bits). The Class 1a bits are 210 protected by a cyclic code and a convolutional code whereas the Class 211 1b bits are protected by the convolutional code only. 213 The 12.2 kbps speech mode is one of the eight Adaptive Multi-Rate 214 (AMR) speech codec speech modes [ref-AMR-COD]. The original order of 215 the 244 speech bits for the 12.2 kbps speech mode of the AMR speech 216 codec as delivered from the speech encoder is defined in Table 9a in 217 [ref-AMR-COD]. This is the same as that defined for the EFR speech 218 codec. However, for the AMR speech codec, the 244 speech bits do not 219 pass through a preliminary channel coder and 16 redundancy bits are 220 not added. Also, the 244 bits are reordered in descending bit error 221 sensitivity order in a different manner than that done for the EFR 222 speech codec (see Table 7 in [ref-EFR-CH]), with the bits being 223 classified as Class A bits (the 81 most important bits), Class B bits 224 (the 103 next most important bits), and Class C bits (the least 225 important 60 bits). See Table 2 in [ref-AMR-FRM]. 227 Another significant difference between the two speech codecs is in 228 regards to DTX operation. The SID frames are different. The SID frame 229 for the EFR speech codec is defined in [ref-EFR-CN]. The SID frame 230 for the AMR speech codec is defined in [ref-AMR-CN, ref-AMR-FRM]. 231 Also, the AMR speech codec has a SID_FIRST and SID_UPDATE frame (in 232 addition to the SID frame) while the EFR speech codec does not. 234 In light of these differences, the upshot of all this is that the 235 EFR speech codec RTP payload format specified in this document is 236 not based upon the AMR speech codec RTP payload format defined in 237 [ref-AMR-RTP]. Instead, the EFR speech codec specified in this 238 document closely resembles the EFR speech codec RTP payload format 239 defined in [ref-EFR-RTP]. 241 2. Payload format 243 As mentioned throughout this document, The EFR speech codec RTP 244 payload format specified in this document closely resembles the EFR 245 speech codec RTP payload format defined in [ref-EFR-RTP]. 247 Barany et al. [PAGE 5] 248 The only difference is that the 4 bit signature (0xC, binary 1100) 249 at the beginning of every buffer for the EFR speech codec RTP payload 250 format defined in [ref-EFR-RTP] MUST be replaced by a 1 bit payload 251 quality indicator Q followed by 3 reserved bits R. The payload 252 quality indicator, if not set, indicates that the payload is severely 253 damaged and the receiver should set the Bad Frame Indicator (BFI), 254 see [ref-EFR-DTX], to either "Unusable frame" (for speech frames) or 255 "Invalid SID frame" (for SID frames). The 3 reserved bits MUST be set 256 to zero. All R bits MUST be ignored by the receiver. 258 As is the case for the EFR speech RTP payload format defined in [ref- 259 EFR-RTP], the bits in the buffer are numbered in the big-endian 260 manner, starting from r1 (the MSB of the first octet) and finishing 261 to r248 (the least significant bit of the last octet). Therefore, for 262 the EFR speech codec RTP payload format specified in this document, 263 the first octet in the buffer contains QRRR in its 4 MSBs as opposed 264 to 1100 for the EFR speech codec RTP payload format defined in [ref- 265 EFR-RTP]. 267 3. IANA considerations 269 One new MIME sub-type as described in this section is to be 270 registered. 272 The MIME-name for the EFR speech codec is allocated from the IETF 273 tree since the EFR speech codec may be a widely used speech codec for 274 for GERAN packet-switched voice service using existing (i.e., legacy) 275 GSM circuit-switched transceiver equipment. 277 4. Security considerations 279 RTP packets using the payload format defined in this specification 280 are subject to the security considerations discussed in the RTP 281 specification [ref-RTP], and any appropriate profile. This implies 282 that confidentiality of the media streams is achieved by encryption. 283 Because the data encoding used with this payload format is applied 284 end-to-end, encryption may be performed after encoding so there is no 285 conflict between the two operations. 287 A potential denial-of-service threat exists for data encodings using 288 receiver side decoding. The attacker can inject pathological 289 datagrams into the stream, which are complex to decode and cause the 290 receiver to be overloaded. The decoder software should consider this 291 possibility and take the necessary precautions. 293 As with any IP-based protocol, in some circumstances, a receiver may 294 be overloaded simply by the receipt of too many packets, either 295 desired or undesired. Network-layer authentication may be used to 297 Barany et al. [PAGE 6] 298 discard packets from undesired sources, but the processing cost of 299 the authentication itself may be too high. 301 5. MIME type registration 303 Media Type name: audio 305 Media subtype name: GERAN-EFR 307 Required parameters: none 309 Optional parameters: none 311 Encoding considerations: See Section 2 of this document. 313 Security considerations: See Section 4 of this document. 315 Intended usage: COMMON 317 5.1. Mapping to SDP parameters 319 Example of usage of EFR speech codec in SDP [ref-SDP], possible GERAN 320 "optimized voice bearer" service that utilizes existing (i.e., 321 legacy) GSM circuit-switched transceiver equipment: 323 m=audio 49120 RTP/AVP 97 324 a=rtpmap:97 GERAN-EFR/8000 326 6. References 328 [ref-RFC-2119] RFC 2119 "Key Words for Use in RFCs to Indicate 329 Requirement Levels". 331 [ref-EFR-RTP] TS 101 318 "Using GSM Speech Codecs Within ITU-T 332 Recommendation H.323". 334 [ref-EFR-ERR] 3GPP TS 46.061 "Substitution and muting of lost frames 335 for Enhanced Full Rate (EFR) Speech Traffic Channels". 337 [ref-EFR-COD] 3GPP TS 46.060 "Enhanced Full Rate (EFR) Speech 338 Transcoding". 340 [ref-EFR-DTX] 3GPP TS 46.081 "Discontinuous Transmission (DTX) for 341 Enhanced Full Rate (EFR) Speech Traffic Channels". 343 [ref-RTP-PROF] draft-ietf-avt-profile-new-11.txt "RTP Profile for 344 Audio and Video Conferences with Minimal Control". 346 Barany et al. [PAGE 7] 348 [ref-RTP-MIME] draft-ietf-avt-rtp-mime-05.txt "MIME Type Registration 349 of RTP Payload Formats". 351 [ref-GERAN] 3GPP TS 43.051 "GSM/EDGE Radio Access Network (GERAN); 352 Overall Description-Stage 2". 354 [ref-EFR-CH] 3GPP TS 45.003 "Channel Coding". 356 [ref-AMR-COD] 3GPP TS 26.090 "AMR Speech Codec; Transcoding 357 Functions". 359 [ref-AMR-FRM] 3GPP TS 26.101 "AMR Speech Codec Frame Structure". 361 [ref-EFR-CN] 3GPP TS 46.062 "Comfort noise aspects for Enhanced 362 Full Rate (EFR) Speech Traffic Channels". 364 [ref-AMR-CN] 3GPP TS 26.092 "AMR Speech Codec; Comfort Noise 365 Aspects". 367 [ref-AMR-RTP] draft-ietf-avt-rtp-amr-10.txt "RTP Payload Format and 368 File Storage Format for AMR and AMR-WB Audio". 370 [ref-RTP] draft-ietf-avt-rtp-new-10.txt " RTP: A Transport 371 Protocol for Real-Time Applications". 373 [ref-SDP] draft-ietf-mmusic-sdp-new-03.txt " SDP: Session 374 Description Protocol". 376 7. Authors' Addresses 378 Peter Barany Tel: +1 972 685 2471 379 Nortel Networks EMail: pbarany@nortelnetworks.com 380 2201 Lakeside Boulevard 381 Richardson, Texas 75083 382 United States of America 384 William Navarro Tel: +33 1 39 44 57 56 385 Nortel Networks EMail: navarro@nortelnetworks.com 386 19, Avenue du Centre 387 Montigny-le-Bretonneaux - PC CT111 388 78928 Yvelines Cedex 9 389 France 391 Barany et al. [PAGE 8]