idnits 2.17.1 draft-camarillo-sipping-transc-framework-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard == It seems as if not all pages are separated by form feeds - found 0 form feeds but 10 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (August 28, 2003) is 7546 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? '1' on line 292 looks like a reference -- Missing reference section? '2' on line 297 looks like a reference -- Missing reference section? '3' on line 302 looks like a reference -- Missing reference section? '4' on line 306 looks like a reference -- Missing reference section? '5' on line 310 looks like a reference -- Missing reference section? '6' on line 315 looks like a reference Summary: 3 errors (**), 0 flaws (~~), 2 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force SIP WG 3 Internet Draft G. Camarillo 4 Ericsson 5 draft-camarillo-sipping-transc-framework-00.txt 6 August 28, 2003 7 Expires: February, 2004 9 Framework for Transcoding with the Session Initiation Protocol 11 STATUS OF THIS MEMO 13 This document is an Internet-Draft and is in full conformance with 14 all provisions of Section 10 of RFC2026. 16 Internet-Drafts are working documents of the Internet Engineering 17 Task Force (IETF), its areas, and its working groups. Note that 18 other groups may also distribute working documents as Internet- 19 Drafts. 21 Internet-Drafts are draft documents valid for a maximum of six months 22 and may be updated, replaced, or obsoleted by other documents at any 23 time. It is inappropriate to use Internet-Drafts as reference 24 material or to cite them other than as "work in progress". 26 The list of current Internet-Drafts can be accessed at 27 http://www.ietf.org/ietf/1id-abstracts.txt 29 To view the list Internet-Draft Shadow Directories, see 30 http://www.ietf.org/shadow.html. 32 Abstract 34 This document defines a framework for transcoding with SIP. This 35 framework includes how to discover the need of transcoding services 36 in a session and how to invoke those transcoding services. Two models 37 for transcoding services invocation are discussed; the conference 38 bridge model and the third party call control model. Both models meet 39 the requirements for SIP regarding transcoding services invocation to 40 support deaf, hard of hearing and speech-impaired individuals. 42 Table of Contents 44 1 Introduction ........................................ 3 45 2 Discovery of the Need for Transcoding Services ...... 3 46 3 Transcoding Services Invocation ..................... 4 47 3.1 Third Party Call Control Transcoding Model .......... 5 48 3.2 Conference Bridge Transcoding Model ................. 5 49 4 Security Considerations ............................. 8 50 5 Contributors ........................................ 8 51 6 Authors' Addresses .................................. 8 52 7 Bibliography ........................................ 8 54 1 Introduction 56 Two user agents involved in a SIP [1] dialog may find it impossible 57 to establish a media session due to a variety of incompatibilities. 58 Assuming that both user agents understand the same session 59 description format (e.g., SDP), incompatibilities can be found at the 60 user agent level and at the user level. At the user agent level, both 61 terminals may not support any common codec or may not support common 62 media types (e.g., a text-only terminal and an audio-only terminal). 63 At the user level, a deaf person will not be able to understand what 64 it is said over an audio stream. 66 In order to make communications possible in the presence of 67 incompatibilities, user agents need to introduce intermediaries that 68 provide transcoding services to a session. From the SIP point of 69 view, the introduction of a transcoder is done in the same way to 70 resolve both user level and user agent level incompatibilities. 71 Therefore, the invocation mechanisms described in this document are 72 generally applicable to any type of incompatibility related to how 73 the information that needs to be communicated is encoded. 75 Furthermore, although this framework focuses on 76 transcoding, the mechanisms described are applicable to 77 media manipulation in general. It would be possible to use 78 them, for example, to invoke a server that simply increased 79 the volume of an audio stream. 81 This document does not describe media server discovery. That is an 82 orthogonal problem that one can address using user agent provisioning 83 or other methods. 85 The remainder of this document is organized as follows. Section 2 86 deals with the discovery of the need of transcoding services for a 87 particular session. 89 Section 3.2 introduces the conference bridge transcoding invocation 90 model, and Section 3.1 introduces the third party call control model. 91 Both models meet the requirements regarding transcoding services 92 invocation in RFC3351 [2] to support deaf, hard of hearing and 93 speech-impaired individuals. 95 2 Discovery of the Need for Transcoding Services 97 According to the one-party consent model defined in RFC 3238 [3], 98 services that involve media manipulation invocation are best invoked 99 by one of the end-points involved in the communication, as opposed to 100 being invoked by an intermediary in the network. Following this 101 principle, one of the end-points should be the one detecting that 102 transcoding is needed for a particular session. 104 In order to decide whether or not transcoding is needed, a user agent 105 needs to know the capabilities of the remote user agent. A user agent 106 acting as an offerer typically obtains this knowledge by downloading 107 a presence document that includes media capabilities (e.g., Bob is 108 available on a terminal that only supports audio) or by getting an 109 SDP description of media capabilities as defined in RFC 3264 [4]. 110 Presence documents are typically received in a NOTIFY request as a 111 result of a subscription. SDP media capabilities descriptions are 112 typically received in a 200 (OK) response to an OPTIONS request or in 113 a 488 (Not Acceptable Here) response to an INVITE. 115 It is recommended that an offerer does not invoke transcoding 116 services before making sure that the answerer does not support the 117 capabilities needed for the session. Making wrong assumptions about 118 the answerer's capabilities can lead to situations where two 119 transcoders are introduced (one by the offerer and one by the 120 answerer) in a session that would not need any transcoding services 121 at all. 123 An example of the situation above is a call between two GSM 124 phones (without using transcoding-free operation). Both 125 phones use a GSM codec, but the speech is converted from 126 GSM to PCM by the originating MSC and from PCM back to GSM 127 by the terminating MSC. 129 Note that transcoding services can be symmetric (e.g., speech-to-text 130 plus text-to-speech) or asymmetric (e.g., a one-way speech-to-text 131 transcoding for a hearing impaired user that can talk). 133 3 Transcoding Services Invocation 135 Once the need for transcoding for a particular session has been 136 identified as described in Section 2, one of the user agents needs to 137 invoke transcoding services. 139 As stated previously, transcoder location is outside the scope of 140 this document. Therefore, we assume that the user agent invoking 141 transcoding services knows the URI of a server that provides them. 143 Invoking transcoding services from a server (T) for a session between 144 two user agents (A and B) involves establishing two media sessions; 145 one between A and T and another between T and B. How to invoke T's 146 services (i.e., how to establish both A-T and T-B sessions) depends 147 on how we model the transcoding service. We have considered two 148 models for invoking a transcoding service. The first is to use third 149 party call control [5], also referred to as 3pcc. The second is to 150 use a (dial-in and possibly dial-out) conference bridge that 151 negotiates the appropriate media parameters on each individual leg 152 (i.e., A-T and T-B). 154 Section 3.1 analyzes the applicability of the third party call 155 control model and Section 3.2 analyzes the applicability of the 156 conference bridge transcoding invocation model. 158 3.1 Third Party Call Control Transcoding Model 160 In the 3pcc transcoding model, defined in (draft-camarillo-sipping- 161 transc-3pcc), the user agent invoking the transcoding service has a 162 signalling relationship with the transcoder and another signalling 163 relationship with the remote user agent. There is no signalling 164 relationship between the transcoder and the remote user agent, as 165 shown in Figure 1. 167 This model is suitable for advanced end points that are able to 168 perform third party call control. It allows end-points to invoke 169 transcoding services on a stream basis. That is, the media streams 170 that need transcoding are routed through the transcoder while the 171 streams that do not need it are sent directly between the end points. 172 This model also allows to invoke one transcoder for the sending 173 direction and a different one for the receiving direction of the same 174 stream. 176 Invoking a transcoder in the middle of an ongoing session is also 177 quite simple. This is useful when session changes occur (e.g., an 178 audio session is upgraded to an audio/video session) and the end- 179 points cannot cope with the changes (e.g., they had common audio 180 codecs but no common video codecs). 182 The privacy level that is achieved using 3pcc is high, since the 183 transcoder does no see the signalling between both end-points. In 184 this model, the transcoder only has access to the information that is 185 strictly needed to perform its function. 187 3.2 Conference Bridge Transcoding Model 189 In a centralized conference, there are a number of media streams 190 between the conference server and each participant of a conference. 191 For a given media type (e.g., audio) the conference server sends over 192 each individual stream the media received over the rest of the 193 streams, typically performing some mixing. If the capabilities of all 194 the end-points participating in the conference are not the same, the 195 conference server may have to send audio to different participants 196 using different audio codecs. 198 +-------+ 199 | | 200 | T |** 201 | | ** 202 +-------+ ** 203 ^ * ** 204 | * ** 205 | * ** 206 SIP * ** 207 | * ** 208 | * ** 209 v * ** 210 +-------+ +-------+ 211 | | | | 212 | A |<-----SIP----->| B | 213 | | | | 214 +-------+ +-------+ 216 <-SIP-> Signalling 217 ******* Media 219 Figure 1: Third Party Call Control Model 221 Consequently, we can model a transcoding service as a two-party 222 conference server that may change not only the codec in use, but also 223 the format of the media (e.g., audio to text). 225 Using this model, T behaves as a B2BUA and the whole A-T-B session is 226 established as described in (draft-camarillo-sipping-transc-b2bua). 227 Figure 2 shows the signalling relationships between the end-points 228 and the transcoder. 230 In the conferencing bridge model, the end-point invoking the 231 transcoder is generally involved in less signalling exchanges than in 232 the 3pcc model. This may be an important feature for end-poing using 233 +-------+ 234 | |** 235 | T | ** 236 | |\ ** 237 +-------+ \\ ** 238 ^ * \\ ** 239 | * \\ ** 240 | * SIP ** 241 SIP * \\ ** 242 | * \\ ** 243 | * \\ ** 244 v * \ ** 245 +-------+ +-------+ 246 | | | | 247 | A | | B | 248 | | | | 249 +-------+ +-------+ 251 <-SIP-> Signalling 252 ******* Media 254 Figure 2: Conference Bridge Control Model 256 low bandwidth or high-delay access links (e.g., some wireless 257 accesses). 259 However, this model is less flexible than the 3pcc model. It is not 260 possible to use different transcoders for different streams or for 261 different directions of a stream. 263 Invoking a transcoder in the middle of an ongoing session or changing 264 from one transcoder to another requires the remote end-point to 265 support the Replaces [6] extension. At present, not many user agents 266 support it. 268 Simple end-points that cannot perform 3pcc and thus cannot use the 269 3pcc model, of course, need to use the conference bridge model. 271 4 Security Considerations 273 This document does not introduce any new security considerations. 275 5 Contributors 277 This document is the result of discussions amongst the conferencing 278 design team. The members of this team include Eric Burger, Henning 279 Schulzrinne and Arnoud van Wijk. 281 6 Authors' Addresses 283 Gonzalo Camarillo 284 Ericsson 285 Advanced Signalling Research Lab. 286 FIN-02420 Jorvas 287 Finland 288 electronic mail: Gonzalo.Camarillo@ericsson.com 290 7 Bibliography 292 [1] J. Rosenberg, H. Schulzrinne, G. Camarillo, A. R. Johnston, J. 293 Peterson, R. Sparks, M. Handley, and E. Schooler, "SIP: session 294 initiation protocol," RFC 3261, Internet Engineering Task Force, June 295 2002. 297 [2] N. Charlton, M. Gasson, G. Gybels, M. Spanner, and A. van Wijk, 298 "User requirements for the session initiation protocol (SIP) in 299 support of deaf, hard of hearing and speech-impaired individuals," 300 RFC 3351, Internet Engineering Task Force, Aug. 2002. 302 [3] S. Floyd and L. Daigle, "IAB architectural and policy 303 considerations for open pluggable edge services," RFC 3238, Internet 304 Engineering Task Force, Jan. 2002. 306 [4] J. Rosenberg and H. Schulzrinne, "An offer/answer model with 307 session description protocol (SDP)," RFC 3264, Internet Engineering 308 Task Force, June 2002. 310 [5] J. Rosenberg, J. L. Peterson, H. Schulzrinne, and G. Camarillo, 311 "Best current practices for third party call control in the session 312 initiation protocol," internet draft, Internet Engineering Task 313 Force, July 2003. Work in progress. 315 [6] B. Biggs, R. W. Dean, and R. Mahy, "The session inititation 316 protocol (SIP) Engineering Task Force, Aug. 2003. Work in progress. 318 The IETF takes no position regarding the validity or scope of any 319 intellectual property or other rights that might be claimed to 320 pertain to the implementation or use of the technology described in 321 this document or the extent to which any license under such rights 322 might or might not be available; neither does it represent that it 323 has made any effort to identify any such rights. Information on the 324 IETF's procedures with respect to rights in standards-track and 325 standards-related documentation can be found in BCP-11. Copies of 326 claims of rights made available for publication and any assurances of 327 licenses to be made available, or the result of an attempt made to 328 obtain a general license or permission for the use of such 329 proprietary rights by implementors or users of this specification can 330 be obtained from the IETF Secretariat. 332 The IETF invites any interested party to bring to its attention any 333 copyrights, patents or patent applications, or other proprietary 334 rights which may cover technology that may be required to practice 335 this standard. Please address the information to the IETF Executive 336 Director. 338 Full Copyright Statement 340 Copyright (c) The Internet Society (2003). All Rights Reserved. 342 This document and translations of it may be copied and furnished to 343 others, and derivative works that comment on or otherwise explain it 344 or assist in its implementation may be prepared, copied, published 345 and distributed, in whole or in part, without restriction of any 346 kind, provided that the above copyright notice and this paragraph are 347 included on all such copies and derivative works. However, this 348 document itself may not be modified in any way, such as by removing 349 the copyright notice or references to the Internet Society or other 350 Internet organizations, except as needed for the purpose of 351 developing Internet standards in which case the procedures for 352 copyrights defined in the Internet Standards process must be 353 followed, or as required to translate it into languages other than 354 English. 356 The limited permissions granted above are perpetual and will not be 357 revoked by the Internet Society or its successors or assigns. 359 This document and the information contained herein is provided on an 360 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 361 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 362 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 363 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 364 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.