idnits 2.17.1 draft-gellens-negotiating-human-language-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 24, 2013) is 4077 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'I-D.iab-privacy-considerations' is defined on line 380, but no explicit reference was found in the text == Unused Reference: 'I-D.saintandre-sip-xmpp-chat' is defined on line 386, but no explicit reference was found in the text ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) == Outdated reference: A later version (-09) exists of draft-iab-privacy-considerations-03 == Outdated reference: A later version (-06) exists of draft-saintandre-sip-xmpp-chat-04 -- Obsolete informational reference (is this intentional?): RFC 3066 (Obsoleted by RFC 4646, RFC 4647) Summary: 1 error (**), 0 flaws (~~), 5 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 MMUSIC Working Group R. Gellens 3 Internet-Draft Qualcomm Technologies, Inc. 4 Intended status: Standards Track February 24, 2013 5 Expires: August 26, 2013 7 Negotiating Human Language Using SDP 8 draft-gellens-negotiating-human-language-02 10 Abstract 12 Users have various human (natural) language needs, abilities, and 13 preferences regarding spoken, written, and signed languages. When 14 establishing interactive communication "calls" there needs to be a 15 way to communicate and ideally match (i.e., negotiate) the caller's 16 language needs, abilities, and preferences with the capabilities of 17 the called party. This is especially important with emergency 18 calling, where a call can be routed to a PSAP or call taker capable 19 of communicating with the user, or a translator or relay operator can 20 be bridged into the call during setup, but this applies to non- 21 emergency calls as well (as an example, when calling an airline 22 reservation desk). 24 This document describes the need and expected use, and discusses the 25 solution using either an existing or new SDP attribute. 27 Status of this Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at http://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on August 26, 2013. 44 Copyright Notice 46 Copyright (c) 2013 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents (http://trustee.ietf.org/ 51 license-info) in effect on the date of publication of this document. 52 Please review these documents carefully, as they describe your rights 53 and restrictions with respect to this document. Code Components 54 extracted from this document must include Simplified BSD License text 55 as described in Section 4.e of the Trust Legal Provisions and are 56 provided without warranty as described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2 61 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 62 3. Expected Use . . . . . . . . . . . . . . . . . . . . . . . . . 4 63 4. Desired Semantics . . . . . . . . . . . . . . . . . . . . . . 4 64 5. Proposed Solution . . . . . . . . . . . . . . . . . . . . . . 5 65 5.1. Possibility: Re-Use existing 'lang' attribute . . . . . . 5 66 5.2. Possibility: Define new 'humintlang' attribute . . . . . . 7 67 6. Silly States . . . . . . . . . . . . . . . . . . . . . . . . . 7 68 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 69 8. Security Considerations . . . . . . . . . . . . . . . . . . . 7 70 9. Changes from Previous Versions . . . . . . . . . . . . . . . . 7 71 9.1. Changes from -00 to -01 . . . . . . . . . . . . . . . . . 7 72 9.2. Changes from -01 to -02 . . . . . . . . . . . . . . . . . 8 73 10. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 8 74 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 8 75 11.1. Normative References . . . . . . . . . . . . . . . . . . 8 76 11.2. Informational References . . . . . . . . . . . . . . . . 8 77 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 9 79 1. Introduction 81 When setting up interactive communication sessions, human (natural) 82 language negotiation is needed in some cases. When the caller and 83 callee are known to each other or where context implies language, 84 such language negotiation may not be needed. In other cases, there 85 is a need for the caller to indicate language preferences, abilities, 86 or needs, including specific spoken, signed, or written languages. 87 This need exists when setting up SIP or other sessions (including 88 emergency and non-emergency calling). For various reasons, including 89 the ability to establish multiple streams each using a different 90 media (e.g., voice, text, video), it makes sense to use a per-stream 91 negotiation mechanism, using SDP. 93 This approach has a number of benefits, including that it is generic 94 and not limited to emergency calls. In some cases such a facility 95 isn't needed, because the language is known from the context (such as 96 when a caller places a call to a sign language relay center). But it 97 seems clearly useful in many other cases. For example, it seems 98 generally useful that someone calling a company call center be able 99 to indicate if a specific sign and/or spoken language is needed. The 100 UE would need to set this, but could default to the language used for 101 the interface with the user. 103 Including the user's human (natural) language requirements in the 104 session establishment negotiation is independent of the use of a 105 relay service and is transparent to a voice service provider. For 106 example, assume a user within the United States who speaks Spanish 107 but not English places a voice call using an IMS device. It doesn't 108 matter if the call is an emergency call or not (e.g., to an airline 109 reservation desk). The language information is transparent to the 110 IMS carrier, but is part of the session negotiation between the UE 111 and the terminating entity. In the case of a call to e.g., an 112 airline, the call can be automatically routed to a Spanish-speaking 113 agent. In the case of an emergency call, the ESInet and the PSAP may 114 choose to take the language into account when determining how to 115 route and process the call (e.g., language and media needs may be 116 considered within policy-based routing). 118 By treating language as another attribute that is negotiated along 119 with other aspects of a media stream, it becomes possible to 120 accommodate a wide range of users' needs and called party facilities. 121 For example, some users may be able to speak several languages, but 122 have a preference. Some called parties may support some of those 123 languages internally but require the use of a translation service for 124 others, or may have a limited number of call takers able to use 125 certain languages. Another example would be a user who is able to 126 speak but is deaf or hard-of-hearing and requires a voice stream plus 127 a text stream (known as voice carry over). Making language a media 128 attribute allows the standard session negotiation mechanism to handle 129 this by providing the information and mechanism for the endpoints to 130 make appropriate decisions. 132 Regarding relay services, in the case of an emergency call requiring 133 sign language such as ASL, there are two common approaches: the 134 caller initiates the call to a relay center, or the caller places the 135 call to emergency services (e.g., 911 or 112). In the former case, 136 the language need is ancillary and supplemental. In the latter case, 137 the ESInet and/or PSAP may take the need for sign language into 138 account and bridge in a relay center. In this case, the ESInet and 139 PSAP have all the standard information available (such as location) 140 but are able to bridge the relay sooner in the call processing. 142 By making this facility part of the end-to-end negotiation, the 143 question of which entity provides or engages the relay service 144 becomes separate from the call processing mechanics; if the caller 145 directs the call to a relay service then the human language facility 146 provides extra information to the relay service but calls will still 147 function without it; if the caller directs the call to emergency 148 services, then the ESInet/PSAP are able to take the user's human 149 language needs into account, e.g., by routing to a particular PSAP or 150 call taker or bridging a relay service or translator. 152 The term "negotiation" is used here rather than "indication" because 153 human language (spoken/written/signed) is something that can be 154 negotiated in the same way as which forms of media (audio/text/video) 155 or which codecs. For example, if we think of non-emergency calls, 156 such as a user calling an airline reservation center, the user may 157 have a set of languages he or she speaks, with perhaps preferences 158 for one or a few, while the airline reservation center will support a 159 fixed set of languages. Negotiation should select whichever language 160 supported by the call center is most preferred by the user. Both 161 sides should be aware of which language was negotiated. This is 162 conceptually similar to the way other aspects of each media stream 163 are negotiated using SDP (e.g., media type and codecs). 165 2. Terminology 167 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 168 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 169 document are to be interpreted as described in RFC 2119 [RFC2119]. 171 3. Expected Use 173 This facility is expected to be used by NENA and 3GPP. NENA is 174 likely to reference it in NENA 08-01 (i3 Stage 3) in describing 175 attributes of calls presented to an ESInet, and in that or other 176 documents describing Policy-Based Routing capabilities within a 177 Policy-Based Routing Function (PCRF). 3GPP is expected to reference 178 this mechanism in general call handling and emergency call handling. 179 Recent CRs introduced in SA1 have anticipated this functionality 180 being provided within SDP. 182 4. Desired Semantics 183 The desired solution is a media attribute that may be used within an 184 offer to indicate the preferred language of each media stream, and 185 within an answer to indicate the accepted language. The semantics of 186 including multiple values for a media stream within an offer is that 187 the languages are listed in order of preference. 189 (While it is true that a conversation among multilingual people often 190 involves multiple languages, it does not seem useful enough as a 191 general facility to warrant complicating the desired semantics of the 192 SDP attribute to allow negotiation of multiple simultaneous languages 193 within an interactive media stream.) 195 5. Proposed Solution 197 An SDP attribute seems the natural choice to negotiate human 198 (natural) language of an interactive media stream. The attribute 199 value should be a language tag from RFC 4566 [RFC4566] or the IANA 200 registry [IANA-lang-tags] 202 5.1. Possibility: Re-Use existing 'lang' attribute 204 RFC 4566 specifies an attribute 'lang' which sounds similar to what 205 is needed here, the difference being that it specifies that 'a=lang' 206 is declarative with the semantics of multiple 'lang' attributes being 207 that all of them are used, while we want a means to negotiate which 208 one is used in each stream. This difference means that either the 209 existing 'lang' attribute can't be used and we need to define a new 210 attribute; or we finese/update the semantics of 'lang' such that the 211 existing semantics apply to non-interactive streams (multiple 'lang' 212 values means all are used), while for interactive streams, one is 213 used; (or possibly the author of this memo has misunderstood RFC 214 4566). 216 The text from RFC 4566 [RFC4566] is: 218 a=lang: 220 This can be a session-level attribute or a media-level attribute. 221 As a session-level attribute, it specifies the default language 222 for the session being described. As a media- level attribute, it 223 specifies the language for that media, overriding any session- 224 level language specified. Multiple lang attributes can be 225 provided either at session or media level if the session 226 description or media use multiple languages, in which case the 227 order of the attributes indicates the order of importance of the 228 various languages in the session or media from most important to 229 least important. 231 The "lang" attribute value must be a single [RFC3066] language tag 232 in US-ASCII [RFC3066]. It is not dependent on the charset 233 attribute. A "lang" attribute SHOULD be specified when a session 234 is of sufficient scope to cross geographic boundaries where the 235 language of recipients cannot be assumed, or where the session is 236 in a different language from the locally assumed norm. 238 The question is: Can the 'lang' attribute be used for our purposes? 239 Using it to negotiate the language for a media seems at first glance 240 to violate its semantics as defined in RFC 4566 [RFC4566]. But there 241 are existing examples of it being used in exactly the way we need. 242 For example, draft-saintandre-sip-xmpp-chat-04 [I-D.saintandre-sip- 243 xmpp-chat] contains an example where the initial invitation contains 244 two 'a=lang' entries for a media stream (for English and Italian) and 245 the OK accepts one of them (Italian), which matches what we need: 247 Example: (F1) SIP user starts the session 249 INVITE sip:juliet@example.com SIP/2.0 250 To: 251 From: ;tag=576 252 Subject: Open chat with Romeo? 253 Call-ID: 742507no 254 Content-Type: application/sdp 256 c=IN IP4 s2x.example.net 257 m=message 7313 TCP/MSRP * 258 a=accept-types:text/plain 259 a=lang:en 260 a=lang:it 261 a=path:msrp://s2x.example.net:7313/ansp71weztas;tcp 263 Example: (F2) Gateway accepts session on Juliet's behalf 265 SIP/2.0 200 OK 266 To: ;tag=534 267 From: ;tag=576 268 Call-ID: 742507no 269 Content-Type: application/sdp 271 c=IN IP4 x2s.example.com 272 m=message 8763 TCP/MSRP * 273 a=accept-types:text/plain 274 a=lang:it 275 a=path:msrp://x2s.example.com:8763/lkjh37s2s20w2a;tcp 277 To re-use the existing 'lang' attribute, we'd update its registration 278 to specify that for non-interactive media, multiple 'lang' values in 279 an offer have the existing RFC 4566 [RFC4566] semantics (all 280 languages are used in the media), while for interactive media 281 streams, one of the values should be selected in the answer and that 282 language used in the media stream. 284 5.2. Possibility: Define new 'humintlang' attribute 286 Instead of re-using 'lang' we may define a new media-level attribute 287 'humintlang' (short for "human interactive language") to negotiate 288 which human language is used in each (interactive) media stream: 290 a=humintlang: 292 This is a media-level attribute. In an offer, it specifies the 293 desired language(s) for the media. Multiple "humintlang" 294 attributes can be provided in an offer for a media stream, in 295 which case the order of the attributes indicates the order of 296 preference of the various languages from most preferred to least 297 preferred. When the "humintlang" attribute appears within an 298 answer it indicates the accepted language for the media. 300 The "humintlang" attribute value MUST be a language tag per RFC 301 5646 [RFC5646]. A "humintlang" attribute SHOULD be specified for 302 each media stream in an offer when placing an emergency call (to 303 avoid ambiguity) and in any other case where the language cannot 304 be assumed from context. 306 When an offer includes media with one or more language tags, each 307 accepted media in the answer MUST include one of the language tags 308 offered for the media. RFC 5646 describes mechanisms for matching 309 language tags. 311 6. Silly States 313 It's possible to specify a "silly state" where the language specified 314 does not make sense for the media type, such as specifying a signed 315 language for an audio media stream. 317 An offer MUST NOT be created where the language does not make sense 318 for the media type. If such an offer is received, the receiver MAY 319 reject the media, ignore the language specified, or attempt to 320 interpret the intent (e.g., if American Sign Language is specified 321 for an audio media stream, this might be interpreted as a desire to 322 use spoken English). 324 7. IANA Considerations 326 TBD. 328 8. Security Considerations 330 TBD 332 9. Changes from Previous Versions 334 9.1. Changes from -00 to -01 335 o Changed name of (possible) new attribute from 'humlang" to 336 "humintlang" 338 o Added discussion of silly state (language not appropriate for 339 media type) 341 o Added Voice Carry Over example 343 o Added mention of multilingual people and multiple languages 345 o Minor text clarifications 347 9.2. Changes from -01 to -02 349 o Updated text for (possible) new attribute "humintlang" to 350 reference RFC 5646 352 o Added clarifying text for (possible) re-use of existing 'lang' 353 attribute saying that the registration would be updated to reflect 354 different semantics for multiple values for interactive versus 355 non-interactive media. 357 o Added clarifying text for (possible) new attribute "humintlang" to 358 attempt to better describe the role of language tags in media in 359 an offer and an answer. 361 10. Acknowledgments 363 Many thanks to Doug Ewell for his review and corrections/suggestions. 365 11. References 367 11.1. Normative References 369 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 370 Requirement Levels", BCP 14, RFC 2119, March 1997. 372 [RFC4566] Handley, M., Jacobson, V. and C. Perkins, "SDP: Session 373 Description Protocol", RFC 4566, July 2006. 375 [RFC5646] Phillips, A. and M. Davis, "Tags for Identifying 376 Languages", BCP 47, RFC 5646, September 2009. 378 11.2. Informational References 380 [I-D.iab-privacy-considerations] 381 Cooper, A., Tschofenig, H., Aboba, B., Peterson, J., 382 Morris, J., Hansen, M. and R. Smith, "Privacy 383 Considerations for Internet Protocols", Internet-Draft 384 draft-iab-privacy-considerations-03, July 2012. 386 [I-D.saintandre-sip-xmpp-chat] 387 Saint-Andre, P., Gavita, E., Hossain, N. and S. Loreto, 388 "Interworking between the Session Initiation Protocol 389 (SIP) and the Extensible Messaging and Presence Protocol 390 (XMPP): One-to-One Text Chat", Internet-Draft draft- 391 saintandre-sip-xmpp-chat-04, October 2012. 393 [IANA-lang-tags] 394 "IANA Language Subtag Registry", , . 397 [RFC3066] Alvestrand, H., "Tags for the Identification of 398 Languages", RFC 3066, January 2001. 400 Author's Address 402 Randall Gellens 403 Qualcomm Technologies, Inc. 404 5775 Morehouse Drive 405 San Diego, CA 92121 406 US 408 Email: rg+ietf@qti.qualcomm.com