idnits 2.17.1 draft-ietf-slim-negotiating-human-language-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The exact meaning of the all-uppercase expression 'NOT REQUIRED' is not defined in RFC 2119. If it is intended as a requirements expression, it should be rewritten using one of the combinations defined in RFC 2119; otherwise it should not be all-uppercase. -- The document date (March 20, 2016) is 2959 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'I-D.iab-privacy-considerations' is defined on line 603, but no explicit reference was found in the text == Unused Reference: 'I-D.saintandre-sip-xmpp-chat' is defined on line 609, but no explicit reference was found in the text ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) -- Obsolete informational reference (is this intentional?): RFC 3066 (Obsoleted by RFC 4646, RFC 4647) Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group R. Gellens 3 Internet-Draft 4 Intended status: Standards Track March 20, 2016 5 Expires: September 21, 2016 7 Negotiating Human Language in Real-Time Communications 8 draft-ietf-slim-negotiating-human-language-01 10 Abstract 12 Users have various human (natural) language needs, abilities, and 13 preferences regarding spoken, written, and signed languages. When 14 establishing interactive communication ("calls") there needs to be a 15 way to negotiate (communicate and match) the caller's language and 16 media needs with the capabilities of the called party. This is 17 especially important with emergency calls, where a call can be 18 handled by a call taker capable of communicating with the user, or a 19 translator or relay operator can be bridged into the call during 20 setup, but this applies to non-emergency calls as well (as an 21 example, when calling a company call center). 23 This document describes the need and a solution using new SDP stream 24 attributes. 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on September 21, 2016. 43 Copyright Notice 45 Copyright (c) 2016 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 61 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 62 3. Expected Use . . . . . . . . . . . . . . . . . . . . . . . . 5 63 4. Desired Semantics . . . . . . . . . . . . . . . . . . . . . . 5 64 5. The existing 'lang' attribute . . . . . . . . . . . . . . . . 6 65 6. Proposed Solution . . . . . . . . . . . . . . . . . . . . . . 7 66 6.1. Rationale . . . . . . . . . . . . . . . . . . . . . . . . 7 67 6.2. New 'humintlang-send' and 'humintlang-recv' attributes . 7 68 6.3. Advisory vs Required . . . . . . . . . . . . . . . . . . 9 69 6.4. Silly States . . . . . . . . . . . . . . . . . . . . . . 9 70 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 10 71 8. Security Considerations . . . . . . . . . . . . . . . . . . . 10 72 9. Changes from Previous Versions . . . . . . . . . . . . . . . 10 73 9.1. Changes from draft-ietf-slim-...-00 to draft-ietf- 74 slim-...-01 . . . . . . . . . . . . . . . . . . . . . . . 10 75 9.2. Changes from draft-gellens-slim-...-03 to draft-ietf- 76 slim-...-00 . . . . . . . . . . . . . . . . . . . . . . . 10 77 9.3. Changes from draft-gellens-slim-...-02 to draft-gellens- 78 slim-...-03 . . . . . . . . . . . . . . . . . . . . . . . 10 79 9.4. Changes from draft-gellens-slim-...-01 to draft-gellens- 80 slim-...-02 . . . . . . . . . . . . . . . . . . . . . . . 10 81 9.5. Changes from draft-gellens-slim-...-00 to draft-gellens- 82 slim-...-01 . . . . . . . . . . . . . . . . . . . . . . . 11 83 9.6. Changes from draft-gellens-mmusic-...-02 to draft- 84 gellens-slim-...-00 . . . . . . . . . . . . . . . . . . . 11 85 9.7. Changes from draft-gellens-mmusic-...-01 to -02 . . . . . 11 86 9.8. Changes from draft-gellens-mmusic-...-00 to -01 . . . . . 11 87 9.9. Changes from draft-gellens-...-02 to draft-gellens- 88 mmusic-...-00 . . . . . . . . . . . . . . . . . . . . . . 11 89 9.10. Changes from draft-gellens-...-01 to -02 . . . . . . . . 12 90 9.11. Changes from draft-gellens-...-00 to -01 . . . . . . . . 12 91 10. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 12 92 11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 13 93 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 13 94 12.1. Normative References . . . . . . . . . . . . . . . . . . 13 95 12.2. Informational References . . . . . . . . . . . . . . . . 13 97 Appendix A. Historic Alternative Proposal: Caller-prefs . . . . 14 98 A.1. Use of Caller Preferences Without Additions . . . . . . . 14 99 A.2. Additional Caller Preferences for Asymmetric Needs . . . 16 100 A.2.1. Caller Preferences for Asymmetric Modality Needs . . 16 101 A.2.2. Caller Preferences for Asymmetric Language Tags . . . 18 102 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 18 104 1. Introduction 106 A mutually comprehensible language is helpful for human 107 communication. This document addresses the real-time, interactive 108 side of the issue. A companion document on language selection in 109 email [draft-tomkinson-multilangcontent] addresses the non-real-time 110 side. 112 When setting up interactive communication sessions (using SIP or 113 other protocols), human (natural) language and media modality (voice, 114 video, text) negotiation may be needed. Unless the caller and callee 115 know each other or there is contextual or out of band information 116 from which the language(s) and media modalities can be determined, 117 there is a need for spoken, signed, or written languages to be 118 negotiated based on the caller's needs and the callee's capabilities. 119 This need applies to both emergency and non-emergency calls. For 120 various reasons, including the ability to establish multiple streams 121 using different media (e.g., voice, text, video), it makes sense to 122 use a per-stream negotiation mechanism, in this case, SDP. 124 This approach has a number of benefits, including that it is generic 125 (applies to all interactive communications negotiated using SDP) and 126 not limited to emergency calls. In some cases such a facility isn't 127 needed, because the language is known from the context (such as when 128 a caller places a call to a sign language relay center, to a friend, 129 or colleague). But it is clearly useful in many other cases. For 130 example, someone calling a company call center or a Public Safety 131 Answering Point (PSAP) should be able to indicate if one or more 132 specific signed, written, and/or spoken languages are preferred, the 133 callee should be able to indicate its capabilities in this area, and 134 the call proceed using in-common language(s) and media forms. 136 Since this is a protocol mechanism, the user equipment (UE client) 137 needs to know the user's preferred languages; a reasonable technique 138 could include a configuration mechanism with a default of the 139 language of the user interface. In some cases, a UE could tie 140 language and media preferences, such as a preference for a video 141 stream using a signed language and/or a text or audio stream using a 142 written/spoken language. 144 Including the user's human (natural) language preferences in the 145 session establishment negotiation is independent of the use of a 146 relay service and is transparent to a voice service provider. For 147 example, assume a user within the United States who speaks Spanish 148 but not English places a voice call using an IMS device. It doesn't 149 matter if the call is an emergency call or not (e.g., to an airline 150 reservation desk). The language information is transparent to the 151 IMS carrier, but is part of the session negotiation between the UE 152 and the terminating entity. In the case of a call to e.g., an 153 airline, the call can be automatically handled by a Spanish-speaking 154 agent. In the case of an emergency call, the Emergency Services IP 155 network (ESInet) and the PSAP may choose to take the language and 156 media preferences into account when determining how to process the 157 call. 159 By treating language as another attribute that is negotiated along 160 with other aspects of a media stream, it becomes possible to 161 accommodate a range of users' needs and called party facilities. For 162 example, some users may be able to speak several languages, but have 163 a preference. Some called parties may support some of those 164 languages internally but require the use of a translation service for 165 others, or may have a limited number of call takers able to use 166 certain languages. Another example would be a user who is able to 167 speak but is deaf or hard-of-hearing and requires a voice stream plus 168 a text stream (known as voice carry over). Making language a media 169 attribute allows the standard session negotiation mechanism to handle 170 this by providing the information and mechanism for the endpoints to 171 make appropriate decisions. 173 Regarding relay services, in the case of an emergency call requiring 174 sign language such as ASL, there are two common approaches: the 175 caller initiates the call to a relay center, or the caller places the 176 call to emergency services (e.g., 911 in the U.S. or 112 in Europe). 177 In the former case, the language need is ancillary and supplemental. 178 In the latter case, the ESInet and/or PSAP may take the need for sign 179 language into account and bridge in a relay center. In this case, 180 the ESInet and PSAP have all the standard information available (such 181 as location) but are able to bridge the relay sooner in the call 182 processing. 184 By making this facility part of the end-to-end negotiation, the 185 question of which entity provides or engages the relay service 186 becomes separate from the call processing mechanics; if the caller 187 directs the call to a relay service then the human language 188 negotiation facility provides extra information to the relay service 189 but calls will still function without it; if the caller directs the 190 call to emergency services, then the ESInet/PSAP are able to take the 191 user's human language needs into account, e.g., by assigning to a 192 specific queue or call taker or bridging in a relay service or 193 translator. 195 The term "negotiation" is used here rather than "indication" because 196 human language (spoken/written/signed) is something that can be 197 negotiated in the same way as which forms of media (audio/text/video) 198 or which codecs. For example, if we think of non-emergency calls, 199 such as a user calling an airline reservation center, the user may 200 have a set of languages he or she speaks, with perhaps preferences 201 for one or a few, while the airline reservation center will support a 202 fixed set of languages. Negotiation should select the user's most 203 preferred language that is supported by the call center. Both sides 204 should be aware of which language was negotiated. This is 205 conceptually similar to the way other aspects of each media stream 206 are negotiated using SDP (e.g., media type and codecs). 208 2. Terminology 210 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 211 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 212 document are to be interpreted as described in RFC 2119 [RFC2119]. 214 3. Expected Use 216 This facility may be used by NENA and 3GPP. NENA has already 217 referenced it in NENA 08-01 (i3 Stage 3 version 2) in describing 218 attributes of calls presented to an ESInet, and may add further 219 details in that or other documents. 3GPP may reference this 220 mechanism in general call handling and emergency call handling. Some 221 CRs introduced in SA1 have anticipated this functionality being 222 provided within SDP. 224 4. Desired Semantics 226 The desired solution is a media attribute that may be used within an 227 offer to indicate the preferred language of each media stream, and 228 within an answer to indicate the accepted language. The semantics of 229 including multiple values for a media stream within an offer is that 230 the languages are listed in order of preference. 232 (While it is true that a conversation among multilingual people often 233 involves multiple languages, the usefulness of providing a way to 234 negotiate this as a general facility is outweighed by the complexity 235 of the desired semantics of the SDP attribute to allow negotiation of 236 multiple simultaneous languages within an interactive media stream.) 238 5. The existing 'lang' attribute 240 RFC 4566 [RFC4566] specifies an attribute 'lang' which appears 241 similar to what is needed here. It specifies that 'a=lang' is 242 declarative; multiple 'lang' attributes indicate that the "media use 243 multiple languages", and that "the order of the attributes indicates 244 the order of importance of the various languages in the ... media" 245 (we interpret this to mean that the media contains all of the 246 languages indicated, for example, a video of an interview by a person 247 speaking one language of a person speaking another, with subtitles in 248 the language of the interviewer; this would list first the language 249 of the interviewer and second the language of the person being 250 interviewed). We need a means to negotiate which language is used in 251 each stream. This difference means that the existing 'lang' 252 attribute can't be used and we need to define a new attribute. 254 The text from RFC 4566 [RFC4566] is: 256 a=lang: 257 This can be a session-level attribute or a media-level attribute. 258 As a session-level attribute, it specifies the default language 259 for the session being described. As a media- level attribute, it 260 specifies the language for that media, overriding any session- 261 level language specified. Multiple lang attributes can be 262 provided either at session or media level if the session 263 description or media use multiple languages, in which case the 264 order of the attributes indicates the order of importance of the 265 various languages in the session or media from most important to 266 least important. 267 The "lang" attribute value must be a single [RFC3066] language tag 268 in US-ASCII [RFC3066]. It is not dependent on the charset 269 attribute. A "lang" attribute SHOULD be specified when a session 270 is of sufficient scope to cross geographic boundaries where the 271 language of recipients cannot be assumed, or where the session is 272 in a different language from the locally assumed norm. 274 A recent search of RFCs and Internet Drafts turned up only one use of 275 the 'lang' attribute (in a now-expired draft), and that sole use was 276 coincidentally in exactly the way we need (erroniously assuming that 277 the attribute was used for negotiation). The sole use was in an 278 example in a draft not directly related to language, where the 279 initial invitation contains two 'a=lang' entries for a media stream 280 (for English and Italian) and the OK accepts one of them (Italian). 282 The example serves as evidence of the need for an SDP attribute with 283 the semantics as described in this document; unfortunately, the 284 existing 'lang' attribute is not it. 286 6. Proposed Solution 288 An SDP attribute seems the natural choice to negotiate human 289 (natural) language of an interactive media stream. The attribute 290 value should be a language tag per RFC 5646 [RFC5646] 292 6.1. Rationale 294 The decision to base the proposal at the media negotiation level, and 295 specifically to use SDP, came after significant debate and 296 discussion. From an engineering standpoint, it is possible to meet 297 the objectives using a variety of mechanisms, but none are perfect. 298 None of the proposed alternatives was clearly better technically in 299 enough ways to win over proponents of the others, and none were 300 clearly so bad technically as to be easily rejected. As is often the 301 case in engineering, choosing the solution is a matter of balancing 302 trade-offs, and ultimately more a matter of taste than technical 303 merit. The two main proposals were to use SDP and SIP. SDP has the 304 advantage that the language is negotiated with the media to which it 305 applies, while SIP has the issue that the languages expressed may not 306 match the SDP media negotiated (for example, a session could 307 negotiate video at the SIP level but fail to negotiate any video 308 media stream at the SDP layer). 310 The mechanism described here for SDP can be adapted to media 311 negotiation protocols other than SDP. 313 6.2. New 'humintlang-send' and 'humintlang-recv' attributes 315 Rather than re-use 'lang' we define two new media-level attributes 316 starting with 'humintlang' (short for "human interactive language") 317 to negotiate which human language is used in each (interactive) media 318 stream. There are two attributes, one ending in "-send" and the 319 other in "-recv" to indicate the language used when sending and 320 receiving media: 322 a=humintlang-send: 323 a=humintlang-recv: 325 Each can appear multiple times in an offer for a media stream. 327 In an offer, the 'humintlang-send' values constitute a list in 328 preference order (first is most preferred) of the languages the 329 offerer wishes to send using the media, and the 'humintlang-recv' 330 values constitute a list in preference order of the languages the 331 offerer wishes to receive using the media. In cases where the user 332 wishes to use one media for sending and another for receiving (such 333 as a speech-impaired user who wishes to send using text and receive 334 using audio), one of the two MAY be unset. In cases where a media is 335 not primarily intended for language (for example, a video or audio 336 stream intended for background only) both SHOULD be unset. In other 337 cases, both SHOULD have the same values in the same order. The two 338 SHOULD NOT be set to languages which are difficult to match together 339 (e.g., specifying a desire to send audio in Hungarian and receive 340 audio in Portuguese will make it difficult to successfully complete 341 the call). 343 In an answer, 'humintlang-send' is the accepted language the answerer 344 will send (which in most cases is one of the languages in the offer's 345 'humintlang-recv'), and 'humintlang-recv' is the accepted language 346 the answerer expects to receive (which in most cases is one of the 347 languages in the offer's 'humintlang-send'). 349 Each value MUST be a language tag per RFC 5646 [RFC5646]. RFC 5646 350 describes mechanisms for matching language tags. While RFC 5646 351 provides a mechanism accommodating increasingly fine-grained 352 distinctions, in the interest of maximum interoperability for real- 353 time interactive communications, each 'humintlang-send' and 354 'humintlang-recv' value SHOULD be restricted to the largest 355 granularity of language tags; in other words, it is RECOMMENDED to 356 specify only a Primary-subtag and NOT to include subtags (e.g., for 357 region or dialect) unless the languages might be mutually 358 incomprehensible without them. 360 In an offer, each language tag value MAY have an asterisk appended as 361 the last character (after the registry value). The asterisk 362 indicates a request by the caller to not fail the call if there is no 363 language in common. See Section 6.3 for more information and 364 discussion. 366 When placing an emergency call, and in any other case where the 367 language cannot be assumed from context, each media stream in an 368 offer primarily intended for human language communication SHOULD 369 specify one or both 'humintlang-send' and 'humintlang-recv' 370 attributes (to avoid ambiguity). 372 Note that while signed language tags are used with a video stream to 373 indicate sign language, a spoken language tag for a video stream in 374 parallel with an audio stream with the same spoken language tag 375 indicates a request for a supplemental video stream to see the 376 speaker. 378 Clients acting on behalf of end users are expected to set one or both 379 'humintlang-send' and 'humintlang-recv' attributes on each media 380 stream primarily intended for human communication in an offer when 381 placing an outgoing session, but either ignore or take into 382 consideration the attributes when receiving incoming calls, based on 383 local configuration and capabilities. Systems acting on behalf of 384 call centers and PSAPs are expected to take into account the values 385 when processing inbound calls. 387 6.3. Advisory vs Required 389 One important consideration with this mechanism is if the call fails 390 if the callee does not support any of the languages requested by the 391 caller. 393 In order to provide for maximum likelihood of a successful 394 communication session, especially in the case of emergency calling, 395 the mechanism defined here provides a way for the caller to indicate 396 a preference for the call failing or succeeding when there is no 397 language in common. However, the callee is NOT REQUIRED to honor 398 this preference. For example, a PSAP MAY choose to attempt the call 399 even with no language in common, while a corporate call center MAY 400 choose to fail the call. 402 The mechanism for indicating this preference is that, in an offer, if 403 the last character of any of the 'humintlang-recv' or 'humintlang- 404 send' values is an asterisk, this indicates a request to not fail the 405 call (similar to SIP Accept-Language syntax). Either way, the called 406 party MAY ignore this, e.g., for the emergency services use case, a 407 PSAP will likely not fail the call. 409 6.4. Silly States 411 It is possible to specify a "silly state" where the language 412 specified does not make sense for the media type, such as specifying 413 a signed language for an audio media stream. 415 An offer MUST NOT be created where the language does not make sense 416 for the media type. If such an offer is received, the receiver MAY 417 reject the media, ignore the language specified, or attempt to 418 interpret the intent (e.g., if American Sign Language is specified 419 for an audio media stream, this might be interpreted as a desire to 420 use spoken English). 422 A spoken language tag for a video stream in conjunction with an audio 423 stream with the same language might indicate a request for 424 supplemental video to see the speaker. 426 7. IANA Considerations 428 IANA is kindly requested to add two entries to the 'att-field (media 429 level only)' table of the SDP parameters registry: 431 +------------------------------+-----------------+-----------------+ 432 | Type | Name | Reference | 433 +------------------------------+-----------------+-----------------+ 434 | att-field (media level only) | humintlang-send | (this document) | 435 | att-field (media level only) | humintlang-recv | (this document) | 436 +------------------------------+-----------------+-----------------+ 438 Table 1: att-field (media level only)' entries 440 8. Security Considerations 442 The Security Considerations of RFC 5646 [RFC5646] apply here (as a 443 use of that RFC). In addition, if the 'humintlang-send' or 444 'humintlang-recv' values are altered or deleted en route, the session 445 could fail or languages incomprehensible to the caller could be 446 selected; however, this is also a risk if any SDP parameters are 447 modified en route. 449 9. Changes from Previous Versions 451 9.1. Changes from draft-ietf-slim-...-00 to draft-ietf-slim-...-01 453 o FOO 455 9.2. Changes from draft-gellens-slim-...-03 to draft-ietf-slim-...-00 457 o Updated title to reflect WG adoption 459 9.3. Changes from draft-gellens-slim-...-02 to draft-gellens- 460 slim-...-03 462 o Removed Use Cases section, per face-to-face discussion at IETF 93 463 o Removed discussion of routing, per face-to-face discussion at IETF 464 93 466 9.4. Changes from draft-gellens-slim-...-01 to draft-gellens- 467 slim-...-02 469 o Updated NENA usage mention 470 o Removed background text reference to draft-saintandre-sip-xmpp- 471 chat-04 since that draft expired 473 9.5. Changes from draft-gellens-slim-...-00 to draft-gellens- 474 slim-...-01 476 o Revision to keep draft from expiring 478 9.6. Changes from draft-gellens-mmusic-...-02 to draft-gellens- 479 slim-...-00 481 o Changed name from -mmusic- to -slim- to reflect proposed WG name 482 o As a result of the face-to-face discussion in Toronto, the SDP vs 483 SIP issue was resolved by going back to SDP, taking out the SIP 484 hint, and converting what had been a set of alternate proposals 485 for various ways of doing it within SIP into an informative annex 486 section which includes background on why SDP is the proposal 487 o Added mention that enabling a mutually comprehensible language is 488 a general problem of which this document addresses the real-time 489 side, with reference to [draft-tomkinson-multilangcontent] which 490 addresses the non-real-time side. 492 9.7. Changes from draft-gellens-mmusic-...-01 to -02 494 o Added clarifying text on leaving attributes unset for media not 495 primarily intended for human language communication (e.g., 496 background audio or video). 497 o Added new section Appendix A ("Alternative Proposal: Caller- 498 prefs") discussing use of SIP-level Caller-prefs instead of SDP- 499 level. 501 9.8. Changes from draft-gellens-mmusic-...-00 to -01 503 o Relaxed language on setting -send and -receive to same values; 504 added text on leaving on empty to indicate asymmetric usage. 505 o Added text that clients on behalf of end users are expected to set 506 the attributes on outgoing calls and ignore on incoming calls 507 while systems on behalf of call centers and PSAPs are expected to 508 take the attributes into account when processing incoming calls. 510 9.9. Changes from draft-gellens-...-02 to draft-gellens-mmusic-...-00 512 o Updated text to refer to RFC 5646 rather than the IANA language 513 subtags registry directly. 514 o Moved discussion of existing 'lang' attribute out of "Proposed 515 Solution" section and into own section now that it is not part of 516 proposal. 517 o Updated text about existing 'lang' attribute. 518 o Added example use cases. 519 o Replaced proposed single 'humintlang' attribute with 'humintlang- 520 send' and 'humintlang-recv' per Harald's request/information that 521 it was a misuse of SDP to use the same attribute for sending and 522 receiving. 523 o Added section describing usage being advisory vs required and text 524 in attribute section. 525 o Added section on SIP "hint" header (not yet nailed down between 526 new and existing header). 527 o Added text discussing usage in policy-based routing function or 528 use of SIP header "hint" if unable to do so. 529 o Added SHOULD that the value of the parameters stick to the largest 530 granularity of language tags. 531 o Added text to Introduction to be try and be more clear about 532 purpose of document and problem being solved. 533 o Many wording improvements and clarifications throughout the 534 document. 535 o Filled in Security Considerations. 536 o Filled in IANA Considerations. 537 o Added to Acknowledgments those who participated in the Orlando ad- 538 hoc discussion as well as those who participated in email 539 discussion and side one-on-one discussions. 541 9.10. Changes from draft-gellens-...-01 to -02 543 o Updated text for (possible) new attribute "humintlang" to 544 reference RFC 5646 545 o Added clarifying text for (possible) re-use of existing 'lang' 546 attribute saying that the registration would be updated to reflect 547 different semantics for multiple values for interactive versus 548 non-interactive media. 549 o Added clarifying text for (possible) new attribute "humintlang" to 550 attempt to better describe the role of language tags in media in 551 an offer and an answer. 553 9.11. Changes from draft-gellens-...-00 to -01 555 o Changed name of (possible) new attribute from 'humlang" to 556 "humintlang" 557 o Added discussion of silly state (language not appropriate for 558 media type) 559 o Added Voice Carry Over example 560 o Added mention of multilingual people and multiple languages 561 o Minor text clarifications 563 10. Contributors 565 Gunnar Hellstrom deserves special mention for his reviews, 566 assistance, and especially for contributing the core text in 567 Appendix A. 569 11. Acknowledgments 571 Many thanks to Bernard Aboba, Harald Alvestrand, Flemming Andreasen, 572 Francois Audet, Eric Burger, Keith Drage, Doug Ewell, Christian 573 Groves, Andrew Hutton, Hadriel Kaplan, Ari Keranen, John Klensin, 574 Paul Kyzivat, John Levine, Alexey Melnikov, James Polk, Pete Resnick, 575 Peter Saint-Andre, and Dale Worley for reviews, corrections, 576 suggestions, and participating in in-person and email discussions. 578 12. References 580 12.1. Normative References 582 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 583 Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/ 584 RFC2119, March 1997, 585 . 587 [RFC3840] Rosenberg, J., Schulzrinne, H., and P. Kyzivat, 588 "Indicating User Agent Capabilities in the Session 589 Initiation Protocol (SIP)", RFC 3840, August 2004. 591 [RFC3841] Rosenberg, J., Schulzrinne, H., and P. Kyzivat, "Caller 592 Preferences for the Session Initiation Protocol (SIP)", 593 RFC 3841, August 2004. 595 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 596 Description Protocol", RFC 4566, July 2006. 598 [RFC5646] Phillips, A. and M. Davis, "Tags for Identifying 599 Languages", BCP 47, RFC 5646, September 2009. 601 12.2. Informational References 603 [I-D.iab-privacy-considerations] 604 Cooper, A., Tschofenig, H., Aboba, B., Peterson, J., 605 Morris, J., Hansen, M., and R. Smith, "Privacy 606 Considerations for Internet Protocols", draft-iab-privacy- 607 considerations-09 (work in progress), May 2013. 609 [I-D.saintandre-sip-xmpp-chat] 610 Saint-Andre, P., Loreto, S., Gavita, E., and N. Hossain, 611 "Interworking between the Session Initiation Protocol 612 (SIP) and the Extensible Messaging and Presence Protocol 613 (XMPP): One-to-One Text Chat", draft-saintandre-sip-xmpp- 614 chat-06 (work in progress), June 2013. 616 [RFC3066] Alvestrand, H., "Tags for the Identification of 617 Languages", RFC 3066, January 2001. 619 [draft-tomkinson-multilangcontent] 620 Tomkinson, N. and N. Borenstein, "Multiple Language 621 Content Type", draft-tomkinson-multilangcontent (work in 622 progress), April 2014. 624 Appendix A. Historic Alternative Proposal: Caller-prefs 626 The decision to base the proposal at the media negotiation level, and 627 specifically to use SDP, came after significant debate and 628 discussion. It is possible to meet the objectives using a variety of 629 mechanisms, but none are perfect. Using SDP means dealing with the 630 complexity of SDP, and leaves out real-time session protocols that do 631 not use SDP. The major alternative proposal was to use SIP. Using 632 SIP leaves out non-SIP session protocols, but more fundamentally, 633 would occur at a different layer than the media negotiation. This 634 results in a more fragile solution since the media modality and 635 language would be negotiated using SIP, and then the specific media 636 formats (which inherently include the modality) would be negotiated 637 at a different level (typically SDP, especially in the emergency 638 calling cases), making it easier to have mismatches (such as where 639 the media modality negotiated in SIP don't match what was negotiated 640 using SDP). 642 An alternative proposal was to use the SIP-level Caller Preferences 643 mechanism from RFC 3840 [RFC3840] and RFC 3841 [RFC3841]. 645 The Caller-prefs mechanism includes a priority system; this would 646 allow different combinations of media and languages to be assigned 647 different priorities. The evaluation and decisions on what to do 648 with the call can be done either by proxies along the call path, or 649 by the addressed UA. Evaluation of alternatives for routing is 650 described in RFC 3841 [RFC3841]. 652 A.1. Use of Caller Preferences Without Additions 654 The following would be possible without adding any new registered 655 tags: 657 Potential callers and recipients MAY include in the Contact field in 658 their SIP registrations media and language tags according to the 659 joint capabilities of the UA and the human user according to RFC 3840 660 [RFC3840]. 662 The most relevant media capability tags are "video", "text" and 663 "audio". Each tag represents a capability to use the media in two- 664 way communication. 666 Language capabilities are declared with a comma-separated list of 667 languages that can be used in the call as parameters to the tag 668 "language=". 670 This is an example of how it is used in a SIP REGISTER: 672 REGISTER user@example.net 673 Contact: audio; video; text; 674 language="en,es,ase" 676 Including this information in SIP REGISTER allows proxies to act on 677 the information. For the problem set addressed by this document, it 678 is not anticipated that proxies will do so using registration data. 679 Further, there are classes of devices (such as cellular mobile 680 phones) that are not anticipated to include this information in their 681 registrations. Hence, use in registration is OPTIONAL. 683 In a call, a list of acceptable media and language combinations is 684 declared, and a priority assigned to each combination. 686 This is done by the Accept-Contact header field, which defines 687 different combinations of media and languages and assigns priorities 688 for completing the call with the SIP URI represented by that Contact. 689 A priority is assigned to each set as a so-called "q-value" which 690 ranges from 1 (most preferred) to 0 (least preferred). 692 Using the Accept-Contact header field in INVITE requests and 693 responses allows these capabilities to be expressed and used during 694 call set-up. Clients SHOULD include this information in INVITE 695 requests and responses. 697 Example: 699 Accept-Contact: *; text; language="en"; q=0.2 700 Accept-Contact: *; video; language="ase"; q=0.8 702 This example shows the highest preference expressed by the caller is 703 to use video with American Sign Language (language code "ase"). As a 704 fallback, it is acceptable to get the call connected with only 705 English text used for human communication. Other media may of course 706 be connected as well, without expectation that it will be usable by 707 the caller for interactive communications (but may still be helpful 708 to the caller). 710 This system satisfies all the needs described in the previous 711 chapters, except that language specifications do not make any 712 distinction between spoken and written language, and that the need 713 for directionality in the specification cannot be fulfilled. 715 To some degree the lack of media specification between speech and 716 text in language tags can be compensated by only specifying the 717 important medium in the Accept-Contact field. 719 Thus, a user who wants to use English mainly for text would specify: 721 Accept-Contact: *;text;language="en";q=1.0 723 While a user who wants to use English mainly for speech but accept it 724 for text would specify: 726 Accept-Contact: *;audio;language="en";q=0.8 727 Accept-Contact: *;text;language="en";q=0.2 729 However, a user who would like to talk, but receive text back has no 730 way to do it with the existing specification. 732 A.2. Additional Caller Preferences for Asymmetric Needs 734 In order to be able to specify asymmetric preferences, there are two 735 possibilities. Either new language tags in the style of the 736 humintlang parameters described above for SDP could be registered, or 737 additional media tags describing the asymmetry could be registered. 739 A.2.1. Caller Preferences for Asymmetric Modality Needs 741 The following new media tags should be defined: 743 speech-receive 744 speech-send 745 text-receive 746 text-send 747 sign-send 748 sign-receive 750 A user who prefers to talk and get text in return in English would 751 register the following (if including this information in registration 752 data): 754 REGISTER user@example.net 755 Contact: audio;text;speech-send;text- 756 receive;language="en" 758 At call time, a user who prefers to talk and get text in return in 759 English would set the Accept-Contact header field to: 761 Accept-Contact: *; audio; text; speech-receive; text-send; 762 language="en";q=0.8 763 Accept-Contact: *; text; language="en"; q=0.2 765 Note that the directions specified here are as viewed from the callee 766 side to match what the callee has registered. 768 A bridge arranged for invoking a relay service specifically arranged 769 for captioned telephony would register the following for supporting 770 calling users: 772 REGISTER ct@ctrelay.net 773 Contact: audio; text; speech-receive; 774 text-send; language="en" 776 A bridge arranged for invoking a relay service specifically arranged 777 for captioned telephony would register the following for supporting 778 called users: 780 REGISTER ct@ctrelay.net 781 Contact: audio; text; speech-send; text- 782 receive; language="en" 784 At call time, these alternatives are included in the list of possible 785 outcome of the call routing by the SIP proxies and the proper relay 786 service is invoked. 788 A.2.2. Caller Preferences for Asymmetric Language Tags 790 An alternative is to register new language tags for the purpose of 791 asymmetric language usage. 793 Instead of using "language=", six new language tags would be 794 registered: 796 humintlang-text-recv 797 humintlang-text-send 798 humintlang-speech-recv 799 humintlang-speech-send 800 humintlang-sign-recv 801 humintlang-sign-send 803 These language tags would be used instead of the regular 804 bidirectional language tags, and users with bidirectional 805 capabilities SHOULD specify values for both directions. Services 806 specifically arranged for supporting users with asymmetric needs 807 SHOULD specify only the asymmetry they support. 809 Author's Address 811 Randall Gellens 813 Email: rg+ietf@randy.pensive.org