idnits 2.17.1 draft-ietf-slim-negotiating-human-language-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 2, 2017) is 2630 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) == Outdated reference: A later version (-14) exists of draft-ietf-slim-multilangcontent-06 Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group R. Gellens 3 Internet-Draft Core Technology Consulting 4 Intended status: Standards Track February 2, 2017 5 Expires: August 6, 2017 7 Negotiating Human Language in Real-Time Communications 8 draft-ietf-slim-negotiating-human-language-06 10 Abstract 12 Users have various human (natural) language needs, abilities, and 13 preferences regarding spoken, written, and signed languages. When 14 establishing interactive communication ("calls") there needs to be a 15 way to negotiate (communicate and match) the caller's language and 16 media needs with the capabilities of the called party. This is 17 especially important with emergency calls, where a call can be 18 handled by a call taker capable of communicating with the user, or a 19 translator or relay operator can be bridged into the call during 20 setup, but this applies to non-emergency calls as well (as an 21 example, when calling a company call center). 23 This document describes the need and a solution using new SDP stream 24 attributes. 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on August 6, 2017. 43 Copyright Notice 45 Copyright (c) 2017 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 61 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 62 3. Desired Semantics . . . . . . . . . . . . . . . . . . . . . . 5 63 4. The existing 'lang' attribute . . . . . . . . . . . . . . . . 5 64 5. Proposed Solution . . . . . . . . . . . . . . . . . . . . . . 6 65 5.1. Rationale . . . . . . . . . . . . . . . . . . . . . . . . 6 66 5.2. New 'humintlang-send' and 'humintlang-recv' attributes . 6 67 5.3. Advisory vs Required . . . . . . . . . . . . . . . . . . 8 68 5.4. Silly States . . . . . . . . . . . . . . . . . . . . . . 8 69 5.5. Examples . . . . . . . . . . . . . . . . . . . . . . . . 9 70 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 71 7. Security Considerations . . . . . . . . . . . . . . . . . . . 10 72 8. Privacy Considerations . . . . . . . . . . . . . . . . . . . 10 73 9. Changes from Previous Versions . . . . . . . . . . . . . . . 10 74 9.1. Changes from draft-ietf-slim-...-04 to draft-ietf- 75 slim-...-06 . . . . . . . . . . . . . . . . . . . . . . . 10 76 9.2. Changes from draft-ietf-slim-...-02 to draft-ietf- 77 slim-...-03 . . . . . . . . . . . . . . . . . . . . . . . 11 78 9.3. Changes from draft-ietf-slim-...-01 to draft-ietf- 79 slim-...-02 . . . . . . . . . . . . . . . . . . . . . . . 11 80 9.4. Changes from draft-ietf-slim-...-00 to draft-ietf- 81 slim-...-01 . . . . . . . . . . . . . . . . . . . . . . . 11 82 9.5. Changes from draft-gellens-slim-...-03 to draft-ietf- 83 slim-...-00 . . . . . . . . . . . . . . . . . . . . . . . 11 84 9.6. Changes from draft-gellens-slim-...-02 to draft-gellens- 85 slim-...-03 . . . . . . . . . . . . . . . . . . . . . . . 11 86 9.7. Changes from draft-gellens-slim-...-01 to draft-gellens- 87 slim-...-02 . . . . . . . . . . . . . . . . . . . . . . . 11 88 9.8. Changes from draft-gellens-slim-...-00 to draft-gellens- 89 slim-...-01 . . . . . . . . . . . . . . . . . . . . . . . 11 90 9.9. Changes from draft-gellens-mmusic-...-02 to draft- 91 gellens-slim-...-00 . . . . . . . . . . . . . . . . . . . 11 92 9.10. Changes from draft-gellens-mmusic-...-01 to -02 . . . . . 12 93 9.11. Changes from draft-gellens-mmusic-...-00 to -01 . . . . . 12 94 9.12. Changes from draft-gellens-...-02 to draft-gellens- 95 mmusic-...-00 . . . . . . . . . . . . . . . . . . . . . . 12 97 9.13. Changes from draft-gellens-...-01 to -02 . . . . . . . . 13 98 9.14. Changes from draft-gellens-...-00 to -01 . . . . . . . . 13 99 10. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 13 100 11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 13 101 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 14 102 12.1. Normative References . . . . . . . . . . . . . . . . . . 14 103 12.2. Informational References . . . . . . . . . . . . . . . . 14 104 Appendix A. Historic Alternative Proposal: Caller-prefs . . . . 14 105 A.1. Use of Caller Preferences Without Additions . . . . . . . 15 106 A.2. Additional Caller Preferences for Asymmetric Needs . . . 17 107 A.2.1. Caller Preferences for Asymmetric Modality Needs . . 17 108 A.2.2. Caller Preferences for Asymmetric Language Tags . . . 18 109 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 19 111 1. Introduction 113 A mutually comprehensible language is helpful for human 114 communication. This document addresses the real-time, interactive 115 side of the issue. A companion document on language selection in 116 email [I-D.ietf-slim-multilangcontent] addresses the non-real-time 117 side. 119 When setting up interactive communication sessions (using SIP or 120 other protocols), human (natural) language and media modality 121 (spoken, signed, written) negotiation may be needed. Unless the 122 caller and callee know each other or there is contextual or out of 123 band information from which the language(s) and media modalities can 124 be determined, there is a need for spoken, signed, or written 125 languages to be negotiated based on the caller's needs and the 126 callee's capabilities. This need applies to both emergency and non- 127 emergency calls. For various reasons, including the ability to 128 establish multiple streams using different media (e.g., voice, text, 129 video), it makes sense to use a per-stream negotiation mechanism, in 130 this case, SDP. 132 This approach has a number of benefits, including that it is generic 133 (applies to all interactive communications negotiated using SDP) and 134 not limited to emergency calls. In some cases such a facility isn't 135 needed, because the language is known from the context (such as when 136 a caller places a call to a sign language relay center, to a friend, 137 or colleague). But it is clearly useful in many other cases. For 138 example, someone calling a company call center or a Public Safety 139 Answering Point (PSAP) should be able to indicate if one or more 140 specific signed, written, and/or spoken languages are preferred, the 141 callee should be able to indicate its capabilities in this area, and 142 the call proceed using in-common language(s) and media forms. 144 Since this is a protocol mechanism, the user equipment (UE client) 145 needs to know the user's preferred languages; a reasonable technique 146 could include a configuration mechanism with a default of the 147 language of the user interface. In some cases, a UE could tie 148 language and media preferences, such as a preference for a video 149 stream using a signed language and/or a text or audio stream using a 150 written/spoken language. 152 Including the user's human (natural) language preferences in the 153 session establishment negotiation is independent of the use of a 154 relay service and is transparent to a voice service provider. For 155 example, assume a user within the United States who speaks Spanish 156 but not English places a voice call. The call could be an emergency 157 call or perhaps to an airline reservation desk. The language 158 information is transparent to the voice service provider, but is part 159 of the session negotiation between the UE and the terminating entity. 160 In the case of a call to e.g., an airline, the call could be 161 automatically handled by a Spanish-speaking agent. In the case of an 162 emergency call, the Emergency Services IP network (ESInet) and the 163 PSAP may choose to take the language and media preferences into 164 account when determining how to process the call. 166 By treating language as another attribute that is negotiated along 167 with other aspects of a media stream, it becomes possible to 168 accommodate a range of users' needs and called party facilities. For 169 example, some users may be able to speak several languages, but have 170 a preference. Some called parties may support some of those 171 languages internally but require the use of a translation service for 172 others, or may have a limited number of call takers able to use 173 certain languages. Another example would be a user who is able to 174 speak but is deaf or hard-of-hearing and requires a voice stream plus 175 a text stream. Making language a media attribute allows the standard 176 session negotiation mechanism to handle this by providing the 177 information and mechanism for the endpoints to make appropriate 178 decisions. 180 Regarding relay services, in the case of an emergency call requiring 181 sign language such as ASL, there are currently two common approaches: 182 the caller initiates the call to a relay center, or the caller places 183 the call to emergency services (e.g., 911 in the U.S. or 112 in 184 Europe). (In a variant of the second case, the voice service 185 provider invokes a relay service as well as emergency services.) In 186 the former case, the language need is ancillary and supplemental. In 187 the non-variant second case, the ESInet and/or PSAP may take the need 188 for sign language into account and bridge in a relay center. In this 189 case, the ESInet and PSAP have all the standard information available 190 (such as location) but are able to bridge the relay sooner in the 191 call processing. 193 By making this facility part of the end-to-end negotiation, the 194 question of which entity provides or engages the relay service 195 becomes separate from the call processing mechanics; if the caller 196 directs the call to a relay service then the human language 197 negotiation facility provides extra information to the relay service 198 but calls will still function without it; if the caller directs the 199 call to emergency services, then the ESInet/PSAP are able to take the 200 user's human language needs into account, e.g., by assigning to a 201 specific queue or call taker or bridging in a relay service or 202 translator. 204 The term "negotiation" is used here rather than "indication" because 205 human language (spoken/written/signed) is something that can be 206 negotiated in the same way as which forms of media (audio/text/video) 207 or which codecs. For example, if we think of non-emergency calls, 208 such as a user calling an airline reservation center, the user may 209 have a set of languages he or she speaks, with perhaps preferences 210 for one or a few, while the airline reservation center will support a 211 fixed set of languages. Negotiation should select the user's most 212 preferred language that is supported by the call center. Both sides 213 should be aware of which language was negotiated. This is 214 conceptually similar to the way other aspects of each media stream 215 are negotiated using SDP (e.g., media type and codecs). 217 2. Terminology 219 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 220 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 221 document are to be interpreted as described in RFC 2119 [RFC2119]. 223 3. Desired Semantics 225 The desired solution is a media attribute (preferably per direction) 226 that may be used within an offer to indicate the preferred language 227 of each (direction of a) media stream, and within an answer to 228 indicate the accepted language. The semantics of including multiple 229 values for a media stream within an offer is that the languages are 230 listed in order of preference. 232 (Negotiating multiple simultaneous languages within a media stream is 233 out of scope, as the complexity of doing so outweighs the 234 usefulness.) 236 4. The existing 'lang' attribute 238 RFC 4566 [RFC4566] specifies an attribute 'lang' which appears 239 similar to what is needed here, but is not sufficiently detailed for 240 use here. In addition, it is not mentioned in [RFC3264] and there 241 are no known implementations in SIP. Further, there is value in 242 being able to specify language per direction (sending and receiving). 243 This document therefore defines two new attributes. 245 5. Proposed Solution 247 An SDP attribute (per direction) seems the natural choice to 248 negotiate human (natural) language of an interactive media stream. 249 The attribute value should be a language tag per BCP 47 [RFC5646] 251 5.1. Rationale 253 The decision to base the proposal at the media negotiation level, and 254 specifically to use SDP, came after significant debate and 255 discussion. From an engineering standpoint, it is possible to meet 256 the objectives using a variety of mechanisms, but none are perfect. 257 None of the proposed alternatives was clearly better technically in 258 enough ways to win over proponents of the others, and none were 259 clearly so bad technically as to be easily rejected. As is often the 260 case in engineering, choosing the solution is a matter of balancing 261 trade-offs, and ultimately more a matter of taste than technical 262 merit. The two main proposals were to use SDP and SIP. SDP has the 263 advantage that the language is negotiated with the media to which it 264 applies, while SIP has the issue that the languages expressed may not 265 match the SDP media negotiated (for example, a session could 266 negotiate video at the SIP level but fail to negotiate any video 267 media stream at the SDP layer). 269 The mechanism described here for SDP can be adapted to media 270 negotiation protocols other than SDP. 272 5.2. New 'humintlang-send' and 'humintlang-recv' attributes 274 This document defines two new media-level attributes starting with 275 'humintlang' (short for "human interactive language") to negotiate 276 which human language is used in each interactive media stream. There 277 are two attributes, one ending in "-send" and the other in "-recv", 278 registered in Section 6 and described here: 280 a=humintlang-send: 281 a=humintlang-recv: 283 Each can appear multiple times in an offer for a media stream. 285 In an offer, 'humintlang-send' indicates the language(s) the offerer 286 is willing to use when sending using the media, and 'humintlang-recv' 287 indicates the language(s) the offerer is willing to use when 288 receiving using the media. The values constitute a list of languages 289 in preference order (first is most preferred). When a media is 290 intended for use in one direction only (such as a speech-impaired 291 user sending using text and receiving using audio), either 292 humintlang-send or humintlang-recv MAY be omitted. When a media is 293 not primarily intended for language (for example, a video or audio 294 stream intended for background only) both SHOULD be omitted. 295 Otherwise, both SHOULD have the same values in the same order. The 296 two SHOULD NOT be set to languages which are difficult to match 297 together (e.g., specifying a desire to send audio in Hungarian and 298 receive audio in Portuguese will make it difficult to successfully 299 complete the call). 301 In an answer, 'humintlang-send' is the accepted language the answerer 302 will send (which in most cases is one of the languages in the offer's 303 'humintlang-recv'), and 'humintlang-recv' is the accepted language 304 the answerer expects to receive (which in most cases is one of the 305 languages in the offer's 'humintlang-send'). 307 Each value MUST be a language tag per BCP 47 [RFC5646]. BCP 47 308 describes mechanisms for matching language tags. Note that [RFC5646] 309 Section 4.1 advises to "tag content wisely" and not include 310 unnecessary subtags. 312 In an offer, each language tag value MAY have an asterisk appended as 313 the last character (after the language tag). The asterisk indicates 314 a request by the caller to not fail the call if there is no language 315 in common. See Section 5.3 for more information and discussion. 317 When placing an emergency call, and in any other case where the 318 language cannot be assumed from context, each media stream in an 319 offer primarily intended for human language communication SHOULD 320 specify both (or in some cases, one of) the 'humintlang-send' and 321 'humintlang-recv' attributes. 323 Note that while signed language tags are used with a video stream to 324 indicate sign language, a spoken language tag for a video stream in 325 parallel with an audio stream with the same spoken language tag 326 indicates a request for a supplemental video stream to see the 327 speaker. 329 Clients acting on behalf of end users are expected to set one or both 330 'humintlang-send' and 'humintlang-recv' attributes on each media 331 stream primarily intended for human communication in an offer when 332 placing an outgoing session, and either ignore or take into 333 consideration the attributes when receiving incoming calls, based on 334 local configuration and capabilities. Systems acting on behalf of 335 call centers and PSAPs are expected to take into account the values 336 when processing inbound calls. 338 Note that media and language negotiation might result in more media 339 streams being accepted than are needed by the users (e.g., if more 340 preferred and less preferred combinations of media and language are 341 all accepted). 343 5.3. Advisory vs Required 345 One important consideration with this mechanism is if the call fails 346 if the callee does not support any of the languages requested by the 347 caller. 349 In order to provide for maximum likelihood of a successful 350 communication session, especially in the case of emergency calling, 351 the mechanism defined here provides a way for the caller to indicate 352 a preference for the call failing or succeeding when there is no 353 language in common. However, it is OPTIONAL for the callee to honor 354 this preference. For example, a PSAP MAY choose to attempt the call 355 even with no language in common, while a corporate call center MAY 356 choose to fail the call. 358 The mechanism for indicating this preference is that, in an offer, if 359 the last character of any of the 'humintlang-recv' or 'humintlang- 360 send' values is an asterisk, this indicates a request to not fail the 361 call (similar to SIP Accept-Language syntax). Either way, the called 362 party MAY ignore this, e.g., for the emergency services use case, a 363 PSAP will likely not fail the call. 365 5.4. Silly States 367 It is possible to specify a "silly state" where the language 368 specified does not make sense for the media type, such as specifying 369 a signed language for an audio media stream. 371 An offer MUST NOT be created where the language does not make sense 372 for the media type. If such an offer is received, the receiver MAY 373 reject the media, ignore the language specified, or attempt to 374 interpret the intent (e.g., if American Sign Language is specified 375 for an audio media stream, this might be interpreted as a desire to 376 use spoken English). 378 A spoken language tag for a video stream in conjunction with an audio 379 stream with the same language might indicate a request for 380 supplemental video to see the speaker. 382 5.5. Examples 384 Some examples are shown below. Only the most directly relevant 385 portions of the SDP block are shown, for clarity. 387 m=audio 49170 RTP/AVP 0 388 a=humintlang-send:en 389 a=humintlang-recv:en 391 m=video 51372 RTP/AVP 31 32 392 a=humintlang-send:ase* 393 a=humintlang-recv:ase* 395 m=audio 49250 RTP/AVP 20 396 a=humintlang-send:es* 397 a=humintlang-recv:es* 398 a=humintlang-send:eu* 399 a=humintlang-recv:eu* 400 a=humintlang-send:en* 401 a=humintlang-recv:en* 403 m=text 45020 RTP/AVP 103 104 404 a=humintlang-send:gr 405 a=humintlang-recv:gr 407 6. IANA Considerations 409 IANA is kindly requested to add two entries to the 'att-field (media 410 level only)' table of the SDP parameters registry: 412 Contact Name: Randall Gellens 413 Contact Email Address: rg+ietf@randy.pensive.org 414 Attribute Name: humintlang-recv 415 Attribute Syntax: 417 humintlang-value = Language-Tag [ asterisk ] 418 ; Language-Tag defined in RFC 5646 419 asterisk = "*" 421 Attribute Semantics: Described in Section 5.2 of TBD: THIS DOCUMENT 422 Usage Level: media 423 Charset Dependent: No 424 Purpose: See Section 5.2 of TBD: THIS DOCUMENT 425 O/A Procedures: See Section 5.2 of TBD: THIS DOCUMENT 426 Reference: TBD: THIS DOCUMENT 428 Contact Name: Randall Gellens 429 Contact Email Address: rg+ietf@randy.pensive.org 430 Attribute Name: humintlang-send 431 Attribute Syntax: 433 humintlang-value = Language-Tag [ asterisk ] 434 ; Language-Tag defined in RFC 5646 435 asterisk = "*" 437 Attribute Semantics: Described in Section 5.2 of TBD: THIS DOCUMENT 438 Usage Level: media 439 Charset Dependent: No 440 Purpose: See Section 5.2 of TBD: THIS DOCUMENT 441 O/A Procedures: See Section 5.2 of TBD: THIS DOCUMENT 442 Reference: TBD: THIS DOCUMENT 444 7. Security Considerations 446 The Security Considerations of BCP 47 [RFC5646] apply here. In 447 addition, if the 'humintlang-send' or 'humintlang-recv' values are 448 altered or deleted en route, the session could fail or languages 449 incomprehensible to the caller could be selected; however, this is 450 also a risk if any SDP parameters are modified en route. 452 8. Privacy Considerations 454 Language and media information can suggest a user's nationality, 455 background, abilities, disabilities, etc. 457 9. Changes from Previous Versions 459 RFC EDITOR: Please remove this section prior to publication. 461 9.1. Changes from draft-ietf-slim-...-04 to draft-ietf-slim-...-06 463 o Deleted Section 3 ("Expected Use") 464 o Reworded modalities in Introduction from "voice, video, text" to 465 "spoken, signed, written" 466 o Reworded text about "increasingly fine-grained distinctions" to 467 instead merely point to BCP 47 Section 4.1's advice to "tag 468 content wisely" and not include unnecessary subtags 469 o Changed IANA registration of new SDP attributes to follow RFC 4566 470 template with extra fields suggested in 4566-bis (expired draft) 471 o Deleted "(known as voice carry over)" 472 o Changed textual instanced of RFC 5646 to BCP 47, although actual 473 reference remains RFC due to xml2rfc limitations 475 9.2. Changes from draft-ietf-slim-...-02 to draft-ietf-slim-...-03 477 o Added Examples 478 o Added Privacy Considerations section 479 o Other editorial changes for clarity 481 9.3. Changes from draft-ietf-slim-...-01 to draft-ietf-slim-...-02 483 o Deleted most of Section 4 and replaced with a very short summary 484 o Replaced "wishes to" with "is willing to" in Section 5.2 485 o Reworded description of attribute usage to clarify when to set 486 both, only one, or neither 487 o Deleted all uses of "IMS" 488 o Other editorial changes for clarity 490 9.4. Changes from draft-ietf-slim-...-00 to draft-ietf-slim-...-01 492 o Editorial changes to wording in Section 5. 494 9.5. Changes from draft-gellens-slim-...-03 to draft-ietf-slim-...-00 496 o Updated title to reflect WG adoption 498 9.6. Changes from draft-gellens-slim-...-02 to draft-gellens- 499 slim-...-03 501 o Removed Use Cases section, per face-to-face discussion at IETF 93 502 o Removed discussion of routing, per face-to-face discussion at IETF 503 93 505 9.7. Changes from draft-gellens-slim-...-01 to draft-gellens- 506 slim-...-02 508 o Updated NENA usage mention 509 o Removed background text reference to draft-saintandre-sip-xmpp- 510 chat-04 since that draft expired 512 9.8. Changes from draft-gellens-slim-...-00 to draft-gellens- 513 slim-...-01 515 o Revision to keep draft from expiring 517 9.9. Changes from draft-gellens-mmusic-...-02 to draft-gellens- 518 slim-...-00 520 o Changed name from -mmusic- to -slim- to reflect proposed WG name 521 o As a result of the face-to-face discussion in Toronto, the SDP vs 522 SIP issue was resolved by going back to SDP, taking out the SIP 523 hint, and converting what had been a set of alternate proposals 524 for various ways of doing it within SIP into an informative annex 525 section which includes background on why SDP is the proposal 526 o Added mention that enabling a mutually comprehensible language is 527 a general problem of which this document addresses the real-time 528 side, with reference to [I-D.ietf-slim-multilangcontent] which 529 addresses the non-real-time side. 531 9.10. Changes from draft-gellens-mmusic-...-01 to -02 533 o Added clarifying text on leaving attributes unset for media not 534 primarily intended for human language communication (e.g., 535 background audio or video). 536 o Added new section Appendix A ("Alternative Proposal: Caller- 537 prefs") discussing use of SIP-level Caller-prefs instead of SDP- 538 level. 540 9.11. Changes from draft-gellens-mmusic-...-00 to -01 542 o Relaxed language on setting -send and -receive to same values; 543 added text on leaving on empty to indicate asymmetric usage. 544 o Added text that clients on behalf of end users are expected to set 545 the attributes on outgoing calls and ignore on incoming calls 546 while systems on behalf of call centers and PSAPs are expected to 547 take the attributes into account when processing incoming calls. 549 9.12. Changes from draft-gellens-...-02 to draft-gellens-mmusic-...-00 551 o Updated text to refer to RFC 5646 rather than the IANA language 552 subtags registry directly. 553 o Moved discussion of existing 'lang' attribute out of "Proposed 554 Solution" section and into own section now that it is not part of 555 proposal. 556 o Updated text about existing 'lang' attribute. 557 o Added example use cases. 558 o Replaced proposed single 'humintlang' attribute with 'humintlang- 559 send' and 'humintlang-recv' per Harald's request/information that 560 it was a misuse of SDP to use the same attribute for sending and 561 receiving. 562 o Added section describing usage being advisory vs required and text 563 in attribute section. 564 o Added section on SIP "hint" header (not yet nailed down between 565 new and existing header). 566 o Added text discussing usage in policy-based routing function or 567 use of SIP header "hint" if unable to do so. 568 o Added SHOULD that the value of the parameters stick to the largest 569 granularity of language tags. 571 o Added text to Introduction to be try and be more clear about 572 purpose of document and problem being solved. 573 o Many wording improvements and clarifications throughout the 574 document. 575 o Filled in Security Considerations. 576 o Filled in IANA Considerations. 577 o Added to Acknowledgments those who participated in the Orlando ad- 578 hoc discussion as well as those who participated in email 579 discussion and side one-on-one discussions. 581 9.13. Changes from draft-gellens-...-01 to -02 583 o Updated text for (possible) new attribute "humintlang" to 584 reference RFC 5646 585 o Added clarifying text for (possible) re-use of existing 'lang' 586 attribute saying that the registration would be updated to reflect 587 different semantics for multiple values for interactive versus 588 non-interactive media. 589 o Added clarifying text for (possible) new attribute "humintlang" to 590 attempt to better describe the role of language tags in media in 591 an offer and an answer. 593 9.14. Changes from draft-gellens-...-00 to -01 595 o Changed name of (possible) new attribute from 'humlang" to 596 "humintlang" 597 o Added discussion of silly state (language not appropriate for 598 media type) 599 o Added Voice Carry Over example 600 o Added mention of multilingual people and multiple languages 601 o Minor text clarifications 603 10. Contributors 605 Gunnar Hellstrom deserves special mention for his reviews, 606 assistance, and especially for contributing the core text in 607 Appendix A. 609 11. Acknowledgments 611 Many thanks to Bernard Aboba, Harald Alvestrand, Flemming Andreasen, 612 Francois Audet, Eric Burger, Keith Drage, Doug Ewell, Christian 613 Groves, Andrew Hutton, Hadriel Kaplan, Ari Keranen, John Klensin, 614 Paul Kyzivat, John Levine, Alexey Melnikov, James Polk, Pete Resnick, 615 Peter Saint-Andre, and Dale Worley for reviews, corrections, 616 suggestions, and participating in in-person and email discussions. 618 12. References 620 12.1. Normative References 622 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 623 Requirement Levels", BCP 14, RFC 2119, 624 DOI 10.17487/RFC2119, March 1997, 625 . 627 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 628 Description Protocol", RFC 4566, DOI 10.17487/RFC4566, 629 July 2006, . 631 [RFC5646] Phillips, A., Ed. and M. Davis, Ed., "Tags for Identifying 632 Languages", BCP 47, RFC 5646, DOI 10.17487/RFC5646, 633 September 2009, . 635 12.2. Informational References 637 [I-D.ietf-slim-multilangcontent] 638 Tomkinson, N. and N. Borenstein, "Multiple Language 639 Content Type", draft-ietf-slim-multilangcontent-06 (work 640 in progress), October 2016. 642 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 643 with Session Description Protocol (SDP)", RFC 3264, 644 DOI 10.17487/RFC3264, June 2002, 645 . 647 [RFC3840] Rosenberg, J., Schulzrinne, H., and P. Kyzivat, 648 "Indicating User Agent Capabilities in the Session 649 Initiation Protocol (SIP)", RFC 3840, 650 DOI 10.17487/RFC3840, August 2004, 651 . 653 [RFC3841] Rosenberg, J., Schulzrinne, H., and P. Kyzivat, "Caller 654 Preferences for the Session Initiation Protocol (SIP)", 655 RFC 3841, DOI 10.17487/RFC3841, August 2004, 656 . 658 Appendix A. Historic Alternative Proposal: Caller-prefs 660 The decision to base the proposal at the media negotiation level, and 661 specifically to use SDP, came after significant debate and 662 discussion. It is possible to meet the objectives using a variety of 663 mechanisms, but none are perfect. Using SDP means dealing with the 664 complexity of SDP, and leaves out real-time session protocols that do 665 not use SDP. The major alternative proposal was to use SIP. Using 666 SIP leaves out non-SIP session protocols, but more fundamentally, 667 would occur at a different layer than the media negotiation. This 668 results in a more fragile solution since the media modality and 669 language would be negotiated using SIP, and then the specific media 670 formats (which inherently include the modality) would be negotiated 671 at a different level (typically SDP, especially in the emergency 672 calling cases), making it easier to have mismatches (such as where 673 the media modality negotiated in SIP don't match what was negotiated 674 using SDP). 676 An alternative proposal was to use the SIP-level Caller Preferences 677 mechanism from RFC 3840 [RFC3840] and RFC 3841 [RFC3841]. 679 The Caller-prefs mechanism includes a priority system; this would 680 allow different combinations of media and languages to be assigned 681 different priorities. The evaluation and decisions on what to do 682 with the call can be done either by proxies along the call path, or 683 by the addressed UA. Evaluation of alternatives for routing is 684 described in RFC 3841 [RFC3841]. 686 A.1. Use of Caller Preferences Without Additions 688 The following would be possible without adding any new registered 689 tags: 691 Potential callers and recipients MAY include in the Contact field in 692 their SIP registrations media and language tags according to the 693 joint capabilities of the UA and the human user according to RFC 3840 694 [RFC3840]. 696 The most relevant media capability tags are "video", "text" and 697 "audio". Each tag represents a capability to use the media in two- 698 way communication. 700 Language capabilities are declared with a comma-separated list of 701 languages that can be used in the call as parameters to the tag 702 "language=". 704 This is an example of how it is used in a SIP REGISTER: 706 REGISTER user@example.net 707 Contact: audio; video; text; 708 language="en,es,ase" 710 Including this information in SIP REGISTER allows proxies to act on 711 the information. For the problem set addressed by this document, it 712 is not anticipated that proxies will do so using registration data. 713 Further, there are classes of devices (such as cellular mobile 714 phones) that are not anticipated to include this information in their 715 registrations. Hence, use in registration is OPTIONAL. 717 In a call, a list of acceptable media and language combinations is 718 declared, and a priority assigned to each combination. 720 This is done by the Accept-Contact header field, which defines 721 different combinations of media and languages and assigns priorities 722 for completing the call with the SIP URI represented by that Contact. 723 A priority is assigned to each set as a so-called "q-value" which 724 ranges from 1 (most preferred) to 0 (least preferred). 726 Using the Accept-Contact header field in INVITE requests and 727 responses allows these capabilities to be expressed and used during 728 call set-up. Clients SHOULD include this information in INVITE 729 requests and responses. 731 Example: 733 Accept-Contact: *; text; language="en"; q=0.2 734 Accept-Contact: *; video; language="ase"; q=0.8 736 This example shows the highest preference expressed by the caller is 737 to use video with American Sign Language (language code "ase"). As a 738 fallback, it is acceptable to get the call connected with only 739 English text used for human communication. Other media may of course 740 be connected as well, without expectation that it will be usable by 741 the caller for interactive communications (but may still be helpful 742 to the caller). 744 This system satisfies all the needs described in the previous 745 chapters, except that language specifications do not make any 746 distinction between spoken and written language, and that the need 747 for directionality in the specification cannot be fulfilled. 749 To some degree the lack of media specification between speech and 750 text in language tags can be compensated by only specifying the 751 important medium in the Accept-Contact field. 753 Thus, a user who wants to use English mainly for text would specify: 755 Accept-Contact: *;text;language="en";q=1.0 757 While a user who wants to use English mainly for speech but accept it 758 for text would specify: 760 Accept-Contact: *;audio;language="en";q=0.8 761 Accept-Contact: *;text;language="en";q=0.2 763 However, a user who would like to talk, but receive text back has no 764 way to do it with the existing specification. 766 A.2. Additional Caller Preferences for Asymmetric Needs 768 In order to be able to specify asymmetric preferences, there are two 769 possibilities. Either new language tags in the style of the 770 humintlang parameters described above for SDP could be registered, or 771 additional media tags describing the asymmetry could be registered. 773 A.2.1. Caller Preferences for Asymmetric Modality Needs 775 The following new media tags should be defined: 777 speech-receive 778 speech-send 779 text-receive 780 text-send 781 sign-send 782 sign-receive 784 A user who prefers to talk and get text in return in English would 785 register the following (if including this information in registration 786 data): 788 REGISTER user@example.net 789 Contact: audio;text;speech-send;text- 790 receive;language="en" 792 At call time, a user who prefers to talk and get text in return in 793 English would set the Accept-Contact header field to: 795 Accept-Contact: *; audio; text; speech-receive; text-send; 796 language="en";q=0.8 797 Accept-Contact: *; text; language="en"; q=0.2 799 Note that the directions specified here are as viewed from the callee 800 side to match what the callee has registered. 802 A bridge arranged for invoking a relay service specifically arranged 803 for captioned telephony would register the following for supporting 804 calling users: 806 REGISTER ct@ctrelay.net 807 Contact: audio; text; speech-receive; 808 text-send; language="en" 810 A bridge arranged for invoking a relay service specifically arranged 811 for captioned telephony would register the following for supporting 812 called users: 814 REGISTER ct@ctrelay.net 815 Contact: audio; text; speech-send; text- 816 receive; language="en" 818 At call time, these alternatives are included in the list of possible 819 outcome of the call routing by the SIP proxies and the proper relay 820 service is invoked. 822 A.2.2. Caller Preferences for Asymmetric Language Tags 824 An alternative is to register new language tags for the purpose of 825 asymmetric language usage. 827 Instead of using "language=", six new language tags would be 828 registered: 830 humintlang-text-recv 831 humintlang-text-send 832 humintlang-speech-recv 833 humintlang-speech-send 834 humintlang-sign-recv 835 humintlang-sign-send 837 These language tags would be used instead of the regular 838 bidirectional language tags, and users with bidirectional 839 capabilities SHOULD specify values for both directions. Services 840 specifically arranged for supporting users with asymmetric needs 841 SHOULD specify only the asymmetry they support. 843 Author's Address 845 Randall Gellens 846 Core Technology Consulting 848 Email: rg+ietf@randy.pensive.org