idnits 2.17.1 

draft-ietf-slim-negotiating-human-language-06.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (February 2, 2017) is 2630 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866)

  == Outdated reference: A later version (-14) exists of
     draft-ietf-slim-multilangcontent-06


     Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                         R. Gellens
3	Internet-Draft                                Core Technology Consulting
4	Intended status: Standards Track                        February 2, 2017
5	Expires: August 6, 2017

7	         Negotiating Human Language in Real-Time Communications
8	             draft-ietf-slim-negotiating-human-language-06

10	Abstract

12	   Users have various human (natural) language needs, abilities, and
13	   preferences regarding spoken, written, and signed languages.  When
14	   establishing interactive communication ("calls") there needs to be a
15	   way to negotiate (communicate and match) the caller's language and
16	   media needs with the capabilities of the called party.  This is
17	   especially important with emergency calls, where a call can be
18	   handled by a call taker capable of communicating with the user, or a
19	   translator or relay operator can be bridged into the call during
20	   setup, but this applies to non-emergency calls as well (as an
21	   example, when calling a company call center).

23	   This document describes the need and a solution using new SDP stream
24	   attributes.

26	Status of This Memo

28	   This Internet-Draft is submitted in full conformance with the
29	   provisions of BCP 78 and BCP 79.

31	   Internet-Drafts are working documents of the Internet Engineering
32	   Task Force (IETF).  Note that other groups may also distribute
33	   working documents as Internet-Drafts.  The list of current Internet-
34	   Drafts is at http://datatracker.ietf.org/drafts/current/.

36	   Internet-Drafts are draft documents valid for a maximum of six months
37	   and may be updated, replaced, or obsoleted by other documents at any
38	   time.  It is inappropriate to use Internet-Drafts as reference
39	   material or to cite them other than as "work in progress."

41	   This Internet-Draft will expire on August 6, 2017.

43	Copyright Notice

45	   Copyright (c) 2017 IETF Trust and the persons identified as the
46	   document authors.  All rights reserved.

48	   This document is subject to BCP 78 and the IETF Trust's Legal
49	   Provisions Relating to IETF Documents
50	   (http://trustee.ietf.org/license-info) in effect on the date of
51	   publication of this document.  Please review these documents
52	   carefully, as they describe your rights and restrictions with respect
53	   to this document.  Code Components extracted from this document must
54	   include Simplified BSD License text as described in Section 4.e of
55	   the Trust Legal Provisions and are provided without warranty as
56	   described in the Simplified BSD License.

58	Table of Contents

60	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
61	   2.  Terminology . . . . . . . . . . . . . . . . . . . . . . . . .   5
62	   3.  Desired Semantics . . . . . . . . . . . . . . . . . . . . . .   5
63	   4.  The existing 'lang' attribute . . . . . . . . . . . . . . . .   5
64	   5.  Proposed Solution . . . . . . . . . . . . . . . . . . . . . .   6
65	     5.1.  Rationale . . . . . . . . . . . . . . . . . . . . . . . .   6
66	     5.2.  New 'humintlang-send' and 'humintlang-recv' attributes  .   6
67	     5.3.  Advisory vs Required  . . . . . . . . . . . . . . . . . .   8
68	     5.4.  Silly States  . . . . . . . . . . . . . . . . . . . . . .   8
69	     5.5.  Examples  . . . . . . . . . . . . . . . . . . . . . . . .   9
70	   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   9
71	   7.  Security Considerations . . . . . . . . . . . . . . . . . . .  10
72	   8.  Privacy Considerations  . . . . . . . . . . . . . . . . . . .  10
73	   9.  Changes from Previous Versions  . . . . . . . . . . . . . . .  10
74	     9.1.  Changes from draft-ietf-slim-...-04 to draft-ietf-
75	           slim-...-06 . . . . . . . . . . . . . . . . . . . . . . .  10
76	     9.2.  Changes from draft-ietf-slim-...-02 to draft-ietf-
77	           slim-...-03 . . . . . . . . . . . . . . . . . . . . . . .  11
78	     9.3.  Changes from draft-ietf-slim-...-01 to draft-ietf-
79	           slim-...-02 . . . . . . . . . . . . . . . . . . . . . . .  11
80	     9.4.  Changes from draft-ietf-slim-...-00 to draft-ietf-
81	           slim-...-01 . . . . . . . . . . . . . . . . . . . . . . .  11
82	     9.5.  Changes from draft-gellens-slim-...-03 to draft-ietf-
83	           slim-...-00 . . . . . . . . . . . . . . . . . . . . . . .  11
84	     9.6.  Changes from draft-gellens-slim-...-02 to draft-gellens-
85	           slim-...-03 . . . . . . . . . . . . . . . . . . . . . . .  11
86	     9.7.  Changes from draft-gellens-slim-...-01 to draft-gellens-
87	           slim-...-02 . . . . . . . . . . . . . . . . . . . . . . .  11
88	     9.8.  Changes from draft-gellens-slim-...-00 to draft-gellens-
89	           slim-...-01 . . . . . . . . . . . . . . . . . . . . . . .  11
90	     9.9.  Changes from draft-gellens-mmusic-...-02 to draft-
91	           gellens-slim-...-00 . . . . . . . . . . . . . . . . . . .  11
92	     9.10. Changes from draft-gellens-mmusic-...-01 to -02 . . . . .  12
93	     9.11. Changes from draft-gellens-mmusic-...-00 to -01 . . . . .  12
94	     9.12. Changes from draft-gellens-...-02 to draft-gellens-
95	           mmusic-...-00 . . . . . . . . . . . . . . . . . . . . . .  12

97	     9.13. Changes from draft-gellens-...-01 to -02  . . . . . . . .  13
98	     9.14. Changes from draft-gellens-...-00 to -01  . . . . . . . .  13
99	   10. Contributors  . . . . . . . . . . . . . . . . . . . . . . . .  13
100	   11. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .  13
101	   12. References  . . . . . . . . . . . . . . . . . . . . . . . . .  14
102	     12.1.  Normative References . . . . . . . . . . . . . . . . . .  14
103	     12.2.  Informational References . . . . . . . . . . . . . . . .  14
104	   Appendix A.  Historic Alternative Proposal: Caller-prefs  . . . .  14
105	     A.1.  Use of Caller Preferences Without Additions . . . . . . .  15
106	     A.2.  Additional Caller Preferences for Asymmetric Needs  . . .  17
107	       A.2.1.  Caller Preferences for Asymmetric Modality Needs  . .  17
108	       A.2.2.  Caller Preferences for Asymmetric Language Tags . . .  18
109	   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  19

111	1.  Introduction

113	   A mutually comprehensible language is helpful for human
114	   communication.  This document addresses the real-time, interactive
115	   side of the issue.  A companion document on language selection in
116	   email [I-D.ietf-slim-multilangcontent] addresses the non-real-time
117	   side.

119	   When setting up interactive communication sessions (using SIP or
120	   other protocols), human (natural) language and media modality
121	   (spoken, signed, written) negotiation may be needed.  Unless the
122	   caller and callee know each other or there is contextual or out of
123	   band information from which the language(s) and media modalities can
124	   be determined, there is a need for spoken, signed, or written
125	   languages to be negotiated based on the caller's needs and the
126	   callee's capabilities.  This need applies to both emergency and non-
127	   emergency calls.  For various reasons, including the ability to
128	   establish multiple streams using different media (e.g., voice, text,
129	   video), it makes sense to use a per-stream negotiation mechanism, in
130	   this case, SDP.

132	   This approach has a number of benefits, including that it is generic
133	   (applies to all interactive communications negotiated using SDP) and
134	   not limited to emergency calls.  In some cases such a facility isn't
135	   needed, because the language is known from the context (such as when
136	   a caller places a call to a sign language relay center, to a friend,
137	   or colleague).  But it is clearly useful in many other cases.  For
138	   example, someone calling a company call center or a Public Safety
139	   Answering Point (PSAP) should be able to indicate if one or more
140	   specific signed, written, and/or spoken languages are preferred, the
141	   callee should be able to indicate its capabilities in this area, and
142	   the call proceed using in-common language(s) and media forms.

144	   Since this is a protocol mechanism, the user equipment (UE client)
145	   needs to know the user's preferred languages; a reasonable technique
146	   could include a configuration mechanism with a default of the
147	   language of the user interface.  In some cases, a UE could tie
148	   language and media preferences, such as a preference for a video
149	   stream using a signed language and/or a text or audio stream using a
150	   written/spoken language.

152	   Including the user's human (natural) language preferences in the
153	   session establishment negotiation is independent of the use of a
154	   relay service and is transparent to a voice service provider.  For
155	   example, assume a user within the United States who speaks Spanish
156	   but not English places a voice call.  The call could be an emergency
157	   call or perhaps to an airline reservation desk.  The language
158	   information is transparent to the voice service provider, but is part
159	   of the session negotiation between the UE and the terminating entity.
160	   In the case of a call to e.g., an airline, the call could be
161	   automatically handled by a Spanish-speaking agent.  In the case of an
162	   emergency call, the Emergency Services IP network (ESInet) and the
163	   PSAP may choose to take the language and media preferences into
164	   account when determining how to process the call.

166	   By treating language as another attribute that is negotiated along
167	   with other aspects of a media stream, it becomes possible to
168	   accommodate a range of users' needs and called party facilities.  For
169	   example, some users may be able to speak several languages, but have
170	   a preference.  Some called parties may support some of those
171	   languages internally but require the use of a translation service for
172	   others, or may have a limited number of call takers able to use
173	   certain languages.  Another example would be a user who is able to
174	   speak but is deaf or hard-of-hearing and requires a voice stream plus
175	   a text stream.  Making language a media attribute allows the standard
176	   session negotiation mechanism to handle this by providing the
177	   information and mechanism for the endpoints to make appropriate
178	   decisions.

180	   Regarding relay services, in the case of an emergency call requiring
181	   sign language such as ASL, there are currently two common approaches:
182	   the caller initiates the call to a relay center, or the caller places
183	   the call to emergency services (e.g., 911 in the U.S. or 112 in
184	   Europe).  (In a variant of the second case, the voice service
185	   provider invokes a relay service as well as emergency services.)  In
186	   the former case, the language need is ancillary and supplemental.  In
187	   the non-variant second case, the ESInet and/or PSAP may take the need
188	   for sign language into account and bridge in a relay center.  In this
189	   case, the ESInet and PSAP have all the standard information available
190	   (such as location) but are able to bridge the relay sooner in the
191	   call processing.

193	   By making this facility part of the end-to-end negotiation, the
194	   question of which entity provides or engages the relay service
195	   becomes separate from the call processing mechanics; if the caller
196	   directs the call to a relay service then the human language
197	   negotiation facility provides extra information to the relay service
198	   but calls will still function without it; if the caller directs the
199	   call to emergency services, then the ESInet/PSAP are able to take the
200	   user's human language needs into account, e.g., by assigning to a
201	   specific queue or call taker or bridging in a relay service or
202	   translator.

204	   The term "negotiation" is used here rather than "indication" because
205	   human language (spoken/written/signed) is something that can be
206	   negotiated in the same way as which forms of media (audio/text/video)
207	   or which codecs.  For example, if we think of non-emergency calls,
208	   such as a user calling an airline reservation center, the user may
209	   have a set of languages he or she speaks, with perhaps preferences
210	   for one or a few, while the airline reservation center will support a
211	   fixed set of languages.  Negotiation should select the user's most
212	   preferred language that is supported by the call center.  Both sides
213	   should be aware of which language was negotiated.  This is
214	   conceptually similar to the way other aspects of each media stream
215	   are negotiated using SDP (e.g., media type and codecs).

217	2.  Terminology

219	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
220	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
221	   document are to be interpreted as described in RFC 2119 [RFC2119].

223	3.  Desired Semantics

225	   The desired solution is a media attribute (preferably per direction)
226	   that may be used within an offer to indicate the preferred language
227	   of each (direction of a) media stream, and within an answer to
228	   indicate the accepted language.  The semantics of including multiple
229	   values for a media stream within an offer is that the languages are
230	   listed in order of preference.

232	   (Negotiating multiple simultaneous languages within a media stream is
233	   out of scope, as the complexity of doing so outweighs the
234	   usefulness.)

236	4.  The existing 'lang' attribute

238	   RFC 4566 [RFC4566] specifies an attribute 'lang' which appears
239	   similar to what is needed here, but is not sufficiently detailed for
240	   use here.  In addition, it is not mentioned in [RFC3264] and there
241	   are no known implementations in SIP.  Further, there is value in
242	   being able to specify language per direction (sending and receiving).
243	   This document therefore defines two new attributes.

245	5.  Proposed Solution

247	   An SDP attribute (per direction) seems the natural choice to
248	   negotiate human (natural) language of an interactive media stream.
249	   The attribute value should be a language tag per BCP 47 [RFC5646]

251	5.1.  Rationale

253	   The decision to base the proposal at the media negotiation level, and
254	   specifically to use SDP, came after significant debate and
255	   discussion.  From an engineering standpoint, it is possible to meet
256	   the objectives using a variety of mechanisms, but none are perfect.
257	   None of the proposed alternatives was clearly better technically in
258	   enough ways to win over proponents of the others, and none were
259	   clearly so bad technically as to be easily rejected.  As is often the
260	   case in engineering, choosing the solution is a matter of balancing
261	   trade-offs, and ultimately more a matter of taste than technical
262	   merit.  The two main proposals were to use SDP and SIP.  SDP has the
263	   advantage that the language is negotiated with the media to which it
264	   applies, while SIP has the issue that the languages expressed may not
265	   match the SDP media negotiated (for example, a session could
266	   negotiate video at the SIP level but fail to negotiate any video
267	   media stream at the SDP layer).

269	   The mechanism described here for SDP can be adapted to media
270	   negotiation protocols other than SDP.

272	5.2.  New 'humintlang-send' and 'humintlang-recv' attributes

274	   This document defines two new media-level attributes starting with
275	   'humintlang' (short for "human interactive language") to negotiate
276	   which human language is used in each interactive media stream.  There
277	   are two attributes, one ending in "-send" and the other in "-recv",
278	   registered in Section 6 and described here:

280	      a=humintlang-send:<language tag>
281	      a=humintlang-recv:<language tag>

283	   Each can appear multiple times in an offer for a media stream.

285	   In an offer, 'humintlang-send' indicates the language(s) the offerer
286	   is willing to use when sending using the media, and 'humintlang-recv'
287	   indicates the language(s) the offerer is willing to use when
288	   receiving using the media.  The values constitute a list of languages
289	   in preference order (first is most preferred).  When a media is
290	   intended for use in one direction only (such as a speech-impaired
291	   user sending using text and receiving using audio), either
292	   humintlang-send or humintlang-recv MAY be omitted.  When a media is
293	   not primarily intended for language (for example, a video or audio
294	   stream intended for background only) both SHOULD be omitted.
295	   Otherwise, both SHOULD have the same values in the same order.  The
296	   two SHOULD NOT be set to languages which are difficult to match
297	   together (e.g., specifying a desire to send audio in Hungarian and
298	   receive audio in Portuguese will make it difficult to successfully
299	   complete the call).

301	   In an answer, 'humintlang-send' is the accepted language the answerer
302	   will send (which in most cases is one of the languages in the offer's
303	   'humintlang-recv'), and 'humintlang-recv' is the accepted language
304	   the answerer expects to receive (which in most cases is one of the
305	   languages in the offer's 'humintlang-send').

307	   Each value MUST be a language tag per BCP 47 [RFC5646].  BCP 47
308	   describes mechanisms for matching language tags.  Note that [RFC5646]
309	   Section 4.1 advises to "tag content wisely" and not include
310	   unnecessary subtags.

312	   In an offer, each language tag value MAY have an asterisk appended as
313	   the last character (after the language tag).  The asterisk indicates
314	   a request by the caller to not fail the call if there is no language
315	   in common.  See Section 5.3 for more information and discussion.

317	   When placing an emergency call, and in any other case where the
318	   language cannot be assumed from context, each media stream in an
319	   offer primarily intended for human language communication SHOULD
320	   specify both (or in some cases, one of) the 'humintlang-send' and
321	   'humintlang-recv' attributes.

323	   Note that while signed language tags are used with a video stream to
324	   indicate sign language, a spoken language tag for a video stream in
325	   parallel with an audio stream with the same spoken language tag
326	   indicates a request for a supplemental video stream to see the
327	   speaker.

329	   Clients acting on behalf of end users are expected to set one or both
330	   'humintlang-send' and 'humintlang-recv' attributes on each media
331	   stream primarily intended for human communication in an offer when
332	   placing an outgoing session, and either ignore or take into
333	   consideration the attributes when receiving incoming calls, based on
334	   local configuration and capabilities.  Systems acting on behalf of
335	   call centers and PSAPs are expected to take into account the values
336	   when processing inbound calls.

338	   Note that media and language negotiation might result in more media
339	   streams being accepted than are needed by the users (e.g., if more
340	   preferred and less preferred combinations of media and language are
341	   all accepted).

343	5.3.  Advisory vs Required

345	   One important consideration with this mechanism is if the call fails
346	   if the callee does not support any of the languages requested by the
347	   caller.

349	   In order to provide for maximum likelihood of a successful
350	   communication session, especially in the case of emergency calling,
351	   the mechanism defined here provides a way for the caller to indicate
352	   a preference for the call failing or succeeding when there is no
353	   language in common.  However, it is OPTIONAL for the callee to honor
354	   this preference.  For example, a PSAP MAY choose to attempt the call
355	   even with no language in common, while a corporate call center MAY
356	   choose to fail the call.

358	   The mechanism for indicating this preference is that, in an offer, if
359	   the last character of any of the 'humintlang-recv' or 'humintlang-
360	   send' values is an asterisk, this indicates a request to not fail the
361	   call (similar to SIP Accept-Language syntax).  Either way, the called
362	   party MAY ignore this, e.g., for the emergency services use case, a
363	   PSAP will likely not fail the call.

365	5.4.  Silly States

367	   It is possible to specify a "silly state" where the language
368	   specified does not make sense for the media type, such as specifying
369	   a signed language for an audio media stream.

371	   An offer MUST NOT be created where the language does not make sense
372	   for the media type.  If such an offer is received, the receiver MAY
373	   reject the media, ignore the language specified, or attempt to
374	   interpret the intent (e.g., if American Sign Language is specified
375	   for an audio media stream, this might be interpreted as a desire to
376	   use spoken English).

378	   A spoken language tag for a video stream in conjunction with an audio
379	   stream with the same language might indicate a request for
380	   supplemental video to see the speaker.

382	5.5.  Examples

384	   Some examples are shown below.  Only the most directly relevant
385	   portions of the SDP block are shown, for clarity.

387	      m=audio 49170 RTP/AVP 0
388	      a=humintlang-send:en
389	      a=humintlang-recv:en

391	      m=video 51372 RTP/AVP 31 32
392	      a=humintlang-send:ase*
393	      a=humintlang-recv:ase*

395	      m=audio 49250 RTP/AVP 20
396	      a=humintlang-send:es*
397	      a=humintlang-recv:es*
398	      a=humintlang-send:eu*
399	      a=humintlang-recv:eu*
400	      a=humintlang-send:en*
401	      a=humintlang-recv:en*

403	      m=text 45020 RTP/AVP 103 104
404	      a=humintlang-send:gr
405	      a=humintlang-recv:gr

407	6.  IANA Considerations

409	   IANA is kindly requested to add two entries to the 'att-field (media
410	   level only)' table of the SDP parameters registry:

412	   Contact Name:  Randall Gellens
413	   Contact Email Address:  rg+ietf@randy.pensive.org
414	   Attribute Name:  humintlang-recv
415	   Attribute Syntax:

417	      humintlang-value =  Language-Tag [ asterisk ]
418	                          ; Language-Tag defined in RFC 5646
419	      asterisk         =  "*"

421	   Attribute Semantics:  Described in Section 5.2 of TBD: THIS DOCUMENT
422	   Usage Level:  media
423	   Charset Dependent:  No
424	   Purpose:  See Section 5.2 of TBD: THIS DOCUMENT
425	   O/A Procedures:  See Section 5.2 of TBD: THIS DOCUMENT
426	   Reference:  TBD: THIS DOCUMENT

428	   Contact Name:  Randall Gellens
429	   Contact Email Address:  rg+ietf@randy.pensive.org
430	   Attribute Name:  humintlang-send
431	   Attribute Syntax:

433	      humintlang-value =  Language-Tag [ asterisk ]
434	                          ; Language-Tag defined in RFC 5646
435	      asterisk         =  "*"

437	   Attribute Semantics:  Described in Section 5.2 of TBD: THIS DOCUMENT
438	   Usage Level:  media
439	   Charset Dependent:  No
440	   Purpose:  See Section 5.2 of TBD: THIS DOCUMENT
441	   O/A Procedures:  See Section 5.2 of TBD: THIS DOCUMENT
442	   Reference:  TBD: THIS DOCUMENT

444	7.  Security Considerations

446	   The Security Considerations of BCP 47 [RFC5646] apply here.  In
447	   addition, if the 'humintlang-send' or 'humintlang-recv' values are
448	   altered or deleted en route, the session could fail or languages
449	   incomprehensible to the caller could be selected; however, this is
450	   also a risk if any SDP parameters are modified en route.

452	8.  Privacy Considerations

454	   Language and media information can suggest a user's nationality,
455	   background, abilities, disabilities, etc.

457	9.  Changes from Previous Versions

459	   RFC EDITOR: Please remove this section prior to publication.

461	9.1.  Changes from draft-ietf-slim-...-04 to draft-ietf-slim-...-06

463	   o  Deleted Section 3 ("Expected Use")
464	   o  Reworded modalities in Introduction from "voice, video, text" to
465	      "spoken, signed, written"
466	   o  Reworded text about "increasingly fine-grained distinctions" to
467	      instead merely point to BCP 47 Section 4.1's advice to "tag
468	      content wisely" and not include unnecessary subtags
469	   o  Changed IANA registration of new SDP attributes to follow RFC 4566
470	      template with extra fields suggested in 4566-bis (expired draft)
471	   o  Deleted "(known as voice carry over)"
472	   o  Changed textual instanced of RFC 5646 to BCP 47, although actual
473	      reference remains RFC due to xml2rfc limitations

475	9.2.  Changes from draft-ietf-slim-...-02 to draft-ietf-slim-...-03

477	   o  Added Examples
478	   o  Added Privacy Considerations section
479	   o  Other editorial changes for clarity

481	9.3.  Changes from draft-ietf-slim-...-01 to draft-ietf-slim-...-02

483	   o  Deleted most of Section 4 and replaced with a very short summary
484	   o  Replaced "wishes to" with "is willing to" in Section 5.2
485	   o  Reworded description of attribute usage to clarify when to set
486	      both, only one, or neither
487	   o  Deleted all uses of "IMS"
488	   o  Other editorial changes for clarity

490	9.4.  Changes from draft-ietf-slim-...-00 to draft-ietf-slim-...-01

492	   o  Editorial changes to wording in Section 5.

494	9.5.  Changes from draft-gellens-slim-...-03 to draft-ietf-slim-...-00

496	   o  Updated title to reflect WG adoption

498	9.6.  Changes from draft-gellens-slim-...-02 to draft-gellens-
499	      slim-...-03

501	   o  Removed Use Cases section, per face-to-face discussion at IETF 93
502	   o  Removed discussion of routing, per face-to-face discussion at IETF
503	      93

505	9.7.  Changes from draft-gellens-slim-...-01 to draft-gellens-
506	      slim-...-02

508	   o  Updated NENA usage mention
509	   o  Removed background text reference to draft-saintandre-sip-xmpp-
510	      chat-04 since that draft expired

512	9.8.  Changes from draft-gellens-slim-...-00 to draft-gellens-
513	      slim-...-01

515	   o  Revision to keep draft from expiring

517	9.9.  Changes from draft-gellens-mmusic-...-02 to draft-gellens-
518	      slim-...-00

520	   o  Changed name from -mmusic- to -slim- to reflect proposed WG name
521	   o  As a result of the face-to-face discussion in Toronto, the SDP vs
522	      SIP issue was resolved by going back to SDP, taking out the SIP
523	      hint, and converting what had been a set of alternate proposals
524	      for various ways of doing it within SIP into an informative annex
525	      section which includes background on why SDP is the proposal
526	   o  Added mention that enabling a mutually comprehensible language is
527	      a general problem of which this document addresses the real-time
528	      side, with reference to [I-D.ietf-slim-multilangcontent] which
529	      addresses the non-real-time side.

531	9.10.  Changes from draft-gellens-mmusic-...-01 to -02

533	   o  Added clarifying text on leaving attributes unset for media not
534	      primarily intended for human language communication (e.g.,
535	      background audio or video).
536	   o  Added new section Appendix A ("Alternative Proposal: Caller-
537	      prefs") discussing use of SIP-level Caller-prefs instead of SDP-
538	      level.

540	9.11.  Changes from draft-gellens-mmusic-...-00 to -01

542	   o  Relaxed language on setting -send and -receive to same values;
543	      added text on leaving on empty to indicate asymmetric usage.
544	   o  Added text that clients on behalf of end users are expected to set
545	      the attributes on outgoing calls and ignore on incoming calls
546	      while systems on behalf of call centers and PSAPs are expected to
547	      take the attributes into account when processing incoming calls.

549	9.12.  Changes from draft-gellens-...-02 to draft-gellens-mmusic-...-00

551	   o  Updated text to refer to RFC 5646 rather than the IANA language
552	      subtags registry directly.
553	   o  Moved discussion of existing 'lang' attribute out of "Proposed
554	      Solution" section and into own section now that it is not part of
555	      proposal.
556	   o  Updated text about existing 'lang' attribute.
557	   o  Added example use cases.
558	   o  Replaced proposed single 'humintlang' attribute with 'humintlang-
559	      send' and 'humintlang-recv' per Harald's request/information that
560	      it was a misuse of SDP to use the same attribute for sending and
561	      receiving.
562	   o  Added section describing usage being advisory vs required and text
563	      in attribute section.
564	   o  Added section on SIP "hint" header (not yet nailed down between
565	      new and existing header).
566	   o  Added text discussing usage in policy-based routing function or
567	      use of SIP header "hint" if unable to do so.
568	   o  Added SHOULD that the value of the parameters stick to the largest
569	      granularity of language tags.

571	   o  Added text to Introduction to be try and be more clear about
572	      purpose of document and problem being solved.
573	   o  Many wording improvements and clarifications throughout the
574	      document.
575	   o  Filled in Security Considerations.
576	   o  Filled in IANA Considerations.
577	   o  Added to Acknowledgments those who participated in the Orlando ad-
578	      hoc discussion as well as those who participated in email
579	      discussion and side one-on-one discussions.

581	9.13.  Changes from draft-gellens-...-01 to -02

583	   o  Updated text for (possible) new attribute "humintlang" to
584	      reference RFC 5646
585	   o  Added clarifying text for (possible) re-use of existing 'lang'
586	      attribute saying that the registration would be updated to reflect
587	      different semantics for multiple values for interactive versus
588	      non-interactive media.
589	   o  Added clarifying text for (possible) new attribute "humintlang" to
590	      attempt to better describe the role of language tags in media in
591	      an offer and an answer.

593	9.14.  Changes from draft-gellens-...-00 to -01

595	   o  Changed name of (possible) new attribute from 'humlang" to
596	      "humintlang"
597	   o  Added discussion of silly state (language not appropriate for
598	      media type)
599	   o  Added Voice Carry Over example
600	   o  Added mention of multilingual people and multiple languages
601	   o  Minor text clarifications

603	10.  Contributors

605	   Gunnar Hellstrom deserves special mention for his reviews,
606	   assistance, and especially for contributing the core text in
607	   Appendix A.

609	11.  Acknowledgments

611	   Many thanks to Bernard Aboba, Harald Alvestrand, Flemming Andreasen,
612	   Francois Audet, Eric Burger, Keith Drage, Doug Ewell, Christian
613	   Groves, Andrew Hutton, Hadriel Kaplan, Ari Keranen, John Klensin,
614	   Paul Kyzivat, John Levine, Alexey Melnikov, James Polk, Pete Resnick,
615	   Peter Saint-Andre, and Dale Worley for reviews, corrections,
616	   suggestions, and participating in in-person and email discussions.

618	12.  References

620	12.1.  Normative References

622	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
623	              Requirement Levels", BCP 14, RFC 2119,
624	              DOI 10.17487/RFC2119, March 1997,
625	              <http://www.rfc-editor.org/info/rfc2119>.

627	   [RFC4566]  Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
628	              Description Protocol", RFC 4566, DOI 10.17487/RFC4566,
629	              July 2006, <http://www.rfc-editor.org/info/rfc4566>.

631	   [RFC5646]  Phillips, A., Ed. and M. Davis, Ed., "Tags for Identifying
632	              Languages", BCP 47, RFC 5646, DOI 10.17487/RFC5646,
633	              September 2009, <http://www.rfc-editor.org/info/rfc5646>.

635	12.2.  Informational References

637	   [I-D.ietf-slim-multilangcontent]
638	              Tomkinson, N. and N. Borenstein, "Multiple Language
639	              Content Type", draft-ietf-slim-multilangcontent-06 (work
640	              in progress), October 2016.

642	   [RFC3264]  Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
643	              with Session Description Protocol (SDP)", RFC 3264,
644	              DOI 10.17487/RFC3264, June 2002,
645	              <http://www.rfc-editor.org/info/rfc3264>.

647	   [RFC3840]  Rosenberg, J., Schulzrinne, H., and P. Kyzivat,
648	              "Indicating User Agent Capabilities in the Session
649	              Initiation Protocol (SIP)", RFC 3840,
650	              DOI 10.17487/RFC3840, August 2004,
651	              <http://www.rfc-editor.org/info/rfc3840>.

653	   [RFC3841]  Rosenberg, J., Schulzrinne, H., and P. Kyzivat, "Caller
654	              Preferences for the Session Initiation Protocol (SIP)",
655	              RFC 3841, DOI 10.17487/RFC3841, August 2004,
656	              <http://www.rfc-editor.org/info/rfc3841>.

658	Appendix A.  Historic Alternative Proposal: Caller-prefs

660	   The decision to base the proposal at the media negotiation level, and
661	   specifically to use SDP, came after significant debate and
662	   discussion.  It is possible to meet the objectives using a variety of
663	   mechanisms, but none are perfect.  Using SDP means dealing with the
664	   complexity of SDP, and leaves out real-time session protocols that do
665	   not use SDP.  The major alternative proposal was to use SIP.  Using
666	   SIP leaves out non-SIP session protocols, but more fundamentally,
667	   would occur at a different layer than the media negotiation.  This
668	   results in a more fragile solution since the media modality and
669	   language would be negotiated using SIP, and then the specific media
670	   formats (which inherently include the modality) would be negotiated
671	   at a different level (typically SDP, especially in the emergency
672	   calling cases), making it easier to have mismatches (such as where
673	   the media modality negotiated in SIP don't match what was negotiated
674	   using SDP).

676	   An alternative proposal was to use the SIP-level Caller Preferences
677	   mechanism from RFC 3840 [RFC3840] and RFC 3841 [RFC3841].

679	   The Caller-prefs mechanism includes a priority system; this would
680	   allow different combinations of media and languages to be assigned
681	   different priorities.  The evaluation and decisions on what to do
682	   with the call can be done either by proxies along the call path, or
683	   by the addressed UA.  Evaluation of alternatives for routing is
684	   described in RFC 3841 [RFC3841].

686	A.1.  Use of Caller Preferences Without Additions

688	   The following would be possible without adding any new registered
689	   tags:

691	   Potential callers and recipients MAY include in the Contact field in
692	   their SIP registrations media and language tags according to the
693	   joint capabilities of the UA and the human user according to RFC 3840
694	   [RFC3840].

696	   The most relevant media capability tags are "video", "text" and
697	   "audio".  Each tag represents a capability to use the media in two-
698	   way communication.

700	   Language capabilities are declared with a comma-separated list of
701	   languages that can be used in the call as parameters to the tag
702	   "language=".

704	   This is an example of how it is used in a SIP REGISTER:

706	      REGISTER    user@example.net
707	      Contact:    <sip:user1@example.net> audio; video; text;
708	                  language="en,es,ase"

710	   Including this information in SIP REGISTER allows proxies to act on
711	   the information.  For the problem set addressed by this document, it
712	   is not anticipated that proxies will do so using registration data.
713	   Further, there are classes of devices (such as cellular mobile
714	   phones) that are not anticipated to include this information in their
715	   registrations.  Hence, use in registration is OPTIONAL.

717	   In a call, a list of acceptable media and language combinations is
718	   declared, and a priority assigned to each combination.

720	   This is done by the Accept-Contact header field, which defines
721	   different combinations of media and languages and assigns priorities
722	   for completing the call with the SIP URI represented by that Contact.
723	   A priority is assigned to each set as a so-called "q-value" which
724	   ranges from 1 (most preferred) to 0 (least preferred).

726	   Using the Accept-Contact header field in INVITE requests and
727	   responses allows these capabilities to be expressed and used during
728	   call set-up.  Clients SHOULD include this information in INVITE
729	   requests and responses.

731	   Example:

733	      Accept-Contact:    *; text; language="en"; q=0.2
734	      Accept-Contact:    *; video; language="ase"; q=0.8

736	   This example shows the highest preference expressed by the caller is
737	   to use video with American Sign Language (language code "ase").  As a
738	   fallback, it is acceptable to get the call connected with only
739	   English text used for human communication.  Other media may of course
740	   be connected as well, without expectation that it will be usable by
741	   the caller for interactive communications (but may still be helpful
742	   to the caller).

744	   This system satisfies all the needs described in the previous
745	   chapters, except that language specifications do not make any
746	   distinction between spoken and written language, and that the need
747	   for directionality in the specification cannot be fulfilled.

749	   To some degree the lack of media specification between speech and
750	   text in language tags can be compensated by only specifying the
751	   important medium in the Accept-Contact field.

753	   Thus, a user who wants to use English mainly for text would specify:

755	      Accept-Contact:    *;text;language="en";q=1.0

757	   While a user who wants to use English mainly for speech but accept it
758	   for text would specify:

760	      Accept-Contact:    *;audio;language="en";q=0.8
761	      Accept-Contact:    *;text;language="en";q=0.2

763	   However, a user who would like to talk, but receive text back has no
764	   way to do it with the existing specification.

766	A.2.  Additional Caller Preferences for Asymmetric Needs

768	   In order to be able to specify asymmetric preferences, there are two
769	   possibilities.  Either new language tags in the style of the
770	   humintlang parameters described above for SDP could be registered, or
771	   additional media tags describing the asymmetry could be registered.

773	A.2.1.  Caller Preferences for Asymmetric Modality Needs

775	   The following new media tags should be defined:

777	      speech-receive
778	      speech-send
779	      text-receive
780	      text-send
781	      sign-send
782	      sign-receive

784	   A user who prefers to talk and get text in return in English would
785	   register the following (if including this information in registration
786	   data):

788	      REGISTER    user@example.net
789	      Contact:    <sip:user1@example.net> audio;text;speech-send;text-
790	                  receive;language="en"

792	   At call time, a user who prefers to talk and get text in return in
793	   English would set the Accept-Contact header field to:

795	      Accept-Contact:    *; audio; text; speech-receive; text-send;
796	                         language="en";q=0.8
797	      Accept-Contact:    *; text; language="en"; q=0.2

799	   Note that the directions specified here are as viewed from the callee
800	   side to match what the callee has registered.

802	   A bridge arranged for invoking a relay service specifically arranged
803	   for captioned telephony would register the following for supporting
804	   calling users:

806	      REGISTER    ct@ctrelay.net
807	      Contact:    <sip:ct1@ctreley.net> audio; text; speech-receive;
808	                  text-send; language="en"

810	   A bridge arranged for invoking a relay service specifically arranged
811	   for captioned telephony would register the following for supporting
812	   called users:

814	      REGISTER    ct@ctrelay.net
815	      Contact:    <sip:ct2@ctreley.net> audio; text; speech-send; text-
816	                  receive; language="en"

818	   At call time, these alternatives are included in the list of possible
819	   outcome of the call routing by the SIP proxies and the proper relay
820	   service is invoked.

822	A.2.2.  Caller Preferences for Asymmetric Language Tags

824	   An alternative is to register new language tags for the purpose of
825	   asymmetric language usage.

827	   Instead of using "language=", six new language tags would be
828	   registered:

830	      humintlang-text-recv
831	      humintlang-text-send
832	      humintlang-speech-recv
833	      humintlang-speech-send
834	      humintlang-sign-recv
835	      humintlang-sign-send

837	   These language tags would be used instead of the regular
838	   bidirectional language tags, and users with bidirectional
839	   capabilities SHOULD specify values for both directions.  Services
840	   specifically arranged for supporting users with asymmetric needs
841	   SHOULD specify only the asymmetry they support.

843	Author's Address

845	   Randall Gellens
846	   Core Technology Consulting

848	   Email: rg+ietf@randy.pensive.org