idnits 2.17.1 

draft-hellstrom-avtcore-multi-party-rtt-solutions-04.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses
     in the document.  If these are example addresses, they should be changed.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (30 October 2020) is 1268 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Missing Reference: 'ICE' is mentioned on line 1041, but not defined

  == Unused Reference: 'RFC3264' is defined on line 1968, but no explicit
     reference was found in the text

  == Outdated reference: A later version (-20) exists of
     draft-ietf-avtcore-multi-party-rtt-mix-06


     Summary: 0 errors (**), 0 flaws (~~), 5 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	Internet Engineering Task Force                             G. Hellstrom
3	Internet-Draft                 Gunnar Hellstrom Accessible Communication
4	Intended status: Informational                           30 October 2020
5	Expires: 3 May 2021

7	           Real-time text solutions for multi-party sessions
8	          draft-hellstrom-avtcore-multi-party-rtt-solutions-04

10	Abstract

12	   This document specifies methods for Real-Time Text (RTT) media
13	   handling in multi-party calls.  The main discussed transport is to
14	   carry Real-Time text by the RTP protocol in a time-sampled mode
15	   according to RFC 4103.  The mechanisms enable the receiving
16	   application to present the received real-time text media, separated
17	   per source, in different ways according to user preferences.  Some
18	   presentation related features are also described explaining suitable
19	   variations of transmission and presentation of text.

21	   Call control features are described for the SIP environment.  A
22	   number of alternative methods for providing the multi-party
23	   negotiation, transmission and presentation are discussed and a
24	   recommendation for the main ones is provided.  The main solution for
25	   SIP based centralized multi-party handling of real-time text is
26	   achieved through a media control unit coordinating multiple RTP text
27	   streams into one RTP stream.

29	   Alternative methods using a single RTP stream and source
30	   identification inline in the text stream are also described, one of
31	   them being provided as a lower functionality fallback method for
32	   endpoints with no multi-party awareness for RTT.

34	   Bridging methods where the text stream is carried without the
35	   contents being dealt with in detail by the bridge are also discussed.

37	   Brief information is also provided for multi-party RTT in the WebRTC
38	   environment.

40	   The intention is to provide background for decisions, specification
41	   and implementation of selected methods.

43	Status of This Memo

45	   This Internet-Draft is submitted in full conformance with the
46	   provisions of BCP 78 and BCP 79.

48	   Internet-Drafts are working documents of the Internet Engineering
49	   Task Force (IETF).  Note that other groups may also distribute
50	   working documents as Internet-Drafts.  The list of current Internet-
51	   Drafts is at https://datatracker.ietf.org/drafts/current/.

53	   Internet-Drafts are draft documents valid for a maximum of six months
54	   and may be updated, replaced, or obsoleted by other documents at any
55	   time.  It is inappropriate to use Internet-Drafts as reference
56	   material or to cite them other than as "work in progress."

58	   This Internet-Draft will expire on 3 May 2021.

60	Copyright Notice

62	   Copyright (c) 2020 IETF Trust and the persons identified as the
63	   document authors.  All rights reserved.

65	   This document is subject to BCP 78 and the IETF Trust's Legal
66	   Provisions Relating to IETF Documents (https://trustee.ietf.org/
67	   license-info) in effect on the date of publication of this document.
68	   Please review these documents carefully, as they describe your rights
69	   and restrictions with respect to this document.  Code Components
70	   extracted from this document must include Simplified BSD License text
71	   as described in Section 4.e of the Trust Legal Provisions and are
72	   provided without warranty as described in the Simplified BSD License.

74	Table of Contents

76	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   4
77	     1.1.  Requirements Language . . . . . . . . . . . . . . . . . .   5
78	   2.  Centralized conference model  . . . . . . . . . . . . . . . .   5
79	   3.  Requirements on multi-party RTT . . . . . . . . . . . . . . .   6
80	     3.1.  General requirements  . . . . . . . . . . . . . . . . . .   6
81	     3.2.  Performance requirements  . . . . . . . . . . . . . . . .   7
82	   4.  RTP based solutions . . . . . . . . . . . . . . . . . . . . .   8
83	     4.1.  Coordination of text RTP streams  . . . . . . . . . . . .   8
84	       4.1.1.  RTP-based solutions with a central mixer  . . . . . .   8
85	         4.1.1.1.  RTP Mixer using default RFC 4103 methods  . . . .   8
86	         4.1.1.2.  RTP Mixer using the default method but decreased
87	                 transmission interval . . . . . . . . . . . . . . .   9
88	         4.1.1.3.  RTP Mixer with frequent transmission and indicating
89	                 sources in CSRC-list  . . . . . . . . . . . . . . .  10
90	         4.1.1.4.  RTP Mixer using timestamp to identify
91	                 redundancy  . . . . . . . . . . . . . . . . . . . .  11
92	         4.1.1.5.  RTP Mixer with multiple primary data in each packet
93	                 and individual sequence numbers . . . . . . . . . .  12
94	         4.1.1.6.  RTP Mixer with multiple primary data in each
95	                 packet  . . . . . . . . . . . . . . . . . . . . . .  13

97	         4.1.1.7.  RTP Mixer with RFC 5109 FEC and RFC 2198 redundancy
98	                 in the packets  . . . . . . . . . . . . . . . . . .  14
99	         4.1.1.8.  RTP Mixer with RFC 5109 FEC and RFC 2198 redundancy
100	                 and separate sequence number in the packets . . . .  16
101	         4.1.1.9.  RTP Mixer indicating participants by a control code
102	                 in the stream . . . . . . . . . . . . . . . . . . .  18
103	         4.1.1.10. Mixing for multi-party unaware user agents  . . .  20
104	       4.1.2.  RTP-based bridging with minor RTT media contents
105	               reformatting by the bridge  . . . . . . . . . . . . .  21
106	         4.1.2.1.  RTP Translator sending one RTT stream per
107	                 participant . . . . . . . . . . . . . . . . . . . .  21
108	         4.1.2.2.  Distributing packets in an end-to-end encryption
109	                 structure . . . . . . . . . . . . . . . . . . . . .  24
110	         4.1.2.3.  Mesh of RTP endpoints . . . . . . . . . . . . . .  25
111	         4.1.2.4.  Multiple RTP sessions, one for each
112	                 participant . . . . . . . . . . . . . . . . . . . .  25
113	   5.  Preferred RTP-based multi-party RTT transport method  . . . .  26
114	   6.  Session control of RTP-based multi-party RTT sessions . . . .  26
115	     6.1.  Implicit RTT multi-party capability indication  . . . . .  27
116	     6.2.  RTT multi-party capability declared by SIP media-tags . .  28
117	     6.3.  SDP media attribute for RTT multi-party capability
118	           indication  . . . . . . . . . . . . . . . . . . . . . . .  29
119	     6.4.  Simplified SDP media attribute for RTT multi-party
120	           capability indication . . . . . . . . . . . . . . . . . .  31
121	     6.5.  SDP format parameter for RTT multi-party capability
122	           indication  . . . . . . . . . . . . . . . . . . . . . . .  31
123	     6.6.  A text media subtype for support of multi-party rtt . . .  33
124	     6.7.  Preferred capability declaration method for RTP-based
125	           transport.  . . . . . . . . . . . . . . . . . . . . . . .  33
126	     6.8.  Identification of the source of text for RTP-based
127	           solutions . . . . . . . . . . . . . . . . . . . . . . . .  33
128	   7.  RTT bridging in WebRTC  . . . . . . . . . . . . . . . . . . .  34
129	     7.1.  RTT bridging in WebRTC with one data channel per
130	           source  . . . . . . . . . . . . . . . . . . . . . . . . .  34
131	     7.2.  RTT bridging in WebRTC with one common data channel . . .  35
132	     7.3.  Preferred rtt multi-party method for WebRTC . . . . . . .  35
133	   8.  Presentation of multi-party text  . . . . . . . . . . . . . .  36
134	     8.1.  Associating identities with text streams  . . . . . . . .  36
135	     8.2.  Presentation details for multi-party aware endpoints. . .  36
136	       8.2.1.  Bubble style presentation . . . . . . . . . . . . . .  37
137	       8.2.2.  Other presentation styles . . . . . . . . . . . . . .  38
138	   9.  Presentation details for multi-party unaware endpoints. . . .  39
139	   10. Security Considerations . . . . . . . . . . . . . . . . . . .  39
140	   11. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  39
141	   12. Congestion considerations . . . . . . . . . . . . . . . . . .  40
142	   13. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  40
143	   14. Change history  . . . . . . . . . . . . . . . . . . . . . . .  40
144	     14.1.  Changes to
145	            draft-hellstrom-avtcore-multi-party-rtt-solutions-04 . .  40
146	     14.2.  Changes to
147	            draft-hellstrom-avtcore-multi-party-rtt-solutions-03 . .  40
148	     14.3.  Changes to
149	            draft-hellstrom-avtcore-multi-party-rtt-solutions-02 . .  40
150	     14.4.  Changes to
151	            draft-hellstrom-avtcore-multi-party-rtt-solutions-01 . .  40
152	     14.5.  Changes from draft-hellstrom-mmusic-multi-party-rtt-02 to
153	            draft-hellstrom-avtcore-multi-party-rtt-solutions-00 . .  41
154	     14.6.  Changes from version
155	            draft-hellstrom-mmusic-multi-party-rtt-01 to -02 . . . .  41
156	   15. References  . . . . . . . . . . . . . . . . . . . . . . . . .  41
157	     15.1.  Normative References . . . . . . . . . . . . . . . . . .  41
158	     15.2.  Informative References . . . . . . . . . . . . . . . . .  42
159	   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  45

161	1.  Introduction

163	   Real-time text (RTT) is a medium in real-time conversational
164	   sessions.  Text entered by participants in a session is transmitted
165	   in a time-sampled fashion, so that no specific user action is needed
166	   to cause transmission.  This gives a direct flow of text in the rate
167	   it is created, that is suitable in a real-time conversational
168	   setting.  The real-time text medium can be combined with other media
169	   in multimedia sessions.

171	   Media from a number of multimedia session participants can be
172	   combined in a multi-party session.  The present document specifies
173	   how the real-time text streams can be handled in multi-party
174	   sessions.  Recommendations are provided for preferred methods.

176	   The description is mainly focused on the transport level, but also
177	   describes a few session and presentation level aspects.

179	   Transport of real-time text is specified in RFC 4103 [RFC4103] RTP
180	   Payload for text conversation.  It makes use of RFC 3550 [RFC3550]
181	   Real Time Protocol, for transport.  Robustness against network
182	   transmission problems is normally achieved through redundant
183	   transmission based on the principle from RFC 2198 [RFC2198], with one
184	   primary and two redundant transmission of each text element.  Primary
185	   and redundant transmissions are combined in packets and described by
186	   a redundancy header.  This transport is usually used in the SIP
187	   Session Initiation Protocol RFC 3261 [RFC3261] environment.

189	   A very brief overview of functions for real-time text handling in
190	   multi-party sessions is described in RFC 4597 [RFC4597] Conferencing
191	   Scenarios, sections 4.8 and 4.10.  The present specification builds
192	   on that description and indicates which protocol mechanisms should be
193	   used to implement multi-party handling of real-time text.

195	   Real-time text can also be transported in the WebRTC environment, by
196	   using WebRTC data channels according to
197	   [I-D.ietf-mmusic-t140-usage-data-channel].  Multi-party aspects for
198	   WebRTC solutions are briefly covered.

200	1.1.  Requirements Language

202	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
203	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
204	   document are to be interpreted as described in RFC 2119 [RFC2119].

206	2.  Centralized conference model

208	   In the centralized conference model for SIP, introduced in RFC 4353
209	   [RFC4353] "A Framework for Conferencing with the Session Initiation
210	   Protocol (SIP)", one function co-ordinates the communication with
211	   participants in the multi-party session.  This function also controls
212	   media mixer functions for the media appearing in the session.  The
213	   central function is common for control of all media, while the media
214	   mixers may work differently for each media.

216	   The central function is called the Focus UA.  Many variants exist for
217	   setting up sessions including the multipoint control centre.  It is
218	   not within scope of this description to describe these, but rather
219	   the media specific handling in the mixer required to handle multi-
220	   party calls with RTT.

222	   The main principle for handling real-time text media in a centralized
223	   conference is that one RTP session for real-time text is established
224	   including the multipoint media control centre and the participating
225	   endpoints which are going to have real-time text exchange with the
226	   others.

228	   The different possible mechanisms for mixing and transporting RTT
229	   differs in the way they multiplex the text streams and how they
230	   identify the sources of the streams.  RFC 7667 [RFC7667] describes a
231	   number of possible use cases for RTP.  This specification refers to
232	   different sections of RFC 7667 for further reading of the situations
233	   caused by the different possible design choices.

235	   The recommended method for using RTP based RTT in a centralized
236	   conference model is specified in
237	   [I-D.ietf-avtcore-multi-party-rtt-mix] based on the recommendations
238	   in this document.

240	   Real-time text can also be transported in the WebRTC environment, by
241	   using WebRTC data channels according to
242	   [I-D.ietf-mmusic-t140-usage-data-channel].  Ways to handle multi-
243	   party calls in that environmnent are also specified.

245	3.  Requirements on multi-party RTT

247	3.1.  General requirements

249	   The following general requirements are placed on multi-party RTT:

251	      A solution shall be applicable to IMS (3GPP TS 22.173)[TS22173],
252	      SIP based VoIP and Next Generation Emergency Services (NENA i3
253	      [NENAi3], ETSI TS 103 479 [TS103479], RFC 6443[RFC6443]).

255	      The transmission interval for text should not be longer than 500
256	      milliseconds when there is anything available to send.  Ref ITU-T
257	      T.140 [T140].

259	      If text loss is detected or suspected, a missing text marker
260	      should be inserted in the text stream.  Ref ITU-T T.140 Amendment
261	      1 [T140ad1].  ETSI EN 301 549 [EN301549]

263	      The display of text from the members of the conversation shall be
264	      arranged so that the text from each participant is clearly
265	      readable, and its source and the relative timing of entered text
266	      is visualized in the display.  Mechanisms for looking back in the
267	      contents from the current session should be provided.  The text
268	      should be displayed as soon as it is received.  Ref ITU-T T.140
269	      [T140]

271	      Bridges must be multimedia capable (voice, video, text).  Ref NENA
272	      i3 STA-010.2.  [NENAi3]

274	      It MUST be possible to use real-time text in conferences both as a
275	      medium of discussion between individual participants (for example,
276	      for sidebar discussions in real-time text while listening to the
277	      main conference audio) and for central support of the conference
278	      with real-time text interpretation of speech.  Ref (R7) in RFC
279	      5194.[RFC5194]

281	      It should be possible to protect RTT contents with usual means for
282	      privacy and integrity.  Ref RFC 6881 section 16.  [RFC6881]
283	      Conferencing procedures are documented in RFC 4579 [RFC4579].  Ref
284	      NENA i3 STA-010.2.[NENAi3]

286	      Conferencing applies to any kind of media stream by which users
287	      may want to communicate.  Ref 3GPP TS 24.147 [TS24147]

289	      The framework for SIP conferences is specified in RFC 4353
290	      [RFC4353].  Ref 3GPP TS 24.147 [TS24147]

292	3.2.  Performance requirements

294	   The mixer performance requirements can be expressed in one number,
295	   extracted from the user requirements on real-time text expressed in
296	   ITU-T F.700, where it is stated that for "good" usability, text
297	   characters should not be delayed more than 1 second from creation to
298	   presentation.  For "usable" usability the figure is 2 seconds.  The
299	   main factor behind these limits is from when taking turns in a
300	   conversation gets disturbed by a delay of when a response gets
301	   visible to the receiving part.  If that times get too long, the
302	   receiving part gets unsure if the previous utterance was well
303	   perceived and the receiving part maybe prepares for repetition.  This
304	   is similar to the same effect in voice communication, where the
305	   usability limit is 400 ms delay.

307	   Another important factor in a multi-party conference is the
308	   opportunity for a participant using real-time text to provide timely
309	   comments and get a chance to enter the discussion if the majority of
310	   participants use voice in the conference.  A complicating factor when
311	   stating the requirements is that some transport methods do not cause
312	   a total delay, but instead an increasing jerkiness when the number of
313	   simultaneously sending participants is increased.

315	   It should however be remembered that the expected number of
316	   participants sending real-time text simultaneously is low.  Just as
317	   with voice or sign language, the capability of the participants to
318	   perceive utterances from more than one participant at a time is very
319	   limited.  Therefore the normal case in multi-party situations is that
320	   one participant at a time is the main provider of text.  Others might
321	   usually just provide very brief comments such as "yes" or "no" or
322	   "may I comment?".  Only at very rare situations two participants
323	   provide more information simultaneously.

325	   *  The number of expected simultaneously transmitting users is
326	      different for different applications.  In all cases, just one
327	      transmitting user is the normal case.  Two simultaneously
328	      transmitting participants can occasionally be expected in
329	      emergency services, relay services, small unmanaged conferences
330	      and group calls and large managed conferences.  Three
331	      simultaneously transmitting participants may appear occasionally
332	      in large unmanaged conferences.  The following can therefore
333	      express the performance requirement.

335	   *  The mean delay of text passing the mixer introduced when only one
336	      participant is sending text should be kept to a minimum and should
337	      not be more than 400 ms.

339	   *  The mean delay of text passing the mixer should not be more than 1
340	      second during moments when up to three users are sending text
341	      simultaneously.

343	   *  For the very rare case that more than three participants send text
344	      simultaneously, the mixer may take action to limit the introduced
345	      delay of the text passing the mixer to 7 seconds e.g. by
346	      discarding text from some participants and instead inserting a
347	      general warning about possible text loss in the stream.

349	4.  RTP based solutions

351	4.1.  Coordination of text RTP streams

353	   Coordinating and sending text RTP streams in the multi-party session
354	   can be done in a number of ways.  The most suitable methods are
355	   specified here with pros and cons.

357	   A receiving and presenting endpoint MUST separate text from the
358	   different sources and identify and display them accordingly.

360	4.1.1.  RTP-based solutions with a central mixer

362	   A set of solutions can be based on the central RTP mixer.  They are
363	   described here and a preferred method selected.

365	4.1.1.1.  RTP Mixer using default RFC 4103 methods

367	   Without any extra specifications, a mixer would transmit with 300
368	   milliseconds intervals, and use RFC 4103 [RFC4103] with the default
369	   redundancy of one original and two redundant transmissions.  The
370	   source of the text would be indicated by a single member in the CSRC
371	   list.  Text from different sources cannot be transmitted in the same
372	   packet.  Therefore, from the time when the mixer sent one piece of
373	   new text from one source, it will need to transmit that text again
374	   twice as redundant data, before it can send text from another source.
375	   The jerkiness = time between transmission of new text is 900 ms.
376	   This is clearly insufficient.

378	   Pros:

380	   Only a capability negotiation method is needed.  No other update of
381	   standards are needed, just a general remark that traditional RTP-
382	   mixing is used.

384	   Cons:

386	   Clearly insufficient mixer switching performance.

388	   A bit complex handling of transmission when there is new text
389	   available from more than one source.  The mixer needs to send two
390	   packets more with redundant text from the current source before
391	   starting to send anything from the other source.

393	4.1.1.2.  RTP Mixer using the default method but decreased transmission
394	          interval

396	   This method makes use of the default RTP-mixing method briefly
397	   described in Section 4.1.1.1.  The only difference is that the
398	   transmission interval is decreased to 100 milliseconds when there is
399	   text from more than one source available for transmission.  The
400	   jerkiness is 300 ms.  The mean delay with two simultaneously sending
401	   participants is 250 ms, and with three simultaneously sending
402	   participants 500 ms.  This is acceptable performance.

404	   Pros:

406	   Minor influence on standards

408	   Can be relatively rapidly be introduced in the intended technical
409	   environments.

411	   Can be declared in sdp as the already existing "text/red" format with
412	   a multi-party attribute for capability negotiation.

414	   Cons:

416	   The introduced jerkiness of new text from more than the required
417	   three simultaneously sending sources is high.

419	   Slightly higher risk for loss of text at bursty packet loss than for
420	   the recommended transmission interval (300 ms) for RFC 4103.

422	   When complete loss of packets occur (beyond recovery), it is not
423	   possible to deduct from which source text was lost.

425	   A bit complex handling of transmission when there is new text
426	   available from more than one source.  The mixer needs to send two
427	   packets more with redundant text from the current source before
428	   starting to send anything from the other source.

430	4.1.1.3.  RTP Mixer with frequent transmission and indicating sources in
431	          CSRC-list

433	   An RTP media mixer combines text from participants into one RTP
434	   stream, thus all using the same destination address/port combination,
435	   the same RTP SSRC, and one sequence number series as described in
436	   Section 7.1 and 7.3 of RTP RFC 3550 [RFC3550] about the Mixer
437	   function.  This method is also briefly described in RFC 7667, section
438	   3.6.1 Media mixing mixer [RFC7667].

440	   The sources of the text in each RTP packet are identified by the CSRC
441	   list in the RTP packets, containing the SSRC of the initial sources
442	   of text.  The order of the CSRC parameters is with the SSRC of the
443	   source of the primary text first, followed by the SSRC of the first
444	   level redundancy, and then the second level redundancy.

446	   The transmission interval should be 100 milliseconds when there is
447	   text to transmit from more than one source, and otherwise 300 ms.

449	   The identification of the sources is made through the CSRC fields and
450	   can be made more readable at the receiver through the RTCP SDES CNAME
451	   and NAME packets as described in RTP[RFC3550].

453	   Information provided through the notification according to RFC 4575
454	   [RFC4575] when the participant joined the conference provides also
455	   suitable information and a reference to the SSRC.

457	   A receiving endpoint is supposed to separate text items from the
458	   different sources and identify and display them accordingly.

460	   The ordered CSRC lists in the RFC 4103 [RFC4103] packets make it
461	   possible to recover from loss of one and two packets in sequence and
462	   assign the recovered text to the right source.  For more loss, a
463	   marker for possible loss should be inserted or presented.

465	   The conference server needs to have authority to decrypt the payload
466	   in the received RTP packets in order to be able to recover text from
467	   redundant data or insert the missing text marker in the stream, and
468	   repack the text in new packets.

470	   Even if the format is very similar to "text/red" of RFC 4103, it
471	   needs to be declared as a new media subtype, e.g. "text/rex".

473	   Pros:

475	   This method has low overhead and less complexity than the methods in
476	   Section 4.1.1.1, Section 4.1.1.2, Section 4.1.1.4 and
477	   Section 4.1.1.6.

479	   When loss of packets occur, it is possible to recover text from
480	   redundancy at loss of up to the number of redundancy levels carried
481	   in the RFC 4103 [RFC4103] stream (normally primary and two redundant
482	   levels).

484	   This method can be implemented with most RTP implementations.

486	   The source switching performance is sufficient for well-behaving
487	   conference participants.  The jerkiness is 100 ms.

489	   Cons:

491	   When more consecutive packet loss than the number of generations of
492	   redundant data appears, it is not possible to deduct the sources of
493	   the totally lost data.

495	   Slightly higher risk for loss of text at bursty packet loss than for
496	   the recommended transmission interval for RFC 4103.

498	   Requires a different sub media format, e.g. "text/rex".  This takes a
499	   long time in standardisation and releases of target technical
500	   environments.

502	   The conference server needs to be allowed to decrypt/encrypt the
503	   packet payload.  This is however normal for media mixers for other
504	   media.

506	4.1.1.4.  RTP Mixer using timestamp to identify redundancy

508	   This method has text only from one source per packet, as the original
509	   RFC 4103 [RFC4103] specifies.  Packets with text from different
510	   sources are instead allowed to be merged.  The recovery procedure in
511	   the receiver will use the RTP timestamp and timestamp offsets in the
512	   redundancy headers to evaluate if a piece of redundant data should be
513	   recovered or not in case of packet loss.

515	   In this method, the transmission interval is 100 milliseconds when
516	   text from more than one source is available for transmission.

518	   Pros:

520	   The format of each packet is equal to what is specified in RFC 4103
521	   [RFC4103].

523	   The source switching performance is sufficient.  Text from five
524	   participants can be transmitted simultaneously with 500 milliseconds
525	   interval per source.

527	   New text from five simultaneous sources can be transmitted within 500
528	   milliseconds.  This is sufficient.

530	   Cons:

532	   The recovery time in case of packet loss is long.  With five
533	   simultaneously sending participants, it will be 1.5 seconds.

535	   The recovery procedure is complex and very different from what is
536	   described in RFC 4103 [RFC4103].

538	   It is not sure that this change can be regarded to be an update to
539	   RFC 4103.  It may need a new media subtype.

541	4.1.1.5.  RTP Mixer with multiple primary data in each packet and
542	          individual sequence numbers

544	   This method allows primary as well as redundant text from more than
545	   one source per packet.  The packet payload contains an ordered set of
546	   redundant and primary data with the same number of generations of
547	   redundancy as once agreed in the SDP negotiation.  The data header
548	   reflects these parts of the payload.  The CSRC list contains one CSRC
549	   member per source in the payload and in the same order.  An
550	   individual sequence number per source is included in the data header
551	   replacing the t140 payload type number that is instead assumed to be
552	   constant in this format.  This allows an individual extra sequence
553	   number per source with maximum value 127, suitable for checking for
554	   which source loss of text appeared when recovery was not possible.

556	   The data header would contain the following fields:
557	     0                   1                    2                   3
558	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
559	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
560	   |F| Source-seq  |  timestamp offset         |   block length    |
561	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
562	   Where "Source-seq" is the sequence number per source.

564	   The maximum number of members in the CSRC-list is 15, and that is
565	   therefore the maximum number of sources that can be represented in
566	   each packet provided that all data can be fitted into the size
567	   allowable in one packet.

569	   Transmission is done as soon as there is new text available, but not
570	   with shorter interval than 150 ms and not longer than 300 ms while
571	   there is anything to send.

573	   A new media subtype is needed, e.g. "text/rex".

575	   This is an SDP offer example for both traditional "text/red"
576	   and multi-party "text/rex" format:

578	         m=text 11000 RTP/AVP 101 100 98
579	         a=rtpmap:98 t140/1000
580	         a=rtpmap:100 red/1000
581	         a=rtpmap:101 rex/1000
582	         a=fmtp:100 98/98/98
583	         a=fmtp:101 98/98/98

585	   Pros:

587	   The source switching performance is good.  Text from 15 participants
588	   can be transmitted simultaneously.

590	   New text from 15 simultaneous sources can be transmitted within 300
591	   milliseconds.  This is good performance.

593	   When more consecutive packet loss than the number of generations of
594	   redundant data appears, it is still possible to deduct the sources of
595	   the totally lost data, when next text from these sources arrive.

597	   Cons:

599	   The format of each packet is different from what is specified in RFC
600	   4103 [RFC4103].

602	   The processing time in standard organisation will be long.

604	   A new media subtype is needed, causing a bit complex negotiation.

606	   The recovery procedure is a bit complex.

608	4.1.1.6.  RTP Mixer with multiple primary data in each packet

610	   This method allows primary as well as redundant text from more than
611	   one source per packet.  The packet payload contains an ordered set of
612	   redundant and primary data with the same number of generations of
613	   redundancy as once agreed in the SDP negotiation.  The data header
614	   reflects these parts of the payload.  The CSRC list contains one CSRC
615	   member per source in the payload and in the same order.

617	   The maximum number of members in the CSRC-list is 15, and that is
618	   therefore the maximum number of sources that can be represented in
619	   each packet provided that all data can be fitted into the size
620	   allowable in one packet.

622	   Transmission is done as soon as there is new text available, but not
623	   with shorter interval than 150 ms and not longer than 300 ms while
624	   there is anything to send.

626	   A new media subtype is needed, e.g. "text/rex".

628	   SDP would be the same as in Section 4.1.1.6.

630	   Pros:

632	   The source switching performance is good.  Text from 15 participants
633	   can be transmitted simultaneously.

635	   New text from 15 simultaneous sources can be transmitted within 150
636	   milliseconds.  This is good performance.

638	   Cons:

640	   The format of each packet is different from what is specified in RFC
641	   4103 [RFC4103].

643	   A new media subtype is needed.

645	   A new media subtype is needed, causing a bit complex negotiation.

647	   The processing time in standard organisation will be long.

649	   The recovery procedure is a bit complex [RFC4103].

651	   When more consecutive packet loss than the number of generations of
652	   redundant data appears, it is not possible to deduct the sources of
653	   the totally lost data.

655	4.1.1.7.  RTP Mixer with RFC 5109 FEC and RFC 2198 redundancy in the
656	          packets

658	   This method allows primary data from one source and redundant text
659	   from other sources in each packet.  The packet payload contains
660	   primary data in "text/t140" format, and redundant data in RFC 5109
661	   FEC [RFC5109] format called "text/ulpfec".  That means that the
662	   redundant data contains the sequence number and the CSRC and other
663	   characteristics from the RTP header when the data was sent as
664	   primary.  The redundancy can be sent at a selected number of packets
665	   after when it was sent as primary, in order to improve the protection
666	   against bursty packet loss.  The redundancy level is recommended to
667	   be the same as in original RFC 4103.

669	   RFC 4103 says that the protection against loss can be made by other
670	   methods than plain redundancy, so this method is in line with that
671	   statement.

673	   Transmission is done as soon as there is new text available, but not
674	   with shorter interval than 100 ms and not longer than 300 ms while
675	   there is anything to send (new or redundant text).

677	   When more consecutive packet loss than the number of generations of
678	   redundant data appears, it is not possible to deduct the sources of
679	   the totally lost data.

681	   The sdp can indicate the format as "text/red" with "text/ulpfec"
682	   redundant data in this way. with traditional RFC 4103 with "text/red"
683	   with "text/t140" as redundant data as a fallback.

685	   m=text 49170 RTP/AVP 98 101 100 102
686	   a=rtpmap:98 red/1000
687	   a=fmtp:98 100/102/102
688	   a=rtpmap:102 ulpfec/1000
689	   a=rtpmap:100 t140/1000
690	   a=rtpmap:101 red/1000
691	   a=fmtp:101 100/100/100
692	   a=fmtp:100 cps=200

694	   The "text/ulpfec" format includes an indication of how far back the
695	   redundancy belongs, making it possible to cover bursty packet loss
696	   better than the other formats with short transmission intervals.  For
697	   real-time text, it is recommended to send three packets between the
698	   primary and the redundant transmissions of text.  That makes the
699	   transmission cover between 500 and 1500 ms of bursty packet loss.
700	   The variation is because of the varying packet interval between many
701	   and one simultaneously transmitting source.

703	   The "text/ulpfec" format has a number of parameters.  One is the
704	   length of the data to be protected which in this case must be the
705	   whole t140block.

707	   Pros:

709	   The source switching performance is good.  Text from 5 participants
710	   can be transmitted within 500 ms.

712	   Good recovery from bursty packet loss.

714	   The method is based on existing standards.  No new registrations are
715	   needed.

717	   Cons:

719	   When more consecutive packet loss than the number of generations of
720	   redundant data appears, it is not possible to deduct the sources of
721	   the totally lost data.

723	   Even if the switching performance is good, it is not as good as for
724	   the method called "RTP Mixer with multiple primary data in each
725	   packet "Section 4.1.1.6.  With more than 5 simultaneously sending
726	   sources, there will be a noticeable delay of text of over 500 ms,
727	   with 100 ms added per simultaneous source.  This is however beyond
728	   the requirements and would be a concern only in congestion
729	   situations.

731	   The recovery procedure is a bit complex [RFC5109].

733	   There is more overhead in terms of extra data and extra packets sent
734	   than in the other methods.  With the recommended two redundant
735	   generations of data, each packet will be 36 bytes longer than with
736	   traditional RFC 4103, and at each pause in transmission five extra
737	   packets with only redundant data will be sent compared to two extra
738	   packets for the traditional RFC 4103 case.

740	4.1.1.8.  RTP Mixer with RFC 5109 FEC and RFC 2198 redundancy and
741	          separate sequence number in the packets

743	   This method allows primary data from one source and redundant text
744	   from other sources in each packet.  The packet payload contains
745	   primary data in a new "text/t140e" format, and redundant data in RFC
746	   5109 FEC [RFC5109] format called "text/ulpfec".  That means that the
747	   redundant data contains the sequence number and the CSRC and other
748	   characteristics from the RTP header when the data was sent as
749	   primary.  The redundancy can be sent at a selected number of packets
750	   after when it was sent as primary, in order to improve the protection
751	   against bursty packet loss.  The redundancy level is recommended to
752	   be the same as in original RFC 4103.  The "text/t140e" format
753	   contains a source-specific sequence number and the t140block.

755	   RFC 4103 says that the protection against loss can be made by other
756	   methods than plain redundancy, so this method is in line with that
757	   statement.

759	   Transmission is done as soon as there is new text available, but not
760	   with shorter interval than 100 ms and not longer than 300 ms while
761	   there is anything to send (new or redundant text).

763	   When more consecutive packet loss than the number of generations of
764	   redundant data appears, it is possible to deduct which sources lost
765	   data when new data arrives from the sources.  This is done by
766	   monitoring the received source specific sequence numbers preceding
767	   the text.

769	   This is an example of how can indicate the format as "text/red" with
770	   "text/t140e" as primary and "text/ulpfec" redundant data, with
771	   traditional RFC 4103 with "text/red" with "text/t140" as redundant
772	   data as a fallback.

774	   m=text 49170 RTP/AVP 98 101 100 102 103
775	   a=rtpmap:98 red/1000
776	   a=fmtp:98 100/102/102
777	   a=rtpmap:102 ulpfec/1000
778	   a=rtpmap:103 t140/1000
779	   a=rtpmap:100 t140e/1000
780	   a=rtpmap:101 red/1000
781	   a=fmtp:101 103/103/103
782	   a=fmtp:100 cps=200

784	   The "text/ulpfec" format includes an indication of how far back the
785	   redundancy belongs, making it possible to cover bursty packet loss
786	   better than the other formats with short transmission intervals.  For
787	   real-time text, it is recommended to send three packets between the
788	   primary and the redundant transmissions of text.  That makes the
789	   transmission cover between 500 and 1500 ms of bursty packet loss.
790	   The variation is because of the varying packet interval between many
791	   and one simultaneously transmitting source.

793	   The "text/ulpfec" format has a number of parameters.  One is the
794	   length of the data to be protected which in this case must be the
795	   whole t140block.

797	   Pros:

799	   The source switching performance is good.  Text from 5 participants
800	   can be transmitted within 500 ms.

802	   Good recovery from bursty packet loss.

804	   The method is based on an existing standard for FEC.

806	   When more consecutive packet loss than the number of generations of
807	   redundant data appears, it is possible to deduct the source of the
808	   lost data when new text arrives from the source.

810	   Cons:

812	   Even if the switching performance is good, it is not as good as for
813	   the method called "RTP Mixer with multiple primary data in each
814	   packet" Section 4.1.1.6.  With more than 5 simultaneously sending
815	   sources, there will be a noticeable delay of text of over 500 ms,
816	   with 100 ms added per simultaneous source.  This is however beyond
817	   the requirements and would be a concern only in congestion
818	   situations.

820	   The recovery procedure is a bit complex [RFC5109].

822	   There is more overhead in terms of extra data and extra packets sent
823	   than in the other methods.  With the recommended two redundant
824	   generations of data, each packet will be 40 bytes longer than with
825	   traditional RFC 4103, and at each pause in transmission five extra
826	   packets with only redundant data will be sent compared to two extra
827	   packets for the traditional RFC 4103 case.

829	   A new text media subtype "text/t140e" needs to be registered.

831	   The processing time in standard organisation will be long.

833	4.1.1.9.  RTP Mixer indicating participants by a control code in the
834	          stream

836	   Text from all participants except the receiving one is transmitted
837	   from the media mixer in the same RTP session and stream, thus all
838	   using the same destination address/port combination, the same RTP
839	   SSRC and , one sequence number series as described in Section 7.1 and
840	   7.3 of RTP RFC 3550 [RFC3550] about the Mixer function.  The sources
841	   of the text in each RTP packet are identified by a new defined T.140
842	   control code "c" followed by a unique identification of the source in
843	   UTF-8 string format.

845	   The receiver can use the string for presenting the source of text.
846	   This method is on the RTP level described in RFC 7667, section 3.6.1
847	   Media mixing mixer [RFC7667].

849	   The inline coding of the source of text is applied in the data stream
850	   itself, and an RTP mixer function is used for coordinating the
851	   sources of text into one RTP stream.

853	   Information uniquely identifying each user in the multi-party session
854	   is placed as the parameter value "n" in the T.140 application
855	   protocol function with the function code "c".  The identifier shall
856	   thus be formatted like this: SOS c n ST, where SOS and ST are coded
857	   as specified in ITU-T T.140 [T140].  The "c" is the letter "c".  The
858	   n parameter value is a string uniquely identifying the source.  This
859	   parameter shall be kept short so that it can be repeated in the
860	   transmission without concerns for network load.

862	   A receiving endpoint is supposed to separate text items from the
863	   different sources and identify and display them accordingly.

865	   The conference server need to be allowed to decrypt/encrypt the
866	   packet payload in order to check the source and repack the text.

868	   Pros:

870	   If loss of packets occur, it is possible to recover text from
871	   redundancy at loss of up to the number of redundancy levels carried
872	   in the RFC 4103 [RFC4103]stream. (normally primary and two redundant
873	   levels.

875	   This method can be implemented with most RTP implementations.

877	   The method can also be used with other transports than RTP

879	   Cons:

881	   The method implies a moderate load by the need to insert the source
882	   often in the stream.

884	   If more consecutive packet loss than the number of generations of
885	   redundant data appears, it is not possible to deduct the source of
886	   the totally lost data.

888	   The mixer needs to be able to generate suitable and unique source
889	   identifications which are suitable as labels for the sources.

891	   Requires an extension on the ITU-T T.140 standard, best made by the
892	   ITU.

894	   There is a risk that the control code indicating the change of source
895	   is lost and the result is false source indication of text.

897	   The conference server need to be allowed to decrypt/encrypt the
898	   packet payload.

900	4.1.1.10.  Mixing for multi-party unaware user agents

902	   Multi-party real-time text contents can be transmitted to multi-party
903	   unaware user agents if source labelling and formatting of the text is
904	   performed by a mixer.  This method has the limitations that the
905	   layout of the presentation and the format of source identification is
906	   purely controlled by the mixer, and that only one source at a time is
907	   allowed to present in real-time.  Other sources need to be stored
908	   temporarily waiting for an appropriate moment to switch the source of
909	   transmitted text.  The mixer controls the switching of sources and
910	   inserts a source identifier in text format at the beginning of text
911	   after switch of source.  The logic of the mixer to detect when a
912	   switch is appropriate should detect a number of places in text where
913	   a switch can be allowed, including new line, end of sentence, end of
914	   phrase, a period of inactivity, and a word separator after a long
915	   time of active transmission.

917	   This method MAY be used when no support for multi-party awareness is
918	   detected in the receiving endpoint.The base for his method is
919	   described in RFC 7667, section 3.6.1 Media mixing mixer [RFC7667].

921	   See [I-D.ietf-avtcore-multi-party-rtt-mix] for a procedure for mixing
922	   RTT for a conference-unaware endpoint.

924	   Pros:

926	   Can be transmitted to conference-unaware endpoints.

928	   Can be used with other transports than RTP

930	   Cons:

932	   Does not allow full real-time presentation of more than one source at
933	   a time.  Text from other sources will be delayed.

935	   The only realistic presentation format is a style with the text from
936	   the different sources presented with a text label indicating source,
937	   and the text collected in a chat style presentation but with more
938	   frequent turn-taking.

940	   Endpoints often have their own system for adding labels to the RTT
941	   presentation.  In that case there will be two levels of labels in the
942	   presentation, one for the mixer and one for the sources.

944	   If loss of more packets than can be recovered by the redundancy
945	   appears, it is not possible to detect which source was struck by the
946	   loss.  It is also possible that a source switch occurred during the
947	   loss, and therefore a false indication of the source of text can be
948	   provided to the user after such loss.

950	   Because of all these cons, this method is not recommended and should
951	   be used as the main method, but only as fallback and the last resort
952	   for backwards interoperability with multi-party unaware endpoints.

954	   The conference server need to be allowed to decrypt/encrypt the
955	   packet payload.

957	4.1.2.  RTP-based bridging with minor RTT media contents reformatting by
958	        the bridge

960	   It may be desirable to send text in a multi-party setting in a way
961	   that allows the text stream contents to be distributed without being
962	   dealt with in detail in any central server.  A number of such methods
963	   are described.  However, when writing this specification, no one of
964	   these methods have a specified way of establishing the session by
965	   sdp.

967	4.1.2.1.  RTP Translator sending one RTT stream per participant

969	   Within the RTP session, text from each participant is transmitted
970	   from the RTP media translator (bridge) in a separate RTP stream, thus
971	   using the same destination address/port combination, the same payload
972	   type number (PT) but separate RTP SSRC parameters and sequence number
973	   series as described in Section 7.1 and 7.2 of RTP RFC 3550 [RFC3550]
974	   about the Translator function.  The source of the text in each RTP
975	   packet is identified by the SSRC parameter in the RTP packets,
976	   containing the SSRC of the initial source of text.

978	   A receiving and presenting endpoint is supposed to separate text
979	   items from the different sources and identify and display them in a
980	   suitable way.

982	   This method is described in RFC 7667, section 3.5.1 Relay-transport
983	   translator or 3.5.2 Media translator [RFC7667].

985	   The identification of the source is made through the SSRC.  The
986	   translation to a readable label can be done by mapping to information
987	   from the RTCP SDES CNAME and NAME packets as described in
988	   RTP[RFC3550], and also through information in the text media member
989	   in the conference notification described in RFC 4575 [RFC4575].

991	   The sdp exchange for establishing this mixing type can be equal to
992	   what is used for basic two-party use of RFC 4103 with just an added
993	   attribute for indicating multi-party capability.

995	   m=text 49170 RTP/AVP 98 103
996	   a=rtpmap:98 red/1000
997	   a=fmtp:98 103/103/103
998	   a=rtpmap:103 t140/1000
999	   a=fmtp:103 cps=150
1000	   a=RTT-mixing:RTP-translator

1002	   A similar answer including the same RTT-mixing attribute would
1003	   indicate that multi-party coding can begin.  An answer without the
1004	   same RTT-mixing attribute could result in diversion to use of the
1005	   mixing method for multi-party unaware endpoints Section 4.1.1.10 if
1006	   more than two parties are involved in the session.

1008	   The bridge can add new sources in the communication to a participant
1009	   by first sending a conference notification according to RFC 4575
1010	   [RFC4575] with the SSRC of the new source included in the
1011	   corresponding "text" media member, or by sending an RTCP message with
1012	   the new SSRC in an SDES packet.

1014	   A receiver should be prepared to receive such indications of new
1015	   streams being added to the multi-party session, so that the new SSRC
1016	   is not taken for a change in SSRC value for an already established
1017	   RTP stream.

1019	   Transmission, reception, packet loss recovery and text loss
1020	   indication is performed per source in the separate RTP streams in the
1021	   same way as in two-party sessions with RFC 4103 [RFC4575].

1023	   Text is recommended to be sent by the bridge as soon as it is
1024	   available for transmission, but not less than 250 ms after a previous
1025	   transmission.  This will in many cases result in close to 0 added
1026	   delay by the bridge, because most RTT senders use a 300 ms
1027	   transmission interval.

1029	   It is sometimes said that this configuration is not supported by
1030	   current media declarations in sdp.  RFC 3264 [RFC3264]specifies in
1031	   some places that one media description is supposed to describe just
1032	   one RTP media stream.  However this is not directly referencing an
1033	   RTP stream, and use of multiple RTP streams in the same RTP session
1034	   is recommended in many other RFCs.

1036	   This confusion is clarified in RFC 5576 [RFC5576] section 3 by the
1037	   following statements:

1039	   "The term "media stream" does not appear in the SDP specification
1040	   itself, but is used by a number of SDP extensions, for instance,
1041	   Interactive Connectivity Establishment (ICE) [ICE], to denote the
1042	   object described by an SDP media description.  This term is
1043	   unfortunately rather confusing, as the RTP specification [RFC3550]
1044	   uses the term "media stream" to refer to an individual media source
1045	   or RTP packet stream, identified by an SSRC, whereas an SDP media
1046	   stream describes an entire RTP session, which can contain any number
1047	   of RTP sources."

1049	   In most cases, it will be sufficient that new sources are introduced
1050	   with a conference notification or RTCP message.  However, RFC 5576
1051	   [RFC5576] specifies attributes which may be used to more explicitly
1052	   announce new sources or restart of earlier established RTP streams.

1054	   This method is encouraged by draft-ietf-avtcore-multiplex-guidelines
1055	   [I-D.ietf-avtcore-multiplex-guidelines] section 5.2.

1057	   Normal operation will be that the bridge receives text packets from
1058	   the source and handles any text recovery and indication of loss
1059	   needed before queueing the resulting clean text for transmission from
1060	   the bridge to the receivers.

1062	   It may however also be possible for the bridge to just convey the
1063	   packet contents as received from the sources, with minor adjustments,
1064	   and let the receiving endpoint handle all aspects of recovery and
1065	   indication of loss, even for the source to bridge path.  In that case
1066	   also the sequence number must be maintained as it was at reception in
1067	   the bridge.  This mode needs further study before application.

1069	   Pros:

1071	   This method is the natural way to do multi-party bridging with RFC
1072	   4103 based RTT.  Only a small addition is included in the session
1073	   establishment to verify capability by the parties because many
1074	   implementations are done without multi-party capability.

1076	   This method has moderate overhead in terms of work for the mixer, but
1077	   high in terms of packet transmission rate.  Five sources sending
1078	   simultaneously cause the bridge to send 15 packets per second to each
1079	   receiver.

1081	   When loss of packets occur, it is possible to recover text from
1082	   redundancy at loss of up to the number of redundancy levels carried
1083	   in the RFC 4103 [RFC4103] stream(normally primary and two redundant
1084	   levels).

1086	   More loss than what can be recovered, can be detected and the marker
1087	   for text loss can be inserted in the correct stream.

1089	   It may be possible in some scenarios to keep the text encrypted
1090	   through the Translator.

1092	   Minimal delay.  The delay can often be kept close to 0 with at least
1093	   5 simultaneous sending participants.

1095	   Cons:

1097	   There are RTP implementations not supporting the Translator model.
1098	   They will need to use the fall-back to multi-party-unaware mixing.
1099	   An investigation about how common this is is needed before the method
1100	   is used.

1102	   The processing time in standard organisation will be long.

1104	   With many simultaneous sending sources, the total rate of packets
1105	   will be high, and can cause congestion.  The requirement to handle 3
1106	   simultaneous sources in this specification will cause 10 packets per
1107	   second that is manageable in most cases, e.g. considering that audio
1108	   usually use 50 packets per second.

1110	4.1.2.2.  Distributing packets in an end-to-end encryption structure

1112	   In order to achieve end-to-end encryption, it is possible to let the
1113	   packets from the sources just pass though a central distributor, and
1114	   handle the security agreements between the participants.
1115	   Specifications exist for a framework with this functionality for
1116	   application on RTP based conferences in
1117	   [I-D.ietf-perc-private-media-framework].  The RTP flow and mixing
1118	   characteristics has similarities with the method described under "RTP
1119	   Translator sending one RTT stream per participant" above.  RFC 4103
1120	   RTP streams [RFC4103] would fit into the structure and it would
1121	   provide a base for end-to-end encrypted rtt multi-party conferencing.

1123	   Pros:

1125	   Good security

1127	   Straightforward multi-party handling.

1129	   Cons:

1131	   Does not operate under the usual SIP central conferencing
1132	   architecture.

1134	   Requires the participants to perform a lot of key handling.

1136	   Is work in progress when this is written.

1138	4.1.2.3.  Mesh of RTP endpoints

1140	   Text from all participants are transmitted directly to all others in
1141	   one RTP session, without a central bridge.  The sources of the text
1142	   in each RTP packet are identified by the source network address and
1143	   the SSRC.

1145	   This method is described in RFC 7667, section 3.4 Point to multi-
1146	   point using mesh [RFC7667].

1148	   Pros:

1150	   When loss of packets occur, it is possible to recover text from
1151	   redundancy at loss of up to the number of redundancy levels carried
1152	   in the RFC 4103 [RFC4103] stream. (normally primary and two redundant
1153	   levels.

1155	   This method can be implemented with most RTP implementations.

1157	   Transmitted text can also be used with other transports than RTP

1159	   Cons:

1161	   This model is not described in IMS, NENA and EENA specifications, and
1162	   does therefore not meet the requirements.

1164	   Requires a drastically increasing number of connections when the
1165	   number of participants increase.

1167	4.1.2.4.  Multiple RTP sessions, one for each participant

1169	   Text from all participants are transmitted directly to all others in
1170	   one RTP session each, without a central bridge.  Each session is
1171	   established with a separate media description in SDP.  The sources of
1172	   the text in each RTP packet are identified by the source network
1173	   address and the SSRC.

1175	   Pros:

1177	   When loss of packets occur, it is possible to recover text from
1178	   redundancy at loss of up to the number of redundancy levels carried
1179	   in the RFC 4103 [RFC4103] stream. (normally primary and two redundant
1180	   levels.

1182	   Complete loss of text can be indicated in the received stream.

1184	   This method can be implemented with most RTP implementations.

1186	   End-to-end encryption is achievable.

1188	   Cons:

1190	   This method is not described in IMS, NENA and ETSI specifications and
1191	   does therefore not meet the requirements.

1193	   A lot of network resources are spent on setting up separate sessions
1194	   for each participant.

1196	5.  Preferred RTP-based multi-party RTT transport method

1198	   For RTP transport of RTT using RTP-mixer technology, one method for
1199	   multi-party mixing and transport stand out as fulfilling the goals
1200	   best and is therefore recommended.  That is: "RTP Mixer using the
1201	   default method but decreased transmission interval" Section 4.1.1.2

1203	   For RTP transport in separate streams or sessions, no current
1204	   recommendation can be made.  A bridging method in the process of
1205	   standardisation with interesting characteristics is the end-to-end
1206	   encryption model "perc" Section 4.1.2.2.

1208	6.  Session control of RTP-based multi-party RTT sessions

1210	   General session control aspects for multi-party sessions are
1211	   described in RFC 4575 [RFC4575] A Session Initiation Protocol (SIP)
1212	   Event Package for Conference State, and RFC 4579 [RFC4579] Session
1213	   Initiation Protocol (SIP) Call Control - Conferencing for User
1214	   Agents.  The nomenclature of these specifications are used here.

1216	   The procedures for a multi-party aware model for RTT-transmission
1217	   shall only be applied if a capability exchange for multi-party aware
1218	   real-time text transmission has been completed and a supported method
1219	   for multi-party real-time text transmission can be negotiated.

1221	   A method for detection of conference-awareness for centralized SIP
1222	   conferencing in general is specified in RFC 4579 [RFC4579].  The
1223	   focus sends the "isfocus" feature tag in a SIP Contact header.  This
1224	   causes the conference-aware endpoint to subscribe to conference
1225	   notifications from the focus.  The focus then sends notifications to
1226	   the endpoint about entering and disappearing conference participants
1227	   and their media capabilities.  The information is carried XML-
1228	   formatted in a 'conference-info' block in the notification according
1229	   to RFC 4575 [RFC4575].  The mechanism is described in detail in RFC
1230	   4575 [RFC4575].

1232	   Before a conference media server starts sending multi-party RTT to an
1233	   endpoint, a verification of its ability to handle multi-party RTT
1234	   must be made.  A decision on which mechanism to use for identifying
1235	   text from the different participants must also be taken, implicitly
1236	   or explicitly.  These verifications and decisions can be done in a
1237	   number of ways.  The most apparent ways are specified here and their
1238	   pros and cons described.  One of the methods is selected to be the
1239	   one to be used by implementations of the centralized conference model
1240	   according to this specification.

1242	6.1.  Implicit RTT multi-party capability indication

1244	   Capability for RTT multi-party handling can be decided to be
1245	   implicitly indicated by session control items.

1247	   The focus may implicitly indicate muti-party RTT capability by
1248	   including the media child with value "text" in the RFC 4575 [RFC4575]
1249	   conference-info provided in conference notifications.

1251	   An endpoint may implicitly indicate multi-party RTT capability by
1252	   including the text media in the SDP in the session control
1253	   transactions with the conference focus after the subscription to the
1254	   conference has taken place.

1256	   The implicit RTT capability indication means for the focus that it
1257	   can handle multi-party RTT according to the preferred method
1258	   indicated in the RTT multi-party methods section above.

1260	   The implicit RTT capability indication means for the endpoint that it
1261	   can handle multi-party RTT according to the preferred method
1262	   indicated in the RTT multi-party methods section above.

1264	   If the focus detects that an endpoint implicitly declared RTT multi-
1265	   party capability, it SHALL provide RTT according to the preferred
1266	   method.

1268	   If the focus detects that the endpoint does not indicate any RTT
1269	   multi-party capability, then it shall either provide RTT multi-party
1270	   text in the way specified for conference-unaware endpoint above, or
1271	   refuse to set up the session.

1273	   If the endpoint detects that the focus has implicitly declared RTT
1274	   multi-party capability, it shall be prepared to present RTT in a
1275	   multi-party fashion according to the preferred method.

1277	   Pros:

1279	   Acceptance of implicit multi-party capability implies that no
1280	   standardisation of explicit RTT multi-party capability exchange is
1281	   required.

1283	   Cons:

1285	   If other methods for multi-party RTT are to be used in the same
1286	   implementation environment as the preferred ones, then capability
1287	   exchange needs to be defined for them.

1289	   Cannot be used outside a strictly applied SIP central conference
1290	   model.

1292	6.2.  RTT multi-party capability declared by SIP media-tags

1294	   Specifications for RTT multi-party capability declarations can be
1295	   agreed for use as SIP media feature tags, to be exchanged during SIP
1296	   call control operation according to the mechanisms in RFC 3840
1297	   [RFC3840] and RFC 3841 [RFC3841].  Capability for the RTT Multi-party
1298	   capability is then indicated by the media feature tag "rtt-mix", with
1299	   a set of possible values for the different possible methods.

1301	   The possible values in the list may for example be:

1303	      rtp-mixer

1305	      perc

1307	   rtp-mixer indicates capability for using the RTP-mixer based
1308	   presentation of multi-party text.

1310	   perc indicates capability for using the perc based transmission of
1311	   multi-party text.

1313	   Example: Contact: <sip:a2@beco.example.com>

1315	   ;methods="INVITE,ACK,OPTIONS,BYE,CANCEL"
1316	   ;+sip.rtt-mix="rtp-mixer"

1318	   If, after evaluation of the alternatives in this specification, only
1319	   one mixing method is selected to be brought to implementation, then
1320	   the media tag can be reduced to a single tag with no list of values.

1322	   An offer-answer exchange should take place and the common method
1323	   selected by the answering party shall be used in the session with
1324	   that UA.

1326	   When no common method is declared, then only the fallback method for
1327	   multi-party unaware participants can be used, or the session dropped.

1329	   If more than one text media section is included in SDP, all must be
1330	   capable of using the declared RTT multi-party method.

1332	   Pros:

1334	   Provides a clear decision method.

1336	   Can be extended with new mixing methods.

1338	   Can guide call routing to a suitable capable focus.

1340	   Cons:

1342	   Requires standardization and IANA registration.

1344	   Is not stream specific.  If more than one text stream is specified,
1345	   all must have the same type of multi-party capability.

1347	   Cannot be used in the WebRTC environment.

1349	6.3.  SDP media attribute for RTT multi-party capability indication

1351	   An attribute can be specified on media level, to be used in text
1352	   media SDP declarations for negotiating RTT multi-party capabilities.
1353	   The attribute can have the name "rtt-mixing".

1355	   More than one attribute can be included in one media description.

1357	   The attribute can have a value.  The value can for example be:

1359	      rtp-mixer

1361	      rtp-translator
1362	      perc

1364	   rtp-mixer indicates capability for using the RTP-mixer and CSRC-list
1365	   based mixing of multi-party text.

1367	   rtp-translator indicates capability for using the RTP-translator
1368	   based mixing

1370	   perc indicates capability for using the perc based transmission of
1371	   multi-party text.

1373	   An offer-answer exchange should take place and the common method
1374	   selected by the answering party shall be used in the session with
1375	   that endpoint.

1377	   When no common method is declared, then only the fallback method for
1378	   multi-party unaware endpoints can be used.

1380	   Example: a=rtt-mixing:rtp-mixer

1382	   If, after evaluation of the alternatives in this specification, only
1383	   one mixing method is selected to be brought to implementation, then
1384	   the attribute can be reduced to a single attribute with no list of
1385	   values.

1387	   Pros:

1389	   Provides a clear decision method.

1391	   Can be extended with new mixing methods.

1393	   Can be used on specific text media.

1395	   Can be used also for SDP-controlled WebRTC sessions with multiple
1396	   streams in the same data channel.

1398	   Cons:

1400	   Requires standardization and IANA registration.

1402	   Cannot guide SIP routing.

1404	6.4.  Simplified SDP media attribute for RTT multi-party capability
1405	      indication

1407	   An attribute can be specified on media level, to be used in text
1408	   media SDP declarations for negotiating RTT multi-party capabilities.
1409	   The attribute can have a name suitable for the selected method and no
1410	   value.  It would be selected and used if only one method for multi-
1411	   party rtt is brought forward from this specification, and the other
1412	   left unspecified for now or found to be possible to negotiate in
1413	   another way.

1415	   An offer-answer exchange should take place and if both parties
1416	   specify rtt-mixing capability with the same attribute, the selected
1417	   mixing method shall be used.

1419	   When no common method is declared, then only the fallback method for
1420	   multi-party unaware endpoints can be used, or the session not
1421	   accepted for multi-party use.

1423	   Example: a=rtt-mix

1425	   Pros:

1427	   Provides a clear decision method.

1429	   Very simple syntax and semantics.

1431	   Can be used on specific text media.

1433	   Cons:

1435	   Requires standardization and IANA registration.

1437	   If another RTT mixing method is also specified in the future, then
1438	   that method may also need to specify and register its own attribute,
1439	   instead of if an attribute with a parameter value is used, when only
1440	   an addition of a new possible value is needed.

1442	   Cannot guide SIP routing.

1444	6.5.  SDP format parameter for RTT multi-party capability indication

1446	   An FMTP format parameter can be specified for the RFC 4103
1447	   [RFC4103]media, to be used in text media SDP declarations for
1448	   negotiating RTT multi-party capabilities.  The parameter can have the
1449	   name "rtt-mixing", with one or more of its possible values.

1451	   The possible values in the list are:

1453	      rtp-mixer

1455	      perc

1457	   rtp-mixer indicates capability for using the RTP-mixer based mixing
1458	   and presentation of multi-party text using the CSRC-list.

1460	   perc indicates capability for using the perc based transmission of
1461	   multi-party text.

1463	   Example: a=fmtp 96 98/98/98 rtt-mixing=rtp-mixer

1465	   If, after evaluation of the alternatives in this specification, only
1466	   one mixing method is selected to be brought to implementation, then
1467	   the parameter can be reduced to a single parameter with no list of
1468	   values.

1470	   An offer-answer exchange should take place and the common method
1471	   selected by the answering party shall be used in the session with
1472	   that UA.

1474	   When no common method is declared, then only the fallback method can
1475	   be used, or the session denied.

1477	   Pros:

1479	   Provides a clear decision method.

1481	   Can be extended with new mixing methods.

1483	   Can be used on specific text media.

1485	   Can be used also for SDP-controlled WebRTC sessions with multiple
1486	   streams in the same data channel.

1488	   Cons:

1490	   Requires standardization and IANA registration.

1492	   May cause interop problems with current RFC4103 [RFC4103]
1493	   implementations not expecting a new fmtp-parameter.

1495	   Cannot guide SIP routing.

1497	6.6.  A text media subtype for support of multi-party rtt

1499	   Indicating a specific text media subtype in SDP is a straightforward
1500	   way for negotiating multi-party capability.  Especially if there are
1501	   format differences from the "text/red" and "text/t140" formats of
1502	   RFC4103 [RFC4103], then this is a natural way to do the negotiation
1503	   for multi-party rtt.

1505	   Pros:

1507	   No extra efforts if a new format is needed anyway.

1509	   Cons:

1511	   None specific to using the format indication for negotiation of
1512	   multi-party capability.  But only feasible if a new format is needed
1513	   anyway.

1515	6.7.  Preferred capability declaration method for RTP-based transport.

1517	   If the preferred transport method is one with a specific media
1518	   subtype in sdp, then specification by media subtype is preferred.

1520	   If this would not be the case, then the preferred capability
1521	   declaration method would be the one with a specific SDP attribute for
1522	   the selected mixing method Section 6.4 because it is straightforward.

1524	6.8.  Identification of the source of text for RTP-based solutions

1526	   The main way to identify the source of text in the RTP based solution
1527	   is by the SSRC of the sending participant.  In the RTP-mixer
1528	   solution, this SSRC is included in the CSRC list of the transmitted
1529	   packets.  Further identification that may be needed for better
1530	   labelling of received text may be achieved from a number of sources.
1531	   It may be the RTCP SDES CNAME and NAME reports, and in the conference
1532	   notification data (RFC 4575) [RFC4575].

1534	   As soon as a new member is added to the RTP session, its
1535	   characteristics should be transmitted in RTCP SDES CNAME and NAME
1536	   reports according to section 6.5 in RFC 3550 [RFC3550].  The
1537	   information about the participant should also be included in the
1538	   conference data including the text media member in a notification
1539	   according to RFC 4575 [RFC4575].

1541	   The RTCP SDES report, SHOULD contain identification of the source
1542	   represented by the SSRC/CSRC identifier.  This identification MUST
1543	   contain the CNAME field and MAY contain the NAME field and other
1544	   defined fields of the SDES report.

1546	   A focus UA SHOULD primarily convey SDES information received from the
1547	   sources of the session members.  When such information is not
1548	   available, the focus UA SHOULD compose SSRC/CSRC, CNAME and NAME
1549	   information from available information from the SIP session with the
1550	   participant.

1552	   Provision of detailed information in the NAME field has security
1553	   implications, especially if provided without encryption.

1555	7.  RTT bridging in WebRTC

1557	   Within WebRTC, real-time text is specified to be carried in WebRTC
1558	   data channels as specified in
1559	   [I-D.ietf-mmusic-t140-usage-data-channel].  A few ways to handle
1560	   multi-party RTT are mentioned briefly.  They are repeated below.

1562	7.1.  RTT bridging in WebRTC with one data channel per source

1564	   A straightforward way to handle multi-party RTT is for the bridge to
1565	   open one T.140 data channel per source towards the receiving
1566	   participants.

1568	   The stream-id forms a unique stream identification.

1570	   The identification of the source is made through the Label property
1571	   of the channel, and session information belonging to the source.  The
1572	   endpoint can compose a readable label for the presentation from this
1573	   information.

1575	   Pros:

1577	   This is a straightforward solution.

1579	   The load per source is low.

1581	   Cons:

1583	   With a high number of participants, the overhead of establishing and
1584	   maintaining the high number of data channels required may be high,
1585	   even if the load per channel is low.

1587	7.2.  RTT bridging in WebRTC with one common data channel

1589	   A way to handle multi-party RTT in WebRTC is for the bridge combine
1590	   text from all sources into one data channel and insert the sources in
1591	   the stream by a T.140 control code for source.

1593	   This method is described in a corresponding section for RTP
1594	   transmission above in Section 4.1.1.9.

1596	   The identification of the source is made through insertion in the
1597	   beginning of each text transmission from a source of a control code
1598	   extension "c" followed by a string representing the source, framed by
1599	   the control code start and end flags SOS and ST (See ITU-T T.140
1600	   [T140]).

1602	   A receiving endpoint is supposed to separate text items from the
1603	   different sources and identify and display them in a suitable way.

1605	   The endpoint does not always display the source identification in the
1606	   received text at the place where it is received, but has the
1607	   information as a guide for planning the presentation of received
1608	   text.  A label corresponding to the source identification is
1609	   presented when needed depending on the selected presentation style.

1611	   Pros:

1613	   This solution has relatively low overhead on session and network
1614	   level

1616	   Cons:

1618	   This solution has higher overhead on the media contents level than
1619	   the WebRTC solution above.

1621	   Standardisation of the new control code "c" in ITU-T T.140 [T140] is
1622	   required.

1624	   The conference server need to be allowed to decrypt/encrypt the data
1625	   channel contents.

1627	7.3.  Preferred rtt multi-party method for WebRTC

1629	   For WebRTC, one method is to prefer because of the simplicity.  So,
1630	   for WebRTC, the method to implement for multi-party RTT with multi-
1631	   party aware parties when no other method is explicitly agreed between
1632	   implementing parties is: "RTT bridging in WebRTC with one data
1633	   channel per source" Section 7.1.

1635	8.  Presentation of multi-party text

1637	   All session participants with RTP based transport MUST observe the
1638	   SSRC/CSRC field of incoming text RTP packets, and make note of which
1639	   source they came from in order to be able to present text in a way
1640	   that makes it easy to read text from each participant in a session,
1641	   and get information about the source of the text.

1643	   In the WebRTC case, the Label parameter and other provided endpoint
1644	   information should be used for the same purpose.

1646	8.1.  Associating identities with text streams

1648	   A source identity SHOULD be composed from available information
1649	   sources and displayed together with the text as indicated in ITU-T
1650	   T.140 Appendix[T140].

1652	   The source identity should primarily be the NAME field from incoming
1653	   SDES packets.  If this information is not available, and the session
1654	   is a two-party session, then the T.140 source identity SHOULD be
1655	   composed from the SIP session participant information.  For multi-
1656	   party sessions the source identity may be composed by local
1657	   information if sufficient information is not available in the
1658	   session.

1660	   Applications may abbreviate the presented source identity to a
1661	   suitable form for the available display.

1663	   Applications may also replace received source information with
1664	   internally used nicknames.

1666	8.2.  Presentation details for multi-party aware endpoints.

1668	   The multi-party aware endpoint should after any action for recovery
1669	   of data from lost packets, separate the incoming streams and present
1670	   them according to the style that the receiving application supports
1671	   and the user has selected.  The decisions taken for presentation of
1672	   the multi-party interchange shall be purely on the receiving side.
1673	   The sending application must not insert any item in the stream to
1674	   influence presentation that is not requested by the sending
1675	   participant.

1677	8.2.1.  Bubble style presentation

1679	   One often used style is to present real-time text in chunks in
1680	   readable bubbles identified by labels containing names of sources.
1681	   Bubbles are placed in one column in the presentation area and are
1682	   closed and moved upwards in the presentation area after certain items
1683	   or events, when there is also newer text from another source that
1684	   would go into a new bubble.  The text items that allows bubble
1685	   closing are any character closing a phrase or sentence followed by a
1686	   space or a timeout of a suitable time (about 10 seconds).

1688	   Real-time active text sent from the local user should be presented in
1689	   a separate area.  When there is a reason to close a bubble from the
1690	   local user, the bubble should be placed above all real-time active
1691	   bubbles, so that the time order that real-time text entries were
1692	   completed is visible.

1694	   Scrolling is usually provided for viewing of recent or older text.
1695	   When scrolling is done to an earlier point in the text, the
1696	   presentation shall not move the scroll position by new received text.
1697	   It must be the decision of the local user to return to automatic
1698	   viewing of latest text actions.  It may be useful with an indication
1699	   that there is new text to read after scrolling to an earlier position
1700	   has been activated.

1702	   The presentation area may become too small to present all text in all
1703	   real-time active bubbles.  Various techniques can be applied to
1704	   provide a good overview and good reading opportunity even in such
1705	   situations.  The active real-time bubble may have a limited number of
1706	   lines and if their contents need more lines, then a scrolling
1707	   opportunity within the real-time active bubble is provided.  Another
1708	   method can be to only show the label and the last line of the active
1709	   real-time bubble contents, and make it possible to expand or compress
1710	   the bubble presentation between full view and one line view.

1712	   Erasures require special consideration.  Erasure within a real-time
1713	   active bubble is straightforward.  But if erasure from one
1714	   participant affects the last character before a bubble, the whole
1715	   previous bubble becomes the actual bubble for real-time action by
1716	   that participant and is placed below all other bubbles in the
1717	   presentation area.  If the border between bubbles was caused by the
1718	   CRLF characters (instead of the normal "Line Separator"), only one
1719	   erasure action is required to erase this bubble border.  When a
1720	   bubble is closed, it is moved up, above all real-time active bubbles.

1722	   A three-party view is shown in this example .

1724	                 _________________________________________________
1725	                |                                              |^|
1726	                |                                              |-|
1727	                |[Alice] Hi, Alice here.                       | |
1728	                |                                              | |
1729	                |[Bob] Bob as well.                            | |
1730	                |                                              | |
1731	                |[Eve] Hi, this is Eve, calling from Paris.    | |
1732	                |      I thought you should be here.           | |
1733	                |                                              | |
1734	                |[Alice] I am coming on Thursday, my           | |
1735	                |      performance is not until Friday morning.| |
1736	                |                                              | |
1737	                |[Bob] And I on Wednesday evening.             | |
1738	                |                                              | |
1739	                |[Alice] Can we meet on Thursday evening?      | |
1740	                |                                              | |
1741	                |[Eve] Yes, definitely. How about 7pm.         | |
1742	                |     at the entrance of the restaurant        | |
1743	                |     Le Lion Blanc?                           | |
1744	                |[Eve] we can have dinner and then take a walk | |
1745	                |                                              | |
1746	                | <Eve-typing> But I need to be back to        | |
1747	                |    the hotel by 11 because I need            | |
1748	                |                                              | |
1749	                | <Bob-typing> I wou                           |-|
1750	                |______________________________________________|v|
1751	                | of course, I underst                           |
1752	                |________________________________________________|

1754	               Figure 1: Three-party call with bubble style.

1756	   Figure 1: Example of a three-party call presented in the bubble
1757	   style.

1759	8.2.2.  Other presentation styles

1761	   Other presentation styles than the bubble style may be arranged and
1762	   appreciated by the users.  In a video conference one way may be to
1763	   have a real-time text area below the video view of each participant.
1764	   Another view may be to provide one column in a presentation area for
1765	   each participant and place the text entries in a relative vertical
1766	   position corresponding to when text entry in them was completed.  The
1767	   labels can then be placed in the column header.  The considerations
1768	   for ending and moving and erasure of entered text discussed above for
1769	   the bubble style are valid also for these styles.

1771	   This figure shows how a coordinated column view MAY be presented.

1773	   _____________________________________________________________________
1774	   |       Bob          |       Eve            |       Alice           |
1775	   |____________________|______________________|_______________________|
1776	   |                    |                      |I will arrive by TGV.  |
1777	   |My flight is to Orly|                      |Convenient to the main |
1778	   |                    |Hi all, can we plan   |station.               |
1779	   |                    |for the seminar?      |                       |
1780	   |Eve, will you do    |                      |                       |
1781	   |your presentation on|                      |                       |
1782	   |Friday?             |Yes, Friday at 10.    |                       |
1783	   |Fine, wo            |                      |We need to meet befo   |
1784	   |___________________________________________________________________|

1786	   Figure 2: A coordinated column-view of a three-party session with
1787	   entries ordered in approximate time-order.

1789	9.  Presentation details for multi-party unaware endpoints.

1791	   Multi-party unaware endpoints are prepared only for presentation of
1792	   two sources of text, the local user and a remote user.  If mixing for
1793	   multi-party unaware endpoints is to be supported, in order to enable
1794	   some multi-party communication with such endpoint, the mixer need to
1795	   plan the presentation and insert labels and line breaks before
1796	   lables.  Many limitations appear for this presentation mode, and it
1797	   must be seen as a fallback and a last resort.

1799	   A procedure for presenting RTT to a conference-unaware endpoint is
1800	   included in [I-D.ietf-avtcore-multi-party-rtt-mix]

1802	10.  Security Considerations

1804	   The security considerations valid for RFC 4103 [RFC4103] and RFC 3550
1805	   [RFC3550] are valid also for the multi-party sessions with text.

1807	11.  IANA Considerations

1809	   The items for indication and negotiation of capability for multi-
1810	   party rtt should be registered with IANA in the specifications where
1811	   they are specified in detail.

1813	12.  Congestion considerations

1815	   The congestion considerations described in RFC 4103 [RFC4103] are
1816	   valid also for the recommended RTP-based multi-party use of the real-
1817	   time text transport.  A risk for congestion may appear if a number of
1818	   conference participants are active transmitting text simultaneously,
1819	   because the recommended RTP-based multi-party transmission method
1820	   does not allow multiple sources of text to contribute to the same
1821	   packet.

1823	   In situations of risk for congestion, the Focus UA MAY combine
1824	   packets from the same source to increase the transmission interval
1825	   per source up to one second.  Local conference policy in the Focus UA
1826	   may be used to decide which streams shall be selected for such
1827	   transmission frequency reduction.

1829	13.  Acknowledgements

1831	   Arnoud van Wijk for contributions to an earlier, expired draft of
1832	   this memo.

1834	14.  Change history

1836	14.1.  Changes to draft-hellstrom-avtcore-multi-party-rtt-solutions-04

1838	   Change name of simplified sdp attribute to "rtt-mix" to match a
1839	   change in the draft draft-ietf-avtcore-multi-party-rtt-mix-09.

1841	14.2.  Changes to draft-hellstrom-avtcore-multi-party-rtt-solutions-03

1843	   Modified info on the method with RFC 4103 format and sdp attribute
1844	   "rtt-mix-rtp-mixer".

1846	   Increased the performance requirements section.

1848	   Inserted recommendations, with emphasis on ease of implementation and
1849	   ease of standardisation.

1851	14.3.  Changes to draft-hellstrom-avtcore-multi-party-rtt-solutions-02

1853	   Added detail in the section on RTP translator model alternative
1854	   4.1.2.1.

1856	14.4.  Changes to draft-hellstrom-avtcore-multi-party-rtt-solutions-01

1858	   Added three more methods for RTP-mixer mixing.  Two RFC 5109 FEC
1859	   based and another with modified data header to detect source of
1860	   completely lost text.

1862	   Separated RTP-based and WebRTC based solutions.

1864	   Deleted the multi-party-unaware mixing procedure appendix.  It is now
1865	   included in the draft draft-ietf-avtcore-multi-party-rtt-mix.  Kept a
1866	   section with a reference to the new place.

1868	14.5.  Changes from draft-hellstrom-mmusic-multi-party-rtt-02 to draft-
1869	       hellstrom-avtcore-multi-party-rtt-solutions-00

1871	   Add discussion about switching performance, as discussed in avtcore
1872	   on March 13.

1874	   Added that a decrease of transmission interval to 100 ms increases
1875	   switching performance by a factor 3, but still not sufficient.

1877	   Added that the CSRC-list method also uses 100 milliseconds
1878	   transmission interval.

1880	   Added the method with multiple primary text in each packet.

1882	   Added the timestamp-based method for rtp-mixing proposed by James
1883	   Hamlin on March 14.

1885	   Corrected the chat style presentation example picture.  Delete a few
1886	   "[mix]".

1888	14.6.  Changes from version draft-hellstrom-mmusic-multi-party-rtt-01 to
1889	       -02

1891	   Change from a general overview to overview with clear
1892	   recommendations.

1894	   Splits text coordination methods in three groups.

1896	   Recommends rtt-mixer with sources in CSRC-list but refers to its spec
1897	   for details.

1899	   Shortened Appendix with conference-unaware example.

1901	   Cleaned up preferences.

1903	   Inserted pictures of screen-views.

1905	15.  References

1907	15.1.  Normative References

1909	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1910	              Requirement Levels", BCP 14, RFC 2119,
1911	              DOI 10.17487/RFC2119, March 1997,
1912	              <https://www.rfc-editor.org/info/rfc2119>.

1914	15.2.  Informative References

1916	   [EN301549] ETSI, "EN 301 549. Accessibility requirements for ICT
1917	              products and services", November 2019,
1918	              <https://www.etsi.org/deliver/
1919	              etsi_en/301500_301599/301549/03.01.01_60/
1920	              en_301549v030101p.pdf>.

1922	   [I-D.ietf-avtcore-multi-party-rtt-mix]
1923	              Hellstrom, G., "RTP-mixer formatting of multi-party Real-
1924	              time text", Work in Progress, Internet-Draft, draft-ietf-
1925	              avtcore-multi-party-rtt-mix-06, 11 June 2020,
1926	              <https://tools.ietf.org/html/draft-ietf-avtcore-multi-
1927	              party-rtt-mix-06>.

1929	   [I-D.ietf-avtcore-multiplex-guidelines]
1930	              Westerlund, M., Burman, B., Perkins, C., Alvestrand, H.,
1931	              and R. Even, "Guidelines for using the Multiplexing
1932	              Features of RTP to Support Multiple Media Streams", Work
1933	              in Progress, Internet-Draft, draft-ietf-avtcore-multiplex-
1934	              guidelines-12, 16 June 2020, <https://tools.ietf.org/html/
1935	              draft-ietf-avtcore-multiplex-guidelines-12>.

1937	   [I-D.ietf-mmusic-t140-usage-data-channel]
1938	              Holmberg, C. and G. Hellstrom, "T.140 Real-time Text
1939	              Conversation over WebRTC Data Channels", Work in Progress,
1940	              Internet-Draft, draft-ietf-mmusic-t140-usage-data-channel-
1941	              14, 10 April 2020, <https://tools.ietf.org/html/draft-
1942	              ietf-mmusic-t140-usage-data-channel-14>.

1944	   [I-D.ietf-perc-private-media-framework]
1945	              Jones, P., Benham, D., and C. Groves, "A Solution
1946	              Framework for Private Media in Privacy Enhanced RTP
1947	              Conferencing (PERC)", Work in Progress, Internet-Draft,
1948	              draft-ietf-perc-private-media-framework-12, 5 June 2019,
1949	              <https://tools.ietf.org/html/draft-ietf-perc-private-
1950	              media-framework-12>.

1952	   [NENAi3]   NENA, "NENA-STA-010.2-2016. Detailed Functional and
1953	              Interface Standards for the NENA i3 Solution", October
1954	              2016, <https://www.nena.org/page/i3_Stage3>.

1956	   [RFC2198]  Perkins, C., Kouvelas, I., Hodson, O., Hardman, V.,
1957	              Handley, M., Bolot, J.C., Vega-Garcia, A., and S. Fosse-
1958	              Parisis, "RTP Payload for Redundant Audio Data", RFC 2198,
1959	              DOI 10.17487/RFC2198, September 1997,
1960	              <https://www.rfc-editor.org/info/rfc2198>.

1962	   [RFC3261]  Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston,
1963	              A., Peterson, J., Sparks, R., Handley, M., and E.
1964	              Schooler, "SIP: Session Initiation Protocol", RFC 3261,
1965	              DOI 10.17487/RFC3261, June 2002,
1966	              <https://www.rfc-editor.org/info/rfc3261>.

1968	   [RFC3264]  Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
1969	              with Session Description Protocol (SDP)", RFC 3264,
1970	              DOI 10.17487/RFC3264, June 2002,
1971	              <https://www.rfc-editor.org/info/rfc3264>.

1973	   [RFC3550]  Schulzrinne, H., Casner, S., Frederick, R., and V.
1974	              Jacobson, "RTP: A Transport Protocol for Real-Time
1975	              Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550,
1976	              July 2003, <https://www.rfc-editor.org/info/rfc3550>.

1978	   [RFC3840]  Rosenberg, J., Schulzrinne, H., and P. Kyzivat,
1979	              "Indicating User Agent Capabilities in the Session
1980	              Initiation Protocol (SIP)", RFC 3840,
1981	              DOI 10.17487/RFC3840, August 2004,
1982	              <https://www.rfc-editor.org/info/rfc3840>.

1984	   [RFC3841]  Rosenberg, J., Schulzrinne, H., and P. Kyzivat, "Caller
1985	              Preferences for the Session Initiation Protocol (SIP)",
1986	              RFC 3841, DOI 10.17487/RFC3841, August 2004,
1987	              <https://www.rfc-editor.org/info/rfc3841>.

1989	   [RFC4103]  Hellstrom, G. and P. Jones, "RTP Payload for Text
1990	              Conversation", RFC 4103, DOI 10.17487/RFC4103, June 2005,
1991	              <https://www.rfc-editor.org/info/rfc4103>.

1993	   [RFC4353]  Rosenberg, J., "A Framework for Conferencing with the
1994	              Session Initiation Protocol (SIP)", RFC 4353,
1995	              DOI 10.17487/RFC4353, February 2006,
1996	              <https://www.rfc-editor.org/info/rfc4353>.

1998	   [RFC4575]  Rosenberg, J., Schulzrinne, H., and O. Levin, Ed., "A
1999	              Session Initiation Protocol (SIP) Event Package for
2000	              Conference State", RFC 4575, DOI 10.17487/RFC4575, August
2001	              2006, <https://www.rfc-editor.org/info/rfc4575>.

2003	   [RFC4579]  Johnston, A. and O. Levin, "Session Initiation Protocol
2004	              (SIP) Call Control - Conferencing for User Agents",
2005	              BCP 119, RFC 4579, DOI 10.17487/RFC4579, August 2006,
2006	              <https://www.rfc-editor.org/info/rfc4579>.

2008	   [RFC4597]  Even, R. and N. Ismail, "Conferencing Scenarios",
2009	              RFC 4597, DOI 10.17487/RFC4597, August 2006,
2010	              <https://www.rfc-editor.org/info/rfc4597>.

2012	   [RFC5109]  Li, A., Ed., "RTP Payload Format for Generic Forward Error
2013	              Correction", RFC 5109, DOI 10.17487/RFC5109, December
2014	              2007, <https://www.rfc-editor.org/info/rfc5109>.

2016	   [RFC5194]  van Wijk, A., Ed. and G. Gybels, Ed., "Framework for Real-
2017	              Time Text over IP Using the Session Initiation Protocol
2018	              (SIP)", RFC 5194, DOI 10.17487/RFC5194, June 2008,
2019	              <https://www.rfc-editor.org/info/rfc5194>.

2021	   [RFC5576]  Lennox, J., Ott, J., and T. Schierl, "Source-Specific
2022	              Media Attributes in the Session Description Protocol
2023	              (SDP)", RFC 5576, DOI 10.17487/RFC5576, June 2009,
2024	              <https://www.rfc-editor.org/info/rfc5576>.

2026	   [RFC6443]  Rosen, B., Schulzrinne, H., Polk, J., and A. Newton,
2027	              "Framework for Emergency Calling Using Internet
2028	              Multimedia", RFC 6443, DOI 10.17487/RFC6443, December
2029	              2011, <https://www.rfc-editor.org/info/rfc6443>.

2031	   [RFC6881]  Rosen, B. and J. Polk, "Best Current Practice for
2032	              Communications Services in Support of Emergency Calling",
2033	              BCP 181, RFC 6881, DOI 10.17487/RFC6881, March 2013,
2034	              <https://www.rfc-editor.org/info/rfc6881>.

2036	   [RFC7667]  Westerlund, M. and S. Wenger, "RTP Topologies", RFC 7667,
2037	              DOI 10.17487/RFC7667, November 2015,
2038	              <https://www.rfc-editor.org/info/rfc7667>.

2040	   [T140]     ITU-T, "Recommendation ITU-T T.140 (02/1998), Protocol for
2041	              multimedia application text conversation", February 1998,
2042	              <https://www.itu.int/rec/T-REC-T.140-199802-I/en>.

2044	   [T140ad1]  ITU-T, "Recommendation ITU-T.140 Addendum 1 - (02/2000),
2045	              Protocol for multimedia application text conversation",
2046	              February 2000,
2047	              <https://www.itu.int/rec/T-REC-T.140-200002-I!Add1/en>.

2049	   [TS103479] ETSI, "TS 103 479. Emergency communications (EMTEL); Core
2050	              elements for network independent access to emergency
2051	              services", December 2019, <https://www.etsi.org/deliver/
2052	              etsi_ts/103400_103499/103479/01.01.01_60/
2053	              ts_103479v010101p.pdf>.

2055	   [TS22173]  3GPP, "IP Multimedia Core Network Subsystem (IMS)
2056	              Multimedia Telephony Service and supplementary services;
2057	              Stage 1", 3GPP TS 22.173 17.1.0, 20 December 2019,
2058	              <http://www.3gpp.org/ftp/Specs/html-info/22173.htm>.

2060	   [TS24147]  3GPP, "Conferencing using the IP Multimedia (IM) Core
2061	              Network (CN) subsystem; Stage 3", 3GPP TS 24.147 16.0.0,
2062	              19 December 2019,
2063	              <http://www.3gpp.org/ftp/Specs/html-info/24147.htm>.

2065	Author's Address

2067	   Gunnar Hellstrom
2068	   Gunnar Hellstrom Accessible Communication
2069	   Esplanaden 30
2070	   SE-136 70 Vendelso
2071	   Sweden

2073	   Phone: +46 708 204 288
2074	   Email: gunnar.hellstrom@ghaccess.se