idnits 2.17.1 

draft-hellstrom-avtcore-multi-party-rtt-solutions-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The abstract seems to contain references ([RFC4103]), which it
     shouldn't.  Please replace those with straight textual mentions of the
     documents in question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (15 June 2020) is 1409 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Unused Reference: 'RFC3264' is defined on line 1782, but no explicit
     reference was found in the text

  == Outdated reference: A later version (-20) exists of
     draft-ietf-avtcore-multi-party-rtt-mix-06


     Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Internet Engineering Task Force                             G. Hellstrom
3	Internet-Draft                 Gunnar Hellstrom Accessible Communication
4	Intended status: Informational                              15 June 2020
5	Expires: 17 December 2020

7	           Real-time text solutions for multi-party sessions
8	          draft-hellstrom-avtcore-multi-party-rtt-solutions-01

10	Abstract

12	   This document specifies methods for Real-Time Text (RTT) media
13	   handling in multi-party calls.  The main transport is to carry Real-
14	   Time text by the RTP protocol in a time-sampled mode according to RFC
15	   4103 [RFC4103].  The mechanisms enable the receiving application to
16	   present the received real-time text media separated per source, in
17	   different ways according to user preferences.  Some presentation
18	   related features are also described explaining suitable variations of
19	   transmission and presentation of text.

21	   Call control features are described for the SIP environment.  A
22	   number of alternative methods for providing the multi-party
23	   negotiation, transmission and presentation are discussed and a
24	   recommendation for the main ones is provided.  The main solution for
25	   SIP based centralized multi-party handling of real-time text is
26	   achieved through a media control unit coordinating multiple RTP text
27	   streams into one RTP stream.

29	   Alternative methods using a single RTP stream and source
30	   identification inline in the text stream are also described, one of
31	   them being provided as a lower functionality fallback method for
32	   endpoints with no multi-party awareness for RTT.

34	   Bridging methods where the text stream is carried without the
35	   contents being dealt with in detail by the bridge are also discussed.

37	   Brief information is also provided for multi-party RTT in the WebRTC
38	   environment.

40	   The intention is to provide background for decisions, specification
41	   and implementation of selected methods.

43	Status of This Memo

45	   This Internet-Draft is submitted in full conformance with the
46	   provisions of BCP 78 and BCP 79.

48	   Internet-Drafts are working documents of the Internet Engineering
49	   Task Force (IETF).  Note that other groups may also distribute
50	   working documents as Internet-Drafts.  The list of current Internet-
51	   Drafts is at https://datatracker.ietf.org/drafts/current/.

53	   Internet-Drafts are draft documents valid for a maximum of six months
54	   and may be updated, replaced, or obsoleted by other documents at any
55	   time.  It is inappropriate to use Internet-Drafts as reference
56	   material or to cite them other than as "work in progress."

58	   This Internet-Draft will expire on 17 December 2020.

60	Copyright Notice

62	   Copyright (c) 2020 IETF Trust and the persons identified as the
63	   document authors.  All rights reserved.

65	   This document is subject to BCP 78 and the IETF Trust's Legal
66	   Provisions Relating to IETF Documents (https://trustee.ietf.org/
67	   license-info) in effect on the date of publication of this document.
68	   Please review these documents carefully, as they describe your rights
69	   and restrictions with respect to this document.  Code Components
70	   extracted from this document must include Simplified BSD License text
71	   as described in Section 4.e of the Trust Legal Provisions and are
72	   provided without warranty as described in the Simplified BSD License.

74	Table of Contents

76	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   4
77	     1.1.  Requirements Language . . . . . . . . . . . . . . . . . .   5
78	   2.  Centralized conference model  . . . . . . . . . . . . . . . .   5
79	   3.  Requirements on multi-party RTT . . . . . . . . . . . . . . .   5
80	   4.  RTP based solutions . . . . . . . . . . . . . . . . . . . . .   7
81	     4.1.  Coordination of text RTP streams  . . . . . . . . . . . .   7
82	       4.1.1.  RTP-based solutions with a central mixer  . . . . . .   7
83	         4.1.1.1.  RTP Mixer using default RFC 4103 methods  . . . .   7
84	         4.1.1.2.  RTP Mixer using the default method but decreased
85	                 transmission interval . . . . . . . . . . . . . . .   8
86	         4.1.1.3.  RTP Mixer with frequent transmission and indicating
87	                 sources in CSRC-list  . . . . . . . . . . . . . . .   9
88	         4.1.1.4.  RTP Mixer using timestamp to identify
89	                 redundancy  . . . . . . . . . . . . . . . . . . . .  10
90	         4.1.1.5.  RTP Mixer with multiple primary data in each packet
91	                 and individual sequence numbers . . . . . . . . . .  11
92	         4.1.1.6.  RTP Mixer with multiple primary data in each
93	                 packet  . . . . . . . . . . . . . . . . . . . . . .  12
94	         4.1.1.7.  RTP Mixer with RFC 5109 FEC and RFC 2198 redundancy
95	                 in the packets  . . . . . . . . . . . . . . . . . .  13

97	         4.1.1.8.  RTP Mixer with RFC 5109 FEC and RFC 2198 redundancy
98	                 and separate sequence number in the packets . . . .  15
99	         4.1.1.9.  RTP Mixer indicating participants by a control code
100	                 in the stream . . . . . . . . . . . . . . . . . . .  17
101	         4.1.1.10. Mixing for multi-party unaware user agents  . . .  18
102	       4.1.2.  RTP-based bridging with minor RTT media contents
103	               reformatting by the bridge  . . . . . . . . . . . . .  20
104	         4.1.2.1.  RTP Translator sending one RTT stream per
105	                 participant . . . . . . . . . . . . . . . . . . . .  20
106	         4.1.2.2.  Distributing packets in an end-to-end encryption
107	                 structure . . . . . . . . . . . . . . . . . . . . .  21
108	         4.1.2.3.  Mesh of RTP endpoints . . . . . . . . . . . . . .  21
109	         4.1.2.4.  Multiple RTP sessions, one for each
110	                 participant . . . . . . . . . . . . . . . . . . . .  22
111	   5.  Preferred RTP-based multi-party RTT transport method  . . . .  23
112	   6.  Session control of RTP-based multi-party RTT sessions . . . .  23
113	     6.1.  Implicit RTT multi-party capability indication  . . . . .  24
114	     6.2.  RTT multi-party capability declared by SIP media-tags . .  25
115	     6.3.  SDP media attribute for RTT multi-party capability
116	           indication  . . . . . . . . . . . . . . . . . . . . . . .  26
117	     6.4.  Simplified SDP media attribute for RTT multi-party
118	           capability indication . . . . . . . . . . . . . . . . . .  27
119	     6.5.  SDP format parameter for RTT multi-party capability
120	           indication  . . . . . . . . . . . . . . . . . . . . . . .  28
121	     6.6.  A text media subtype for support of multi-party rtt . . .  29
122	     6.7.  Preferred capability declaration method for RTP-based
123	           transport.  . . . . . . . . . . . . . . . . . . . . . . .  29
124	     6.8.  Identification of the source of text for RTP-based
125	           solutions . . . . . . . . . . . . . . . . . . . . . . . .  30
126	   7.  RTT bridging in WebRTC  . . . . . . . . . . . . . . . . . . .  30
127	     7.1.  RTT bridging in WebRTC with one data channel per
128	           source  . . . . . . . . . . . . . . . . . . . . . . . . .  30
129	     7.2.  RTT bridging in WebRTC with one common data channel . . .  31
130	     7.3.  Preferred rtt multi-party method for WebRTC . . . . . . .  32
131	   8.  Presentation of multi-party text  . . . . . . . . . . . . . .  32
132	     8.1.  Associating identities with text streams  . . . . . . . .  32
133	     8.2.  Presentation details for multi-party aware endpoints. . .  33
134	       8.2.1.  Bubble style presentation . . . . . . . . . . . . . .  33
135	       8.2.2.  Other presentation styles . . . . . . . . . . . . . .  35
136	   9.  Presentation details for multi-party unaware endpoints. . . .  35
137	   10. Security Considerations . . . . . . . . . . . . . . . . . . .  35
138	   11. IANA Considerations . . . . . . . . . . . . . . . . . . . . .  36
139	   12. Congestion considerations . . . . . . . . . . . . . . . . . .  36
140	   13. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  36
141	   14. Change history  . . . . . . . . . . . . . . . . . . . . . . .  36
142	     14.1.  Changes to
143	            draft-hellstrom-avtcore-multi-party-rtt-solutions-01 . .  36

145	     14.2.  Changes from draft-hellstrom-mmusic-multi-party-rtt-02 to
146	            draft-hellstrom-avtcore-multi-party-rtt-solutions-00 . .  36
147	     14.3.  Changes from version
148	            draft-hellstrom-mmusic-multi-party-rtt-01 to -02 . . . .  37
149	   15. References  . . . . . . . . . . . . . . . . . . . . . . . . .  37
150	     15.1.  Normative References . . . . . . . . . . . . . . . . . .  37
151	     15.2.  Informative References . . . . . . . . . . . . . . . . .  37
152	   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  40

154	1.  Introduction

156	   Real-time text (RTT) is a medium in real-time conversational
157	   sessions.  Text entered by participants in a session is transmitted
158	   in a time-sampled fashion, so that no specific user action is needed
159	   to cause transmission.  This gives a direct flow of text in the rate
160	   it is created, that is suitable in a real-time conversational
161	   setting.  The real-time text medium can be combined with other media
162	   in multimedia sessions.

164	   Media from a number of multimedia session participants can be
165	   combined in a multi-party session.  The present document specifies
166	   how the real-time text streams can be handled in multi-party
167	   sessions.  Recommendations are provided for preferred methods.

169	   The description is mainly focused on the transport level, but also
170	   describes a few session and presentation level aspects.

172	   Transport of real-time text is specified in RFC 4103 [RFC4103] RTP
173	   Payload for text conversation.  It makes use of RFC 3550 [RFC3550]
174	   Real Time Protocol, for transport.  Robustness against network
175	   transmission problems is normally achieved through redundant
176	   transmission based on the principle from RFC 2198 [RFC2198], with one
177	   primary and two redundant transmission of each text element.  Primary
178	   and redundant transmissions are combined in packets and described by
179	   a redundancy header.  This transport is usually used in the SIP
180	   Session Initiation Protocol RFC 3261 [RFC3261] environment.

182	   A very brief overview of functions for real-time text handling in
183	   multi-party sessions is described in RFC 4597 [RFC4597] Conferencing
184	   Scenarios, sections 4.8 and 4.10.  The present specification builds
185	   on that description and indicates which protocol mechanisms should be
186	   used to implement multi-party handling of real-time text.

188	   Real-time text can also be transported in the WebRTC environment, by
189	   using WebRTC data channels according to
190	   [I-D.ietf-mmusic-t140-usage-data-channel].  Multi-party aspects for
191	   WebRTC solutions are briefly covered.

193	1.1.  Requirements Language

195	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
196	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
197	   document are to be interpreted as described in RFC 2119 [RFC2119].

199	2.  Centralized conference model

201	   In the centralized conference model for SIP, introduced in RFC 4353
202	   [RFC4353] "A Framework for Conferencing with the Session Initiation
203	   Protocol (SIP)", one function co-ordinates the communication with
204	   participants in the multi-party session.  This function also controls
205	   media mixer functions for the media appearing in the session.  The
206	   central function is common for control of all media, while the media
207	   mixers may work differently for each media.

209	   The central function is called the Focus UA.  Many variants exist for
210	   setting up sessions including the multipoint control centre.  It is
211	   not within scope of this description to describe these, but rather
212	   the media specific handling in the mixer required to handle multi-
213	   party calls with RTT.

215	   The main principle for handling real-time text media in a centralized
216	   conference is that one RTP session for real-time text is established
217	   including the multipoint media control centre and the participating
218	   endpoints which are going to have real-time text exchange with the
219	   others.

221	   The different possible mechanisms for mixing and transporting RTT
222	   differs in the way they multiplex the text streams and how they
223	   identify the sources of the streams.  RFC 7667 [RFC7667] describes a
224	   number of possible use cases for RTP.  This specification refers to
225	   different sections of RFC 7667 for further reading of the situations
226	   caused by the different possible design choices.

228	   The recommended method for using RTT in a centralized conference
229	   model is specified in [I-D.ietf-avtcore-multi-party-rtt-mix] based on
230	   the recommendations in the present document.

232	   Real-time text can also be transported in the WebRTC environment, by
233	   using WebRTC data channels according to
234	   [I-D.ietf-mmusic-t140-usage-data-channel].  Ways to handle multi-
235	   party calls in that environmnent are also specified.

237	3.  Requirements on multi-party RTT

239	   The following requirements are placed on multi-party RTT:

241	      A solution shall be applicable to IMS (3GPP TS 22.173)[TS22173],
242	      SIP based VoIP and Next Generation Emergency Services (NENA i3
243	      [NENAi3], ETSI TS 103 479 [TS103479], RFC 6443[RFC6443]).

245	      The transmission interval for text must not be longer than 500
246	      milliseconds when there is anything available to send.  Ref ITU-T
247	      T.140 [T140].

249	      If text loss is detected or suspected, a missing text marker shall
250	      be inserted in the text stream.  Ref ITU-T T.140 Amendment 1
251	      [T140ad1].  ETSI EN 301 549 [EN301549]

253	      The display of text from the members of the conversation shall be
254	      arranged so that the text from each participant is clearly
255	      readable, and its source and the relative timing of entered text
256	      is visualized in the display.  Mechanisms for looking back in the
257	      contents from the current session should be provided.  The text
258	      should be displayed as soon as it is received.  Ref ITU-T T.140
259	      [T140]

261	      Bridges must be multimedia capable (voice, video, text).  Ref NENA
262	      i3 STA-010.2.  [NENAi3]

264	      R7: It MUST be possible to use real-time text in conferences both
265	      as a medium of discussion between individual participants (for
266	      example, for sidebar discussions in real-time text while listening
267	      to the main conference audio) and for central support of the
268	      conference with real-time text interpretation of speech.  Ref RFC
269	      5194.[RFC5194]

271	      It should be possible to protect RTT contents with usual means for
272	      privacy and integrity.Ref RFC 6881 section 16.  [RFC6881]

274	      Conferencing procedures are documented in RFC 4579 [RFC4579].  Ref
275	      NENA i3 STA-010.2.[NENAi3]

277	      Conferencing applies to any kind of media stream by which users
278	      may want to communicate.  Ref 3GPP TS 24.147 [TS24147]

280	      The framework for SIP conferences is specified in RFC 4353
281	      [RFC4353].  Ref 3GPP TS 24.147 [TS24147]

283	      The mixer performance requirements can be expressed in two
284	      numbers.

286	      1) The number of participants who can transmit simultaneously with
287	      the text not being delayed in the mixer more than 500
288	      milliseconds.  This requirement is depending on the application.
289	      Five simultaneous transmitting participants is a sufficiently high
290	      number for most situations.

292	      2) The switching time from when the mixer is transmitting text
293	      from one participant and text arrives from another participant,
294	      until the mixer sends the text from the second participant.  This
295	      time should not be more than 500 milliseconds when there are up to
296	      five participants sending text simultaneously.

298	4.  RTP based solutions

300	4.1.  Coordination of text RTP streams

302	   Coordinating and sending text RTP streams in the multi-party session
303	   can be done in a number of ways.  The most suitable methods are
304	   specified here with pros and cons.

306	   A receiving and presenting endpoint MUST separate text from the
307	   different sources and identify and display them accordingly.

309	4.1.1.  RTP-based solutions with a central mixer

311	   A set of solutions can be based on the central RTP mixer.  They are
312	   described here and a preferred method selected.

314	4.1.1.1.  RTP Mixer using default RFC 4103 methods

316	   Without any extra specifications, a mixer would transmit with 300
317	   milliseconds intervals, and use RFC 4103 [RFC4103] with the default
318	   redundancy of one original and two redundant transmissions.  The
319	   source of the text would be indicated by a single member in the CSRC
320	   list.  Text from different sources cannot be transmitted in the same
321	   packet.  Therefore, from the time when the mixer sent one piece of
322	   new text from one source, it will need to transmit that text again
323	   twice as redundant data, before it can send text from another source.
324	   The switching time will thus be 900 milliseconds.  The mixer can not
325	   even send text from two simultaneous sources without introducing more
326	   than 500 milliseconds delay.  This is clearly insufficient.

328	   Pros:

330	   Only a capability negotiation method is needed.  No other update of
331	   standards are needed, just a general remark that traditional RTP-
332	   mixing is used.

334	   Cons:

336	   Clearly insufficient mixer switching performance.

338	   A bit complex handling of transmission when there is new text
339	   available from more than one source.  The mixer needs to send two
340	   packets more with redundant text from the current source before
341	   starting to send anything from the other source.

343	4.1.1.2.  RTP Mixer using the default method but decreased transmission
344	          interval

346	   This method makes use of the default RTP-mixing method briefly
347	   described in Section 4.1.1.1.  The only difference is that the
348	   transmission interval is decreased to 100 milliseconds when there is
349	   text from more than one source available for transmission.  This
350	   increases the switching performance to three source switches per
351	   second.  The delay of new text from a participant can be one second
352	   if five users send new text simultaneously.  Text from two
353	   simultaneous users would not get more dealyed than 400 ms.

355	   Pros:

357	   Minor influence on standards

359	   Can be sdp-declared as "text/red" with a multi-party attribute for
360	   capability negotiation.

362	   Cons:

364	   Too long delay of new text from more than two simultaneous sources.

366	   Slightly higher risk for loss of text at bursty packet loss than for
367	   the recommended transmission interval (300 ms) for RFC 4103.

369	   When complete loss of packets occur (beyond recovery), it is not
370	   possible to deduct from which source text was lost.

372	   A bit complex handling of transmission when there is new text
373	   available from more than one source.  The mixer needs to send two
374	   packets more with redundant text from the current source before
375	   starting to send anything from the other source.

377	4.1.1.3.  RTP Mixer with frequent transmission and indicating sources in
378	          CSRC-list

380	   An RTP media mixer combines text from participants into one RTP
381	   stream, thus all using the same destination address/port combination,
382	   the same RTP SSRC, and one sequence number series as described in
383	   Section 7.1 and 7.3 of RTP RFC 3550 [RFC3550] about the Mixer
384	   function.  This method is also briefly described in RFC 7667, section
385	   3.6.1 Media mixing mixer [RFC7667].

387	   The sources of the text in each RTP packet are identified by the CSRC
388	   list in the RTP packets, containing the SSRC of the initial sources
389	   of text.  The order of the CSRC parameters is with the SSRC of the
390	   source of the primary text first, followed by the SSRC of the first
391	   level redundancy, and then the second level redundancy.

393	   The transmission interval should be 100 milliseconds when there is
394	   text to transmit from more than one source, and otherwise 300 ms.

396	   The identification of the sources is made through the CSRC fields and
397	   can be made more readable at the receiver through the RTCP SDES CNAME
398	   and NAME packets as described in RTP[RFC3550].

400	   Information provided through the notification according to RFC 4575
401	   [RFC4575] when the participant joined the conference provides also
402	   suitable information and a reference to the SSRC.

404	   A receiving endpoint is supposed to separate text items from the
405	   different sources and identify and display them accordingly.

407	   The ordered CSRC lists in the RFC 4103 [RFC4103] packets make it
408	   possible to recover from loss of one and two packets in sequence and
409	   assign the recovered text to the right source.  For more loss, a
410	   marker for possible loss should be inserted or presented.

412	   The conference server needs to have authority to decrypt the payload
413	   in the received RTP packets in order to be able to recover text from
414	   redundant data or insert the missing text marker in the stream, and
415	   repack the text in new packets.

417	   Even if the format is very similar to "text/red" of RFC 4103, it has
418	   been indicated that it needs to be declared as a new media subtype,
419	   e.g. "text/rex".

421	   Pros:

423	   This method has low overhead and less complexity than the methods in
424	   Section 4.1.1.1, Section 4.1.1.2, Section 4.1.1.4 and
425	   Section 4.1.1.6.

427	   When loss of packets occur, it is possible to recover text from
428	   redundancy at loss of up to the number of redundancy levels carried
429	   in the RFC 4103 [RFC4103] stream (normally primary and two redundant
430	   levels).

432	   This method can be implemented with most RTP implementations.

434	   The source switching performance is sufficient for well-behaving
435	   conference participants.  There can be switching between five source
436	   per second with an introduced delay of maximum 500 ms.  With just two
437	   parties typing simultaneously, the delay will be a maximum of 100 ms.

439	   Cons:

441	   When more consecutive packet loss than the number of generations of
442	   redundant data appears, it is not possible to deduct the sources of
443	   the totally lost data.

445	   Slightly higher risk for loss of text at bursty packet loss than for
446	   the recommended transmission interval for RFC 4103.

448	   Requires a different sub media format, e.g. "text/rex".

450	   The conference server needs to be allowed to decrypt/encrypt the
451	   packet payload.  This is however normal for media mixers for other
452	   media.

454	4.1.1.4.  RTP Mixer using timestamp to identify redundancy

456	   This method has text only from one source per packet, as the original
457	   RFC 4103 [RFC4103] specifies.  Packets with text from different
458	   sources are instead allowed to be merged.  The recovery procedure in
459	   the receiver will use the RTP timestamp and timestamp offsets in the
460	   redundancy headers to evaluate if a piece of redundant data should be
461	   recovered or not in case of packet loss.

463	   In this method, the transmission interval is 100 milliseconds when
464	   text from more than one source is available for transmission.

466	   Pros:

468	   The format of each packet is equal to what is specified in RFC 4103
469	   [RFC4103].

471	   The source switching performance is sufficient.  Text from five
472	   participants can be transmitted simultaneously with 500 milliseconds
473	   interval per source.

475	   New text from five simultaneous sources can be transmitted within 500
476	   milliseconds.  This is sufficient.

478	   Cons:

480	   The recovery time in case of packet loss is long.  With five
481	   participants, it will be 1.5 seconds.

483	   The recovery procedure is complex and very different from what is
484	   described in RFC 4103 [RFC4103].

486	   It is not sure that this change can be regarded to be an update to
487	   RFC 4103.  It may need a new media subtype.

489	4.1.1.5.  RTP Mixer with multiple primary data in each packet and
490	          individual sequence numbers

492	   This method allows primary as well as redundant text from more than
493	   one source per packet.  The packet payload contains an ordered set of
494	   redundant and primary data with the same number of generations of
495	   redundancy as once agreed in the SDP negotiation.  The data header
496	   reflects these parts of the payload.  The CSRC list contains one CSRC
497	   member per source in the payload and in the same order.  An
498	   individual sequence number per source is included in the data header
499	   replacing the t140 payload type number that is instead assumed to be
500	   constant in this format.  This allows an individual extra sequence
501	   number per source with maximum value 127, suitable for checking for
502	   which source loss of text appeared when recovery was not possible.

504	   The data header would contain the following fields:
505	     0                   1                    2                   3
506	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
507	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
508	   |F| Source-seq  |  timestamp offset         |   block length    |
509	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
510	   Where "Source-seq" is the sequence number per source.

512	   The maximum number of members in the CSRC-list is 16, and that is
513	   therefore the maximum number of sources that can be represented in
514	   each packet provided that all data can be fitted into the size
515	   allowable in one packet.

517	   Transmission is done as soon as there is new text available, but not
518	   with shorter interval than 150 ms and not longer than 300 ms while
519	   there is anything to send.

521	   A new media subtype is needed, e.g. "text/rex".

523	   This is an SDP offer example for both traditional "text/red"
524	   and multi-party "text/rex" format:

526	         m=text 11000 RTP/AVP 101 100 98
527	         a=rtpmap:98 t140/1000
528	         a=rtpmap:100 red/1000
529	         a=rtpmap:101 rex/1000
530	         a=fmtp:100 98/98/98
531	         a=fmtp:101 98/98/98

533	   Pros:

535	   The source switching performance is good.  Text from 16 participants
536	   can be transmitted simultaneously.

538	   New text from 16 simultaneous sources can be transmitted within 300
539	   milliseconds.  This is good performance.

541	   When more consecutive packet loss than the number of generations of
542	   redundant data appears, it is still possible to deduct the sources of
543	   the totally lost data, when next text from these sources arrive.

545	   Cons:

547	   The format of each packet is different from what is specified in RFC
548	   4103 [RFC4103].

550	   A new media subtype is needed.

552	   The recovery procedure is a bit complex.

554	4.1.1.6.  RTP Mixer with multiple primary data in each packet

556	   This method allows primary as well as redundant text from more than
557	   one source per packet.  The packet payload contains an ordered set of
558	   redundant and primary data with the same number of generations of
559	   redundancy as once agreed in the SDP negotiation.  The data header
560	   reflects these parts of the payload.  The CSRC list contains one CSRC
561	   member per source in the payload and in the same order.  The
562	   The maximum number of members in the CSRC-list is 16, and that is
563	   therefore the maximum number of sources that can be represented in
564	   each packet provided that all data can be fitted into the size
565	   allowable in one packet.

567	   Transmission is done as soon as there is new text available, but not
568	   with shorter interval than 150 ms and not longer than 300 ms while
569	   there is anything to send.

571	   A new media subtype is needed, e.g. "text/rex".

573	   SDP would be the same as in Section 4.1.1.6.

575	   Pros:

577	   The source switching performance is good.  Text from 16 participants
578	   can be transmitted simultaneously.

580	   New text from 16 simultaneous sources can be transmitted within 150
581	   milliseconds.  This is good performance.

583	   Cons:

585	   The format of each packet is different from what is specified in RFC
586	   4103 [RFC4103].

588	   A new media subtype is needed.

590	   The recovery procedure is a bit complex [RFC4103].

592	   When more consecutive packet loss than the number of generations of
593	   redundant data appears, it is not possible to deduct the sources of
594	   the totally lost data.

596	4.1.1.7.  RTP Mixer with RFC 5109 FEC and RFC 2198 redundancy in the
597	          packets

599	   This method allows primary data from one source and redundant text
600	   from other sources in each packet.  The packet payload contains
601	   primary data in "text/t140" format, and redundant data in RFC 5109
602	   FEC [RFC5109] format called "text/ulpfec".  That means that the
603	   redundant data contains the sequence number and the CSRC and other
604	   characteristics from the RTP header when the data was sent as
605	   primary.  The redundancy can be sent at a selected number of packets
606	   after when it was sent as primary, in order to improve the protection
607	   against bursty packet loss.  The redundancy level is recommended to
608	   be the same as in original RFC 4103.

610	   RFC 4103 says that the protection against loss can be made by other
611	   methods than plain redundancy, so this method is in line with that
612	   statement.

614	   Transmission is done as soon as there is new text available, but not
615	   with shorter interval than 100 ms and not longer than 300 ms while
616	   there is anything to send (new or redundant text).

618	   When more consecutive packet loss than the number of generations of
619	   redundant data appears, it is not possible to deduct the sources of
620	   the totally lost data.

622	   The sdp can indicate the format as "text/red" with "text/ulpfec"
623	   redundant data in this way. with traditional RFC 4103 with "text/red"
624	   with "text/t140" as redundant data as a fallback.

626	   m=text 49170 RTP/AVP 98 101 100 102
627	   a=rtpmap:98 red/1000
628	   a=fmtp:98 100/102/102
629	   a=rtpmap:102 ulpfec/1000
630	   a=rtpmap:100 t140/1000
631	   a=rtpmap:101 red/1000
632	   a=fmtp:101 100/100/100
633	   a=fmtp:100 cps=200

635	   The "text/ulpfec" format includes an indication of how far back the
636	   redundancy belongs, making it possible to cover bursty packet loss
637	   better than the other formats with short transmission intervals.  For
638	   real-time text, it is recommended to send three packets between the
639	   primary and the redundant transmissions of text.  That makes the
640	   transmission cover between 500 and 1500 ms of bursty packet loss.
641	   The variation is because of the varying packet interval between many
642	   and one simultaneously transmitting source.

644	   The "text/ulpfec" format has a number of parameters.  One is the
645	   length of the data to be protected which in this case must be the
646	   whole t140block.

648	   Pros:

650	   The source switching performance is good.  Text from 5 participants
651	   can be transmitted within 500 ms.

653	   Good recovery from bursty packet loss.

655	   The method is based on existing standards.  No new registrations are
656	   needed.

658	   Cons:

660	   When more consecutive packet loss than the number of generations of
661	   redundant data appears, it is not possible to deduct the sources of
662	   the totally lost data.

664	   Even if the switching performance is good, it is not as good as for
665	   the method called "RTP Mixer with multiple primary data in each
666	   packet "Section 4.1.1.6.  With more than 5 simultaneously sending
667	   sources, there will be a noticeable delay of text of over 500 ms,
668	   with 100 ms added per simultaneous source.  This is however beyond
669	   the requirements and would be a concern only in congestion
670	   situations.

672	   The recovery procedure is a bit complex [RFC5109].

674	   There is more overhead in terms of extra data and extra packets sent
675	   than in the other methods.  With the recommended two redundant
676	   generations of data, each packet will be 36 bytes longer than with
677	   traditional RFC 4103, and at each pause in transmission five extra
678	   packets with only redundant data will be sent compared to two extra
679	   packets for the traditional RFC 4103 case.

681	4.1.1.8.  RTP Mixer with RFC 5109 FEC and RFC 2198 redundancy and
682	          separate sequence number in the packets

684	   This method allows primary data from one source and redundant text
685	   from other sources in each packet.  The packet payload contains
686	   primary data in a new "text/t140e" format, and redundant data in RFC
687	   5109 FEC [RFC5109] format called "text/ulpfec".  That means that the
688	   redundant data contains the sequence number and the CSRC and other
689	   characteristics from the RTP header when the data was sent as
690	   primary.  The redundancy can be sent at a selected number of packets
691	   after when it was sent as primary, in order to improve the protection
692	   against bursty packet loss.  The redundancy level is recommended to
693	   be the same as in original RFC 4103.  The "text/t140e" format
694	   contains a source-specific sequence number and the t140block.

696	   RFC 4103 says that the protection against loss can be made by other
697	   methods than plain redundancy, so this method is in line with that
698	   statement.

700	   Transmission is done as soon as there is new text available, but not
701	   with shorter interval than 100 ms and not longer than 300 ms while
702	   there is anything to send (new or redundant text).

704	   When more consecutive packet loss than the number of generations of
705	   redundant data appears, it is possible to deduct which sources lost
706	   data when new data arrives from the sources.  This is done by
707	   monitoring the received source specific sequence numbers preceding
708	   the text.

710	   This is an example of how can indicate the format as "text/red" with
711	   "text/t140e" as primary and "text/ulpfec" redundant data, with
712	   traditional RFC 4103 with "text/red" with "text/t140" as redundant
713	   data as a fallback.

715	   m=text 49170 RTP/AVP 98 101 100 102 103
716	   a=rtpmap:98 red/1000
717	   a=fmtp:98 100/102/102
718	   a=rtpmap:102 ulpfec/1000
719	   a=rtpmap:103 t140/1000
720	   a=rtpmap:100 t140e/1000
721	   a=rtpmap:101 red/1000
722	   a=fmtp:101 103/103/103
723	   a=fmtp:100 cps=200

725	   The "text/ulpfec" format includes an indication of how far back the
726	   redundancy belongs, making it possible to cover bursty packet loss
727	   better than the other formats with short transmission intervals.  For
728	   real-time text, it is recommended to send three packets between the
729	   primary and the redundant transmissions of text.  That makes the
730	   transmission cover between 500 and 1500 ms of bursty packet loss.
731	   The variation is because of the varying packet interval between many
732	   and one simultaneously transmitting source.

734	   The "text/ulpfec" format has a number of parameters.  One is the
735	   length of the data to be protected which in this case must be the
736	   whole t140block.

738	   Pros:

740	   The source switching performance is good.  Text from 5 participants
741	   can be transmitted within 500 ms.

743	   Good recovery from bursty packet loss.

745	   The method is based on an existing standard for FEC.

747	   When more consecutive packet loss than the number of generations of
748	   redundant data appears, it is possible to deduct the source of the
749	   lost data when new text arrives from the source.

751	   Cons:

753	   Even if the switching performance is good, it is not as good as for
754	   the method called "RTP Mixer with multiple primary data in each
755	   packet" Section 4.1.1.6.  With more than 5 simultaneously sending
756	   sources, there will be a noticeable delay of text of over 500 ms,
757	   with 100 ms added per simultaneous source.  This is however beyond
758	   the requirements and would be a concern only in congestion
759	   situations.

761	   The recovery procedure is a bit complex [RFC5109].

763	   There is more overhead in terms of extra data and extra packets sent
764	   than in the other methods.  With the recommended two redundant
765	   generations of data, each packet will be 40 bytes longer than with
766	   traditional RFC 4103, and at each pause in transmission five extra
767	   packets with only redundant data will be sent compared to two extra
768	   packets for the traditional RFC 4103 case.

770	   A new text media subtype "text/t140e" needs to be registered.

772	4.1.1.9.  RTP Mixer indicating participants by a control code in the
773	          stream

775	   Text from all participants except the receiving one is transmitted
776	   from the media mixer in the same RTP session and stream, thus all
777	   using the same destination address/port combination, the same RTP
778	   SSRC and , one sequence number series as described in Section 7.1 and
779	   7.3 of RTP RFC 3550 [RFC3550] about the Mixer function.  The sources
780	   of the text in each RTP packet are identified by a new defined T.140
781	   control code "c" followed by a unique identification of the source in
782	   UTF-8 string format.

784	   The receiver can use the string for presenting the source of text.
785	   This method is on the RTP level described in RFC 7667, section 3.6.1
786	   Media mixing mixer [RFC7667].

788	   The inline coding of the source of text is applied in the data stream
789	   itself, and an RTP mixer function is used for coordinating the
790	   sources of text into one RTP stream.

792	   Information uniquely identifying each user in the multi-party session
793	   is placed as the parameter value "n" in the T.140 application
794	   protocol function with the function code "c".  The identifier shall
795	   thus be formatted like this: SOS c n ST, where SOS and ST are coded
796	   as specified in ITU-T T.140 [T140].  The "c" is the letter "c".  The
797	   n parameter value is a string uniquely identifying the source.  This
798	   parameter shall be kept short so that it can be repeated in the
799	   transmission without concerns for network load.

801	   A receiving endpoint is supposed to separate text items from the
802	   different sources and identify and display them accordingly.

804	   The conference server need to be allowed to decrypt/encrypt the
805	   packet payload in order to check the source and repack the text.

807	   Pros:

809	   If loss of packets occur, it is possible to recover text from
810	   redundancy at loss of up to the number of redundancy levels carried
811	   in the RFC 4103 [RFC4103]stream. (normally primary and two redundant
812	   levels.

814	   This method can be implemented with most RTP implementations.

816	   The method can also be used with other transports than RTP

818	   Cons:

820	   The method implies a moderate load by the need to insert the source
821	   often in the stream.

823	   If more consecutive packet loss than the number of generations of
824	   redundant data appears, it is not possible to deduct the source of
825	   the totally lost data.

827	   The mixer needs to be able to generate suitable and unique source
828	   identifications which are suitable as labels for the sources.

830	   Requires an extension on the ITU-T T.140 standard, best made by the
831	   ITU.

833	   There is a risk that the control code indicating the change of source
834	   is lost and the result is false source indication of text.

836	   The conference server need to be allowed to decrypt/encrypt the
837	   packet payload.

839	4.1.1.10.  Mixing for multi-party unaware user agents

841	   Multi-party real-time text contents can be transmitted to multi-party
842	   unaware user agents if source labelling and formatting of the text is
843	   performed by a mixer.  This method has the limitations that the
844	   layout of the presentation and the format of source identification is
845	   purely controlled by the mixer, and that only one source at a time is
846	   allowed to present in real-time.  Other sources need to be stored
847	   temporarily waiting for an appropriate moment to switch the source of
848	   transmitted text.  The mixer controls the switching of sources and
849	   inserts a source identifier in text format at the beginning of text
850	   after switch of source.  The logic of the mixer to detect when a
851	   switch is appropriate should detect a number of places in text where
852	   a switch can be allowed, including new line, end of sentence, end of
853	   phrase, a period of inactivity, and a word separator after a long
854	   time of active transmission.

856	   This method MAY be used when no support for multi-party awareness is
857	   detected in the receiving endpoint.The base for his method is
858	   described in RFC 7667, section 3.6.1 Media mixing mixer [RFC7667].

860	   See [I-D.ietf-avtcore-multi-party-rtt-mix] for a procedure for mixing
861	   RTT for a conference-unaware endpoint.

863	   Pros:

865	   Can be transmitted to conference-unaware endpoints.

867	   Can be used with other transports than RTP

869	   Cons:

871	   Does not allow full real-time presentation of more than one source at
872	   a time.  Text from other sources will be delayed.

874	   The only realistic presentation format is a style with the text from
875	   the different sources presented with a text label indicating source,
876	   and the text collected in a chat style presentation but with more
877	   frequent turn-taking.

879	   Endpoints often have their own system for adding labels to the RTT
880	   presentation.  In that case there will be two levels of labels in the
881	   presentation, one for the mixer and one for the sources.

883	   If loss of more packets than can be recovered by the redundancy
884	   appears, it is not possible to detect which source was struck by the
885	   loss.  It is also possible that a source switch occurred during the
886	   loss, and therefore a false indication of the source of text can be
887	   provided to the user after such loss.

889	   Because of all these cons, this method is not recommended and MUST
890	   NOT be used as the main method, but only as the last resort for
891	   backwards interoperability with multi-party unaware endpoints.

893	   The conference server need to be allowed to decrypt/encrypt the
894	   packet payload.

896	4.1.2.  RTP-based bridging with minor RTT media contents reformatting by
897	        the bridge

899	   It may be desirable to send text in a multi-party setting in a way
900	   that allows the text stream contents to be distributed without being
901	   dealt with in detail in any central server.  A number of such methods
902	   are described.  However, when writing this specification, no one of
903	   these methods have a specified way of establishing the session by
904	   sdp.

906	4.1.2.1.  RTP Translator sending one RTT stream per participant

908	   Within the RTP session, text from each participant is transmitted
909	   from the RTP media translator in a separate RTP stream, thus using
910	   the same destination address/port combination, but separate RTP SSRC
911	   parameters and sequence number series as described in Section 7.1 and
912	   7.2 of RTP RFC 3550 [RFC3550] about the Translator function.  The
913	   source of the text in each RTP packet is identified by the SSRC
914	   parameter in the RTP packets, containing the SSRC of the initial
915	   source of text.

917	   A receiving and presenting endpoint is supposed to separate text
918	   items from the different sources and identify and display them in a
919	   suitable way.

921	   This method is described in RFC 7667, section 3.5.1 Relay-transport
922	   translator or 3.5.2 Media translator [RFC7667].

924	   The identification of the source is made through the SSRC and the
925	   RTCP SDES CNAME and NAME packets as described in RTP[RFC3550].

927	   Pros:

929	   This method has moderate overhead in terms of work for the mixer, but
930	   high in terms of packet transmission rate.  When loss of packets
931	   occur, it is possible to recover text from redundancy at loss of up
932	   to the number of redundancy levels carried in the RFC 4103 [RFC4103]
933	   stream(normally primary and two redundant levels).

935	   More loss than what can be recovered, can be detected and the marker
936	   for text loss can be inserted in the correct stream.

938	   It may be possible in some scenarios to keep the text encrypted
939	   through the Translator.

941	   Cons:

943	   There may be RTP implementations not supporting the Translator model.

945	   With many simultaneous sending sources, the total rate of packets
946	   will be high, and can cause congestion.

948	   This configuration is not supported by current media declarations in
949	   sdp.  RFC 3264 [RFC3264]specifies in many places that one media
950	   description is supposed to describe just one RTP stream.

952	4.1.2.2.  Distributing packets in an end-to-end encryption structure

954	   In order to achieve end-to-end encryption, it is possible to let the
955	   packets from the sources just pass though a central distributor, and
956	   handle the security agreements between the participants.
957	   Specifications exist for a framework with this functionality for
958	   application on RTP based conferences in
959	   [I-D.ietf-perc-private-media-framework].  The RTP flow and mixing
960	   characteristics has similarities with the method described under "RTP
961	   Translator sending one RTT stream per participant" above.  RFC 4103
962	   RTP streams [RFC4103] would fit into the structure and it would
963	   provide a base for end-to-end encrypted rtt multi-party conferencing.

965	   Pros:

967	   Good security

969	   Straightforward multi-party handling.

971	   Cons:

973	   Does not operate under the usual SIP central conferencing
974	   architecture.

976	   Requires the participants to perform a lot of key handling.

978	   Is work in progress when this is written.

980	4.1.2.3.  Mesh of RTP endpoints

982	   Text from all participants are transmitted directly to all others in
983	   one RTP session, without a central bridge.  The sources of the text
984	   in each RTP packet are identified by the source network address and
985	   the SSRC.

987	   This method is described in RFC 7667, section 3.4 Point to multi-
988	   point using mesh [RFC7667].

990	   Pros:

992	   When loss of packets occur, it is possible to recover text from
993	   redundancy at loss of up to the number of redundancy levels carried
994	   in the RFC 4103 [RFC4103] stream. (normally primary and two redundant
995	   levels.

997	   This method can be implemented with most RTP implementations.

999	   Transmitted text can also be used with other transports than RTP

1001	   Cons:

1003	   This model is not described in IMS, NENA and EENA specifications, and
1004	   does therefore not meet the requirements.

1006	   Requires a drastically increasing number of connections when the
1007	   number of participants increase.

1009	4.1.2.4.  Multiple RTP sessions, one for each participant

1011	   Text from all participants are transmitted directly to all others in
1012	   one RTP session each, without a central bridge.  Each session is
1013	   established with a separate media description in SDP.  The sources of
1014	   the text in each RTP packet are identified by the source network
1015	   address and the SSRC.

1017	   Pros:

1019	   When loss of packets occur, it is possible to recover text from
1020	   redundancy at loss of up to the number of redundancy levels carried
1021	   in the RFC 4103 [RFC4103] stream. (normally primary and two redundant
1022	   levels.

1024	   Complete loss of text can be indicated in the received stream.

1026	   This method can be implemented with most RTP implementations.

1028	   End-to-end encryption is achievable.

1030	   Cons:

1032	   This method is not described in IMS, NENA and ETSI specifications and
1033	   does therefore not meet the requirements.

1035	   A lot of network resources are spent on setting up separate sessions
1036	   for each participant.

1038	5.  Preferred RTP-based multi-party RTT transport method

1040	   For RTP transport of RTT using RTP-mixer technology, one method for
1041	   multi-party mixing and transport stand out as fulfilling the goals
1042	   best and is therefore recommended.  That is: TBD

1044	   For RTP transport in separate streams or sessions, no current
1045	   recommendation can be made.  A bridging method in the process of
1046	   standardisation with interesting characteristics is the end-to-end
1047	   encryption model "perc" Section 4.1.2.2.

1049	6.  Session control of RTP-based multi-party RTT sessions

1051	   General session control aspects for multi-party sessions are
1052	   described in RFC 4575 [RFC4575] A Session Initiation Protocol (SIP)
1053	   Event Package for Conference State, and RFC 4579 [RFC4579] Session
1054	   Initiation Protocol (SIP) Call Control - Conferencing for User
1055	   Agents.  The nomenclature of these specifications are used here.

1057	   The procedures for a multi-party aware model for RTT-transmission
1058	   shall only be applied if a capability exchange for multi-party aware
1059	   real-time text transmission has been completed and a supported method
1060	   for multi-party real-time text transmission can be negotiated.

1062	   A method for detection of conference-awareness for centralized SIP
1063	   conferencing in general is specified in RFC 4579 [RFC4579].  The
1064	   focus sends the "isfocus" feature tag in a SIP Contact header.  This
1065	   causes the conference-aware endpoint to subscribe to conference
1066	   notifications from the focus.  The focus then sends notifications to
1067	   the endpoint about entering and disappearing conference participants
1068	   and their media capabilities.  The information is carried XML-
1069	   formatted in a 'conference-info' block in the notification according
1070	   to RFC 4575 [RFC4575].  The mechanism is described in detail in RFC
1071	   4575 [RFC4575].

1073	   Before a conference media server starts sending multi-party RTT to an
1074	   endpoint, a verification of its ability to handle multi-party RTT
1075	   must be made.  A decision on which mechanism to use for identifying
1076	   text from the different participants must also be taken, implicitly
1077	   or explicitly.  These verifications and decisions can be done in a
1078	   number of ways.  The most apparent ways are specified here and their
1079	   pros and cons described.  One of the methods is selected to be the
1080	   one to be used by implementations of the centralized conference model
1081	   according to this specification.

1083	6.1.  Implicit RTT multi-party capability indication

1085	   Capability for RTT multi-party handling can be decided to be
1086	   implicitly indicated by session control items.

1088	   The focus may implicitly indicate muti-party RTT capability by
1089	   including the media child with value "text" in the RFC 4575 [RFC4575]
1090	   conference-info provided in conference notifications.

1092	   An endpoint may implicitly indicate multi-party RTT capability by
1093	   including the text media in the SDP in the session control
1094	   transactions with the conference focus after the subscription to the
1095	   conference has taken place.

1097	   The implicit RTT capability indication means for the focus that it
1098	   can handle multi-party RTT according to the preferred method
1099	   indicated in the RTT multi-party methods section above.

1101	   The implicit RTT capability indication means for the endpoint that it
1102	   can handle multi-party RTT according to the preferred method
1103	   indicated in the RTT multi-party methods section above.

1105	   If the focus detects that an endpoint implicitly declared RTT multi-
1106	   party capability, it SHALL provide RTT according to the preferred
1107	   method.

1109	   If the focus detects that the endpoint does not indicate any RTT
1110	   multi-party capability, then it shall either provide RTT multi-party
1111	   text in the way specified for conference-unaware endpoint above, or
1112	   refuse to set up the session.

1114	   If the endpoint detects that the focus has implicitly declared RTT
1115	   multi-party capability, it shall be prepared to present RTT in a
1116	   multi-party fashion according to the preferred method.

1118	   Pros:

1120	   Acceptance of implicit multi-party capability implies that no
1121	   standardisation of explicit RTT multi-party capability exchange is
1122	   required.

1124	   Cons:

1126	   If other methods for multi-party RTT are to be used in the same
1127	   implementation environment as the preferred ones, then capability
1128	   exchange needs to be defined for them.

1130	   Cannot be used outside a strictly applied SIP central conference
1131	   model.

1133	6.2.  RTT multi-party capability declared by SIP media-tags

1135	   Specifications for RTT multi-party capability declarations can be
1136	   agreed for use as SIP media feature tags, to be exchanged during SIP
1137	   call control operation according to the mechanisms in RFC 3840
1138	   [RFC3840] and RFC 3841 [RFC3841].  Capability for the RTT Multi-party
1139	   capability is then indicated by the media feature tag "rtt-mix", with
1140	   a set of possible values for the different possible methods.

1142	   The possible values in the list may for example be:

1144	      rtp-mixer

1146	      perc

1148	   rtp-mixer indicates capability for using the RTP-mixer based
1149	   presentation of multi-party text.

1151	   perc indicates capability for using the perc based transmission of
1152	   multi-party text.

1154	   Example: Contact: <sip:a2@beco.example.com>

1156	   ;methods="INVITE,ACK,OPTIONS,BYE,CANCEL"

1158	   ;+sip.rtt-mix="rtp-mixer"

1160	   If, after evaluation of the alternatives in this specification, only
1161	   one mixing method is selected to be brought to implementation, then
1162	   the media tag can be reduced to a single tag with no list of values.

1164	   An offer-answer exchange should take place and the common method
1165	   selected by the answering party shall be used in the session with
1166	   that UA.

1168	   When no common method is declared, then only the fallback method for
1169	   multi-party unaware participants can be used, or the session dropped.

1171	   If more than one text media section is included in SDP, all must be
1172	   capable of using the declared RTT multi-party method.

1174	   Pros:

1176	   Provides a clear decision method.

1178	   Can be extended with new mixing methods.

1180	   Can guide call routing to a suitable capable focus.

1182	   Cons:

1184	   Requires standardization and IANA registration.

1186	   Is not stream specific.  If more than one text stream is specified,
1187	   all must have the same type of multi-party capability.

1189	   Cannot be used in the WebRTC environment.

1191	6.3.  SDP media attribute for RTT multi-party capability indication

1193	   An attribute can be specified on media level, to be used in text
1194	   media SDP declarations for negotiating RTT multi-party capabilities.
1195	   The attribute can have the name "rtt-mix".

1197	   More than one attribute can be included in one media description.

1199	   The attribute can have a value.  The value can for example be:

1201	      rtp-mixer

1203	      rtp-translator

1205	      perc

1207	   rtp-mixer indicates capability for using the RTP-mixer and CSRC-list
1208	   based mixing of multi-party text.

1210	   rtp-translator indicates capability for using the RTP-translator
1211	   based mixing

1213	   perc indicates capability for using the perc based transmission of
1214	   multi-party text.

1216	   An offer-answer exchange should take place and the common method
1217	   selected by the answering party shall be used in the session with
1218	   that endpoint.

1220	   When no common method is declared, then only the fallback method for
1221	   multi-party unaware endpoints can be used.

1223	   Example: a=rtt-mix:rtp-mixer
1224	   If, after evaluation of the alternatives in this specification, only
1225	   one mixing method is selected to be brought to implementation, then
1226	   the attribute can be reduced to a single attribute with no list of
1227	   values.

1229	   Pros:

1231	   Provides a clear decision method.

1233	   Can be extended with new mixing methods.

1235	   Can be used on specific text media.

1237	   Can be used also for SDP-controlled WebRTC sessions with multiple
1238	   streams in the same data channel.

1240	   Cons:

1242	   Requires standardization and IANA registration.

1244	   Cannot guide SIP routing.

1246	6.4.  Simplified SDP media attribute for RTT multi-party capability
1247	      indication

1249	   An attribute can be specified on media level, to be used in text
1250	   media SDP declarations for negotiating RTT multi-party capabilities.
1251	   The attribute can have the name "rtt-mix" with no value.  It would be
1252	   selected and used if only one method for multi-party rtt is brought
1253	   forward from this specification, and the other suppressed or found to
1254	   be possible to negotiate in another way.

1256	   An offer-answer exchange should take place and if both parties
1257	   specify "rtt-mix" capability, the selected mixing method shall be
1258	   used.

1260	   When no common method is declared, then only the fallback method for
1261	   multi-party unaware endpoints can be used, or the session not
1262	   accepted for multi-party use.

1264	   Example: a=rtt-mix

1266	   Pros:

1268	   Provides a clear decision method.

1270	   Very simple syntax and semantics.

1272	   Can be used on specific text media.

1274	   Could possibly be used also for SDP-controlled WebRTC sessions with
1275	   multiple streams in the same data channel.

1277	   Cons:

1279	   Requires standardization and IANA registration.

1281	   If another RTT mixing method is also specified in the future, then
1282	   that method may also need to specify and register its own attribute,
1283	   instead of if an attribute with a parameter value is used, when only
1284	   an addition of a new possible value is needed.

1286	   Cannot guide SIP routing.

1288	6.5.  SDP format parameter for RTT multi-party capability indication

1290	   An FMTP format parameter can be specified for the RFC 4103
1291	   [RFC4103]media, to be used in text media SDP declarations for
1292	   negotiating RTT multi-party capabilities.  The parameter can have the
1293	   name "rtt-mix", with one or more of its possible values.

1295	   The possible values in the list are:

1297	      rtp-mixer

1299	      perc

1301	   rtp-mixer indicates capability for using the RTP-mixer based mixing
1302	   and presentation of multi-party text using the CSRC-list.

1304	   perc indicates capability for using the perc based transmission of
1305	   multi-party text.

1307	   Example: a=fmtp 96 98/98/98 rtt-mix=rtp-mixer

1309	   If, after evaluation of the alternatives in this specification, only
1310	   one mixing method is selected to be brought to implementation, then
1311	   the parameter can be reduced to a single parameter with no list of
1312	   values.

1314	   An offer-answer exchange should take place and the common method
1315	   selected by the answering party shall be used in the session with
1316	   that UA.

1318	   When no common method is declared, then only the fallback method can
1319	   be used, or the session denied.

1321	   Pros:

1323	   Provides a clear decision method.

1325	   Can be extended with new mixing methods.

1327	   Can be used on specific text media.

1329	   Can be used also for SDP-controlled WebRTC sessions with multiple
1330	   streams in the same data channel.

1332	   Cons:

1334	   Requires standardization and IANA registration.

1336	   May cause interop problems with current RFC4103 [RFC4103]
1337	   implementations not expecting a new fmtp-parameter.

1339	   Cannot guide SIP routing.

1341	6.6.  A text media subtype for support of multi-party rtt

1343	   Indicating a specific text media subtype in SDP is a straightforward
1344	   way for negotiating multi-party capability.  Especially if there are
1345	   format differences from the "text/red" and "text/t140" formats of
1346	   RFC4103 [RFC4103], then this is a natural way to do the negotiation
1347	   for multi-party rtt.

1349	   Pros:

1351	   No extra efforts if a new format is needed anyway.

1353	   Cons:

1355	   None specific to using the format indication for negotiation of
1356	   multi-party capability.  But only feasible if a new format is needed
1357	   anyway.

1359	6.7.  Preferred capability declaration method for RTP-based transport.

1361	   If the preferred transport method is one with a specific media
1362	   subtype in sdp, then speciication by media subtype is preferred.

1364	   If this would not be the case, then the preferred capability
1365	   declaration method would be the one with a simplified SDP attribute
1366	   "a=rtt-mix" Section 6.4 because it is straightforward and partially
1367	   usable also for WebRTC if so needed.

1369	6.8.  Identification of the source of text for RTP-based solutions

1371	   The main way to identify the source of text in the RTP based solution
1372	   is by the SSRC of the sending participant.  In the RTP-mixer
1373	   solution, this SSRC is included in the CSRC list of the transmitted
1374	   packets.  Further identification that may be needed for better
1375	   labelling of received text may be achieved from a number of sources.
1376	   It may be the RTCP SDES CNAME and NAME reports, and in the conference
1377	   notification data (RFC 4575) [RFC4575].

1379	   As soon as a new member is added to the RTP session, its
1380	   characteristics should be transmitted in RTCP SDES CNAME and NAME
1381	   reports according to section 6.5 in RFC 3550 [RFC3550].  The
1382	   information about the participant should also be included in the
1383	   conference data including the text media member in a notification
1384	   according to RFC 4575 [RFC4575].

1386	   The RTCP SDES report, SHOULD contain identification of the source
1387	   represented by the SSRC/CSRC identifier.  This identification MUST
1388	   contain the CNAME field and MAY contain the NAME field and other
1389	   defined fields of the SDES report.

1391	   A focus UA SHOULD primarily convey SDES information received from the
1392	   sources of the session members.  When such information is not
1393	   available, the focus UA SHOULD compose SSRC/CSRC, CNAME and NAME
1394	   information from available information from the SIP session with the
1395	   participant.

1397	7.  RTT bridging in WebRTC

1399	   Within WebRTC, real-time text is specified to be carried in WebRTC
1400	   data channels as specified in
1401	   [I-D.ietf-mmusic-t140-usage-data-channel].  A few ways to handle
1402	   multi-party RTT are mentioned briefly.  They are repeated below.

1404	7.1.  RTT bridging in WebRTC with one data channel per source

1406	   A straightforward way to handle multi-party RTT is for the bridge to
1407	   open one T.140 data channel per source towards the receiving
1408	   participants.

1410	   The stream-id forms a unique stream identification.

1412	   The identification of the source is made through the Label property
1413	   of the channel, and session information belonging to the source.  The
1414	   endpoint can compose a readable label for the presentation from this
1415	   information.

1417	   Pros:

1419	   This is a straightforward solution.

1421	   The load per source is low.

1423	   Cons:

1425	   With a high number of participants, the overhead of establishing and
1426	   maintaining the high number of data channels required may be high,
1427	   even if the load per channel is low.

1429	7.2.  RTT bridging in WebRTC with one common data channel

1431	   A way to handle multi-party RTT in WebRTC is for the bridge combine
1432	   text from all sources into one data channel and insert the sources in
1433	   the stream by a T.140 control code for source.

1435	   This method is described in a corresponding section for RTP
1436	   transmission above in Section 4.1.1.9.

1438	   The identification of the source is made through insertion in the
1439	   beginning of each text transmission from a source of a control code
1440	   extension "c" followed by a string representing the source, framed by
1441	   the control code start and end flags SOS and ST (See ITU-T T.140
1442	   [T140]).

1444	   A receiving endpoint is supposed to separate text items from the
1445	   different sources and identify and display them in a suitable way.

1447	   The endpoint does not always display the source identification in the
1448	   received text at the place where it is received, but has the
1449	   information as a guide for planning the presentation of received
1450	   text.  A label corresponding to the source identification is
1451	   presented when needed depending on the selected presentation style.

1453	   Pros:

1455	   This solution has relatively low overhead on session and network
1456	   level

1458	   Cons:

1460	   This solution has higher overhead on the media contents level than
1461	   the WebRTC solution above.

1463	   Standardisation of the new control code "c" in ITU-T T.140 [T140] is
1464	   required.

1466	   The conference server need to be allowed to decrypt/encrypt the data
1467	   channel contents.

1469	7.3.  Preferred rtt multi-party method for WebRTC

1471	   For WebRTC, one method is to prefer because of the simplicity.  So,
1472	   for WebRTC, the method to implement for multi-party RTT with multi-
1473	   party aware parties when no other method is explicitly agreed between
1474	   implementing parties is: "RTT bridging in WebRTC with one data
1475	   channel per source" Section 7.1.

1477	8.  Presentation of multi-party text

1479	   All session participants with RTP based transport MUST observe the
1480	   SSRC/CSRC field of incoming text RTP packets, and make note of which
1481	   source they came from in order to be able to present text in a way
1482	   that makes it easy to read text from each participant in a session,
1483	   and get information about the source of the text.

1485	   In the WebRTC case, the Label parameter and other provided endpoint
1486	   information should be used for the same purpose.

1488	8.1.  Associating identities with text streams

1490	   A source identity SHOULD be composed from available information
1491	   sources and displayed together with the text as indicated in ITU-T
1492	   T.140 Appendix[T140].

1494	   The source identity should primarily be the NAME field from incoming
1495	   SDES packets.  If this information is not available, and the session
1496	   is a two-party session, then the T.140 source identity SHOULD be
1497	   composed from the SIP session participant information.  For multi-
1498	   party sessions the source identity may be composed by local
1499	   information if sufficient information is not available in the
1500	   session.

1502	   Applications may abbreviate the presented source identity to a
1503	   suitable form for the available display.

1505	   Applications may also replace received source information with
1506	   internally used nicknames.

1508	8.2.  Presentation details for multi-party aware endpoints.

1510	   The multi-party aware endpoint should after any action for recovery
1511	   of data from lost packets, separate the incoming streams and present
1512	   them according to the style that the receiving application supports
1513	   and the user has selected.  The decisions taken for presentation of
1514	   the multi-party interchange shall be purely on the receiving side.
1515	   The sending application must not insert any item in the stream to
1516	   influence presentation that is not requested by the sending
1517	   participant.

1519	8.2.1.  Bubble style presentation

1521	   One often used style is to present real-time text in chunks in
1522	   readable bubbles identified by labels containing names of sources.
1523	   Bubbles are placed in one column in the presentation area and are
1524	   closed and moved upwards in the presentation area after certain items
1525	   or events, when there is also newer text from another source that
1526	   would go into a new bubble.  The text items that allows bubble
1527	   closing are any character closing a phrase or sentence followed by a
1528	   space or a timeout of a suitable time (about 10 seconds).

1530	   Real-time active text sent from the local user should be presented in
1531	   a separate area.  When there is a reason to close a bubble from the
1532	   local user, the bubble should be placed above all real-time active
1533	   bubbles, so that the time order that real-time text entries were
1534	   completed is visible.

1536	   Scrolling is usually provided for viewing of recent or older text.
1537	   When scrolling is done to an earlier point in the text, the
1538	   presentation shall not move the scroll position by new received text.
1539	   It must be the decision of the local user to return to automatic
1540	   viewing of latest text actions.  It may be useful with an indication
1541	   that there is new text to read after scrolling to an earlier position
1542	   has been activated.

1544	   The presentation area may become too small to present all text in all
1545	   real-time active bubbles.  Various techniques can be applied to
1546	   provide a good overview and good reading opportunity even in such
1547	   situations.  The active real-time bubble may have a limited number of
1548	   lines and if their contents need more lines, then a scrolling
1549	   opportunity within the real-time active bubble is provided.  Another
1550	   method can be to only show the label and the last line of the active
1551	   real-time bubble contents, and make it possible to expand or compress
1552	   the bubble presentation between full view and one line view.

1554	   Erasures require special consideration.  Erasure within a real-time
1555	   active bubble is straightforward.  But if erasure from one
1556	   participant affects the last character before a bubble, the whole
1557	   previous bubble becomes the actual bubble for real-time action by
1558	   that participant and is placed below all other bubbles in the
1559	   presentation area.  If the border between bubbles was caused by the
1560	   CRLF characters (instead of the normal "Line Separator"), only one
1561	   erasure action is required to erase this bubble border.  When a
1562	   bubble is closed, it is moved up, above all real-time active bubbles.

1564	   A three-party view is shown in this example .

1566	                 _________________________________________________
1567	                |                                              |^|
1568	                |                                              |-|
1569	                |[Alice] Hi, Alice here.                       | |
1570	                |                                              | |
1571	                |[Bob] Bob as well.                            | |
1572	                |                                              | |
1573	                |[Eve] Hi, this is Eve, calling from Paris.    | |
1574	                |      I thought you should be here.           | |
1575	                |                                              | |
1576	                |[Alice] I am coming on Thursday, my           | |
1577	                |      performance is not until Friday morning.| |
1578	                |                                              | |
1579	                |[Bob] And I on Wednesday evening.             | |
1580	                |                                              | |
1581	                |[Alice] Can we meet on Thursday evening?      | |
1582	                |                                              | |
1583	                |[Eve] Yes, definitely. How about 7pm.         | |
1584	                |     at the entrance of the restaurant        | |
1585	                |     Le Lion Blanc?                           | |
1586	                |[Eve] we can have dinner and then take a walk | |
1587	                |                                              | |
1588	                | <Eve-typing> But I need to be back to        | |
1589	                |    the hotel by 11 because I need            | |
1590	                |                                              | |
1591	                | <Bob-typing> I wou                           |-|
1592	                |______________________________________________|v|
1593	                | of course, I underst                           |
1594	                |________________________________________________|

1596	               Figure 1: Three-party call with bubble style.

1598	   Figure 1: Example of a three-party call presented in the bubble
1599	   style.

1601	8.2.2.  Other presentation styles

1603	   Other presentation styles than the bubble style may be arranged and
1604	   appreciated by the users.  In a video conference one way may be to
1605	   have a real-time text area below the video view of each participant.
1606	   Another view may be to provide one column in a presentation area for
1607	   each participant and place the text entries in a relative vertical
1608	   position corresponding to when text entry in them was completed.  The
1609	   labels can then be placed in the column header.  The considerations
1610	   for ending and moving and erasure of entered text discussed above for
1611	   the bubble style are valid also for these styles.

1613	   This figure shows how a coordinated column view MAY be presented.

1615	   _____________________________________________________________________
1616	   |       Bob          |       Eve            |       Alice           |
1617	   |____________________|______________________|_______________________|
1618	   |                    |                      |I will arrive by TGV.  |
1619	   |My flight is to Orly|                      |Convenient to the main |
1620	   |                    |Hi all, can we plan   |station.               |
1621	   |                    |for the seminar?      |                       |
1622	   |Eve, will you do    |                      |                       |
1623	   |your presentation on|                      |                       |
1624	   |Friday?             |Yes, Friday at 10.    |                       |
1625	   |Fine, wo            |                      |We need to meet befo   |
1626	   |___________________________________________________________________|

1628	   Figure 2: A coordinated column-view of a three-party session with
1629	   entries ordered in approximate time-order.

1631	9.  Presentation details for multi-party unaware endpoints.

1633	   Multi-party unaware endpoints are prepared only for presentation of
1634	   two sources of text, the local user and a remote user.  If mixing for
1635	   multi-party unaware endpoints is to be supported, in order to enable
1636	   some multi-party communication with such endpoint, the mixer need to
1637	   plan the presentation and insert labels and line breaks before
1638	   lables.  Many limitations appear for this presentation mode, and it
1639	   must be seen as a fallback and a last resort.

1641	   A procedure for presenting RTT to a conference-unaware endpoint is
1642	   included in [I-D.ietf-avtcore-multi-party-rtt-mix]

1644	10.  Security Considerations

1646	   The security considerations valid for RFC 4103 [RFC4103] and RFC 3550
1647	   [RFC3550] are valid also for the multi-party sessions with text.

1649	11.  IANA Considerations

1651	   The items for indication and negotiation of capability for multi-
1652	   party rtt should be registered with IANA in the specifications where
1653	   they are specified in detail.

1655	12.  Congestion considerations

1657	   The congestion considerations described in RFC 4103 [RFC4103] are
1658	   valid also for the recommended RTP-based multi-party use of the real-
1659	   time text transport.  A risk for congestion may appear if a number of
1660	   conference participants are active transmitting text simultaneously,
1661	   because the recommended RTP-based multi-party transmission method
1662	   does not allow multiple sources of text to contribute to the same
1663	   packet.

1665	   In situations of risk for congestion, the Focus UA MAY combine
1666	   packets from the same source to increase the transmission interval
1667	   per source up to one second.  Local conference policy in the Focus UA
1668	   may be used to decide which streams shall be selected for such
1669	   transmission frequency reduction.

1671	13.  Acknowledgements

1673	   Arnoud van Wijk for contributions to an earlier, expired draft of
1674	   this memo.

1676	14.  Change history

1678	14.1.  Changes to draft-hellstrom-avtcore-multi-party-rtt-solutions-01

1680	   Added three more methods for RTP-mixer mixing.  Two RFC 5109 FEC
1681	   based and another with modified data header to detect source of
1682	   completely lost text.

1684	   Separated RTP-based and WebRTC based solutions.

1686	   Deleted the multi-party-unaware mixing procedure appendix.  It is now
1687	   included in the draft draft-ietf-avtcore-multi-party-rtt-mix.  Kept a
1688	   section with a reference to the new place.

1690	14.2.  Changes from draft-hellstrom-mmusic-multi-party-rtt-02 to draft-
1691	       hellstrom-avtcore-multi-party-rtt-solutions-00

1693	   Add discussion about switching performance, as discussed in avtcore
1694	   on March 13.

1696	   Added that a decrease of transmission interval to 100 ms increases
1697	   switching performance by a factor 3, but still not sufficient.

1699	   Added that the CSRC-list method also uses 100 milliseconds
1700	   transmission interval.

1702	   Added the method with multiple primary text in each packet.

1704	   Added the timestamp-based method for rtp-mixing proposed by James
1705	   Hamlin on March 14.

1707	   Corrected the chat style presentation example picture.  Delete a few
1708	   "[mix]".

1710	14.3.  Changes from version draft-hellstrom-mmusic-multi-party-rtt-01 to
1711	       -02

1713	   Change from a general overview to overview with clear
1714	   recommendations.

1716	   Splits text coordination methods in three groups.

1718	   Recommends rtt-mixer with sources in CSRC-list but referenes to its
1719	   spec for details.

1721	   Shortened Appendix with conference-unaware example.

1723	   Cleaned up preferences.

1725	   Inserted pictures of screen-views.

1727	15.  References

1729	15.1.  Normative References

1731	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1732	              Requirement Levels", BCP 14, RFC 2119,
1733	              DOI 10.17487/RFC2119, March 1997,
1734	              <https://www.rfc-editor.org/info/rfc2119>.

1736	15.2.  Informative References

1738	   [EN301549] ETSI, "EN 301 549. Accessibility requirements for ICT
1739	              products and services", November 2019,
1740	              <https://www.etsi.org/deliver/
1741	              etsi_en/301500_301599/301549/03.01.01_60/
1742	              en_301549v030101p.pdf>.

1744	   [I-D.ietf-avtcore-multi-party-rtt-mix]
1745	              Hellstrom, G., "RTP-mixer formatting of multi-party Real-
1746	              time text", Work in Progress, Internet-Draft, draft-ietf-
1747	              avtcore-multi-party-rtt-mix-06, 11 June 2020,
1748	              <https://tools.ietf.org/html/draft-ietf-avtcore-multi-
1749	              party-rtt-mix-06>.

1751	   [I-D.ietf-mmusic-t140-usage-data-channel]
1752	              Holmberg, C. and G. Hellstrom, "T.140 Real-time Text
1753	              Conversation over WebRTC Data Channels", Work in Progress,
1754	              Internet-Draft, draft-ietf-mmusic-t140-usage-data-channel-
1755	              14, 10 April 2020, <https://tools.ietf.org/html/draft-
1756	              ietf-mmusic-t140-usage-data-channel-14>.

1758	   [I-D.ietf-perc-private-media-framework]
1759	              Jones, P., Benham, D., and C. Groves, "A Solution
1760	              Framework for Private Media in Privacy Enhanced RTP
1761	              Conferencing (PERC)", Work in Progress, Internet-Draft,
1762	              draft-ietf-perc-private-media-framework-12, 5 June 2019,
1763	              <https://tools.ietf.org/html/draft-ietf-perc-private-
1764	              media-framework-12>.

1766	   [NENAi3]   NENA, "NENA-STA-010.2-2016. Detailed Functional and
1767	              Interface Standards for the NENA i3 Solution", October
1768	              2016, <https://www.nena.org/page/i3_Stage3>.

1770	   [RFC2198]  Perkins, C., Kouvelas, I., Hodson, O., Hardman, V.,
1771	              Handley, M., Bolot, J.C., Vega-Garcia, A., and S. Fosse-
1772	              Parisis, "RTP Payload for Redundant Audio Data", RFC 2198,
1773	              DOI 10.17487/RFC2198, September 1997,
1774	              <https://www.rfc-editor.org/info/rfc2198>.

1776	   [RFC3261]  Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston,
1777	              A., Peterson, J., Sparks, R., Handley, M., and E.
1778	              Schooler, "SIP: Session Initiation Protocol", RFC 3261,
1779	              DOI 10.17487/RFC3261, June 2002,
1780	              <https://www.rfc-editor.org/info/rfc3261>.

1782	   [RFC3264]  Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
1783	              with Session Description Protocol (SDP)", RFC 3264,
1784	              DOI 10.17487/RFC3264, June 2002,
1785	              <https://www.rfc-editor.org/info/rfc3264>.

1787	   [RFC3550]  Schulzrinne, H., Casner, S., Frederick, R., and V.
1788	              Jacobson, "RTP: A Transport Protocol for Real-Time
1789	              Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550,
1790	              July 2003, <https://www.rfc-editor.org/info/rfc3550>.

1792	   [RFC3840]  Rosenberg, J., Schulzrinne, H., and P. Kyzivat,
1793	              "Indicating User Agent Capabilities in the Session
1794	              Initiation Protocol (SIP)", RFC 3840,
1795	              DOI 10.17487/RFC3840, August 2004,
1796	              <https://www.rfc-editor.org/info/rfc3840>.

1798	   [RFC3841]  Rosenberg, J., Schulzrinne, H., and P. Kyzivat, "Caller
1799	              Preferences for the Session Initiation Protocol (SIP)",
1800	              RFC 3841, DOI 10.17487/RFC3841, August 2004,
1801	              <https://www.rfc-editor.org/info/rfc3841>.

1803	   [RFC4103]  Hellstrom, G. and P. Jones, "RTP Payload for Text
1804	              Conversation", RFC 4103, DOI 10.17487/RFC4103, June 2005,
1805	              <https://www.rfc-editor.org/info/rfc4103>.

1807	   [RFC4353]  Rosenberg, J., "A Framework for Conferencing with the
1808	              Session Initiation Protocol (SIP)", RFC 4353,
1809	              DOI 10.17487/RFC4353, February 2006,
1810	              <https://www.rfc-editor.org/info/rfc4353>.

1812	   [RFC4575]  Rosenberg, J., Schulzrinne, H., and O. Levin, Ed., "A
1813	              Session Initiation Protocol (SIP) Event Package for
1814	              Conference State", RFC 4575, DOI 10.17487/RFC4575, August
1815	              2006, <https://www.rfc-editor.org/info/rfc4575>.

1817	   [RFC4579]  Johnston, A. and O. Levin, "Session Initiation Protocol
1818	              (SIP) Call Control - Conferencing for User Agents",
1819	              BCP 119, RFC 4579, DOI 10.17487/RFC4579, August 2006,
1820	              <https://www.rfc-editor.org/info/rfc4579>.

1822	   [RFC4597]  Even, R. and N. Ismail, "Conferencing Scenarios",
1823	              RFC 4597, DOI 10.17487/RFC4597, August 2006,
1824	              <https://www.rfc-editor.org/info/rfc4597>.

1826	   [RFC5109]  Li, A., Ed., "RTP Payload Format for Generic Forward Error
1827	              Correction", RFC 5109, DOI 10.17487/RFC5109, December
1828	              2007, <https://www.rfc-editor.org/info/rfc5109>.

1830	   [RFC5194]  van Wijk, A., Ed. and G. Gybels, Ed., "Framework for Real-
1831	              Time Text over IP Using the Session Initiation Protocol
1832	              (SIP)", RFC 5194, DOI 10.17487/RFC5194, June 2008,
1833	              <https://www.rfc-editor.org/info/rfc5194>.

1835	   [RFC6443]  Rosen, B., Schulzrinne, H., Polk, J., and A. Newton,
1836	              "Framework for Emergency Calling Using Internet
1837	              Multimedia", RFC 6443, DOI 10.17487/RFC6443, December
1838	              2011, <https://www.rfc-editor.org/info/rfc6443>.

1840	   [RFC6881]  Rosen, B. and J. Polk, "Best Current Practice for
1841	              Communications Services in Support of Emergency Calling",
1842	              BCP 181, RFC 6881, DOI 10.17487/RFC6881, March 2013,
1843	              <https://www.rfc-editor.org/info/rfc6881>.

1845	   [RFC7667]  Westerlund, M. and S. Wenger, "RTP Topologies", RFC 7667,
1846	              DOI 10.17487/RFC7667, November 2015,
1847	              <https://www.rfc-editor.org/info/rfc7667>.

1849	   [T140]     ITU-T, "Recommendation ITU-T T.140 (02/1998), Protocol for
1850	              multimedia application text conversation", February 1998,
1851	              <https://www.itu.int/rec/T-REC-T.140-199802-I/en>.

1853	   [T140ad1]  ITU-T, "Recommendation ITU-T.140 Addendum 1 - (02/2000),
1854	              Protocol for multimedia application text conversation",
1855	              February 2000,
1856	              <https://www.itu.int/rec/T-REC-T.140-200002-I!Add1/en>.

1858	   [TS103479] ETSI, "TS 103 479. Emergency communications (EMTEL); Core
1859	              elements for network independent access to emergency
1860	              services", December 2019, <https://www.etsi.org/deliver/
1861	              etsi_ts/103400_103499/103479/01.01.01_60/
1862	              ts_103479v010101p.pdf>.

1864	   [TS22173]  3GPP, "IP Multimedia Core Network Subsystem (IMS)
1865	              Multimedia Telephony Service and supplementary services;
1866	              Stage 1", 3GPP TS 22.173 17.1.0, 20 December 2019,
1867	              <http://www.3gpp.org/ftp/Specs/html-info/22173.htm>.

1869	   [TS24147]  3GPP, "Conferencing using the IP Multimedia (IM) Core
1870	              Network (CN) subsystem; Stage 3", 3GPP TS 24.147 16.0.0,
1871	              19 December 2019,
1872	              <http://www.3gpp.org/ftp/Specs/html-info/24147.htm>.

1874	Author's Address

1876	   Gunnar Hellstrom
1877	   Gunnar Hellstrom Accessible Communication
1878	   Esplanaden 30
1879	   SE-136 70 Vendelso
1880	   Sweden

1882	   Phone: +46 708 204 288
1883	   Email: gunnar.hellstrom@ghaccess.se