idnits 2.17.1 

draft-ietf-avtcore-multi-party-rtt-mix-07.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The abstract seems to contain references ([RFC4102], [RFC4103]), which
     it shouldn't.  Please replace those with straight textual mentions of the
     documents in question.

  == The 'Updates: ' line in the draft header should list only the _numbers_
     of the RFCs which will be updated by this document (if approved); it
     should not include the word 'RFC' in the list.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == Line 647 has weird spacing: '...example  from ...'

  == Line 1368 has weird spacing: '...example  from ...'

  == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD',
     or 'RECOMMENDED' is not an accepted usage according to RFC 2119.  Please
     use uppercase 'NOT' together with RFC 2119 keywords (if that is what you
     mean).
     
     Found 'MUST not' in this paragraph:
     
     A party not performing as a mixer MUST not include the CSRC list.

  == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD',
     or 'RECOMMENDED' is not an accepted usage according to RFC 2119.  Please
     use uppercase 'NOT' together with RFC 2119 keywords (if that is what you
     mean).
     
     Found 'MUST not' in this paragraph:
     
     A party not performing as a mixer MUST not include the CSRC list if
     it has a single source of text.

  == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD',
     or 'RECOMMENDED' is not an accepted usage according to RFC 2119.  Please
     use uppercase 'NOT' together with RFC 2119 keywords (if that is what you
     mean).
     
     Found 'SHOULD not' in this paragraph:
     
     BEL 0007 Bell  Alert in session, provides for alerting during an
     active session.  The display count SHOULD not be altered.

  == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD',
     or 'RECOMMENDED' is not an accepted usage according to RFC 2119.  Please
     use uppercase 'NOT' together with RFC 2119 keywords (if that is what you
     mean).
     
     Found 'SHOULD not' in this paragraph:
     
     INT ESC 0061  Interrupt (used to initiate mode negotiation
     procedure).  The display count SHOULD not be altered.

  == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD',
     or 'RECOMMENDED' is not an accepted usage according to RFC 2119.  Please
     use uppercase 'NOT' together with RFC 2119 keywords (if that is what you
     mean).
     
     Found 'SHOULD not' in this paragraph:
     
     SGR 009B Ps 006D  Select graphic rendition.  Ps is rendition
     parameters specified in ISO 6429.  The display count SHOULD not be
     altered.  The SGR code SHOULD be stored for the current source.

  == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD',
     or 'RECOMMENDED' is not an accepted usage according to RFC 2119.  Please
     use uppercase 'NOT' together with RFC 2119 keywords (if that is what you
     mean).
     
     Found 'SHOULD not' in this paragraph:
     
     SOS 0098  Start of string, used as a general protocol element
     introducer, followed by a maximum 256 bytes string and the ST. The
     display count SHOULD not be altered.

  == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD',
     or 'RECOMMENDED' is not an accepted usage according to RFC 2119.  Please
     use uppercase 'NOT' together with RFC 2119 keywords (if that is what you
     mean).
     
     Found 'SHOULD not' in this paragraph:
     
     ST 009C  String terminator, end of SOS string.  The display count
     SHOULD not be altered.

  == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD',
     or 'RECOMMENDED' is not an accepted usage according to RFC 2119.  Please
     use uppercase 'NOT' together with RFC 2119 keywords (if that is what you
     mean).
     
     Found 'SHOULD not' in this paragraph:
     
     ESC 001B  Escape - used in control strings.  The display count
     SHOULD not be altered for the complete escape code.

     (Using the creation date from RFC4102, updated by this document, for
     RFC5378 checks: 2003-12-18)

     (Using the creation date from RFC4103, updated by this document, for
     RFC5378 checks: 2003-11-21)

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (12 July 2020) is 1384 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'Bob' is mentioned on line 1975, but not defined

  ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866)

  ** Downref: Normative reference to an Informational RFC: RFC 8643

  -- Possible downref: Non-RFC (?) normative reference: ref. 'T140'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'T140ad1'


     Summary: 3 errors (**), 0 flaws (~~), 13 warnings (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	AVTCore                                                     G. Hellstrom
3	Internet-Draft                 Gunnar Hellstrom Accessible Communication
4	Updates: RFC 4102, RFC 4103 (if approved)                   12 July 2020
5	Intended status: Standards Track
6	Expires: 13 January 2021

8	           RTP-mixer formatting of multi-party Real-time text
9	               draft-ietf-avtcore-multi-party-rtt-mix-07

11	Abstract

13	   Real-time text mixers for multi-party sessions need to identify the
14	   source of each transmitted group of text so that the text can be
15	   presented by endpoints in suitable grouping with other text from the
16	   same source.

18	   Regional regulatory requirements specify provision of real-time text
19	   in multi-party calls.  RFC 4103 mixer implementations can use
20	   traditional RTP functions for source identification, but the mixer
21	   source switching performance is limited when using the default
22	   transmission with redundancy.

24	   Enhancements for RFC 4103 real-time text mixing is provided in this
25	   document, suitable for a centralized conference model that enables
26	   source identification and source switching.  The intended use is for
27	   real-time text mixers and multi-party-aware participant endpoints.
28	   Two mechanisms are provided.  The mechanisms builds on use of the
29	   CSRC list in the RTP packet for source identification.  One method
30	   makes use of the same "text/red" format as for two-party sessions,
31	   while the other makes use of an extended packet format "text/rex" for
32	   more efficient transmission.

34	   A capability exchange is specified so that it can be verified that a
35	   participant can handle the multi-party coded real-time text stream.
36	   The capability for one method is by use of a media attribute a=rtt-
37	   mix-rtp-mixer.  The other method is indicated by the media subtype
38	   "text/rex".

40	   The document updates RFC 4102[RFC4102] and RFC 4103[RFC4103]

42	   A brief description about how a mixer can format text for the case
43	   when the endpoint is not multi-party aware is also provided.

45	Status of This Memo

47	   This Internet-Draft is submitted in full conformance with the
48	   provisions of BCP 78 and BCP 79.

50	   Internet-Drafts are working documents of the Internet Engineering
51	   Task Force (IETF).  Note that other groups may also distribute
52	   working documents as Internet-Drafts.  The list of current Internet-
53	   Drafts is at https://datatracker.ietf.org/drafts/current/.

55	   Internet-Drafts are draft documents valid for a maximum of six months
56	   and may be updated, replaced, or obsoleted by other documents at any
57	   time.  It is inappropriate to use Internet-Drafts as reference
58	   material or to cite them other than as "work in progress."

60	   This Internet-Draft will expire on 13 January 2021.

62	Copyright Notice

64	   Copyright (c) 2020 IETF Trust and the persons identified as the
65	   document authors.  All rights reserved.

67	   This document is subject to BCP 78 and the IETF Trust's Legal
68	   Provisions Relating to IETF Documents (https://trustee.ietf.org/
69	   license-info) in effect on the date of publication of this document.
70	   Please review these documents carefully, as they describe your rights
71	   and restrictions with respect to this document.  Code Components
72	   extracted from this document must include Simplified BSD License text
73	   as described in Section 4.e of the Trust Legal Provisions and are
74	   provided without warranty as described in the Simplified BSD License.

76	Table of Contents

78	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
79	     1.1.  Selected solution and considered alternative  . . . . . .   5
80	     1.2.  Nomenclature  . . . . . . . . . . . . . . . . . . . . . .   6
81	     1.3.  Intended application  . . . . . . . . . . . . . . . . . .   7
82	   2.  Specified solutions . . . . . . . . . . . . . . . . . . . . .   7
83	     2.1.  Negotiated use of the RFC 4103 format for multi-party in a
84	           single RTP stream . . . . . . . . . . . . . . . . . . . .   7
85	     2.2.  Use of an extended packet format "text/rex" with text from
86	           multiple sources  . . . . . . . . . . . . . . . . . . . .  17
87	     2.3.  Mixing for multi-party unaware endpoints  . . . . . . . .  35
88	   3.  Presentation level considerations . . . . . . . . . . . . . .  36
89	     3.1.  Presentation by multi-party aware endpoints . . . . . . .  36
90	     3.2.  Multi-party mixing for multi-party unaware endpoints  . .  38
91	   4.  Gateway Considerations  . . . . . . . . . . . . . . . . . . .  44
92	     4.1.  Gateway considerations with Textphones (e.g.  TTYs).  . .  44
93	     4.2.  Gateway considerations with WebRTC. . . . . . . . . . . .  45
94	   5.  Updates to RFC 4102 and RFC 4103  . . . . . . . . . . . . . .  45
95	   6.  Congestion considerations . . . . . . . . . . . . . . . . . .  46
96	   7.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  46
97	   8.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  46
98	     8.1.  Registration of the "rtt-mix-rtp-mixer" sdp media
99	           attribute . . . . . . . . . . . . . . . . . . . . . . . .  46
100	     8.2.  Registration of "text/rex" media subtype  . . . . . . . .  47
101	   9.  Security Considerations . . . . . . . . . . . . . . . . . . .  47
102	   10. Change history  . . . . . . . . . . . . . . . . . . . . . . .  47
103	     10.1.  Changes included in
104	             draft-ietf-avtcore-multi-party-rtt-mix-07 . . . . . . .  47
105	     10.2.  Changes included in
106	             draft-ietf-avtcore-multi-party-rtt-mix-06 . . . . . . .  47
107	     10.3.  Changes included in
108	             draft-ietf-avtcore-multi-party-rtt-mix-05 . . . . . . .  48
109	     10.4.  Changes included in
110	             draft-ietf-avtcore-multi-party-rtt-mix-04 . . . . . . .  48
111	     10.5.  Changes included in
112	             draft-ietf-avtcore-multi-party-rtt-mix-03 . . . . . . .  48
113	     10.6.  Changes included in
114	             draft-ietf-avtcore-multi-party-rtt-mix-02 . . . . . . .  49
115	     10.7.  Changes to draft-ietf-avtcore-multi-party-rtt-mix-01 . .  49
116	     10.8.  Changes from
117	             draft-hellstrom-avtcore-multi-party-rtt-source-03 to
118	             draft-ietf-avtcore-multi-party-rtt-mix-00 . . . . . . .  50
119	     10.9.  Changes from
120	             draft-hellstrom-avtcore-multi-party-rtt-source-02 to
121	             -03 . . . . . . . . . . . . . . . . . . . . . . . . . .  50
122	     10.10. Changes from
123	             draft-hellstrom-avtcore-multi-party-rtt-source-01 to
124	             -02 . . . . . . . . . . . . . . . . . . . . . . . . . .  50
125	     10.11. Changes from
126	             draft-hellstrom-avtcore-multi-party-rtt-source-00 to
127	             -01 . . . . . . . . . . . . . . . . . . . . . . . . . .  51
128	   11. References  . . . . . . . . . . . . . . . . . . . . . . . . .  51
129	     11.1.  Normative References . . . . . . . . . . . . . . . . . .  51
130	     11.2.  Informative References . . . . . . . . . . . . . . . . .  53
131	   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  53

133	1.  Introduction

135	   RFC 4103[RFC4103] specifies use of RFC 3550 RTP [RFC3550] for
136	   transmission of real-time text (RTT) and the "text/t140" format.  It
137	   also specifies a redundancy format "text/red" for increased
138	   robustness.  RFC 4102 [RFC4102] registers the "text/red" format.
139	   Regional regulatory requirements specify provision of real-time text
140	   in multi-party calls.

142	   Real-time text is usually provided together with audio and sometimes
143	   with video in conversational sessions.

145	   The redundancy scheme of RFC 4103 [RFC4103] enables efficient
146	   transmission of redundant text in packets together with new text.
147	   However the redundancy header format has no source indicators for the
148	   redundant transmissions.  An assumption has had to be made that the
149	   redundant parts in a packet are from the same source as the new text.
150	   The recommended transmission is one new and two redundant generations
151	   of text (T140blocks) in each packet and the recommended transmission
152	   interval is 300 ms.

154	   A mixer, selecting between text input from different sources and
155	   transmitting it in a common stream needs to make sure that the
156	   receiver can assign the received text to the proper sources for
157	   presentation.  Therefore, using RFC 4103 without any extra rule for
158	   source identification, the mixer needs to stop sending new text from
159	   one source and then make sure that all text so far has been sent with
160	   all intended redundancy levels (usually two) before switching to
161	   another source.  That causes the long time of one second to switch
162	   between transmission of text from one source to text from another
163	   source when using the default transmission interval 300 ms.  Both the
164	   total throughput and the switching performance in the mixer would be
165	   too low for most applications.  However by shorting the transmission
166	   interval to 100 ms, good performance is achieved for up to 3
167	   simultaneously sending sources and usable performance for up to 5
168	   simultaneously sending sources.  This method is negotiated through an
169	   sdp media attribute "rtt-mix-rtp-mixer".

171	   A more efficient source identification scheme requires that each
172	   redundant T140block has its source individually preserved.  This
173	   document introduces a source indicator by specific rules for
174	   populating the CSRC-list and the data header in the RTP-packet.

176	   An extended packet format "text/rex" is specified for this purpose,
177	   providing the possibility to include text from up to 15 sources in
178	   each packet in order to enhance mixer source switching performance.
179	   By these extensions, the performance requirements on multi-party
180	   mixing for real-time text are exceeded by the "text/rex" solution in
181	   this document.

183	   A negotiation mechanism can therefore be based on selection of the
184	   "text/red" with media attribute "rtt-mix-rtp-mixer" or the "text/rex"
185	   media format for verification that the parties are able to handle a
186	   multi-party coded stream and agreeing on which method to use.

188	   A fall-back mixing procedure is specified for cases when the
189	   negotiation results in "text/red" without the "rtt-mix attribute"
190	   being the only common submedia format.

192	   The document updates RFC 4102[RFC4102] and RFC 4103[RFC4103] by
193	   introducing an attribute for indicating multi-party capability, and
194	   an extended packet format for the multi-party mixing case and more
195	   strict rules for the source indications.

197	1.1.  Selected solution and considered alternative

199	   A number of alternatives were considered when searching an efficient
200	   multi-party method for real-time text.  This section explains a few
201	   of them briefly.

203	   One RTP stream per source, sent in the same RTP session with
204	   "text/red" format.  From some points of view, use of multiple RTP
205	      streams, one for each source, sent in the same RTP session, called
206	      the RTP translator model in RFC 3550 [RFC3550], would be
207	      efficient, and use exactly the same packet format as RFC 4103, the
208	      same payload type and a simple SDP declaration.  However, there is
209	      currently lack of support for multi-stream RTP in certain
210	      implementation technologies.  The multi-stream solution would also
211	      cause more overhead than a single RTP stream solution "text/rex"
212	      specified in this document and more the more simultaneous sending
213	      participants there are.

215	   The "text/red" format in RFC 4103 with shorter transmission
216	   interval, and indicating source in CSRC.  The "text/red" format with
217	      "text/t140" payload in a single RTP stream can be sent with 100 ms
218	      packet intervals instead of the regular 300 ms.  The source is
219	      indicated in the CSRC field.  Source switching can then be done
220	      every 300 ms while simultaneous transmission occurs.  With two
221	      participants sending text simultaneously, the switching and
222	      transmission performance is good.  With three or more
223	      simultaneously sending participants, there will be a noticable
224	      jerkiness in text presentation, more the more participants who
225	      send text simultaneously.  With three sending participants, the
226	      jerkiness will be about 450 ms, and with five, about 1350 ms.
227	      Text sent from a source at the end of the period its text is sent
228	      by the mixer will have close to zero extra delay.  Recent text
229	      will be presented with no or low delay.  The 1350 ms jerkiness
230	      will be noticable and slightly unpleasant, but corresponds in time
231	      to what typing humans often cause by hesitation or changing
232	      position while typing.  A benefit of this method is that no new
233	      packet format needs to be introduced and implemented.  Since
234	      simultaneous typing by more than two parties is rare, and in many
235	      applications also more than three parties in a call is rare, this
236	      method can be used successfully without its limitations becoming
237	      annoying.  Negotiation is based on a new sdp media attribute "rtt-
238	      mix-rtp-mixer".

240	   The "text/rex" packet format with up to 15 sources in one packet.  Th
241	      e mechanism called "text/rex" specified in this document makes use
242	      of the RTP mixer model specified in RFC3550[RFC3550].  Text from
243	      up to 15 sources can be included in each packet.  Packets are
244	      normally sent every 300 ms.  The mean delay will be 150 ms.  The
245	      sources are indicated in the CSRC list of the RTP packets.  A new
246	      redundancy packet format is specified, named "text/rex".

248	   The presentation planned by the mixer for multi-party unaware
249	   endpoints.  It is desirable to have a method that does not require
250	      any modifications in existing user devices implementing RFC 4103
251	      for RTT without explicit support of multi-party sessions.  This is
252	      possible by having the mixer insert a new line and a text
253	      formatted source label before each switch of text source in the
254	      stream.  Switch of source can only be done in places in the text
255	      where it does not disturb the perception of the contents.  Text
256	      from only one source can be presented in real time at a time.  The
257	      delay will therefore be varying.  The method has also other
258	      limitations, but is included in this document as a fallback
259	      method.  In calls where parties take turns properly by ending
260	      their entries with a new line, the limitations will have limited
261	      influence on the user experience. while only two parties send
262	      text, these two will see the text in real time with no delay.

264	1.2.  Nomenclature

266	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
267	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
268	   document are to be interpreted as described in [RFC2119].

270	   The terms SDES, CNAME, NAME, SSRC, CSRC, CSRC list, CC, RTCP, RTP-
271	   mixer, RTP-translator are explained in [RFC3550]

273	   The term "T140block" is defined in RFC 4103 [RFC4103] to contain one
274	   or more T.140 code elements.

276	   "TTY" stands for a text telephone type used in North America.

278	   "WebRTC" stands for web based communication specified by W3C and
279	   IETF.

281	   "DTLS-SRTP" stnds for security specified in RFC 5764 [RFC5764].

283	1.3.  Intended application

285	   The methods for multi-party real-time text are primarily intended for
286	   use in transmission between mixers and endpoints in centralised
287	   mixing configurations.  It is also applicable between endpoints as
288	   well as between mixers.  An often mentioned application is for
289	   emergency service calls with real-time text and voice, where a
290	   calltaker want to make an attended handover of a call to another
291	   agent, and stay observing in the session.  Multimedia conference
292	   sessions with support for participants to contribute in text is
293	   another application.  Conferences with central support for speech-to-
294	   text conversion is yet another mentioned application.

296	   In all these applications, normally only one participant at a time
297	   will send long text utterances.  In some cases, one other participant
298	   will occasionally contribute with a longer comment simultaneously.
299	   That may also happen in some rare cases when text is interpreted to
300	   text in another language in a conference.  Apart from these cases,
301	   other participants are only expected to contribute with very brief
302	   utterings while others are sending text.

304	   Text is supposed to be human generated, by some text input means,
305	   such as typing on a keyboard or using speech-to-text technology.
306	   Occasional small cut-and-paste operations may appear even if that is
307	   not the initial purpose of real-time text.

309	   The real-time characteristics of real-time text is essential for the
310	   participants to be able to contribute to a conversation.  If the text
311	   is too much delayed from typing a letter to its presentation, then,
312	   in some conference situations, the opportunity to comment will be
313	   gone and someone else will grab the turn.  A delay of more than one
314	   second in such situations is an obstacle for good conversation.

316	2.  Specified solutions

318	2.1.  Negotiated use of the RFC 4103 format for multi-party in a single
319	      RTP stream

321	   This section specifies use of the current format specified in
322	   [RFC4103] for true multi-party real-time text.  It is an update of
323	   RFC 4103 by a clarification on one way to use it in the multi-party
324	   situation.  It is done by completing a negotiation for this kind of
325	   multi-party capability and by indicating source in the CSRC element
326	   in the RTP packets.  Please use [RFC4103] as reference when reading
327	   the following description.

329	2.1.1.  Negotiation for use of this method

331	   RFC 4103[RFC4103] specifies use of RFC 3550 RTP[RFC3550], and a
332	   redundancy format "text/red" for increased robustness of real-time
333	   text transmission.  This document updates RFC 4102[RFC4102] and RFC
334	   4103[RFC4103] by introducing a capability negotiation for handling
335	   multi-party real-time text.  The capability negotiation is based on
336	   use of the sdp media attribute "rtt-mix-rtp-mixer".

338	   The syntax is as follows:
339	      a=rtt-mix-rtp-mixer

341	   A transmitting party SHALL send text according to the multi-party
342	   format only when the negotiation for this method was successful and
343	   when the CC field in the RTP packet is 1.  In all other cases, the
344	   packets SHALL be populated as for a two-party session.

346	2.1.2.  Use of fields in the RTP packets

348	   The CC field SHALL show the number of members in the CSRC list, which
349	   is one (1) in transmissions from a mixer involved in a multi-party
350	   session, and otherwise 0.

352	   When transmitted from a mixer during a multi-party session, a CSRC
353	   list is included in the packet.  The single member in the CSRC-list
354	   SHALL contain the SSRC of the source of the T140blocks in the packet.
355	   When redundancy is used, the recommended level of redundancy is to
356	   use one primary and two redundant generations of T140blocks.  In some
357	   cases, a primary or redundant T140block is empty, but is still
358	   represented by a member in the redundancy header.

360	   From other aspects, the contents of the RTP packts are equal to what
361	   is specified in RFC 4103.

363	2.1.3.  Transmission of multi-party contents

365	   As soon as a participant is known to participate in a session and
366	   being available for text reception, a Unicode BOM character SHALL be
367	   sent to it according to the procedures in this section.  If the
368	   transmitter is a mixer, then the source of this character SHALL be
369	   indicated to be the mixer itself.

371	2.1.4.  Keep-alive

373	   After that, the transmitter SHALL send keep-alive traffic to the
374	   receivers at regular intervals when no other traffic has occurred
375	   during that interval if that is decided for the actual connection.
376	   Recommendations for keep-alive can be found in RFC 6263[RFC6263].

378	2.1.5.  Transmission interval

380	   A "text/red" transmitter SHOULD send packets distributed in time as
381	   long as there is something (new or redundant T140blocks) to transmit.
382	   The maximum transmission interval SHOULD then be 300 ms.  It is
383	   RECOMMENDED to send next packet to a receiver as soon as new text to
384	   that receiver is available, as long as the time after the latest sent
385	   packet to the same receiver is more than or equal to 100 ms, and also
386	   the maximum character rate to the receiver is not exceeded.  The
387	   intention is to keep the latency low while keeping a good protection
388	   against text loss in bursty packet loss conditions.  New and
389	   redundant text from one source MAY be transmitted in the same packet.
390	   Text from different sources MUST NOT be transmitted in the same
391	   packet.

393	2.1.6.  Do not send received text to the originating source

395	   Text received from a participant SHOULD NOT be included in
396	   transmission to that participant.

398	2.1.7.  Clean incoming text

400	   A mixer SHALL handle reception and recovery of packet loss, marking
401	   of possible text loss and deletion of 'BOM' characters from each
402	   participant before queueing received text for transmission to
403	   receiving participants.

405	2.1.8.  Redundancy

407	   The transmitting party using redundancy SHALL send redundant
408	   repetitions of T140blocks aleady transmitted in earlier packets.  The
409	   number of redundant generations of T140blocks to include in
410	   transmitted packets SHALL be deducted from the SDP negotiation.  It
411	   SHOULD be set to the minimum of the number declared by the two
412	   parties negotiating a connection.

414	2.1.9.  Text placement in packets

416	   At time of transmission, the mixer SHALL populate the RTP packet with
417	   all T140blocks queued for transmission originating from the source in
418	   turn for transmission as long as this is not in conflict with the
419	   allowed number of characters per second or the maximum packet size.
420	   The SSRC of the source shall be placed as the member in the CSRC-
421	   list.

423	2.1.10.  Maximum number of sources per packet

425	   When text from more than one source is available for transmission,
426	   the mixer SHALL let the sources take turns in having their text
427	   transmitted.  When switching from transmission of one source to allow
428	   another source to have its text sent, all intended redundant
429	   generations of the last text from the current source MUST be
430	   transmitted before text from another source can be transmitted.
431	   Actively transmitting sources SHOULD be allowed to take turns as
432	   frequently as possible to have their text transmitted.  That implies
433	   that with the recommended redundancy, the mixer SHALL send primary
434	   text and two packets with redundant text from the current source
435	   before text from another source is transmitted.  The source with the
436	   oldest received text in the mixer SHOULD be next in turn to get all
437	   its available text transmitted.

439	   Note: The CSRC-list in an RTP packet only includes the participant
440	   who's text is included in text blocks.  It is not the same as the
441	   total list of participants in a conference.  With audio and video
442	   media, the CSRC-list would often contain all participants who are not
443	   muted whereas text participants that don't type are completely silent
444	   and thus are not represented in RTP packet CSRC-lists once their text
445	   have been transmitted as primary and the intended number of redundant
446	   generations.

448	2.1.11.  Empty T140blocks

450	   If no unsent T140blocks were available for a source at the time of
451	   populating a packet, but T140blocks are available which have not yet
452	   been sent the full intended number of redundant transmissions, then
453	   the primary T140block for that source is composed of an empty
454	   T140block, and populated (without taking up any length) in a packet
455	   for transmission.  The corresponding SSRC SHALL be placed as usual in
456	   its place in the CSRC-list.

458	2.1.12.  Creation of the redundancy

460	   The primary T140block from a source in the latest transmitted packet
461	   is used to populate the first redundant T140block for that source.
462	   The first redundant T140block for that source from the latest
463	   transmission is placed as the second redundant T140block.

465	   Usually this is the level of redundancy used.  If a higher number of
466	   redundancy is negotiated, then the procedure SHALL be maintained
467	   until all available redundant levels of T140blocks are placed in the
468	   packet.  If a receiver has negotiated a lower number of "text/red"
469	   generations, then that level shall be the maximum used by the
470	   transmitter.

472	2.1.13.  Timer offset fields

474	   The timestamp offset values are inserted in the data header, with the
475	   time offset from the RTP timestamp in the packet when the
476	   corresponding T140block was sent from its original source as primary.

478	   The timestamp offsets are expressed in the same clock tick units as
479	   the RTP timestamp.

481	   The timestamp offset values for empty T140blocks have no relevance
482	   but SHOULD be assigned realistic values.

484	2.1.14.  Other RTP header fields

486	   The number of members in the CSRC list ( 0 or 1) shall be placed in
487	   the "CC" header field.  Only mixers place value 1 in the "CC" field.

489	   The current time SHALL be inserted in the timestamp.

491	   The SSRC of the mixer for the RTT session SHALL be inserted in the
492	   SSRC field of the RTP header.

494	   The M-bit shall be handled as specified in [RFC4103].

496	2.1.15.  Pause in transmission

498	   When there is no new T140block to transmit, and no redundant
499	   T140block that has not been retransmitted the intended number of
500	   times from any source, the transmission process can stop until either
501	   new T140blocks arrive, or a keep-alive method calls for transmission
502	   of keep-alive packets.

504	2.1.16.  RTCP considerations

506	   A mixer SHALL send RTCP reports with SDES, CNAME and NAME information
507	   about the sources in the multi-party call.  This makes it possible
508	   for participants to compose a suitable label for text from each
509	   source.

511	2.1.17.  Reception of multi-party contents

513	   The "text/red" receiver included in an endpoint with presentation
514	   functions will receive RTP packets in the single stream from the
515	   mixer, and SHALL distribute the T140blocks for presentation in
516	   presentation areas for each source.  Other receiver roles, such as
517	   gateways or chained mixers are also feasible, and requires
518	   consideration if the stream shall just be forwarded, or distributed
519	   based on the different sources.

521	2.1.17.1.  Multi-party vs two-party use

523	   If the "CC" field value of a received packet is 1, it indicates that
524	   multi-party transmission is active, and the receiver MUST be prepared
525	   to act on the source according to its role.  If the CC value is 0,
526	   the connection is point-to-point.

528	2.1.17.2.  Level of redundancy

530	   The used level of redundancy generations SHALL be evaluated from the
531	   received packet contents.  The number of generations (including the
532	   primary) is equal to the number of members in the redundancy header.

534	2.1.17.3.  Extracting text and handling recovery and loss

536	   The RTP sequence numbers of the received packets SHALL be monitored
537	   for gaps and packets out of order.

539	   As long as the sequence is correct, each packet SHALL be unpacked in
540	   order.  The T140blocks SHALL be extracted from the primary area, and
541	   the corresponding SSRC SHALL be extracted from the CSRC list and used
542	   for assigning the new T140block to the correct presentation areas (or
543	   correspondingly for other receiver roles).

545	   If a sequence number gap appears and is still there after some
546	   defined time for jitter resolution, T140data SHALL be recovered from
547	   redundant data.  If the gap is wider than the number of generations
548	   of redundant T140blocks in the packet, then a t140block SHALL be
549	   created with a marker for possible text loss [T140ad1] and assigned
550	   to the SSRC of the transmitter as a general input from the mixer
551	   because in general it is not possible to deduct from which source(s)
552	   text was lost.  It is in some cases possible to deduct that no text
553	   was lost even for a gap wider than the redundancy generations, and in
554	   some cases it can be concluded which source that likely had loss.
555	   Therefore, the receiver MAY insert the marker for possible text loss
556	   [T140ad1] in the presentation area corresponding to the source which
557	   may have had loss.

559	   Then, the T140block in the received packet SHALL be retrieved
560	   beginning with the highest redundant generation, and assigning it to
561	   the presentation area of that source.  Finally the primary T140block
562	   SHALL be retrieved from the packet and similarly assigned to the
563	   corresponding presentation area for the source.

565	   If the sequence number gap was equal to or less than the number of
566	   redundancy generations in the received packet, a missing text marker
567	   SHALL NOT be inserted, and instead the T140block and the SSRC fully
568	   recovered from the redundancy information and the CSRC-list in the
569	   way indicated above.

571	2.1.17.4.  Delete BOM

573	   Unicode character "BOM" is used as a start indication and sometimes
574	   used as a filler or keep alive by transmission implementations.
575	   These SHALL be deleted on reception.

577	2.1.17.5.  Empty T140blocks

579	   Empty T140blocks are included as fillers for unused redundancy levels
580	   in the packets.  They just do not provide any contents and do not
581	   contribute to the received streams.

583	2.1.18.  Performance considerations

585	   This solution has good performance up to three participants
586	   simultaneously sending text.  At higher numbers of participants
587	   simultaneously sending text, a jerkiness is visible in the
588	   presentation of text.  With five participants simultaneously
589	   transmitting text, the jerkiness is about 1400 ms.  Evenso, the
590	   transmission of text catches up, so there is no resulting delay
591	   introduced.  The solution is therefore suitable for emergency service
592	   use, relay service use, and small or well-managed larger multimedia
593	   conferences.  It is only less suitable for large conferences with a
594	   high number of participants sending text simultaneously.  It should
595	   be noted that it is only the number of users sending text within the
596	   same moment that causes jerkiness, not the total number of users with
597	   RTT capability.

599	2.1.19.  Offer/answer considerations

601	   A party which has negotiated the "rtt-mix-rtp-mixer" sdp media
602	   attribute MUST populate the CSRC-list and format the packets
603	   according to this section if it acts as an rtp-mixer and sends multi-
604	   party text.

606	   A party which has negotiated the the "rtt-mix-rtp-mixer" sdp media
607	   attribute MUST interpret the contents of the CSRC-list and the
608	   packets according to this section in received rtp packets in the
609	   corresponding RTP stream.

611	   A party performing as a mixer, which has not negotiated the "rtt-mix-
612	   rtp-mixer" sdp media attribute, but negotiated a "text/red" or "text/
613	   t140" format in a session with a participant SHOULD, if nothing else
614	   is specified for the application, format transmitted text to that
615	   participant to be suitable to present on a multi-party unaware
616	   endpoint as further specified in section Section 3.2.

618	   A party not performing as a mixer MUST not include the CSRC list.

620	2.1.20.  Security for session control and media

622	   Security SHOULD be applied on both session control and media.  In
623	   applications where legacy endpoints without security may exist, a
624	   negotiation between security and no security SHOULD be applied.  If
625	   no other security solution is mandated by the application, then RFC
626	   8643 OSRTP[RFC8643] SHOULD be applied to negotiate SRTP media
627	   security with DTLS.  Most SDP examples below are for simplicity
628	   expressed without the security additions.  The principles (but not
629	   all details) for applying DTLS-SRTP security is shown in a couple of
630	   the following examples.

632	2.1.21.  SDP offer/answer examples

634	   This sections shows some examples of SDP for session negotiation of
635	   the real-time text media in SIP sessions.  Audio is usually provided
636	   in the same session, and sometimes also video.  The examples only
637	   show the part of importance for the real-time text media.

639	     Offer example for "text/red" format and multi-party support:

641	           m=text 11000 RTP/AVP 100 98
642	           a=rtpmap:98 t140/1000
643	           a=rtpmap:100 red/1000
644	           a=fmtp:100 98/98/98
645	           a=rtt-mix-rtp-mixer

647	      Answer example  from a multi-party capable device
648	           m=text 11000 RTP/AVP 100 98
649	           a=rtpmap:98 t140/1000
650	           a=rtpmap:100 red/1000
651	           a=fmtp:100 98/98/98
652	           a=rtt-mix-rtp-mixer

654	      Offer example for "text/red" format including multi-party
655	      and security:
656	            a=fingerprint: SHA-1 \
657	            4A:AD:B9:B1:3F:82:18:3B:54:02:12:DF:3E:5D:49:6B:19:E5:7C:AB
658	            m=text 11000 RTP/AVP 100 98
659	            a=rtpmap:98 t140/1000
660	            a=rtpmap:100 red/1000
661	            a=fmtp:100 98/98/98
662	            a=rtt-mix-rtp-mixer

664	   The "Fingerprint" is sufficient to offer DTLS-SRTP, with the media
665	   line still indicating RTP/AVP.

667	       Answer example from multi-party capable device with security
668	            a=fingerprint: SHA-1 \
669	            FF:FF:FF:B1:3F:82:18:3B:54:02:12:DF:3E:5D:49:6B:19:E5:7C:AB
670	            m=text 11000 RTP/AVP 100 98
671	            a=rtpmap:98 t140/1000
672	            a=rtpmap:100 red/1000
673	            a=fmtp:100 98/98/98
674	            a=rtt-mix-rtp-mixer

676	   With the "fingerprint" the device acknowledges use of SRTP/DTLS.

678	     Answer example from a multi-party unaware device that also
679	     does not support security:

681	           m=text 12000 RTP/AVP 100 98
682	           a=rtpmap:98 t140/1000
683	           a=rtpmap:100 red/1000
684	           a=fmtp:100 98/98/98

686	2.1.22.  Packet sequence example

688	   This example shows a symbolic flow of packets from a mixer with loss
689	   and recovery.  A and B are sources of RTT.  P indicates primary data.
690	   R1 is first redundant generation data and R2 is second redundant
691	   generation data.  A1, B1, A2 etc are text chunks (T140blocks)
692	   received from the respective sources.  X indicates dropped packet
693	   between the mixer and a receiver.

695	     |----------------|
696	     |Seq no 1        |
697	     |CC=1            |
698	     |CSRC list A     |
699	     |R2: A1          |
700	     |R1: A2          |
701	     |P:  A3          |
702	     |----------------|

704	   Assuming that earlier packets ( with text A1 and A2) were received in
705	   sequence, text A3 is received from packet 1 and assigned to reception
706	   area A.  The mixer is now assumed to have received text from source B
707	   and need to prepare for sending that text.  First it must send the
708	   redundant generations of text A1.

710	     |----------------|
711	     |Seq no 2        |
712	     |CC=1            |
713	     |CSRC list A     |
714	     |R2  A2          |
715	     |R1: A3          |
716	     |P: Empty        |
717	     |----------------|
718	     Nothing needs to be retrieved from this packet.

720	     X----------------|
721	     X Seq no 3       |
722	     X CC=1           |
723	     X CSRC list A    |
724	     X R2: A3         |
725	     X R1: Empty      |
726	     X P:  Empty      |
727	     X----------------|
728	     Packet 3 is assumed to be dropped in network problems

730	     X----------------|
731	     X Seq no 4       |
732	     X CC=1           |
733	     X CSRC list B    |
734	     X R2: Empty      |
735	     X R1: Empty      |
736	     X P2: B1         |
737	     X----------------|
738	     Packet 4 contains text from B, assumed dropped in network problems.
739	     The mixer is assumed to have received text from A on turn to send.
740	     Sending of text from B must therefore be temporarily ended by
741	     sending redundancy twice.

743	     X----------------|
744	     X Seq no 5       |
745	     X CC=1           |
746	     X CSRC list B    |
747	     X R2: Empty      |
748	     X R1: B1         |
749	     X P:  Empty      |
750	     X----------------|
751	     Packet 5 is assumed to be dropped in network problems

753	     |----------------|
754	     |Seq no 6        |
755	     |CC=1            |
756	     |CSRC list B     |
757	     | R2: B1         |
758	     | R1: Empty      |
759	     | P:  Empty      |
760	     |----------------|

762	   Packet 6 is received.  The latest received sequence number was 2.
763	   Recovery is therefore tried for 3,4,5.  There is no coverage for seq
764	   no 3.  But knowing that A1 must have been sent as R2 in packet 3, it
765	   can be concluded that nothing was lost.

767	   For seqno 4, text B1 is recovered from the second generation
768	   redundancy and appended to the reception area of B.  For seqno 5,
769	   nothing needs to be recovered.  No primary text is available in
770	   packet 6.

772	   After this sequence, A3 and B1 have been received.  In this case no
773	   text was lost.  Even if also packet 2 was lost, it can be concluded
774	   that no text was lost.

776	   If also packets 1 and 2 were lost, there would be a need to create a
777	   marker for possibly lost text (U'FFFD) [T140ad1], inserted generally
778	   and possibly also in text sequences A and B.

780	2.2.  Use of an extended packet format "text/rex" with text from
781	      multiple sources

783	   The method specified in this section called "text/rex" has higher
784	   performance than the previous method.  Text from up to 15 sources can
785	   be included in each packet.  This may be of value in large non-
786	   managed conferences.

788	2.2.1.  Use of fields in the RTP packets

790	   RFC 4103[RFC4103] specifies use of RFC 3550 RTP[RFC3550], and a
791	   redundancy format "text/red" for increased robustness of real-time
792	   text transmission.  This document updates RFC 4102[RFC4102] and RFC
793	   4103[RFC4103] by introducing a format "text/rex" with a rule for
794	   populating and using the CSRC-list in the RTP packet and extending
795	   the redundancy header to be called a data header.  This is done in
796	   order to enhance the performance in multi-party RTT sessions.

798	   The "text/rex" format can be seen as an "n-tuple" variant of the
799	   "text/red" format intended to carry text information from up to 15
800	   sources per packet.

802	   The CC field SHALL show the number of members in the CSRC list, which
803	   is one per source represented in the packet.

805	   When transmitted from a mixer, a CSRC list is included in the packet.
806	   The members in the CSRC-list SHALL contain the SSRCs of the sources
807	   of the T140blocks in the packet.  The order of the CSRC members MUST
808	   be the same as the order of the sources of the data header fields and
809	   the T140blocks.  When redundancy is used, text from all included
810	   sources MUST have the same number of redundant generations.  The
811	   primary, first redundant, second redundant and possible further
812	   redundant generations of T140blocks MUST be grouped per source in the
813	   packet in "source groups".  The recommended level of redundancy is to
814	   use one primary and two redundant generations of T140blocks.  In some
815	   cases, a primary or redundant T140block is empty, but is still
816	   represented by a member in the data header.

818	   The RTP header is followed by one or more source groups of data
819	   headers: one header for each text block to be included.  Each of
820	   these data headers provides the timestamp offset and length of the
821	   corresponding data block, in addition to the payload type number
822	   corresponding to the payload format "text/t140".  The data headers
823	   are followed by the data fields carrying T140blocks from the sources.

825	     0                   1                    2                   3
826	   0 1 2 3 4 5 6 7 8 9 0 1 2 3  4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
827	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
828	   |F|   block PT  |  timestamp offset         |   block length    |
829	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

831	   Figure 1: The bits in the data header.

833	   The bits in the data header are specified as follows:

835	   F:  1 bit First bit in header indicates whether another header block
836	      follows.  It has value 1 if further header blocks follow, and
837	      value 0 if this is the last header block.

839	   block PT:  7 bits RTP payload type number for this block,
840	      corresponding to the t140 payload type from the RTPMAP SDP
841	      attribute.

843	   timestamp offset:  14 bits Unsigned offset of timestamp of this block
844	      relative to the timestamp given in the RTP header.  The offset is
845	      a time to be subtracted from the current timestamp to determine
846	      the timestamp of the data when the latest part of this block was
847	      sent from the original source.  If the timestamp offset would be
848	      >15 000, it SHALL be set to 15 000.  For redundant data, the
849	      resulting time is the time when the data was sent as primary from
850	      the original source.  If the value would be >15 000, then it SHALL
851	      be set to 15 000 plus 300 times the redundancy level of the data.
852	      The high values appear only in exceptional cases, e.g. when some
853	      data has been held in order to keep the text flow under the
854	      Characters Per Second (CPS) limit.

856	   block length:  10 bits Length in bytes of the corresponding data
857	      block excluding the header.

859	   The header for the final block has a zero F bit, and apart from that
860	   the same fields as other data headers.

862	   Note: The "text/rex" packet format is similar to that of RFC 2198
863	   [RFC2198] but is different from some aspects.  RFC 2198 associates
864	   the whole of the CSRC-list with the primary data and assumes that the
865	   same list applies to reconstructed redundant data.  In this section a
866	   T140block is associated with exactly one CSRC list member as
867	   described above.  Also RFC 2198 [RFC2198] anticipates infrequent
868	   change to CSRCs; implementers should be aware that the order of the
869	   CSRC-list according to this section will vary during transitions
870	   between transmission from the mixer of text originated by different
871	   participants.  Another difference is that the last member in the data
872	   header area in RFC 2198 [RFC2198] only contains the payload type
873	   number while in this section it has the same format as all other
874	   entries in the data header.

876	   The picture below shows a typical "text/rex" RTP packet with multi-
877	   party RTT contents from three sources and coding according to this
878	   section.

880	       0                   1                   2                   3
881	       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
882	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
883	      |V=2|P|X| CC=3  |M|  "REX" PT   |   RTP sequence number         |
884	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
885	      |               timestamp of packet creation                    |
886	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
887	      |           synchronization source (SSRC) identifier            |
888	      +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
889	      |  CSRC list member 1 = SSRC of source of "A"                   |
890	      |  CSRC list member 2 = SSRC of source of "B"                   |
891	      |  CSRC list member 3 = SSRC of source of "C"                   |
892	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
893	      |1|   T140 PT   |timestmp offset of "A-R2"  |"A-R2" block length|
894	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
895	      |1|   T140 PT   |timestamp offset of "A-R1" |"A-R1" block length|
896	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
897	      |1|   T140 PT   | timestamp offset of "A-P" |"A-P" block length |
898	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
899	      |1|   T140 PT   |timestamp offset of "B-R2" |"B-R2" block length|
900	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
901	      |1|   T140 PT   |timestamp offset of "B-R1" |"B-R1" block length|
902	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
903	      |1|   T140 PT   | timestamp offset of "B-P" | "B-P" block length|
904	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
905	      |1|   T140 PT   |timestamp offset of "C-R2" |"C-R2" block length|
906	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
907	      |1|   T140 PT   |timestamp offset of "C-R1" |"C-R1" block length|
908	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
909	      |0|   T140 PT   |timestamp offset of "C-P"  |"C-P" block length |
910	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
911	      | "A-R2" T.140 encoded redundant data                           |
912	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
913	      |               |"A-R1" T.140 encoded redundant data            |
914	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
915	      |"A-P" T.140 encoded primary    |                               |
916	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
917	      |     "B-R2" T.140 encoded redundant data       |               |
918	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
919	      |      "B-R1" T.140 encoded redundant data                      |
920	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
921	      | "B-P" T.140 encoded primary data              |               |
922	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
923	      |     "C-R2" T.140 encoded redundant data       |               |
924	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
925	      |      "C-R1" T.140 encoded redundant data                      |
926	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
927	      |      "C-P" T.140 encoded primary data         |
928	      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
929	      Figure 2:A "text/rex" packet with text from three sources A, B, C.

931	   A-P, B-P, C-P are primary data from A, B and C.

933	   A-R1, B-R1, C-R1 are first redundant generation data from A, B and C.

935	   A-R2, B-R2, C-R2 are first redundant generation data from A, B and C.

937	   In a real case, some of the data headers will likely indicate a zero
938	   block length, and no corresponding T.140 data.

940	2.2.2.  Actions at transmission by a mixer

942	2.2.2.1.  Initial BOM transmission

944	   As soon as a participant is known to participate in a session and
945	   being available for text reception, a Unicode "BOM" character SHALL
946	   be sent to it according to the procedures in this section.  If the
947	   transmitter is a mixer, then the source of this character SHALL be
948	   indicated to be the mixer itself.

950	2.2.2.2.  Keep-alive

952	   After that, the transmitter SHALL send keep-alive traffic to the
953	   receivers at regular intervals when no other traffic has occurred
954	   during that interval if that is decided for the actual connection.
955	   Recommendations for keep-alive can be found in RFC 6263[RFC6263].

957	2.2.2.3.  Transmission interval

959	   A "text/rex" transmitter SHOULD send packets distributed in time as
960	   long as there is something (new or redundant T140blocks) to transmit.
961	   The maximum transmission interval SHOULD then be 300 ms.  It is
962	   RECOMMENDED to send a packet to a receiver as soon as new text to
963	   that receiver is available, as long as the time after the latest sent
964	   packet to the same receiver is more than 150 ms, and also the maximum
965	   character rate to the receiver is not exceeded.  The intention is to
966	   keep the latency low while keeping a good protection against text
967	   loss in bursty packet loss conditions.

969	2.2.2.4.  Do not send received text to the originating source

971	   Text received from a participant SHOULD NOT be included in
972	   transmission to that participant.

974	2.2.2.5.  Clean incoming text

976	   A mixer SHALL handle reception and recovery of packet loss, marking
977	   of possible text loss and deletion of 'BOM' characters from each
978	   participant before queueing received text for transmission to
979	   receiving participants.

981	2.2.2.6.  Redundancy

983	   The transmitting party using redundancy SHALL send redundant
984	   repetitions of T140blocks aleady transmitted in earlier packets.  The
985	   number of redundant generations of T140blocks to include in
986	   transmitted packets SHALL be deducted from the SDP negotiation.  It
987	   SHOULD be set to the minimum of the number declared by the two
988	   parties negotiating a connection.  The same number of redundant
989	   generations MUST be used for text from all sources when it is
990	   transmitted to a receiver.  The number of generations sent to a
991	   receiver SHALL be the same during the whole session unless it is
992	   modified by session renegotiation.

994	2.2.2.7.  Text placement in packets

996	   At time of transmission, the mixer SHALL populate the RTP packet with
997	   T140blocks combined from all T140blocks queued for transmission
998	   originating from each source as long as this is not in conflict with
999	   the allowed number of characters per second or the maximum packet
1000	   size.  These T140blocks SHALL be placed in the packet interleaved
1001	   with redundant T140blocks and new T140blocks from other sources.  The
1002	   SSRC of each source shall be placed as a member in the CSRC-list at a
1003	   place corresponding to the place of its T140blocks in the packet.

1005	2.2.2.8.  Maximum number of sources per packet

1007	   Text from a maximum of 15 sources MAY be included in a packet.  The
1008	   reason for this limitation is the maximum number of CSRC list members
1009	   allowed in a packet.  If text from more sources need to be
1010	   transmitted, the mixer MAY let the sources take turns in having their
1011	   text transmitted.  When stopping transmission of one source to allow
1012	   another source to have its text sent, all intended redundant
1013	   generations of the last text from the source to be stopped MUST be
1014	   transmitted before text from another source can be transmitted.
1015	   Actively transmitting sources SHOULD be allowed to take turns with
1016	   short intervals to have their text transmitted.

1018	   Note: The CSRC-list in an RTP packet only includes participants who's
1019	   text is included in text blocks.  It is not the same as the total
1020	   list of participants in a conference.  With audio and video media,
1021	   the CSRC-list would often contain all participants who are not muted
1022	   whereas text participants that don't type are completely silent and
1023	   thus are not represented in RTP packet CSRC-lists once their text
1024	   have been transmitted as primary and the intended number of redundant
1025	   generations.

1027	2.2.2.9.  Empty T140blocks

1029	   If no unsent T140blocks were available for a source at the time of
1030	   populating a packet, but T140blocks are available which have not yet
1031	   been sent the full intended number of redundant transmissions, then
1032	   the primary T140block for that source is composed of an empty
1033	   T140block, and populated (without taking up any length) in a packet
1034	   for transmission.  The corresponding SSRC SHALL be placed in its
1035	   place in the CSRC-list.

1037	2.2.2.10.  Creation of the redundancy

1039	   The primary T140block from each source in the latest transmitted
1040	   packet is used to populate the first redundant T140block for that
1041	   source.  The first redundant T140block for that source from the
1042	   latest transmission is placed as the second redundant T140block.

1044	   Usually this is the level of redundancy used.  If a higher number of
1045	   redundancy is negotiated, then the procedure SHALL be maintained
1046	   until all available redundant levels of T140blocks and their sources
1047	   are placed in the packet.  If a receiver has negotiated a lower
1048	   number of "text/rex" generations, then that level shall be the
1049	   maximum used by the transmitter.

1051	2.2.2.11.  Timer offset fields

1053	   The timer offset values are inserted in the data header, with the
1054	   time offset from the RTP timestamp in the packet when the
1055	   corresponding T140block was sent from its original source as primary.

1057	   The timer offsets are expressed in the same clock tick units as the
1058	   RTP timestamp.

1060	   The timestamp offset values for empty T140blocks have no relevance
1061	   but SHOULD be assigned realistic values.

1063	2.2.2.12.  Other RTP header fields

1065	   The number of members in the CSRC list shall be placed in the "CC"
1066	   header field.  Only mixers place values >0 in the "CC" field.

1068	   The current time SHALL be inserted in the timestamp.

1070	   The SSRC of the mixer for the RTT session SHALL be inserted in the
1071	   SSRC field of the RTP header.

1073	   The M-bit SHALL be set to 1 first in the session and after a pause.

1075	2.2.2.13.  Pause in transmission

1077	   When there is no new T140block to transmit, and no redundant
1078	   T140block that has not been retransmitted the intended number of
1079	   times, the transmission process can stop until either new T140blocks
1080	   arrive, or a keep-alive method calls for transmission of keep-alive
1081	   packets.

1083	2.2.3.  Actions at reception

1085	   The "text/rex" receiver included in an endpoint with presentation
1086	   functions will receive RTP packets in the single stream from the
1087	   mixer, and SHALL distribute the T140blocks for presentation in
1088	   presentation areas for each source.  Other receiver roles, such as
1089	   gateways or chained mixers are also feasible, and requires
1090	   consideration if the stream shall just be forwarded, or distributed
1091	   based on the different sources.

1093	2.2.3.1.  Multi-party vs two-party use

1095	   If the "CC" field value of a received packet is >0, it indicates that
1096	   multi-party transmission is active, and the receiver MUST be prepared
1097	   to act on the different sources according to its role.  If the CC
1098	   value is 0, the transmission is point-to-point.

1100	2.2.3.2.  Level of redundancy

1102	   The used level of redundancy generations SHALL be evaluated from the
1103	   received packet contents.  If the CC value is 0, the number of
1104	   generations (including the primary) is equal to the number of members
1105	   in the data header.  If the CC value is >0, the number of generations
1106	   (including the primary) is equal to the number of members in the data
1107	   header divided by the CC value.  If the remainder from the division
1108	   is >0, then the packet is malformed and SHALL cause an error
1109	   indication in the receiver.

1111	2.2.3.3.  Extracting text and handling recovery and loss

1113	   The RTP sequence numbers of the received packets SHALL be monitored
1114	   for gaps and packets out of order.

1116	   As long as the sequence is correct, each packet SHALL be unpacked in
1117	   order.  The T140blocks SHALL be extracted from the primary areas, and
1118	   the corresponding SSRCs SHALL be extracted from the corresponding
1119	   positions in the CSRC list and used for assigning the new T140block
1120	   to the correct presentation areas (or correspondingly).

1122	   If a sequence number gap appears and is still there after some
1123	   defined time for jitter resolution, T140data SHALL be recovered from
1124	   redundant data.  If the gap is wider than the number of generations
1125	   of redundant T140blocks in the packet, then a t140block SHALL be
1126	   created with a marker for possible text loss [T140ad1] and assigned
1127	   to the SSRC of the transmitter as a general input from the mixer
1128	   because in general it is not possible to deduct from which sources
1129	   text was lost.  It is however likely that the sources which had loss
1130	   were active in transmission just before or after the sequence number
1131	   gap.  Therefore, the receiver MAY insert the marker for possible text
1132	   loss [T140ad1] in the presentation areas corresponding to the sources
1133	   which had text in the packets just before and after the gap.

1135	   Then, the T140blocks in the received packet SHALL be retrieved
1136	   beginning with the highest redundant generation, grouping them with
1137	   the corresponding SSRC from the CSRC-list and assigning them to the
1138	   presentation areas per source.  Finally the primary T140blocks SHALL
1139	   be retrieved from the packet and similarly their sources retrieved
1140	   from the corresponding positions in the CSRC-list, and then assigned
1141	   to the corresponding presentation areas for the sources.

1143	   If the sequence number gap was equal to or less than the number of
1144	   redundancy generations in the received packet, a missing text marker
1145	   SHALL NOT be inserted, and instead the T140blocks and their SSRCs
1146	   fully recovered from the redundancy information and the CSRC-list in
1147	   the way indicated above.

1149	2.2.3.4.  Delete BOM

1151	   Unicode character "BOM" is used as a start indication and sometimes
1152	   used as a filler or keep alive by transmission implementations.
1153	   These SHALL be deleted on reception.

1155	2.2.3.5.  Empty T140blocks

1157	   Empty T140blocks are included as fillers for unused redundancy levels
1158	   in the packets.  They just do not provide any contents and do not
1159	   contribute to the received streams.

1161	2.2.4.  RTCP considerations

1163	   A mixer SHALL send RTCP reports with SDES, CNAME and NAME information
1164	   about the sources in the multi-party call.  This makes it possible
1165	   for participants to compose a suitable label for text from each
1166	   source.

1168	2.2.5.  Chained operation

1170	   By strictly applying the rules for "text/rex" packet format by all
1171	   conforming devices, mixers MAY be arranged in chains.

1173	2.2.6.  Usage without redundancy

1175	   The "text/rex" format SHALL be used also for multi-party
1176	   communication when the redundancy mechanism is not used.  That MAY be
1177	   the case when robustness in transmission is provided by some other
1178	   means than by redundancy.  All aspects of this section SHALL be
1179	   applied except the redundant generations in transmission.

1181	   The "text/rex" format SHOULD thus be used for multi-party operation,
1182	   also when some other protection against packet loss is utilized, for
1183	   example a reliable network or transport.  The format is also suitable
1184	   to be used for point-to-point operation.

1186	2.2.7.  Use with SIP centralized conferencing framework

1188	   The SIP conferencing framework, mainly specified in RFC
1189	   4353[RFC4353], RFC 4579[RFC4579] and RFC 4575[RFC4575] is suitable
1190	   for coordinating sessions including multi-party RTT.  The RTT stream
1191	   between the mixer and a participant is one and the same during the
1192	   conference.  Participants get announced by notifications when
1193	   participants are joining or leaving, and further user information may
1194	   be provided.  The SSRC of the text to expect from joined users MAY be
1195	   included in a notification.  The notifications MAY be used both for
1196	   security purposes and for translation to a label for presentation to
1197	   other users.

1199	2.2.8.  Conference control

1201	   In managed conferences, control of the real-time text media SHOULD be
1202	   provided in the same way as other for media, e.g. for muting and
1203	   unmuting by the direction attributes in SDP [RFC4566].

1205	   Note that floor control functions may be of value for RTT users as
1206	   well as for users of other media in a conference.

1208	2.2.9.  Media Subtype Registration

1210	   This registration is done using the template defined in [RFC6838] and
1211	   following [RFC4855].

1213	   Type name:
1214	      text

1216	   Subtype name:
1217	      rex

1219	   Required parameters:
1220	      rate:
1221	         The RTP timestamp (clock) rate.  The only valid value is 1000.

1223	      pt:
1224	         a comma-separated list of RTP payload types.  Because comma is
1225	         a special character, the list must be a quoted-string (enclosed
1226	         in double quotes).  Each list element is a mapping of the
1227	         dynamic payload type number to an embedded Content-type
1228	         specification for the payload format corresponding to the
1229	         payload type.  The format of the mapping is:

1231	         payload-type-number "=" content-type

1233	         If the content-type string includes a comma, then the content-
1234	         type string MUST be a quoted-string.  If the content- type
1235	         string does not include a comma, it MAY still be quoted.  Since
1236	         it is part of the list which must itself be a quoted- string,
1237	         that means the quotation marks MUST be quoted with backslash
1238	         quoting as specified in RFC 2045.  If the content- type string
1239	         itself contains a quoted-string, then the requirement for
1240	         backslash quoting is recursively applied.  To specify the text/
1241	         rex payload format in SDP, the pt parameter is mapped to an
1242	         a=fmtp attribute by eliminating the parameter name (pt) and
1243	         changing the commas to slashes.  For example:

1245	         pt = " = \"text/t140;cps=200,text/t140,text/t140\" "

1247	         Implies the following sdp

1249	                 m=text 49170 RTP/AVP 98 100
1250	                 a=rtpmap:98 rex/1000
1251	                 a=fmtp:98 100/100/100
1252	                 a=rtpmap:100 t140/1000
1253	                 a=fmtp:100 cps=200

1255	   Encoding considerations:
1256	      binary; see Section 4.8 of [RFC6838].

1258	   Security considerations:
1259	      See Section 9 of RFC xxxx.  [RFC Editor: Upon publication as an
1260	      RFC, please replace "XXXX" with the number assigned to this
1261	      document and remove this note.]

1263	   Interoperability considerations:
1264	      None.

1266	   Published specification:
1267	      RFC XXXX.  [RFC Editor: Upon publication as an RFC, please replace
1268	      "XXXX" with the number assigned to this document and remove this
1269	      note.]

1271	   Applications which use this media type:
1272	      For example: Text conferencing tools, multimedia conferencing
1273	      tools.Real-time conversational tools.

1275	   Fragment identifier considerations:
1276	      N/A.

1278	   Additional information:
1279	      None.

1281	   Person & email address to contact for further information:
1282	      Gunnar Hellstrom <gunnar.hellstrom@ghaccess.se>

1284	   Intended usage:
1285	      COMMON

1287	   Restrictions on usage:
1288	      This media type depends on RTP framing, and hence is only defined
1289	      for transfer via RTP [RFC3550].

1291	   Author:
1292	      Gunnar Hellstrom <gunnar.hellstrom@ghaccess.se>

1294	   Change controller:
1295	      IETF AVTCore Working Group delegated from the IESG.

1297	2.2.10.  SDP considerations

1299	   There are receiving RTT implementations which implement RFC 4103
1300	   [RFC4103] but not the source separation by the CSRC.  Sending mixed
1301	   text according to the usual CSRC convention from RFC 2198 [RFC2198]
1302	   to a device implementing only RFC 4103 [RFC4103] and no multi-party
1303	   mechanism would risk to lead to unreadable presented text.
1304	   Therefore, in order to negotiate RTT mixing capability according to
1305	   the "text/rtx" method, all devices supporting "text/rex"" for multi-
1306	   party aware participants SHALL include an SDP media format "text/rex"
1307	   in the SDP [RFC4566], indicating this format in offers and answers.
1308	   Multi-party streams using the coding of this section intended for
1309	   multi-party aware endpoints MUST NOT be sent to devices which have
1310	   not indicated the "text/rex" format.

1312	   Implementations not understanding the "text/rex" format MUST ignore
1313	   it according to common SDP rules.

1315	   The SDP media format defined here, is named "rex", for extended
1316	   "red".  It is intended to be used in "text" media descriptions with
1317	   "text/rex" and "text/t140" formats.  Both formats MUST be declared
1318	   for the "text/rex" format to be used.  It indicates capability to use
1319	   source indications in the CSRC list and the packet format according
1320	   to this section.  It also indicates ability to receive 150 real-time
1321	   text characters per second by default.

1323	2.2.10.1.  Mapping of media parameters to sdp

1325	   The information carried in the media type registration has a specific
1326	   mapping to fields in the Session Description Protocol (SDP) , which
1327	   is commonly used to describe RTP sessions.  When SDP RFC 4566
1328	   [RFC4566]is used to specify sessions employing the "text/rex" format,
1329	   the mapping is as follows:

1331	   *  The media type ("text") goes in SDP "m=" as the media name.

1333	   *  The media subtype (payload format name) goes in SDP "a=rtpmap" as
1334	      the encoding name.  The RTP clock rate in "a=rtpmap" MUST be 1000
1335	      for "text/rex".

1337	   *  When the payload type is used with redundancy, the level of
1338	      redundancy is shown by the number of elements in the slash-
1339	      separated payload type list in the "fmtp" parameter of the "text/
1340	      rex" media format.

1342	2.2.10.2.  Security for session control and media

1344	   Security SHOULD be applied on both session control and media.  In
1345	   applications where legacy endpoints without security may exist, a
1346	   negotiation between security and no security SHOULD be applied.  If
1347	   no other security solution is mandated by the application, then RFC
1348	   8643 OSRTP[RFC8643] SHOULD be applied to negotiate SRTP media
1349	   security with DTLS.  Most SDP examples below are for simplicity
1350	   expressed without the security additions.  The principles (but not
1351	   all details) for applying DTLS-SRTP security is shown in a couple of
1352	   the following examples.

1354	2.2.10.3.  SDP offer/answer examples

1356	   This sections shows some examples of SDP for session negotiation of
1357	   the real-time text media in SIP sessions.  Audio is usually provided
1358	   in the same session, and sometimes also video.  The examples only
1359	   show the part of importance for the real-time text media.

1361	    Offer example for just "text/rex" multi-party capability :

1363	         m=text 11000 RTP/AVP 101 98
1364	         a=rtpmap:98 t140/1000
1365	         a=rtpmap:101 rex/1000
1366	         a=fmtp:101 98/98/98

1368	    Answer example  from a multi-party capable device
1369	         m=text 12000 RTP/AVP 101 98
1370	         a=rtpmap:98 t140/1000
1371	         a=rtpmap:101 rex/1000
1372	         a=fmtp:101 98/98/98

1374	   Offer example for "text/red" and "text/rex" multi-party support:

1376	         m=text 11000 RTP/AVP 101 100 98
1377	         a=rtpmap:98 t140/1000
1378	         a=rtpmap:100 red/1000
1379	         a=rtpmap:101 rex/1000
1380	         a=fmtp:100 98/98/98
1381	         a=fmtp:101 98/98/98
1382	         a=rtt-mix-rtp-mixer

1384	    Answer example from multi-party capable device using "text/rex".
1385	         m=text 11000 RTP/AVP 101 98
1386	         a=rtpmap:98 t140/1000
1387	         a=rtpmap:101 rex/1000
1388	         a=fmtp:101 98/98/98

1390	    Offer example for both traditional "text/red" and multi-party format
1391	    including security:
1392	          a=fingerprint: SHA-1 \
1393	          4A:AD:B9:B1:3F:82:18:3B:54:02:12:DF:3E:5D:49:6B:19:E5:7C:AB
1394	          m=text 11000 RTP/AVP 101 100 98
1395	          a=rtpmap:98 t140/1000
1396	          a=rtpmap:100 red/1000
1397	          a=rtpmap:101 rex/1000
1398	          a=fmtp:100 98/98/98
1399	          a=fmtp:101 98/98/98
1400	          a=rtt-mix-rtp-mixer

1402	   The "Fingerprint" is sufficient to offer DTLS-SRTP, with the media
1403	   line still indicating RTP/AVP.

1405	     Answer example from a multi-party capable device including security
1406	          a=fingerprint: SHA-1 \
1407	          FF:FF:FF:B1:3F:82:18:3B:54:02:12:DF:3E:5D:49:6B:19:E5:7C:AB
1408	          m=text 11000 RTP/AVP 101 98
1409	          a=rtpmap:98 t140/1000
1410	          a=rtpmap:101 rex/1000
1411	          a=fmtp:101 98/98/98

1413	   With the "fingerprint" the device acknowledges use of SRTP/DTLS.

1415	   Answer example from a multi-party unaware device that also
1416	   does not support security:

1418	         m=text 12000 RTP/AVP 100 98
1419	         a=rtpmap:98 t140/1000
1420	         a=rtpmap:100 red/1000
1421	         a=fmtp:100 98/98/98

1423	   A party which has negotiated the "text/rex" format MUST populate the
1424	   CSRC-list and format the packets according to this section if it acts
1425	   as an rtp-mixer and sends multi-party text.

1427	   A party which has negotiated the "text/rex" capability MUST interpret
1428	   the contents of the CSRC-list and the packets according to this
1429	   section in received rtp packets using the corresponding payload type.

1431	   A party performing as a mixer, which has not negotiated the "text/
1432	   rex" format, but negotiated a "text/red" or "text/t140" format in a
1433	   session with a participant SHOULD, if nothing else is specified for
1434	   the application, format transmitted text to that participant to be
1435	   suitable to present on a multi-party unaware endpoint as further
1436	   specified in section Section 3.2.

1438	   A party not performing as a mixer MUST not include the CSRC list if
1439	   it has a single source of text.

1441	2.2.10.4.  Packet examples

1443	   This example shows a symbolic flow of packets from a mixer with loss
1444	   and recovery.  A, B and C are sources of RTT.  M is the mixer.  Pn
1445	   indicates primary data in source group "n".  Rn1 is first redundant
1446	   generation data and Rn2 is second redundant generation data in source
1447	   group "n".  A1, B1, A2 etc are text chunks (T140blocks) received from
1448	   the respective sources.  X indicates dropped packet between the mixer
1449	   and a receiver.

1451	   |----------------|
1452	   |Seq no 1        |
1453	   |CC=1            |
1454	   |CSRC list A     |
1455	   |R12: Empty      |
1456	   |R11: Empty      |
1457	   |P1: A1          |
1458	   |----------------|

1460	   Assuming that earlier packets were received in sequence, text A1 is
1461	   received from packet 1 and assigned to reception area A.

1463	   |----------------|
1464	   |Seq no 2        |
1465	   |CC=3            |
1466	   |CSRC list C,A   |
1467	   |R12 Empty       |
1468	   |R11:Empty       |
1469	   |P1: C1          |
1470	   |R22 Empty       |
1471	   |R21: A1         |
1472	   |P2: Empty       |
1473	   |----------------|
1474	   Text C1 is received from packet 2 and assigned to reception area C.

1476	   X----------------|
1477	   X Seq no 3       |
1478	   X CC=2           |
1479	   X CSRC list C,A  |
1480	   X R12: Empty     |
1481	   X R11: C1        |
1482	   X P1:  Empty     |
1483	   X R22: A1        |
1484	   X R21: Empty     |
1485	   X P2:  A2        |
1486	   X----------------|
1487	   Packet 3 is assumed to be dropped in network problems

1489	   X----------------|
1490	   X Seq no 4       |
1491	   X CC=3           |
1492	   X CSRC list C,B,A|
1493	   X R12: Empty     |
1494	   X R11: Empty     |
1495	   X P1: C2         |
1496	   X R22: Empty     |
1497	   X R21: Empty     |
1498	   X P2: B1         |
1499	   X R32: Empty     |
1500	   X R31: A2        |
1501	   X P3:  A3        |
1502	   X----------------|
1503	   Packet 4 is assumed to be dropped in network problems

1505	   X----------------|
1506	   X Seq no 5       |
1507	   X CC=3           |
1508	   X CSRC list C,B,A|
1509	   X R12: Empty     |
1510	   X R11: C2        |
1511	   X P1: Empty      |
1512	   X R22: Empty     |
1513	   X R21: B1        |
1514	   X P2: B2         |
1515	   X R32: A2        |
1516	   X R31: A3        |
1517	   X P3:  A4        |
1518	   X----------------|
1519	   Packet 5 is assumed to be dropped in network problems
1520	   |----------------|
1521	   |Seq no 6        |
1522	   |CC=3            |
1523	   |CSRC list C,B,A |
1524	   | R12: C2        |
1525	   | R11: Empty     |
1526	   | P1: Empty      |
1527	   | R22: B1        |
1528	   | R21: B2        |
1529	   | P2:  B3        |
1530	   | R32: A3        |
1531	   | R31: A4        |
1532	   | P3:  A5        |
1533	   |----------------|

1535	   Packet 6 is received.  The latest received sequence number was 2.
1536	   Recovery is therefore tried for 3,4,5.  But there is no coverage for
1537	   seq no 3.  A missing text mark (U'FFFD) [T140ad1] is created and
1538	   appended to the common mixer reception area.  A missing text mark
1539	   (U'FFFD) MAY also be appended in all streams which had text in the
1540	   packets before and after the gap.  That is in this case after A1, and
1541	   C1, and before B1.

1543	   For seqno 4, texts C2, B1 and A3 are recovered from the second
1544	   generation redundancy and appended to their respective reception
1545	   areas.  For seqno 5, texts B2 and A4 are recovered from the first
1546	   generation redundancy and appended to their respective reception
1547	   areas.  Primary text B3 and A5 are received and appended to their
1548	   respective reception areas.

1550	   After this sequence, the following has been received: A1,A3, A4, A5;
1551	   B1, B2, B3; C1, C2.  A possible loss is indicated by the general
1552	   missing text mark in time between A1 and A3, and in the streams after
1553	   A1 and C1 and before B1.

1555	   With only one or two packets lost, there would not be any need to
1556	   create a missing text marker, and all text would be recovered.

1558	   It will be a design decision how to present the missing text markers
1559	   assigned to the mixer as a source.

1561	2.2.10.5.  Performance considerations

1563	   This method allows new text from up to 15 sources per packet.  A
1564	   mixer implementing the specification will normally cause a latency of
1565	   0 to 150 milliseconds in text from up to 15 simultaneous sources.
1566	   This performance meets well the realistic requirements for conference
1567	   and conversational applications for which up to 5 simultaneous
1568	   sources should not be delayed more than 500 milliseconds by a mixer.
1569	   In order to achieve good performance, a receiver for multi-party
1570	   calls SHOULD declare a sufficient CPS value for the "text/t140"
1571	   format in SDP for the number of allowable characters per second.

1573	   As comparison, if the "text/red" format would be used for multi-party
1574	   communication with its default timing and redundancy, 5
1575	   simultaneously sending parties would cause jerky presentation of the
1576	   text from them in text spurts with 5 seconds intervals.  With a
1577	   reduction of the transmission interval to 150 ms, the time between
1578	   text spurts for 5 simultaneous sending parties would be 2.5 seconds.

1580	   Five simultaneous sending parties may occasionally occur in a
1581	   conference with one or two main sending parties and three parties
1582	   giving very brief comments.

1584	   The default maximum rate of reception of "text/t140" real-time text
1585	   is in RFC 4103 [RFC4103] specified to be 30 characters per second.
1586	   The value MAY be modified in the CPS parameter of the FMTP attribute
1587	   in the media section for the "text/t140" media.  A mixer combining
1588	   real-time text from a number of sources may have a higher combined
1589	   flow of text coming from the sources.  Endpoints SHOULD therefore
1590	   specify a suitable higher value for the CPS parameter, corresponding
1591	   to its real reception capability.  A value for CPS of 150 is the
1592	   default for the "text/t140" stream in the "text/rex" format.  See RFC
1593	   4103 [RFC4103] for the format and use of the CPS parameter.  The same
1594	   rules apply for the "text/rex" format except for the default value.

1596	2.3.  Mixing for multi-party unaware endpoints

1598	   A method is specified in this section for cases when the
1599	   participating endpoint does not implement any solution for multi-
1600	   party presentation of real-time text.  The solution requires the
1601	   mixer to insert text dividers and readable labels and only send text
1602	   from one source at a time until a suitable point appears for source
1603	   change.  This solution is a fallback method with functional
1604	   limitations that acts on the presentation level and is further
1605	   specified in Section 3.2.

1607	3.  Presentation level considerations

1609	   ITU-T T.140 [T140] provides the presentation level requirements for
1610	   the RFC 4103 [RFC4103] transport.  T.140 [T140] has functions for
1611	   erasure and other formatting functions and has the following general
1612	   statement for the presentation:

1614	   "The display of text from the members of the conversation should be
1615	   arranged so that the text from each participant is clearly readable,
1616	   and its source and the relative timing of entered text is visualized
1617	   in the display.  Mechanisms for looking back in the contents from the
1618	   current session should be provided.  The text should be displayed as
1619	   soon as it is received."

1621	   Strict application of T.140 [T140] is of essence for the
1622	   interoperability of real-time text implementations and to fulfill the
1623	   intention that the session participants have the same information of
1624	   the text contents of the conversation without necessarily having the
1625	   exact same layout of the conversation.

1627	   T.140 [T140] specifies a set of presentation control codes to include
1628	   in the stream.  Some of them are optional.  Implementations MUST be
1629	   able to ignore optional control codes that they do not support.

1631	   There is no strict "message" concept in real-time text.  Line
1632	   Separator SHALL be used as a separator allowing a part of received
1633	   text to be grouped in presentation.  The characters "CRLF" may be
1634	   used by other implementations as replacement for Line Separator.  The
1635	   "CRLF" combination SHALL be erased by just one erasing action, just
1636	   as the Line Separator.  Presentation functions are allowed to group
1637	   text for presentation in smaller groups than the line separators
1638	   imply and present such groups with source indication together with
1639	   text groups from other sources (see the following presentation
1640	   examples).  Erasure has no specific limit by any delimiter in the
1641	   text stream.

1643	3.1.  Presentation by multi-party aware endpoints

1645	   A multi-party aware receiving party, presenting real-time text MUST
1646	   separate text from different sources and present them in separate
1647	   presentation fields.  The receiving party MAY separate presentation
1648	   of parts of text from a source in readable groups based on other
1649	   criteria than line separator and merge these groups in the
1650	   presentation area when it benefits the user to most easily find and
1651	   read text from the different participants.  The criteria MAY e.g. be
1652	   a received comma, full stop, or other phrase delimiters, or a long
1653	   pause.

1655	   When text is received from multiple original sources simultaneously,
1656	   the presentation SHOULD provide a view where text is added in
1657	   multiple places simultaneously.

1659	   If the presentation presents text from different sources in one
1660	   common area, the presenting endpoint SHOULD insert text from the
1661	   local user ended at suitable points merged with received text to
1662	   indicate the relative timing for when the text groups were completed.
1663	   In this presentation mode, the receiving endpoint SHALL present the
1664	   source of the different groups of text.

1666	   A view of a three-party RTT call in chat style is shown in this
1667	   example .

1669	                 _________________________________________________
1670	                |                                              |^|
1671	                |[Alice] Hi, Alice here.                       |-|
1672	                |                                              | |
1673	                |[Bob] Bob as well.                            | |
1674	                |                                              | |
1675	                |[Eve] Hi, this is Eve, calling from Paris.    | |
1676	                |      I thought you should be here.           | |
1677	                |                                              | |
1678	                |[Alice] I am coming on Thursday, my           | |
1679	                |      performance is not until Friday morning.| |
1680	                |                                              | |
1681	                |[Bob] And I on Wednesday evening.             | |
1682	                |                                              | |
1683	                |[Alice] Can we meet on Thursday evening?      | |
1684	                |                                              | |
1685	                |[Eve] Yes, definitely. How about 7pm.         | |
1686	                |     at the entrance of the restaurant        | |
1687	                |     Le Lion Blanc?                           | |
1688	                |[Eve] we can have dinner and then take a walk |-|
1689	                |______________________________________________|v|
1690	                | <Eve-typing> But I need to be back to        |^|
1691	                |    the hotel by 11 because I need            |-|
1692	                |                                              | |
1693	                | <Bob-typing> I wou                           |-|
1694	                |______________________________________________|v|
1695	                | of course, I underst                           |
1696	                |________________________________________________|

1698	   Figure 3: Example of a three-party RTT call presented in chat style
1699	   seen at participant 'Alice's endpoint.

1701	   Other presentation styles than the chat style may be arranged.

1703	   This figure shows how a coordinated column view MAY be presented.

1705	   _____________________________________________________________________
1706	   |       Bob          |       Eve            |       Alice           |
1707	   |____________________|______________________|_______________________|
1708	   |                    |                      |I will arrive by TGV.  |
1709	   |My flight is to Orly|                      |Convenient to the main |
1710	   |                    |Hi all, can we plan   |station.               |
1711	   |                    |for the seminar?      |                       |
1712	   |Eve, will you do    |                      |                       |
1713	   |your presentation on|                      |                       |
1714	   |Friday?             |Yes, Friday at 10.    |                       |
1715	   |Fine, wo            |                      |We need to meet befo   |
1716	   |___________________________________________________________________|

1718	   Figure 4: An example of a coordinated column-view of a three-party
1719	   session with entries ordered vertically in approximate time-order.

1721	3.2.  Multi-party mixing for multi-party unaware endpoints

1723	   When the mixer has indicated multi-party capability by the "rtt-mix-
1724	   rtp-mixer" sdp attribute or the "text/rex" format in an SDP
1725	   negotiation, but the multi-party capability negotiation fails with an
1726	   endpoint, then the agreed "text/red" or "text/t140" format SHALL be
1727	   used and the mixer SHOULD compose a best-effort presentation of
1728	   multi-party real-time text in one stream intended to be presented by
1729	   an endpoint with no multi-party awareness.

1731	   This presentation format has functional limitations and SHOULD be
1732	   used only to enable participation in multi-party calls by legacy
1733	   deployed endpoints implementing only RFC 4103 without any multi-party
1734	   extensions specified in this document.

1736	   The principles and procedures below do not specify any new protocol
1737	   elements.  They are instead composed from the information in ITU-T
1738	   T.140 [T140] and an ambition to provide a best effort presentation on
1739	   an endpoint which has functions only for two-party calls.

1741	   The mixer mixing for multi-party unaware endpoints SHALL compose a
1742	   simulated limited multi-party RTT view suitable for presentation in
1743	   one presentation area.  The mixer SHALL group text in suitable groups
1744	   and prepare for presentation of them by inserting a new line between
1745	   them if the transmitted text did not already end with a new line.  A
1746	   presentable label SHOULD be composed and sent for the source
1747	   initially in the session and after each source switch.  With this
1748	   procedure the time for source switching is depending on the actions
1749	   of the users.  In order to expedite source switch, a user can for
1750	   example end its turn with a new line.

1752	3.2.1.  Actions by the mixer at reception from the call participants

1754	   When text is received by the mixer from the different participants,
1755	   the mixer SHALL recover text from redundancy if any packets are lost.
1756	   The mark for lost text [T140ad1] SHOULD be inserted in the stream if
1757	   unrecoverable loss appears.  Any Unicode "BOM" characters, possibly
1758	   used for keep-alive shall be deleted.  The time of creation of text
1759	   (retrieved from the RTP timestamp) SHALL be stored together with the
1760	   received text from each source in queues for transmission to the
1761	   recipients.

1763	3.2.2.  Actions by the mixer for transmission to the recipients

1765	   The following procedure SHOULD be applied for each recipient of
1766	   multi-part text from the mixer.

1768	   The text for transmission SHOULD be formatted by the mixer for each
1769	   receiving user for presentation in one single presentation area.
1770	   Text received from a participant SHOULD NOT be included in
1771	   transmission to that participant.  When there is text available for
1772	   transmission from the mixer to a receiving party from more than one
1773	   participant, the mixer SHOULD switch between transmission of text
1774	   from the different sources at suitable points in the transmitted
1775	   stream.

1777	   When switching source, the mixer SHOULD insert a line separator if
1778	   the already transmitted text did not end with a new line (line
1779	   separator or CRLF).  A label SHOULD be composed from information in
1780	   the CNAME and NAME fields in RTCP reports from the participant to
1781	   have its text transmitted, or from other session information for that
1782	   user.  The label SHOULD be delimited by suitable characters (e.g. '[
1783	   ]') and transmitted.  The CSRC SHOULD indicate the selected source.
1784	   Then text from that selected participant SHOULD be transmitted until
1785	   a new suitable point for switching source is reached.

1787	   Seeking a suitable point for switching source SHOULD be done when
1788	   there is older text waiting for transmission from any party than the
1789	   age of the last transmitted text.  Suitable points for switching are:

1791	   *  A completed phrase ended by comma

1793	   *  A completed sentence

1795	   *  A new line (line separator or CRLF)

1797	   *  A long pause (e.g. > 10 seconds) in received text from the
1798	      currently transmitted source

1800	   *  If text from one participant has been transmitted with text from
1801	      other sources waiting for transmission for a long time (e.g. > 1
1802	      minute) and none of the other suitable points for switching has
1803	      occurred, a source switch MAY be forced by the mixer at next word
1804	      delimiter, and also if even a word delimiter does not occur within
1805	      a time (e.g. 15 seconds) after the scan for word delimiter
1806	      started.

1808	   When switching source, the source which has the oldest text in queue
1809	   SHOULD be selected to be transmitted.  A character display count
1810	   SHOULD be maintained for the currently transmitted source, starting
1811	   at zero after the label is transmitted for the currently transmitted
1812	   source.

1814	   The status SHOULD be maintained for the latest control code for
1815	   Select Graphic Rendition (SGR) from each source.  If there is an SGR
1816	   code stored as the status for the current source before the source
1817	   switch is done, a reset of SGR shall be sent by the sequence SGR 0
1818	   [009B 0000 006D] after the new line and before the new label during a
1819	   source switch.  See SGR below for an explanation.  This transmission
1820	   does not influence the display count.

1822	   If there is an SGR code stored for the new source after the source
1823	   switch, that SGR code SHOULD be transmitted to the recipient before
1824	   the label.  This transmission does not influence the display count.

1826	3.2.3.  Actions on transmission of text

1828	   Text from a source sent to the recipient SHOULD increase the display
1829	   count by one per transmitted character.

1831	3.2.4.  Actions on transmission of control codes

1833	   The following control codes specified by T.140 require specific
1834	   actions.  They SHOULD cause specific considerations in the mixer.
1835	   Note that the codes presented here are expressed in UCS-16, while
1836	   transmission is made in UTF-8 transform of these codes.

1838	   BEL 0007 Bell  Alert in session, provides for alerting during an
1839	      active session.  The display count SHOULD not be altered.

1841	   NEW LINE 2028  Line separator.  Check and perform a source switch if
1842	      appropriate.  Increase display count by 1.

1844	   CR LF 000D 000A  A supported, but not preferred way of requesting a
1845	      new line.  Check and perform a source switch if appropriate.
1846	      Increase display count by 1.

1848	   INT ESC 0061  Interrupt (used to initiate mode negotiation
1849	      procedure).  The display count SHOULD not be altered.

1851	   SGR 009B Ps 006D  Select graphic rendition.  Ps is rendition
1852	      parameters specified in ISO 6429.  The display count SHOULD not be
1853	      altered.  The SGR code SHOULD be stored for the current source.

1855	   SOS 0098  Start of string, used as a general protocol element
1856	      introducer, followed by a maximum 256 bytes string and the ST.
1857	      The display count SHOULD not be altered.

1859	   ST 009C  String terminator, end of SOS string.  The display count
1860	      SHOULD not be altered.

1862	   ESC 001B  Escape - used in control strings.  The display count SHOULD
1863	      not be altered for the complete escape code.

1865	   Byte order mark "BOM" (U+FEFF)  "Zero width, no break space", used
1866	      for synchronization and keep-alive.  SHOULD be deleted from
1867	      incoming streams.  Shall be sent first after session establishment
1868	      to the recipient.  The display count shall not be altered.

1870	   Missing text mark (U+FFFD)  "Replacement character", represented as a
1871	      question mark in a rhombus, or if that is not feasible, replaced
1872	      by an apostrophe ', marks place in stream of possible text loss.
1873	      SHOULD be inserted by the reception procedure in case of
1874	      unrecoverable loss of packets.  The display count SHOULD be
1875	      increased by one when sent as for any other character.

1877	   SGR  If a control code for selecting graphic rendition (SGR), other
1878	      than reset of the graphic rendition (SGR 0) is sent to a
1879	      recipient, that control code shall also be stored as status for
1880	      the source in the storage for SGR status.  If a reset graphic
1881	      rendition (SGR 0) originated from a source is sent, then the SGR
1882	      status storage for that source shall be cleared.  The display
1883	      count shall not be increased.

1885	   BS (U+0008)  Back Space, intended to erase the last entered character
1886	      by a source.  Erasure by backspace cannot always be performed as
1887	      the erasing party intended.  If an erasing action erases all text
1888	      up to the end of the leading label after a source switch, then the
1889	      mixer must not transmit more backspaces.  Instead it is
1890	      RECOMMENDED that a letter "X" is inserted in the text stream for
1891	      each backspace as an indication of the intent to erase more.  A
1892	      new line is usually coded by a Line Separator, but the character
1893	      combination "CRLF" MAY be used instead.  Erasure of a new line is
1894	      in both cases done by just one erasing action (Backspace).  If the
1895	      display count has a positive value it is decreased by one when the
1896	      BS is sent.  If the display count is at zero, it is not altered.

1898	3.2.5.  Packet transmission

1900	   A mixer transmitting to a multi-party unaware terminal SHOULD send
1901	   primary data only from one source per packet.  The SSRC SHOULD be the
1902	   SSRC of the mixer.  The CSRC list SHOULD contain one member and be
1903	   the SSRC of the source of the primary data.

1905	3.2.6.  Functional limitations

1907	   When a multi-party unaware endpoint presents a conversation in one
1908	   display area in a chat style, it inserts source indications for
1909	   remote text and local user text as they are merged in completed text
1910	   groups.  When an endpoint using this layout receives and presents
1911	   text mixed for multi-party unaware endpoints, there will be two
1912	   levels of source indicators for the received text; one generated by
1913	   the mixer and inserted in a label after each source switch, and
1914	   another generated by the receiving endpoint and inserted after each
1915	   switch between local and remote source in the presentation area.
1916	   This will waste display space and look inconsistent to the reader.

1918	   New text can be presented only from one source at a time.  Switch of
1919	   source to be presented takes place at suitable places in the text,
1920	   such as end of phrase, end of sentence, line separator and
1921	   inactivity.  Therefore the time to switch to present waiting text
1922	   from other sources may become long and will vary and depend on the
1923	   actions of the currently presented source.

1925	   Erasure can only be done up to the latest source switch.  If a user
1926	   tries to erase more text, the erasing actions will be presented as
1927	   letter X after the label.

1929	   Text loss because of network errors may hit the label between entries
1930	   from different parties, causing risk for misunderstanding from which
1931	   source a piece of text is.

1933	   These facts makes it strongly RECOMMENDED to implement multi-party
1934	   awareness in RTT endpoints.  The use of the mixing method for multi-
1935	   party-unaware endpoints should be left for use with endpoints which
1936	   are impossible to upgrade to become multi-party aware.

1938	3.2.7.  Example views of presentation on multi-party unaware endpoints

1940	   The following pictures are examples of the view on a participant's
1941	   display for the multi-party-unaware case.

1943	     _________________________________________________
1944	    |       Conference       |          Alice          |
1945	    |________________________|_________________________|
1946	    |                        |I will arrive by TGV.    |
1947	    |[Bob]:My flight is to   |Convenient to the main   |
1948	    |Orly.                   |station.                 |
1949	    |[Eve]:Hi all, can we    |                         |
1950	    |plan for the seminar.   |                         |
1951	    |                        |                         |
1952	    |[Bob]:Eve, will you do  |                         |
1953	    |your presentation on    |                         |
1954	    |Friday?                 |                         |
1955	    |[Eve]:Yes, Friday at 10.|                         |
1956	    |[Bob]: Fine, wo         |We need to meet befo     |
1957	    |________________________|_________________________|

1959	   Figure 5: Alice who has a conference-unaware client is receiving the
1960	   multi-party real-time text in a single-stream.  This figure shows how
1961	   a coordinated column view MAY be presented on Alice's device.

1963	     _________________________________________________
1964	    |                                              |^|
1965	    |[Alice] Hi, Alice here.                       |-|
1966	    |                                              | |
1967	    |[mix][Bob] Bob as well.                       | |
1968	    |                                              | |
1969	    |[Eve] Hi, this is Eve, calling from Paris     | |
1970	    |      I thought you should be here.           | |
1971	    |                                              | |
1972	    |[Alice] I am coming on Thursday, my           | |
1973	    |      performance is not until Friday morning.| |
1974	    |                                              | |
1975	    |[mix][Bob] And I on Wednesday evening.        | |
1976	    |                                              | |
1977	    |[Eve] we can have dinner and then walk        | |
1978	    |                                              | |
1979	    |[Eve] But I need to be back to                | |
1980	    |    the hotel by 11 because I need            | |
1981	    |                                              |-|
1982	    |______________________________________________|v|
1983	    | of course, I underst                           |
1984	    |________________________________________________|

1986	   Figure 6: An example of a view of the multi-party unaware
1987	   presentation in chat style.  Alice is the local user.

1989	4.  Gateway Considerations

1991	4.1.  Gateway considerations with Textphones (e.g.  TTYs).

1993	   Multi-party RTT sessions may involve gateways of different kinds.
1994	   Gateways involved in setting up sessions SHALL correctly reflect the
1995	   multi-party capability or unawareness of the combination of the
1996	   gateway and the remote endpoint beyond the gateway.

1998	   One case that may occur is a gateway to PSTN for communication with
1999	   textphones (e.g.  TTYs).  Textphones are limited devices with no
2000	   multi-party awareness, and it SHOULD therefore be suitable for the
2001	   gateway to not indicate multi-party awareness for that case.  Another
2002	   solution is that the gateway indicates multi-party capability towards
2003	   the mixer, and includes the multi-party mixer function for multi-
2004	   party unaware endpoints itself.  This solution makes it possible to
2005	   make adaptations for the functional limitations of the textphone
2006	   (TTY).

2008	   More information on gateways to textphones (TTYs) is found in RFC
2009	   5194[RFC5194]

2011	4.2.  Gateway considerations with WebRTC.

2013	   Gateway operation to real-time text in WebRTC may also be required.
2014	   In WebRTC, RTT is specified in draft-ietf-mmusic-t140-usage-data-
2015	   channel[I-D.ietf-mmusic-t140-usage-data-channel].

2017	   A multi-party bridge may have functionality for communicating by RTT
2018	   both in RTP streams with RTT and WebRTC t140 data channels.  Other
2019	   configurations may consist of a multi-party bridge with either
2020	   technology for RTT transport and a separate gateway for conversion of
2021	   the text communication streams between RTP and t140 data channel.

2023	   In WebRTC, it is assumed that for a multi-party session, one t140
2024	   data channel is established for each source from a gateway or bridge
2025	   to each participant.  Each participant also has a data channel with
2026	   two-way connection with the gateway or bridge.

2028	   The t140 channel used both ways is for text from the WebRTC user and
2029	   from the bridge or gateway itself to the WebRTC user.  The label
2030	   parameter of this t140 channel is used as NAME field in RTCP to
2031	   participants on the RTP side.  The other t140 channels are only for
2032	   text from other participants to the WebRTC user.

2034	   When a new participant has entered the session with RTP transport of
2035	   rtt, a new t140 channel SHOULD be established to WebRTC users with
2036	   the label parameter composed from the NAME field in RTCP on the RTP
2037	   side.

2039	   When a new participant has entered the multi-party session with RTT
2040	   transport in a WebRTC t140 data channel, the new participant SHOULD
2041	   be announced by a notification to RTP users.  The label parameter
2042	   from the WebRTC side SHOULD be used as the NAME RTCP field on the RTP
2043	   side, or other available session information.

2045	5.  Updates to RFC 4102 and RFC 4103

2047	   This document updates RFC 4102[RFC4102] and RFC 4103[RFC4103] by
2048	   introducing an sdp media attribute "rtt-mix-rtp-mixer" for
2049	   negotiation of multi-party mixing capability with the [RFC4103]
2050	   format and an extended packet format "text/rex" for the enhanced
2051	   performance multi-party mixing case and more strict rules for the use
2052	   of redundancy, and population of the CSRC list in the packets.
2053	   Implications for the CSRC list use from RFC 2198[RFC2198] is not in
2054	   effect for the "text/rex" format.

2056	   The update is in line with the statement in RFC 4103 section 4,
2057	   saying that "Forward Error Correction mechanisms, ..., or any other
2058	   mechanism with the purpose of increasing the reliability of text
2059	   transmission, MAY be used as an alternative or complement to
2060	   redundancy."

2062	6.  Congestion considerations

2064	   The congestion considerations and recommended actions from RFC 4103
2065	   [RFC4103] are valid also in multi-party situations.

2067	   The first action in case of congestion SHOULD be to temporarily
2068	   increase the transmission interval up to two seconds.

2070	7.  Acknowledgements

2072	   James Hamlin for format input.

2074	8.  IANA Considerations

2076	8.1.  Registration of the "rtt-mix-rtp-mixer" sdp media attribute

2078	   [RFC EDITOR NOTE: Please replace all instances of RFCXXXX with the
2079	   RFC number of this document.]

2081	   IANA is asked to register the new sdp attribute "rtt-mix-rtp-mixer".

2083	   Contact name:  IESG

2085	   Contact email:  iesg@ietf.org

2087	   Attribute name:  rtt-mix-rtp-mixer

2089	   Attribute syntax:  a=rtt-mix-rtp-mixer

2091	   Attribute semantics:  See RFCXXXX Section 2.1.1

2093	   Attribute value:  none

2095	   Usage level:  media

2097	   Purpose:  Indicate support by mixer or endpoint of multi-party mixing
2098	      for real-time text transmission, using a common RTP-stream for
2099	      transmission of text from a number of sources mixed with one
2100	      source at a time and the source indicated in a single CSRC-list
2101	      member.

2103	   Charset Dependent:  no
2104	   O/A procedure:  See RFCXXXX Section 2.1.19

2106	   Mux Category:  normal

2108	   Reference:  RFCXXXX

2110	8.2.  Registration of "text/rex" media subtype

2112	   The IANA is requested to register the media type "text/rex" as
2113	   specified in Section 2.2.9.  The media type is also requested to be
2114	   added to the IANA registry for "RTP Payload Format Media Types"
2115	   <http://www.iana.org/assignments/rtp-parameters>.

2117	9.  Security Considerations

2119	   The RTP-mixer model requires the mixer to be allowed to decrypt, pack
2120	   and encrypt secured text from the conference participants.  Therefore
2121	   the mixer needs to be trusted.  This is similar to the situation for
2122	   central mixers of audio and video.

2124	   The requirement to transfer information about the user in RTCP
2125	   reports in SDES, CNAME and NAME fields, and in conference
2126	   notifications, for creation of labels may have privacy concerns as
2127	   already stated in RFC 3550 [RFC3550], and may be restricted of
2128	   privacy reasons.  The receiving user will then get a more symbolic
2129	   label for the source.

2131	10.  Change history

2133	10.1.  Changes included in draft-ietf-avtcore-multi-party-rtt-mix-07

2135	   Added a method based on the "text/red" format and single source per
2136	   packet, negotiated by the "rtt-mix-rtp-mixer" sdp attribute.

2138	   Added reasoning and recommendation about indication of loss.

2140	   The highest number of sources in one packet is 15, not 16.  Changed.

2142	   Added in information on update to RFC 4103 that RFC 4103 explicitly
2143	   allows addition of FEC method.  The redundancy is a kind of forward
2144	   error correction..

2146	10.2.  Changes included in draft-ietf-avtcore-multi-party-rtt-mix-06

2148	   Improved definitions list format.

2150	   The format of the media subtype parameters is made to match the
2151	   requirements.

2153	   The mapping of media subtype parameters to sdp is included.

2155	   The CPS parameter belongs to the t140 subtype and does not need to be
2156	   registered here.

2158	10.3.  Changes included in draft-ietf-avtcore-multi-party-rtt-mix-05

2160	   nomenclature and editorial improvements

2162	   "this document" used consistently to refer to this document.

2164	10.4.  Changes included in draft-ietf-avtcore-multi-party-rtt-mix-04

2166	   'Redundancy header' renamed to 'data header'.

2168	   More clarifications added.

2170	   Language and figure number corrections.

2172	10.5.  Changes included in draft-ietf-avtcore-multi-party-rtt-mix-03

2174	   Mention possible need to mute and raise hands as for other media.
2175	   ---done ----

2177	   Make sure that use in two-party calls is also possible and explained.
2178	   - may need more wording -

2180	   Clarify the RTT is often used together with other media. --done--

2182	   Tell that text mixing is N-1.  A users own text is not received in
2183	   the mix. -done-

2185	   In 3. correct the interval to: A "text/rex" transmitter SHOULD send
2186	   packets distributed in time as long as there is something (new or
2187	   redundant T140blocks) to transmit.  The maximum transmission interval
2188	   SHOULD then be 300 ms.  It is RECOMMENDED to send a packet to a
2189	   receiver as soon as new text to that receiver is available, as long
2190	   as the time after the latest sent packet to the same receiver is more
2191	   than 150 ms, and also the maximum character rate to the receiver is
2192	   not exceeded.  The intention is to keep the latency low while keeping
2193	   a good protection against text loss in bursty packet loss conditions.
2194	   -done-

2196	   In 1.3 say that the format is used both ways. -done-

2198	   In 13.1 change presentation area to presentation field so that reader
2199	   does not think it shall be totally separated. -done-
2200	   In Performance and intro, tell the performance in number of
2201	   simultaneous sending users and introduced delay 16, 150 vs
2202	   requirements 5 vs 500. -done --

2204	   Clarify redundancy level per connection.  -done-

2206	   Timestamp also for the last data header.  To make it possible for all
2207	   text to have time offset as for transmission from the source.  Make
2208	   that header equal to the others. -done-

2210	   Mixer always use the CSRC list, even for its own BOM. -done-

2212	   Combine all talk about transmission interval (300 ms vs when text has
2213	   arrived) in section 3 in one paragraph or close to each other. -done-

2215	   Documents the goal of good performance with low delay for 5
2216	   simultaneous typers in the introduction. -done-

2218	   Describe better that only primary text shall be sent on to receivers.
2219	   Redundancy and loss must be resolved by the mixer. -done-

2221	10.6.  Changes included in draft-ietf-avtcore-multi-party-rtt-mix-02

2223	   SDP and better description and visibility of security by OSRTP RFC
2224	   8634 needed.

2226	   The description of gatewaying to WebRTC extended.

2228	   The description of the data header in the packet is improved.

2230	10.7.  Changes to draft-ietf-avtcore-multi-party-rtt-mix-01

2232	   2,5,6 More efficient format "text/rex" introduced and attribute
2233	   a=rtt-mix deleted.

2235	   3.  Brief about use of OSRTP for security included- More needed.

2237	   4.  Brief motivation for the solution and why not rtp-translator is
2238	   used added to intro.

2240	   7.  More limitations for the multi-party unaware mixing method
2241	   inserted.

2243	   8.  Updates to RFC 4102 and 4103 more clearly expressed.

2245	   9.  Gateway to WebRTC started.  More needed.

2247	10.8.  Changes from draft-hellstrom-avtcore-multi-party-rtt-source-03 to
2248	       draft-ietf-avtcore-multi-party-rtt-mix-00

2250	   Changed file name to draft-ietf-avtcore-multi-party-rtt-mix-00

2252	   Replaced CDATA in IANA registration table with better coding.

2254	   Converted to xml2rfc version 3.

2256	10.9.  Changes from draft-hellstrom-avtcore-multi-party-rtt-source-02 to
2257	       -03

2259	   Changed company and e-mail of the author.

2261	   Changed title to "RTP-mixer formatting of multi-party Real-time text"
2262	   to better match contents.

2264	   Check and modification where needed of use of RFC 2119 words SHALL
2265	   etc.

2267	   More about the CC value in sections on transmitters and receivers so
2268	   that 1-to-1 sessions do not use the mixer format.

2270	   Enhanced section on presentation for multi-party-unaware endpoints

2272	   A paragraph recommending CPS=150 inserted in the performance section.

2274	10.10.  Changes from draft-hellstrom-avtcore-multi-party-rtt-source-01
2275	        to -02

2277	   In Abstract and 1.  Introduction: Introduced wording about regulatory
2278	   requirements.

2280	   In section 5: The transmission interval is decreased to 100 ms when
2281	   there is text from more than one source to transmit.

2283	   In section 11 about SDP negotiation, a SHOULD-requirement is
2284	   introduced that the mixer should make a mix for multi-party unaware
2285	   endpoints if the negotiation is not successful.  And a reference to a
2286	   later chapter about it.

2288	   The presentation considerations chapter 14 is extended with more
2289	   information about presentation on multi-party aware endpoints, and a
2290	   new section on the multi-party unaware mixing with low functionality
2291	   but SHOULD a be implemented in mixers.  Presentation examples are
2292	   added.

2294	   A short chapter 15 on gateway considerations is introduced.

2296	   Clarification about the text/t140 format included in chapter 10.

2298	   This sentence added to the chapter 10 about use without redundancy.
2299	   "The text/red format SHOULD be used unless some other protection
2300	   against packet loss is utilized, for example a reliable network or
2301	   transport."

2303	   Note about deviation from RFC 2198 added in chapter 4.

2305	   In chapter 9.  "Use with SIP centralized conferencing framework" the
2306	   following note is inserted: Note: The CSRC-list in an RTP packet only
2307	   includes participants who's text is included in one or more text
2308	   blocks.  It is not the same as the list of participants in a
2309	   conference.  With audio and video media, the CSRC-list would often
2310	   contain all participants who are not muted whereas text participants
2311	   that don't type are completely silent and so don't show up in RTP
2312	   packet CSRC-lists.

2314	10.11.  Changes from draft-hellstrom-avtcore-multi-party-rtt-source-00
2315	        to -01

2317	   Editorial cleanup.

2319	   Changed capability indication from fmtp-parameter to SDP attribute
2320	   "rtt-mix".

2322	   Swapped order of redundancy elements in the example to match reality.

2324	   Increased the SDP negotiation section

2326	11.  References

2328	11.1.  Normative References

2330	   [I-D.ietf-mmusic-t140-usage-data-channel]
2331	              Holmberg, C. and G. Hellstrom, "T.140 Real-time Text
2332	              Conversation over WebRTC Data Channels", Work in Progress,
2333	              Internet-Draft, draft-ietf-mmusic-t140-usage-data-channel-
2334	              14, 10 April 2020, <https://tools.ietf.org/html/draft-
2335	              ietf-mmusic-t140-usage-data-channel-14>.

2337	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
2338	              Requirement Levels", BCP 14, RFC 2119,
2339	              DOI 10.17487/RFC2119, March 1997,
2340	              <https://www.rfc-editor.org/info/rfc2119>.

2342	   [RFC2198]  Perkins, C., Kouvelas, I., Hodson, O., Hardman, V.,
2343	              Handley, M., Bolot, J.C., Vega-Garcia, A., and S. Fosse-
2344	              Parisis, "RTP Payload for Redundant Audio Data", RFC 2198,
2345	              DOI 10.17487/RFC2198, September 1997,
2346	              <https://www.rfc-editor.org/info/rfc2198>.

2348	   [RFC3550]  Schulzrinne, H., Casner, S., Frederick, R., and V.
2349	              Jacobson, "RTP: A Transport Protocol for Real-Time
2350	              Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550,
2351	              July 2003, <https://www.rfc-editor.org/info/rfc3550>.

2353	   [RFC4102]  Jones, P., "Registration of the text/red MIME Sub-Type",
2354	              RFC 4102, DOI 10.17487/RFC4102, June 2005,
2355	              <https://www.rfc-editor.org/info/rfc4102>.

2357	   [RFC4103]  Hellstrom, G. and P. Jones, "RTP Payload for Text
2358	              Conversation", RFC 4103, DOI 10.17487/RFC4103, June 2005,
2359	              <https://www.rfc-editor.org/info/rfc4103>.

2361	   [RFC4566]  Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
2362	              Description Protocol", RFC 4566, DOI 10.17487/RFC4566,
2363	              July 2006, <https://www.rfc-editor.org/info/rfc4566>.

2365	   [RFC4855]  Casner, S., "Media Type Registration of RTP Payload
2366	              Formats", RFC 4855, DOI 10.17487/RFC4855, February 2007,
2367	              <https://www.rfc-editor.org/info/rfc4855>.

2369	   [RFC5764]  McGrew, D. and E. Rescorla, "Datagram Transport Layer
2370	              Security (DTLS) Extension to Establish Keys for the Secure
2371	              Real-time Transport Protocol (SRTP)", RFC 5764,
2372	              DOI 10.17487/RFC5764, May 2010,
2373	              <https://www.rfc-editor.org/info/rfc5764>.

2375	   [RFC6263]  Marjou, X. and A. Sollaud, "Application Mechanism for
2376	              Keeping Alive the NAT Mappings Associated with RTP / RTP
2377	              Control Protocol (RTCP) Flows", RFC 6263,
2378	              DOI 10.17487/RFC6263, June 2011,
2379	              <https://www.rfc-editor.org/info/rfc6263>.

2381	   [RFC6838]  Freed, N., Klensin, J., and T. Hansen, "Media Type
2382	              Specifications and Registration Procedures", BCP 13,
2383	              RFC 6838, DOI 10.17487/RFC6838, January 2013,
2384	              <https://www.rfc-editor.org/info/rfc6838>.

2386	   [RFC8643]  Johnston, A., Aboba, B., Hutton, A., Jesske, R., and T.
2387	              Stach, "An Opportunistic Approach for Secure Real-time
2388	              Transport Protocol (OSRTP)", RFC 8643,
2389	              DOI 10.17487/RFC8643, August 2019,
2390	              <https://www.rfc-editor.org/info/rfc8643>.

2392	   [T140]     ITU-T, "Recommendation ITU-T T.140 (02/1998), Protocol for
2393	              multimedia application text conversation", February 1998,
2394	              <https://www.itu.int/rec/T-REC-T.140-199802-I/en>.

2396	   [T140ad1]  ITU-T, "Recommendation ITU-T.140 Addendum 1 - (02/2000),
2397	              Protocol for multimedia application text conversation",
2398	              February 2000,
2399	              <https://www.itu.int/rec/T-REC-T.140-200002-I!Add1/en>.

2401	11.2.  Informative References

2403	   [RFC4353]  Rosenberg, J., "A Framework for Conferencing with the
2404	              Session Initiation Protocol (SIP)", RFC 4353,
2405	              DOI 10.17487/RFC4353, February 2006,
2406	              <https://www.rfc-editor.org/info/rfc4353>.

2408	   [RFC4575]  Rosenberg, J., Schulzrinne, H., and O. Levin, Ed., "A
2409	              Session Initiation Protocol (SIP) Event Package for
2410	              Conference State", RFC 4575, DOI 10.17487/RFC4575, August
2411	              2006, <https://www.rfc-editor.org/info/rfc4575>.

2413	   [RFC4579]  Johnston, A. and O. Levin, "Session Initiation Protocol
2414	              (SIP) Call Control - Conferencing for User Agents",
2415	              BCP 119, RFC 4579, DOI 10.17487/RFC4579, August 2006,
2416	              <https://www.rfc-editor.org/info/rfc4579>.

2418	   [RFC5194]  van Wijk, A., Ed. and G. Gybels, Ed., "Framework for Real-
2419	              Time Text over IP Using the Session Initiation Protocol
2420	              (SIP)", RFC 5194, DOI 10.17487/RFC5194, June 2008,
2421	              <https://www.rfc-editor.org/info/rfc5194>.

2423	Author's Address

2425	   Gunnar Hellstrom
2426	   Gunnar Hellstrom Accessible Communication
2427	   Esplanaden 30
2428	   SE-13670 Vendelso
2429	   Sweden

2431	   Email: gunnar.hellstrom@ghaccess.se