idnits 2.17.1 draft-ietf-avtcore-multi-party-rtt-mix-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([RFC4102], [RFC4103]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. == The 'Updates: ' line in the draft header should list only the _numbers_ of the RFCs which will be updated by this document (if approved); it should not include the word 'RFC' in the list. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 820 has weird spacing: '...example from ...' == Line 835 has weird spacing: '...example from ...' == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: A party not performing as a mixer MUST not include the CSRC list if it has a single source of text. == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'SHOULD not' in this paragraph: BEL 0007 Bell Alert in session, provides for alerting during an active session. The display count SHOULD not be altered. == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'SHOULD not' in this paragraph: INT ESC 0061 Interrupt (used to initiate mode negotiation procedure). The display count SHOULD not be altered. == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'SHOULD not' in this paragraph: SGR 009B Ps 006D Select graphic rendition. Ps is rendition parameters specified in ISO 6429. The display count SHOULD not be altered. The SGR code SHOULD be stored for the current source. == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'SHOULD not' in this paragraph: SOS 0098 Start of string, used as a general protocol element introducer, followed by a maximum 256 bytes string and the ST. The display count SHOULD not be altered. == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'SHOULD not' in this paragraph: ST 009C String terminator, end of SOS string. The display count SHOULD not be altered. == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'SHOULD not' in this paragraph: ESC 001B Escape - used in control strings. The display count SHOULD not be altered for the complete escape code. (Using the creation date from RFC4102, updated by this document, for RFC5378 checks: 2003-12-18) (Using the creation date from RFC4103, updated by this document, for RFC5378 checks: 2003-11-21) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (11 June 2020) is 1414 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'Bob' is mentioned on line 1409, but not defined ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) ** Downref: Normative reference to an Informational RFC: RFC 8643 -- Possible downref: Non-RFC (?) normative reference: ref. 'T140' -- Possible downref: Non-RFC (?) normative reference: ref. 'T140ad1' Summary: 3 errors (**), 0 flaws (~~), 12 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 AVTCore G. Hellstrom 3 Internet-Draft Gunnar Hellstrom Accessible Communication 4 Updates: RFC 4102, RFC 4103 (if approved) 11 June 2020 5 Intended status: Standards Track 6 Expires: 13 December 2020 8 RTP-mixer formatting of multi-party Real-time text 9 draft-ietf-avtcore-multi-party-rtt-mix-06 11 Abstract 13 Real-time text mixers for multi-party sessions need to identify the 14 source of each transmitted group of text so that the text can be 15 presented by endpoints in suitable grouping with other text from the 16 same source. 18 Regional regulatory requirements specify provision of real-time text 19 in multi-party calls. RFC 4103 mixer implementations can use 20 traditional RTP functions for source identification, but the mixer 21 source switching performance is limited when using the default 22 transmission with redundancy. 24 An enhancement for RFC 4103 real-time text mixing is provided in this 25 document, suitable for a centralized conference model that enables 26 source identification and efficient source switching. The intended 27 use is for real-time text mixers and multi-party-aware participant 28 endpoints. The mechanism builds on use of the CSRC list in the RTP 29 packet and an extended packet format "text/rex". 31 A capability exchange is specified so that it can be verified that a 32 participant can handle the multi-party coded real-time text stream. 33 The capability is indicated by the media subtype "text/rex". 35 The document updates RFC 4102[RFC4102] and RFC 4103[RFC4103] 37 A brief description about how a mixer can format text for the case 38 when the endpoint is not multi-party aware is also provided. 40 Status of This Memo 42 This Internet-Draft is submitted in full conformance with the 43 provisions of BCP 78 and BCP 79. 45 Internet-Drafts are working documents of the Internet Engineering 46 Task Force (IETF). Note that other groups may also distribute 47 working documents as Internet-Drafts. The list of current Internet- 48 Drafts is at https://datatracker.ietf.org/drafts/current/. 50 Internet-Drafts are draft documents valid for a maximum of six months 51 and may be updated, replaced, or obsoleted by other documents at any 52 time. It is inappropriate to use Internet-Drafts as reference 53 material or to cite them other than as "work in progress." 55 This Internet-Draft will expire on 13 December 2020. 57 Copyright Notice 59 Copyright (c) 2020 IETF Trust and the persons identified as the 60 document authors. All rights reserved. 62 This document is subject to BCP 78 and the IETF Trust's Legal 63 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 64 license-info) in effect on the date of publication of this document. 65 Please review these documents carefully, as they describe your rights 66 and restrictions with respect to this document. Code Components 67 extracted from this document must include Simplified BSD License text 68 as described in Section 4.e of the Trust Legal Provisions and are 69 provided without warranty as described in the Simplified BSD License. 71 Table of Contents 73 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 74 1.1. Selected solution and considered alternative . . . . . . 5 75 1.2. Nomenclature . . . . . . . . . . . . . . . . . . . . . . 5 76 1.3. Intended application . . . . . . . . . . . . . . . . . . 6 77 2. Use of fields in the RTP packets . . . . . . . . . . . . . . 6 78 3. Actions at transmission by a mixer . . . . . . . . . . . . . 9 79 3.1. Initial BOM transmission . . . . . . . . . . . . . . . . 9 80 3.2. Keep-alive . . . . . . . . . . . . . . . . . . . . . . . 9 81 3.3. Transmission interval . . . . . . . . . . . . . . . . . . 10 82 3.4. Do not send received text to the originating source . . . 10 83 3.5. Clean incoming text . . . . . . . . . . . . . . . . . . . 10 84 3.6. Redundancy . . . . . . . . . . . . . . . . . . . . . . . 10 85 3.7. Text placement in packets . . . . . . . . . . . . . . . . 10 86 3.8. Maximum number of sources per packet . . . . . . . . . . 11 87 3.9. Empty T140blocks . . . . . . . . . . . . . . . . . . . . 11 88 3.10. Creation of the redundancy . . . . . . . . . . . . . . . 11 89 3.11. Timer offset fields . . . . . . . . . . . . . . . . . . . 12 90 3.12. Other RTP header fields . . . . . . . . . . . . . . . . . 12 91 3.13. Pause in transmission . . . . . . . . . . . . . . . . . . 12 92 4. Actions at reception . . . . . . . . . . . . . . . . . . . . 12 93 4.1. Multi-party vs two-party use . . . . . . . . . . . . . . 12 94 4.2. Level of redundancy . . . . . . . . . . . . . . . . . . . 13 95 4.3. Extracting text and handling recovery and loss . . . . . 13 96 4.4. Delete BOM . . . . . . . . . . . . . . . . . . . . . . . 14 97 4.5. Empty T140blocks . . . . . . . . . . . . . . . . . . . . 14 99 5. RTCP considerations . . . . . . . . . . . . . . . . . . . . . 14 100 6. Chained operation . . . . . . . . . . . . . . . . . . . . . . 14 101 7. Usage without redundancy . . . . . . . . . . . . . . . . . . 14 102 8. Use with SIP centralized conferencing framework . . . . . . . 15 103 9. Conference control . . . . . . . . . . . . . . . . . . . . . 15 104 10. Media Subtype Registration . . . . . . . . . . . . . . . . . 15 105 11. SDP considerations . . . . . . . . . . . . . . . . . . . . . 17 106 11.1. Mapping of media parameters to sdp . . . . . . . . . . . 18 107 11.2. Security for session control and media . . . . . . . . . 18 108 11.3. SDP offer/answer examples . . . . . . . . . . . . . . . 18 109 12. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 20 110 13. Performance considerations . . . . . . . . . . . . . . . . . 23 111 14. Presentation level considerations . . . . . . . . . . . . . . 23 112 14.1. Presentation by multi-party aware endpoints . . . . . . 24 113 14.2. Multi-party mixing for multi-party unaware endpoints . . 26 114 15. Gateway Considerations . . . . . . . . . . . . . . . . . . . 32 115 15.1. Gateway considerations with Textphones (e.g. TTYs). . . 32 116 15.2. Gateway considerations with WebRTC. . . . . . . . . . . 33 117 16. Updates to RFC 4102 and RFC 4103 . . . . . . . . . . . . . . 33 118 17. Congestion considerations . . . . . . . . . . . . . . . . . . 33 119 18. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 34 120 19. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 34 121 20. Security Considerations . . . . . . . . . . . . . . . . . . . 34 122 21. Change history . . . . . . . . . . . . . . . . . . . . . . . 34 123 21.1. Changes included in 124 draft-ietf-avtcore-multi-party-rtt-mix-06 . . . . . . . 34 125 21.2. Changes included in 126 draft-ietf-avtcore-multi-party-rtt-mix-05 . . . . . . . 34 127 21.3. Changes included in 128 draft-ietf-avtcore-multi-party-rtt-mix-04 . . . . . . . 35 129 21.4. Changes included in 130 draft-ietf-avtcore-multi-party-rtt-mix-03 . . . . . . . 35 131 21.5. Changes included in 132 draft-ietf-avtcore-multi-party-rtt-mix-02 . . . . . . . 36 133 21.6. Changes to draft-ietf-avtcore-multi-party-rtt-mix-01 . . 36 134 21.7. Changes from 135 draft-hellstrom-avtcore-multi-party-rtt-source-03 to 136 draft-ietf-avtcore-multi-party-rtt-mix-00 . . . . . . . 36 137 21.8. Changes from 138 draft-hellstrom-avtcore-multi-party-rtt-source-02 to 139 -03 . . . . . . . . . . . . . . . . . . . . . . . . . . 37 140 21.9. Changes from 141 draft-hellstrom-avtcore-multi-party-rtt-source-01 to 142 -02 . . . . . . . . . . . . . . . . . . . . . . . . . . 37 143 21.10. Changes from 144 draft-hellstrom-avtcore-multi-party-rtt-source-00 to 145 -01 . . . . . . . . . . . . . . . . . . . . . . . . . . 38 146 22. References . . . . . . . . . . . . . . . . . . . . . . . . . 38 147 22.1. Normative References . . . . . . . . . . . . . . . . . . 38 148 22.2. Informative References . . . . . . . . . . . . . . . . . 40 149 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 40 151 1. Introduction 153 RFC 4103[RFC4103] specifies use of RFC 3550 RTP [RFC3550] for 154 transmission of real-time text (RTT) and the "text/t140" format. It 155 also specifies a redundancy format "text/red" for increased 156 robustness. RFC 4102 [RFC4102] registers the "text/red" format. 157 Regional regulatory requirements specify provision of real-time text 158 in multi-party calls. 160 Real-time text is usually provided together with audio and sometimes 161 with video in conversational sessions. 163 The redundancy scheme of RFC 4103 [RFC4103] enables efficient 164 transmission of redundant text in packets together with new text. 165 However the redundant header format has no source indicators for the 166 redundant transmissions. An assumption has had to be made that the 167 redundant parts in a packet are from the same source as the new text. 168 The recommended transmission is one new and two redundant generations 169 of text (T140blocks) in each packet and the recommended transmission 170 interval is 300 ms. 172 A mixer, selecting between text input from different sources and 173 transmitting it in a common stream needs to make sure that the 174 receiver can assign the received text to the proper sources for 175 presentation. Therefore, using RFC 4103 without any extra rule for 176 source identification, the mixer needs to stop sending new text from 177 one source and then make sure that all text so far has been sent with 178 all intended redundancy levels (usually two) before switching to 179 another source. That causes the very long time of one second to 180 switch between transmission of text from one source to text from 181 another source. Both the total throughput and the switching 182 performance in the mixer is too low for most applications. 184 A more efficient source identification scheme requires that each 185 redundant T140block has its source individually preserved. This 186 document introduces a source indicator by specific rules for 187 populating the CSRC-list and the data header in the RTP-packet. 189 An extended packet format "text/rex" is specified for this purpose, 190 providing the possibility to include text from up to 16 sources in 191 each packet in order to enhance mixer source switching performance. 192 By these extensions, the performance requirements on multi-party 193 mixing for real-time text are exceeded by the solution in this 194 document. 196 A negotiation mechanism can therefore be based on selection between 197 the "text/red" and the "text/rex" media formats for verification that 198 the receiver is able to handle the multi-party coded stream. 200 A fall-back mixing procedure is specified for cases when the 201 negotiation results in "text/red" being the only common submedia 202 format. 204 The document updates RFC 4102[RFC4102] and RFC 4103[RFC4103] by 205 introducing an extended packet format for the multi-party mixing case 206 and more strict rules for the source indications. 208 1.1. Selected solution and considered alternative 210 The mechanism specified in this document makes use of the RTP mixer 211 model specified in RFC3550[RFC3550]. From some points of view, use 212 of the RTP translator model specified in RFC 3550 would be more 213 efficient, because then the text packets can pass the translator with 214 only minor modification. However, there may be a lack of support for 215 the translator model in existing RTP implementations, and therefore 216 the more common RTP-mixer model was selected. The translator model 217 would also easier cause congestion if many users send text 218 simultaneously. 220 1.2. Nomenclature 222 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 223 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 224 document are to be interpreted as described in [RFC2119]. 226 The terms SDES, CNAME, NAME, SSRC, CSRC, CSRC list, CC, RTCP, RTP- 227 mixer, RTP-translator are explained in [RFC3550] 229 The term "T140block" is defined in RFC 4103 [RFC4103] to contain one 230 or more T.140 code elements. 232 "TTY" stands for a text telephone type used in North America. 234 "WebRTC" stands for web based communication specified by W3C and 235 IETF. 237 "DTLS-SRTP" stnds for security specified in RFC 5764 [RFC5764]. 239 1.3. Intended application 241 The format for multi-party real-time text is primarily intended for 242 use in transmission between mixers and endpoints in centralised 243 mixing configurations. It is also applicable between endpoints as 244 well as between mixers. 246 2. Use of fields in the RTP packets 248 RFC 4103[RFC4103] specifies use of RFC 3550 RTP[RFC3550], and a 249 redundancy format "text/red" for increased robustness of real-time 250 text transmission. This document updates RFC 4102[RFC4102] and RFC 251 4103[RFC4103] by introducing a format "text/rex" with a rule for 252 populating and using the CSRC-list in the RTP packet and extending 253 the redundancy header to be called a data header. This is done in 254 order to enhance the performance in multi-party RTT sessions. 256 The "text/rex" format can be seen as an "n-tuple" of the "text/red" 257 format intended to carry text information from up to 16 sources per 258 packet. 260 The CC field SHALL show the number of members in the CSRC list, which 261 is one per source represented in the packet. 263 When transmitted from a mixer, a CSRC list is included in the packet. 264 The members in the CSRC-list SHALL contain the SSRCs of the sources 265 of the T140blocks in the packet. The order of the CSRC members MUST 266 be the same as the order of the sources of the data header fields and 267 the T140blocks. When redundancy is used, text from all included 268 sources MUST have the same number of redundant generations. The 269 primary, first redundant, second redundant and possible further 270 redundant generations of T140blocks MUST be grouped per source in the 271 packet in "source groups". The recommended level of redundancy is to 272 use one primary and two redundant generations of T140blocks. In some 273 cases, a primary or redundant T140block is empty, but is still 274 represented by a member in the data header. 276 The RTP header is followed by one or more source groups of data 277 headers: one header for each text block to be included. Each of 278 these data headers provides the timestamp offset and length of the 279 corresponding data block, in addition to the payload type number 280 corresponding to the payload format "text/t140". The data headers 281 are followed by the data fields carrying T140blocks from the sources. 283 0 1 2 3 284 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 285 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 286 |F| block PT | timestamp offset | block length | 287 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 289 Figure 1: The bits in the data header. 291 The bits in the data header are specified as follows: 293 F: 1 bit First bit in header indicates whether another header block 294 follows. It has value 1 if further header blocks follow, and 295 value 0 if this is the last header block. 297 block PT: 7 bits RTP payload type number for this block, 298 corresponding to the t140 payload type from the RTPMAP SDP 299 attribute. 301 timestamp offset: 14 bits Unsigned offset of timestamp of this block 302 relative to the timestamp given in the RTP header. The offset is 303 a time to be subtracted from the current timestamp to determine 304 the timestamp of the data when the latest part of this block was 305 sent from the original source. If the timestamp offset would be 306 >15 000, it SHALL be set to 15 000. For redundant data, the 307 resulting time is the time when the data was sent as primary from 308 the original source. If the value would be >15 000, then it SHALL 309 be set to 15 000 plus 300 times the redundancy level of the data. 310 The high values appear only in exceptional cases, e.g. when some 311 data has been held in order to keep the text flow under the 312 Characters Per Second (CPS) limit. 314 block length: 10 bits Length in bytes of the corresponding data 315 block excluding the header. 317 The header for the final block has a zero F bit, and apart from that 318 the same fields as other data headers. 320 Note: This document has a packet format that is similar to that of 321 RFC 2198 [RFC2198] but is different from some aspects. RFC 2198 322 associates the whole of the CSRC-list with the primary data and 323 assumes that the same list applies to reconstructed redundant data. 324 In this document a T140block is associated with exactly one CSRC list 325 member as described above. Also RFC 2198 [RFC2198] anticipates 326 infrequent change to CSRCs; implementers should be aware that the 327 order of the CSRC-list according to this document will vary during 328 transitions between transmission from the mixer of text originated by 329 different participants. Another difference is that the last member 330 in the data header area in RFC 2198 [RFC2198] only contains the 331 payload type number while in this document it has the same format as 332 all other entries in the data header. 334 The picture below shows a typical "text/rex" RTP packet with multi- 335 party RTT contents from three sources and coding according to this 336 document. 338 0 1 2 3 339 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 340 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 341 |V=2|P|X| CC=3 |M| "REX" PT | RTP sequence number | 342 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 343 | timestamp of packet creation | 344 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 345 | synchronization source (SSRC) identifier | 346 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 347 | CSRC list member 1 = SSRC of source of "A" | 348 | CSRC list member 2 = SSRC of source of "B" | 349 | CSRC list member 3 = SSRC of source of "C" | 350 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 351 |1| T140 PT |timestmp offset of "A-R2" |"A-R2" block length| 352 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 353 |1| T140 PT |timestamp offset of "A-R1" |"A-R1" block length| 354 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 355 |1| T140 PT | timestamp offset of "A-P" |"A-P" block length | 356 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 357 |1| T140 PT |timestamp offset of "B-R2" |"B-R2" block length| 358 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 359 |1| T140 PT |timestamp offset of "B-R1" |"B-R1" block length| 360 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 361 |1| T140 PT | timestamp offset of "B-P" | "B-P" block length| 362 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 363 |1| T140 PT |timestamp offset of "C-R2" |"C-R2" block length| 364 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 365 |1| T140 PT |timestamp offset of "C-R1" |"C-R1" block length| 366 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 367 |0| T140 PT |timestamp offset of "C-P" |"C-P" block length | 368 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 369 | "A-R2" T.140 encoded redundant data | 370 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 371 | |"A-R1" T.140 encoded redundant data | 372 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 373 |"A-P" T.140 encoded primary | | 374 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 375 | "B-R2" T.140 encoded redundant data | | 376 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 377 | "B-R1" T.140 encoded redundant data | 378 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 379 | "B-P" T.140 encoded primary data | | 380 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 381 | "C-R2" T.140 encoded redundant data | | 382 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 383 | "C-R1" T.140 encoded redundant data | 384 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 385 | "C-P" T.140 encoded primary data | 386 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 387 Figure 2:A "text/rex" packet with text from three sources A, B, C. 389 A-P, B-P, C-P are primary data from A, B and C. 391 A-R1, B-R1, C-R1 are first redundant generation data from A, B and C. 393 A-R2, B-R2, C-R2 are first redundant generation data from A, B and C. 395 In a real case, some of the data headers will likely indicate a zero 396 block length, and no corresponding T.140 data. 398 3. Actions at transmission by a mixer 400 3.1. Initial BOM transmission 402 As soon as a participant is known to participate in a session and 403 being available for text reception, a Unicode BOM character SHALL be 404 sent to it according to the procedures in this document. If the 405 transmitter is a mixer, then the source of this character SHALL be 406 indicated to be the mixer itself. 408 3.2. Keep-alive 410 After that, the transmitter SHALL send keep-alive traffic to the 411 receivers at regular intervals when no other traffic has occurred 412 during that interval if that is decided for the actual connection. 413 Recommendations for keep-alive can be found in RFC 6263[RFC6263]. 415 3.3. Transmission interval 417 A "text/rex" transmitter SHOULD send packets distributed in time as 418 long as there is something (new or redundant T140blocks) to transmit. 419 The maximum transmission interval SHOULD then be 300 ms. It is 420 RECOMMENDED to send a packet to a receiver as soon as new text to 421 that receiver is available, as long as the time after the latest sent 422 packet to the same receiver is more than 150 ms, and also the maximum 423 character rate to the receiver is not exceeded. The intention is to 424 keep the latency low while keeping a good protection against text 425 loss in bursty packet loss conditions. 427 3.4. Do not send received text to the originating source 429 Text received from a participant SHOULD NOT be included in 430 transmission to that participant. 432 3.5. Clean incoming text 434 A mixer SHALL handle reception and recovery of packet loss, marking 435 of possible text loss and deletion of 'BOM' characters from each 436 participant before queueing received text for transmission to 437 receiving participants. 439 3.6. Redundancy 441 The transmitting party using redundancy SHALL send redundant 442 repetitions of T140blocks aleady transmitted in earlier packets. The 443 number of redundant generations of T140blocks to include in 444 transmitted packets SHALL be deducted from the SDP negotiation. It 445 SHOULD be set to the minimum of the number declared by the two 446 parties negotiating a connection. The same number of redundant 447 generations MUST be used for text from all sources when it is 448 transmitted to a receiver. The number of generations sent to a 449 receiver SHALL be the same during the whole session unless it is 450 modified by session renegotiation. 452 3.7. Text placement in packets 454 At time of transmission, the mixer SHALL populate the RTP packet with 455 T140blocks combined from all T140blocks queued for transmission 456 originating from each source as long as this is not in conflict with 457 the allowed number of characters per second or the maximum packet 458 size. These T140blocks SHALL be placed in the packet interleaved 459 with redundant T140blocks and new T140blocks from other sources. The 460 SSRC of each source shall be placed as a member in the CSRC-list at a 461 place corresponding to the place of its T140blocks in the packet. 463 3.8. Maximum number of sources per packet 465 Text from a maximum of 16 sources MAY be included in a packet. The 466 reason for this limitation is the maximum number of CSRC list members 467 allowed in a packet. If text from more sources need to be 468 transmitted, the mixer MAY let the sources take turns in having their 469 text transmitted. When stopping transmission of one source to allow 470 another source to have its text sent, all intended redundant 471 generations of the last text from the source to be stopped MUST be 472 transmitted before text from another source can be transmitted. 473 Actively transmitting sources SHOULD be allowed to take turns with 474 short intervals to have their text transmitted. 476 Note: The CSRC-list in an RTP packet only includes participants who's 477 text is included in text blocks. It is not the same as the total 478 list of participants in a conference. With audio and video media, 479 the CSRC-list would often contain all participants who are not muted 480 whereas text participants that don't type are completely silent and 481 thus are not represented in RTP packet CSRC-lists once their text 482 have been transmitted as primary and the intended number of redundant 483 generations. 485 3.9. Empty T140blocks 487 If no unsent T140blocks were available for a source at the time of 488 populating a packet, but T140blocks are available which have not yet 489 been sent the full intended number of redundant transmissions, then 490 the primary T140block for that source is composed of an empty 491 T140block, and populated (without taking up any length) in a packet 492 for transmission. The corresponding SSRC SHALL be placed in its 493 place in the CSRC-list. 495 3.10. Creation of the redundancy 497 The primary T140block from each source in the latest transmitted 498 packet is used to populate the first redundant T140block for that 499 source. The first redundant T140block for that source from the 500 latest transmission is placed as the second redundant T140block 501 source. 503 Usually this is the level of redundancy used. If a higher number of 504 redundancy is negotiated, then the procedure SHALL be maintained 505 until all available redundant levels of T140blocks and their sources 506 are placed in the packet. If a receiver has negotiated a lower 507 number of "text/rex" generations, then that level shall be the 508 maximum used by the transmitter. 510 3.11. Timer offset fields 512 The timer offset values are inserted in the data header, with the 513 time offset from the RTP timestamp in the packet when the 514 corresponding T140block was sent from its original source as primary. 516 The timer offsets are expressed in the same clock tick units as the 517 RTP timestamp. 519 The timestamp offset values for empty T140blocks have no relevance 520 but SHOULD be assigned realistic values. 522 3.12. Other RTP header fields 524 The number of members in the CSRC list shall be placed in the "CC" 525 header field. Only mixers place values >0 in the "CC" field. 527 The current time SHALL be inserted in the timestamp. 529 The SSRC of the mixer for the RTT session SHALL be inserted in the 530 SSRC field of the RTP header. 532 3.13. Pause in transmission 534 When there is no new T140block to transmit, and no redundant 535 T140block that has not been retransmitted the intended number of 536 times, the transmission process can stop until either new T140blocks 537 arrive, or a keep-alive method calls for transmission of keep-alive 538 packets. 540 4. Actions at reception 542 The "text/rex" receiver included in an endpoint with presentation 543 functions will receive RTP packets in the single stream from the 544 mixer, and SHALL distribute the T140blocks for presentation in 545 presentation areas for each source. Other receiver roles, such as 546 gateways or chained mixers are also feasible, and requires 547 consideration if the stream shall just be forwarded, or distributed 548 based on the different sources. 550 4.1. Multi-party vs two-party use 552 If the "CC" field value of a received packet is >0, it indicates that 553 multi-party transmission is active, and the receiver MUST be prepared 554 to act on the different sources according to its role. If the CC 555 value is 0, the connection is point-to-point. 557 4.2. Level of redundancy 559 The used level of redundancy generations SHALL be evaluated from the 560 received packet contents. If the CC value is 0, the number of 561 generations (including the primary) is equal to the number of members 562 in the data header. If the CC value is >0, the number of generations 563 (including the primary) is equal to the number of members in the data 564 header divided by the CC value. If the remainder from the division 565 is >0, then the packet is malformed and SHALL cause an error 566 indication in the receiver. 568 4.3. Extracting text and handling recovery and loss 570 The RTP sequence numbers of the received packets SHALL be monitored 571 for gaps and packets out of order. 573 As long as the sequence is correct, each packet SHALL be unpacked in 574 order. The T140blocks SHALL be extracted from the primary areas, and 575 the corresponding SSRCs SHALL be extracted from the corresponding 576 positions in the CSRC list and used for assigning the new T140block 577 to the correct presentation areas (or correspondingly). 579 If a sequence number gap appears and is still there after some 580 defined time for jitter resolution, T140data SHALL be recovered from 581 redundant data. If the gap is wider than the number of generations 582 of redundant T140blocks in the packet, then a t140block SHALL be 583 created with a marker for text loss [T140ad1] and assigned to the 584 SSRC of the transmitter as a general input from the mixer because in 585 general it is not possible to deduct from which sources text was 586 lost. 588 Then, the T140blocks in the received packet SHALL be retrieved 589 beginning with the highest redundant generation, grouping them with 590 the corresponding SSRC from the CSRC-list and assigning them to the 591 presentation areas per source. Finally the primary T140blocks SHALL 592 be retrieved from the packet and similarly their sources retrieved 593 from the corresponding positions in the CSRC-list, and then assigned 594 to the corresponding presentation areas for the sources. 596 If the sequence number gap was equal to or less than the number of 597 redundancy generations in the received packet, a missing text marker 598 SHALL NOT be inserted, and instead the T140blocks and their SSRCs 599 fully recovered from the redundancy information and the CSRC-list in 600 the way indicated above. 602 4.4. Delete BOM 604 Unicode character BOM is used as a start indication and sometimes 605 used as a filler or keep alive by transmission implementations. 606 These SHALL be deleted on reception. 608 4.5. Empty T140blocks 610 Empty T140blocks are included as fillers for unused redundancy levels 611 in the packets. They just do not provide any contents and do not 612 contribute to the received streams. 614 5. RTCP considerations 616 A mixer SHALL send RTCP reports with SDES, CNAME and NAME information 617 about the sources in the multi-party call. This makes it possible 618 for participants to compose a suitable label for text from each 619 source. 621 6. Chained operation 623 By strictly applying the rules for "text/rex" packet format by all 624 conforming devices, mixers MAY be arranged in chains. 626 7. Usage without redundancy 628 The "text/rex" format SHALL be used also for multi-party 629 communication when the redundancy mechanism is not used. That MAY be 630 the case when robustness in transmission is provided by some other 631 means than by redundancy. All aspects of this document SHALL be 632 applied except the redundant generations in transmission. 634 The "text/rex" format SHOULD thus be used for multi-party operation, 635 also when some other protection against packet loss is utilized, for 636 example a reliable network or transport. The format is also suitable 637 to be used for point-to-point operation. 639 8. Use with SIP centralized conferencing framework 641 The SIP conferencing framework, mainly specified in RFC 642 4353[RFC4353], RFC 4579[RFC4579] and RFC 4575[RFC4575] is suitable 643 for coordinating sessions including multi-party RTT. The RTT stream 644 between the mixer and a participant is one and the same during the 645 conference. Participants get announced by notifications when 646 participants are joining or leaving, and further user information may 647 be provided. The SSRC of the text to expect from joined users MAY be 648 included in a notification. The notifications MAY be used both for 649 security purposes and for translation to a label for presentation to 650 other users. 652 9. Conference control 654 In managed conferences, control of the real-time text media SHOULD be 655 provided in the same way as other for media, e.g. for muting and 656 unmuting by the direction attributes in SDP [RFC4566]. 658 Note that floor control functions may be of value for RTT users as 659 well as for users of other media in a conference. 661 10. Media Subtype Registration 663 This registration is done using the template defined in [RFC6838] and 664 following [RFC4855]. 666 Type name: 667 text 669 Subtype name: 670 rex 672 Required parameters: 673 rate: 674 The RTP timestamp (clock) rate. The only valid value is 1000. 676 pt: 677 a comma-separated list of RTP payload types. Because comma is 678 a special character, the list must be a quoted-string (enclosed 679 in double quotes). Each list element is a mapping of the 680 dynamic payload type number to an embedded Content-type 681 specification for the payload format corresponding to the 682 payload type. The format of the mapping is: 684 payload-type-number "=" content-type 685 If the content-type string includes a comma, then the content- 686 type string MUST be a quoted-string. If the content- type 687 string does not include a comma, it MAY still be quoted. Since 688 it is part of the list which must itself be a quoted- string, 689 that means the quotation marks MUST be quoted with backslash 690 quoting as specified in RFC 2045. If the content- type string 691 itself contains a quoted-string, then the requirement for 692 backslash quoting is recursively applied. To specify the text/ 693 rex payload format in SDP, the pt parameter is mapped to an 694 a=fmtp attribute by eliminating the parameter name (pt) and 695 changing the commas to slashes. For example: 697 pt = " = \"text/t140;cps=200,text/t140,text/t140\" " 699 Implies the following sdp 701 m=text 49170 RTP/AVP 98 100 702 a=rtpmap:98 rex/1000 703 a=fmtp:98 100/100/100 704 a=rtpmap:100 t140/1000 705 a=fmtp:100 cps=200 707 Encoding considerations: 708 binary; see Section 4.8 of [RFC6838]. 710 Security considerations: 711 See Section 20 of RFC xxxx. [RFC Editor: Upon publication as an 712 RFC, please replace "XXXX" with the number assigned to this 713 document and remove this note.] 715 Interoperability considerations: 716 None. 718 Published specification: 719 RFC XXXX. [RFC Editor: Upon publication as an RFC, please replace 720 "XXXX" with the number assigned to this document and remove this 721 note.] 723 Applications which use this media type: 724 For example: Text conferencing tools, multimedia conferencing 725 tools.Real-time conversational tools. 727 Fragment identifier considerations: 728 N/A. 730 Additional information: 731 None. 733 Person & email address to contact for further information: 734 Gunnar Hellstrom 736 Intended usage: 737 COMMON 739 Restrictions on usage: 740 This media type depends on RTP framing, and hence is only defined 741 for transfer via RTP [RFC3550]. 743 Author: 744 Gunnar Hellstrom 746 Change controller: 747 IETF AVTCore Working Group delegated from the IESG. 749 11. SDP considerations 751 There are receiving RTT implementations which implement RFC 4103 752 [RFC4103] but not the source separation by the CSRC. Sending mixed 753 text according to the usual CSRC convention from RFC 2198 [RFC2198] 754 to a device implementing only RFC 4103 [RFC4103] would risk to lead 755 to unreadable presented text. Therefore, in order to negotiate RTT 756 mixing capability according to this document, all devices supporting 757 this document for multi-party aware participants SHALL include an SDP 758 media format "text/rex" in the SDP [RFC4566], indicating this 759 capability in offers and answers. Multi-party streams using the 760 coding of this document intended for multi-party aware endpoints MUST 761 NOT be sent to devices which have not indicated the "text/rex" 762 format. 764 Implementations not understanding the "text/rex" format MUST ignore 765 it according to common SDP rules. 767 The SDP media format defined here, is named "rex", for extended 768 "red". It is intended to be used in "text" media descriptions with 769 "text/rex" and "text/t140" formats. Both formats MUST be declared 770 for the "text/rex" format to be used. It indicates capability to use 771 source indications in the CSRC list and the packet format according 772 to this document. It also indicates ability to receive 150 real-time 773 text characters per second by default. 775 11.1. Mapping of media parameters to sdp 777 The information carried in the media type registration has a specific 778 mapping to fields in the Session Description Protocol (SDP) , which 779 is commonly used to describe RTP sessions. When SDP RFC 4566 780 [RFC4566]is used to specify sessions employing the "text/rex" format, 781 the mapping is as follows: 783 * The media type ("text") goes in SDP "m=" as the media name. 785 * The media subtype (payload format name) goes in SDP "a=rtpmap" as 786 the encoding name. The RTP clock rate in "a=rtpmap" MUST be 1000 787 for "text/rex". 789 * When the payload type is used with redundancy, the level of 790 redundancy is shown by the number of elements in the slash- 791 separated payload type list in the "fmtp" parameter of the "text/ 792 rex" media format. 794 11.2. Security for session control and media 796 Security SHOULD be applied on both session control and media. In 797 applications where legacy endpoints without security may exist, a 798 negotiation between security and no security SHOULD be applied. If 799 no other security solution is mandated by the application, then RFC 800 8643 OSRTP[RFC8643] SHOULD be applied to negotiate SRTP media 801 security with DTLS. Most SDP examples below are for simplicity 802 expressed without the security additions. The principles (but not 803 all details) for applying DTLS-SRTP security is shown in a couple of 804 the following examples. 806 11.3. SDP offer/answer examples 808 This sections shows some examples of SDP for session negotiation of 809 the real-time text media in SIP sessions. Audio is usually provided 810 in the same session, and sometimes also video. The examples only 811 show the part of importance for the real-time text media. 813 Offer example for just multi-party capability: 815 m=text 11000 RTP/AVP 101 98 816 a=rtpmap:98 t140/1000 817 a=rtpmap:101 rex/1000 818 a=fmtp:101 98/98/98 820 Answer example from a multi-party capable device 821 m=text 12000 RTP/AVP 101 98 822 a=rtpmap:98 t140/1000 823 a=rtpmap:101 rex/1000 824 a=fmtp:101 98/98/98 826 Offer example for both traditional "text/red" and multi-party format: 828 m=text 11000 RTP/AVP 101 100 98 829 a=rtpmap:98 t140/1000 830 a=rtpmap:100 red/1000 831 a=rtpmap:101 rex/1000 832 a=fmtp:100 98/98/98 833 a=fmtp:101 98/98/98 835 Answer example from a multi-party capable device 836 m=text 11000 RTP/AVP 101 98 837 a=rtpmap:98 t140/1000 838 a=rtpmap:101 rex/1000 839 a=fmtp:101 98/98/98 841 Offer example for both traditional "text/red" and multi-party format 842 including security: 843 a=fingerprint: SHA-1 \ 844 4A:AD:B9:B1:3F:82:18:3B:54:02:12:DF:3E:5D:49:6B:19:E5:7C:AB 845 m=text 11000 RTP/AVP 101 100 98 846 a=rtpmap:98 t140/1000 847 a=rtpmap:100 red/1000 848 a=rtpmap:101 rex/1000 849 a=fmtp:100 98/98/98 850 a=fmtp:101 98/98/98 852 The "Fingerprint" is sufficient to offer DTLS-SRTP, with the media 853 line still indicating RTP/AVP. 855 Answer example from a multi-party capable device including security 856 a=fingerprint: SHA-1 \ 857 FF:FF:FF:B1:3F:82:18:3B:54:02:12:DF:3E:5D:49:6B:19:E5:7C:AB 858 m=text 11000 RTP/AVP 101 98 859 a=rtpmap:98 t140/1000 860 a=rtpmap:101 rex/1000 861 a=fmtp:101 98/98/98 863 With the "fingerprint" the device acknowledges use of SRTP/DTLS. 865 Answer example from a multi-party unaware device that also 866 does not support security: 868 m=text 12000 RTP/AVP 100 98 869 a=rtpmap:98 t140/1000 870 a=rtpmap:100 red/1000 871 a=fmtp:100 98/98/98 873 A party which has negotiated the "text/rex" format MUST populate the 874 CSRC-list and format the packets according to this document if it 875 acts as an rtp-mixer and sends multi-party text. 877 A party which has negotiated the "text/rex" capability MUST interpret 878 the contents of the CSRC-list and the packets according to this 879 document in received rtp packets using the corresponding payload 880 type. 882 A party performing as a mixer, which has not negotiated the "text/ 883 rex" format, but negotiated a "text/red" or "text/t140" format in a 884 session with a participant SHOULD, if nothing else is specified for 885 the application, format transmitted text to that participant to be 886 suitable to present on a multi-party unaware endpoint as further 887 specified in section Section 14.2. 889 A party not performing as a mixer MUST not include the CSRC list if 890 it has a single source of text. 892 12. Examples 894 This example shows a symbolic flow of packets from a mixer with loss 895 and recovery. A, B and C are sources of RTT. M is the mixer. Pn 896 indicates primary data in source group "n". Rn1 is first redundant 897 generation data and Rn2 is second redundant generation data in source 898 group "n". A1, B1, A2 etc are text chunks (T140blocks) received from 899 the respective sources. X indicates dropped packet between the mixer 900 and a receiver. 902 |----------------| 903 |Seq no 1 | 904 |CC=1 | 905 |CSRC list A | 906 |R12: Empty | 907 |R11: Empty | 908 |P1: A1 | 909 |----------------| 911 Assuming that earlier packets were received in sequence, text A1 is 912 received from packet 1 and assigned to reception area A. 914 |----------------| 915 |Seq no 2 | 916 |CC=3 | 917 |CSRC list C,A | 918 |R12 Empty | 919 |R11:Empty | 920 |P1: C1 | 921 |R22 Empty | 922 |R21: A1 | 923 |P2: Empty | 924 |----------------| 925 Text C1 is received from packet 2 and assigned to reception area C. 927 X----------------| 928 X Seq no 3 | 929 X CC=2 | 930 X CSRC list C,A | 931 X R12: Empty | 932 X R11: C1 | 933 X P1: Empty | 934 X R22: A1 | 935 X R21: Empty | 936 X P2: A2 | 937 X----------------| 938 Packet 3 is assumed to be dropped in network problems 940 X----------------| 941 X Seq no 4 | 942 X CC=3 | 943 X CSRC list C,B,A| 944 X R12: Empty | 945 X R11: Empty | 946 X P1: C2 | 947 X R22: Empty | 948 X R21: Empty | 949 X P2: B1 | 950 X R32: Empty | 951 X R31: A2 | 952 X P3: A3 | 953 X----------------| 954 Packet 4 is assumed to be dropped in network problems 955 X----------------| 956 X Seq no 5 | 957 X CC=3 | 958 X CSRC list C,B,A| 959 X R12: Empty | 960 X R11: C2 | 961 X P1: Empty | 962 X R22: Empty | 963 X R21: B1 | 964 X P2: B2 | 965 X R32: A2 | 966 X R31: A3 | 967 X P3: A4 | 968 X----------------| 969 Packet 5 is assumed to be dropped in network problems 971 |----------------| 972 |Seq no 6 | 973 |CC=3 | 974 |CSRC list C,B,A | 975 | R12: C2 | 976 | R11: Empty | 977 | P1: Empty | 978 | R22: B1 | 979 | R21: B2 | 980 | P2: B3 | 981 | R32: A3 | 982 | R31: A4 | 983 | P3: A5 | 984 |----------------| 986 Packet 6 is received. The latest received sequence number was 2. 987 Recovery is therefore tried for 3,4,5. But there is no coverage for 988 seq no 3. A missing text mark (U'FFFD) [T140ad1] is created and 989 appended to the common mixer reception area. For seqno 4, texts C2, 990 B1 and A3 are recovered from the second generation redundancy and 991 appended to their respective reception areas. For seqno 5, texts B2 992 and A4 are recovered from the first generation redundancy and 993 appended to their respective reception areas. Primary text B3 and A5 994 are received and appended to their respective reception areas. 996 After this sequence, the following has been received: A1,A3, A4, A5; 997 B1, B2, B3; C1, C2. A possible loss is indicated by the missing text 998 mark in time between A1 and A3. 1000 With only one or two packets lost, there would not be any need to 1001 create a missing text marker, and all text would be recovered. 1003 It will be a design decision how to present the missing text markers 1004 assigned to the mixer as a source. 1006 13. Performance considerations 1008 This document allows new text from up to 16 sources per packet. A 1009 mixer implementing the specification will normally cause a latency of 1010 0 to 150 milliseconds in text from up to 16 simultaneous sources. 1011 This performance meets well the realistic requirements for conference 1012 and conversational applications for which up to 5 simultaneous 1013 sources should not be delayed more than 500 milliseconds by a mixer. 1014 In order to achieve good performance, a receiver for multi-party 1015 calls SHOULD declare a sufficient CPS value for the "text/t140" 1016 format in SDP for the number of allowable characters per second. 1018 As comparison, if the "text/red" format would be used for multi-party 1019 communication with its default timing and redundancy, 5 1020 simultaneously sending parties would cause jerky presentation of the 1021 text from them in text spurts with 5 seconds intervals. With a 1022 reduction of the transmission interval to 150 ms, the time between 1023 text spurts for 5 simultaneous sending parties would be 2.5 seconds. 1025 Five simultaneous sending parties may occasionally occur in a 1026 conference with one or two main sending parties and three parties 1027 giving very brief comments. 1029 The default maximum rate of reception of "text/t140" real-time text 1030 is in RFC 4103 [RFC4103] specified to be 30 characters per second. 1031 The value MAY be modified in the CPS parameter of the FMTP attribute 1032 in the media section for the "text/t140" media. A mixer combining 1033 real-time text from a number of sources may have a higher combined 1034 flow of text coming from the sources. Endpoints SHOULD therefore 1035 specify a suitable higher value for the CPS parameter, corresponding 1036 to its real reception capability. A value for CPS of 150 is the 1037 default for the "text/t140" stream in the "text/rex" format. See RFC 1038 4103 [RFC4103] for the format and use of the CPS parameter. The same 1039 rules apply for the "text/rex" format except for the default value. 1041 14. Presentation level considerations 1043 ITU-T T.140 [T140] provides the presentation level requirements for 1044 the RFC 4103 [RFC4103] transport. T.140 [T140] has functions for 1045 erasure and other formatting functions and has the following general 1046 statement for the presentation: 1048 "The display of text from the members of the conversation should be 1049 arranged so that the text from each participant is clearly readable, 1050 and its source and the relative timing of entered text is visualized 1051 in the display. Mechanisms for looking back in the contents from the 1052 current session should be provided. The text should be displayed as 1053 soon as it is received." 1055 Strict application of T.140 [T140] is of essence for the 1056 interoperability of real-time text implementations and to fulfill the 1057 intention that the session participants have the same information of 1058 the text contents of the conversation without necessarily having the 1059 exact same layout of the conversation. 1061 T.140 [T140] specifies a set of presentation control codes to include 1062 in the stream. Some of them are optional. Implementations MUST be 1063 able to ignore optional control codes that they do not support. 1065 There is no strict "message" concept in real-time text. Line 1066 Separator SHALL be used as a separator allowing a part of received 1067 text to be grouped in presentation. The characters "CRLF" may be 1068 used by other implementations as replacement for Line Separator. The 1069 "CRLF" combination SHALL be erased by just one erasing action, just 1070 as the Line Separator. Presentation functions are allowed to group 1071 text for presentation in smaller groups than the line separators 1072 imply and present such groups with source indication together with 1073 text groups from other sources (see the following presentation 1074 examples). Erasure has no specific limit by any delimiter in the 1075 text stream. 1077 14.1. Presentation by multi-party aware endpoints 1079 A multi-party aware receiving party, presenting real-time text MUST 1080 separate text from different sources and present them in separate 1081 presentation fields. The receiving party MAY separate presentation 1082 of parts of text from a source in readable groups based on other 1083 criteria than line separator and merge these groups in the 1084 presentation area when it benefits the user to most easily find and 1085 read text from the different participants. The criteria MAY e.g. be 1086 a received comma, full stop, or other phrase delimiters, or a long 1087 pause. 1089 When text is received from multiple original sources simultaneously, 1090 the presentation SHOULD provide a view where text is added in 1091 multiple places simultaneously. 1093 If the presentation presents text from different sources in one 1094 common area, the presenting endpoint SHOULD insert text from the 1095 local user ended at suitable points merged with received text to 1096 indicate the relative timing for when the text groups were completed. 1097 In this presentation mode, the receiving endpoint SHALL present the 1098 source of the different groups of text. 1100 A view of a three-party RTT call in chat style is shown in this 1101 example . 1103 _________________________________________________ 1104 | |^| 1105 |[Alice] Hi, Alice here. |-| 1106 | | | 1107 |[Bob] Bob as well. | | 1108 | | | 1109 |[Eve] Hi, this is Eve, calling from Paris. | | 1110 | I thought you should be here. | | 1111 | | | 1112 |[Alice] I am coming on Thursday, my | | 1113 | performance is not until Friday morning.| | 1114 | | | 1115 |[Bob] And I on Wednesday evening. | | 1116 | | | 1117 |[Alice] Can we meet on Thursday evening? | | 1118 | | | 1119 |[Eve] Yes, definitely. How about 7pm. | | 1120 | at the entrance of the restaurant | | 1121 | Le Lion Blanc? | | 1122 |[Eve] we can have dinner and then take a walk |-| 1123 |______________________________________________|v| 1124 | But I need to be back to |^| 1125 | the hotel by 11 because I need |-| 1126 | | | 1127 | I wou |-| 1128 |______________________________________________|v| 1129 | of course, I underst | 1130 |________________________________________________| 1132 Figure 3: Example of a three-party RTT call presented in chat style 1133 seen at participant 'Alice's endpoint. 1135 Other presentation styles than the chat style may be arranged. 1137 This figure shows how a coordinated column view MAY be presented. 1139 _____________________________________________________________________ 1140 | Bob | Eve | Alice | 1141 |____________________|______________________|_______________________| 1142 | | |I will arrive by TGV. | 1143 |My flight is to Orly| |Convenient to the main | 1144 | |Hi all, can we plan |station. | 1145 | |for the seminar? | | 1146 |Eve, will you do | | | 1147 |your presentation on| | | 1148 |Friday? |Yes, Friday at 10. | | 1149 |Fine, wo | |We need to meet befo | 1150 |___________________________________________________________________| 1152 Figure 4: An example of a coordinated column-view of a three-party 1153 session with entries ordered vertically in approximate time-order. 1155 14.2. Multi-party mixing for multi-party unaware endpoints 1157 When the mixer has indicated multi-party capability by the "text/rex" 1158 format in an SDP negotiation, but the multi-party capability 1159 negotiation fails with an endpoint, then the agreed "text/red" or 1160 "text/t140" format SHALL be used and the mixer SHOULD compose a best- 1161 effort presentation of multi-party real-time text in one stream 1162 intended to be presented by an endpoint with no multi-party 1163 awareness. 1165 This presentation format has functional limitations and SHOULD be 1166 used only to enable participation in multi-party calls by legacy 1167 deployed endpoints implementing only RFC 4103 without the multi-party 1168 extension specified in this document. 1170 The principles and procedures below do not specify any new protocol 1171 elements. They are instead composed from the information in ITU-T 1172 T.140 [T140] and an ambition to provide a best effort presentation on 1173 an endpoint which has functions only for two-party calls. 1175 The mixer mixing for multi-party unaware endpoints SHALL compose a 1176 simulated limited multi-party RTT view suitable for presentation in 1177 one presentation area. The mixer SHALL group text in suitable groups 1178 and prepare for presentation of them by inserting a new line between 1179 them if the transmitted text did not already end with a new line. A 1180 presentable label SHOULD be composed and sent for the source 1181 initially in the session and after each source switch. With this 1182 procedure the time for source switching is depending on the actions 1183 of the users. In order to expedite source switch, a user can for 1184 example end its turn with a new line. 1186 14.2.1. Actions by the mixer at reception from the call participants 1188 When text is received by the mixer from the different participants, 1189 the mixer SHALL recover text from redundancy if any packets are lost. 1190 The mark for lost text [T140ad1] SHOULD be inserted in the stream if 1191 unrecoverable loss appears. Any Unicode "BOM" characters, possibly 1192 used for keep-alive shall be deleted. The time of creation of text 1193 (retrieved from the RTP timestamp) SHALL be stored together with the 1194 received text from each source in queues for transmission to the 1195 recipients. 1197 14.2.2. Actions by the mixer for transmission to the recipients 1199 The following procedure SHOULD be applied for each recipient of 1200 multi-part text from the mixer. 1202 The text for transmission SHOULD be formatted by the mixer for each 1203 receiving user for presentation in one single presentation area. 1204 Text received from a participant SHOULD NOT be included in 1205 transmission to that participant. When there is text available for 1206 transmission from the mixer to a receiving party from more than one 1207 participant, the mixer SHOULD switch between transmission of text 1208 from the different sources at suitable points in the transmitted 1209 stream. 1211 When switching source, the mixer SHOULD insert a line separator if 1212 the already transmitted text did not end with a new line (line 1213 separator or CRLF). A label SHOULD be composed from information in 1214 the CNAME and NAME fields in RTCP reports from the participant to 1215 have its text transmitted, or from other session information for that 1216 user. The label SHOULD be delimited by suitable characters (e.g. '[ 1217 ]') and transmitted. The CSRC SHOULD indicate the selected source. 1218 Then text from that selected participant SHOULD be transmitted until 1219 a new suitable point for switching source is reached. 1221 Seeking a suitable point for switching source SHOULD be done when 1222 there is older text waiting for transmission from any party than the 1223 age of the last transmitted text. Suitable points for switching are: 1225 * A completed phrase ended by comma 1227 * A completed sentence 1229 * A new line (line separator or CRLF) 1231 * A long pause (e.g. > 10 seconds) in received text from the 1232 currently transmitted source 1234 * If text from one participant has been transmitted with text from 1235 other sources waiting for transmission for a long time (e.g. > 1 1236 minute) and none of the other suitable points for switching has 1237 occurred, a source switch MAY be forced by the mixer at next word 1238 delimiter, and also if even a word delimiter does not occur within 1239 a time (e.g. 15 seconds) after the scan for word delimiter 1240 started. 1242 When switching source, the source which has the oldest text in queue 1243 SHOULD be selected to be transmitted. A character display count 1244 SHOULD be maintained for the currently transmitted source, starting 1245 at zero after the label is transmitted for the currently transmitted 1246 source. 1248 The status SHOULD be maintained for the latest control code for 1249 Select Graphic Rendition (SGR) from each source. If there is an SGR 1250 code stored as the status for the current source before the source 1251 switch is done, a reset of SGR shall be sent by the sequence SGR 0 1252 [009B 0000 006D] after the new line and before the new label during a 1253 source switch. See SGR below for an explanation. This transmission 1254 does not influence the display count. 1256 If there is an SGR code stored for the new source after the source 1257 switch, that SGR code SHOULD be transmitted to the recipient before 1258 the label. This transmission does not influence the display count. 1260 14.2.3. Actions on transmission of text 1262 Text from a source sent to the recipient SHOULD increase the display 1263 count by one per transmitted character. 1265 14.2.4. Actions on transmission of control codes 1267 The following control codes specified by T.140 require specific 1268 actions. They SHOULD cause specific considerations in the mixer. 1269 Note that the codes presented here are expressed in UCS-16, while 1270 transmission is made in UTF-8 transform of these codes. 1272 BEL 0007 Bell Alert in session, provides for alerting during an 1273 active session. The display count SHOULD not be altered. 1275 NEW LINE 2028 Line separator. Check and perform a source switch if 1276 appropriate. Increase display count by 1. 1278 CR LF 000D 000A A supported, but not preferred way of requesting a 1279 new line. Check and perform a source switch if appropriate. 1280 Increase display count by 1. 1282 INT ESC 0061 Interrupt (used to initiate mode negotiation 1283 procedure). The display count SHOULD not be altered. 1285 SGR 009B Ps 006D Select graphic rendition. Ps is rendition 1286 parameters specified in ISO 6429. The display count SHOULD not be 1287 altered. The SGR code SHOULD be stored for the current source. 1289 SOS 0098 Start of string, used as a general protocol element 1290 introducer, followed by a maximum 256 bytes string and the ST. 1291 The display count SHOULD not be altered. 1293 ST 009C String terminator, end of SOS string. The display count 1294 SHOULD not be altered. 1296 ESC 001B Escape - used in control strings. The display count SHOULD 1297 not be altered for the complete escape code. 1299 Byte order mark "BOM" (U+FEFF) "Zero width, no break space", used 1300 for synchronization and keep-alive. SHOULD be deleted from 1301 incoming streams. Shall be sent first after session establishment 1302 to the recipient. The display count shall not be altered. 1304 Missing text mark (U+FFFD) "Replacement character", represented as a 1305 question mark in a rhombus, or if that is not feasible, replaced 1306 by an apostrophe ', marks place in stream of possible text loss. 1307 SHOULD be inserted by the reception procedure in case of 1308 unrecoverable loss of packets. The display count SHOULD be 1309 increased by one when sent as for any other character. 1311 SGR If a control code for selecting graphic rendition (SGR), other 1312 than reset of the graphic rendition (SGR 0) is sent to a 1313 recipient, that control code shall also be stored as status for 1314 the source in the storage for SGR status. If a reset graphic 1315 rendition (SGR 0) originated from a source is sent, then the SGR 1316 status storage for that source shall be cleared. The display 1317 count shall not be increased. 1319 BS (U+0008) Back Space, intended to erase the last entered character 1320 by a source. Erasure by backspace cannot always be performed as 1321 the erasing party intended. If an erasing action erases all text 1322 up to the end of the leading label after a source switch, then the 1323 mixer must not transmit more backspaces. Instead it is 1324 RECOMMENDED that a letter "X" is inserted in the text stream for 1325 each backspace as an indication of the intent to erase more. A 1326 new line is usually coded by a Line Separator, but the character 1327 combination "CRLF" MAY be used instead. Erasure of a new line is 1328 in both cases done by just one erasing action (Backspace). If the 1329 display count has a positive value it is decreased by one when the 1330 BS is sent. If the display count is at zero, it is not altered. 1332 14.2.5. Packet transmission 1334 A mixer transmitting to a multi-party unaware terminal SHOULD send 1335 primary data only from one source per packet. The SSRC SHOULD be the 1336 SSRC of the mixer. The CSRC list SHOULD contain one member and be 1337 the SSRC of the source of the primary data. 1339 14.2.6. Functional limitations 1341 When a multi-party unaware endpoint presents a conversation in one 1342 display area in a chat style, it inserts source indications for 1343 remote text and local user text as they are merged in completed text 1344 groups. When an endpoint using this layout receives and presents 1345 text mixed for multi-party unaware endpoints, there will be two 1346 levels of source indicators for the received text; one generated by 1347 the mixer and inserted in a label after each source switch, and 1348 another generated by the receiving endpoint and inserted after each 1349 switch between local and remote source in the presentation area. 1350 This will waste display space and look inconsistent to the reader. 1352 New text can be presented only from one source at a time. Switch of 1353 source to be presented takes place at suitable places in the text, 1354 such as end of phrase, end of sentence, line separator and 1355 inactivity. Therefore the time to switch to present waiting text 1356 from other sources may become long and will vary and depend on the 1357 actions of the currently presented source. 1359 Erasure can only be done up to the latest source switch. If a user 1360 tries to erase more text, the erasing actions will be presented as 1361 letter X after the label. 1363 Text loss because of network errors may hit the label between entries 1364 from different parties, causing risk for misunderstanding from which 1365 source a piece of text is. 1367 These facts makes it strongly RECOMMENDED to implement multi-party 1368 awareness in RTT endpoints. The use of the mixing method for multi- 1369 party-unaware endpoints should be left for use with endpoints which 1370 are impossible to upgrade to become multi-party aware. 1372 14.2.7. Example views of presentation on multi-party unaware endpoints 1374 The following pictures are examples of the view on a participant's 1375 display for the multi-party-unaware case. 1377 _________________________________________________ 1378 | Conference | Alice | 1379 |________________________|_________________________| 1380 | |I will arrive by TGV. | 1381 |[Bob]:My flight is to |Convenient to the main | 1382 |Orly. |station. | 1383 |[Eve]:Hi all, can we | | 1384 |plan for the seminar. | | 1385 | | | 1386 |[Bob]:Eve, will you do | | 1387 |your presentation on | | 1388 |Friday? | | 1389 |[Eve]:Yes, Friday at 10.| | 1390 |[Bob]: Fine, wo |We need to meet befo | 1391 |________________________|_________________________| 1393 Figure 5: Alice who has a conference-unaware client is receiving the 1394 multi-party real-time text in a single-stream. This figure shows how 1395 a coordinated column view MAY be presented on Alice's device. 1397 _________________________________________________ 1398 | |^| 1399 |[Alice] Hi, Alice here. |-| 1400 | | | 1401 |[mix][Bob] Bob as well. | | 1402 | | | 1403 |[Eve] Hi, this is Eve, calling from Paris | | 1404 | I thought you should be here. | | 1405 | | | 1406 |[Alice] I am coming on Thursday, my | | 1407 | performance is not until Friday morning.| | 1408 | | | 1409 |[mix][Bob] And I on Wednesday evening. | | 1410 | | | 1411 |[Eve] we can have dinner and then walk | | 1412 | | | 1413 |[Eve] But I need to be back to | | 1414 | the hotel by 11 because I need | | 1415 | |-| 1416 |______________________________________________|v| 1417 | of course, I underst | 1418 |________________________________________________| 1420 Figure 6: An example of a view of the multi-party unaware 1421 presentation in chat style. Alice is the local user. 1423 15. Gateway Considerations 1425 15.1. Gateway considerations with Textphones (e.g. TTYs). 1427 Multi-party RTT sessions may involve gateways of different kinds. 1428 Gateways involved in setting up sessions SHALL correctly reflect the 1429 multi-party capability or unawareness of the combination of the 1430 gateway and the remote endpoint beyond the gateway. 1432 One case that may occur is a gateway to PSTN for communication with 1433 textphones (e.g. TTYs). Textphones are limited devices with no 1434 multi-party awareness, and it SHOULD therefore be suitable for the 1435 gateway to not indicate multi-party awareness for that case. Another 1436 solution is that the gateway indicates multi-party capability towards 1437 the mixer, and includes the multi-party mixer function for multi- 1438 party unaware endpoints itself. This solution makes it possible to 1439 make adaptations for the functional limitations of the textphone 1440 (TTY). 1442 More information on gateways to textphones (TTYs) is found in RFC 1443 5194[RFC5194] 1445 15.2. Gateway considerations with WebRTC. 1447 Gateway operation to real-time text in WebRTC may also be required. 1448 In WebRTC, RTT is specified in draft-ietf-mmusic-t140-usage-data- 1449 channel[I-D.ietf-mmusic-t140-usage-data-channel]. 1451 A multi-party bridge may have functionality for communicating by RTT 1452 both in RTP streams with RTT and WebRTC t140 data channels. Other 1453 configurations may consist of a multi-party bridge with either 1454 technology for RTT transport and a separate gateway for conversion of 1455 the text communication streams between RTP and t140 data channel. 1457 In WebRTC, it is assumed that for a multi-party session, one t140 1458 data channel is established for each source from a gateway or bridge 1459 to each participant. Each participant also has a data channel with 1460 two-way connection with the gateway or bridge. 1462 The t140 channel used both ways is for text from the WebRTC user and 1463 from the bridge or gateway itself to the WebRTC user. The label 1464 parameter of this t140 channel is used as NAME field in RTCP to 1465 participants on the RTP side. The other t140 channels are only for 1466 text from other participants to the WebRTC user. 1468 When a new participant has entered the session with RTP transport of 1469 rtt, a new t140 channel SHOULD be established to WebRTC users with 1470 the label parameter composed from the NAME field in RTCP on the RTP 1471 side. 1473 When a new participant has entered the multi-party session with RTT 1474 transport in a WebRTC t140 data channel, the new participant SHOULD 1475 be announced by a notification to RTP users. The label parameter 1476 from the WebRTC side SHOULD be used as the NAME RTCP field on the RTP 1477 side, or other available session information. 1479 16. Updates to RFC 4102 and RFC 4103 1481 This document updates RFC 4102[RFC4102] and RFC 4103[RFC4103] by 1482 introducing an extended packet format "text/rex" for the multi-party 1483 mixing case and more strict rules for the use of redundancy, and 1484 population of the CSRC list in the packets. Implications for the 1485 CSRC list use from RFC 2198[RFC2198] are hereby not in effect. 1487 17. Congestion considerations 1489 The congestion considerations and recommended actions from RFC 4103 1490 [RFC4103] are valid also in multi-party situations. 1492 The first action in case of congestion SHOULD be to temporarily 1493 increase the transmission interval up to one second. 1495 18. Acknowledgements 1497 James Hamlin for format input. 1499 19. IANA Considerations 1501 The IANA is requested to register the media type "text/rex" as 1502 specified in Section 10. The media type is also requested to be 1503 added to the IANA registry for "RTP Payload Format Media Types" 1504 . 1506 20. Security Considerations 1508 The RTP-mixer model requires the mixer to be allowed to decrypt, pack 1509 and encrypt secured text from the conference participants. Therefore 1510 the mixer needs to be trusted. This is similar to the situation for 1511 central mixers of audio and video. 1513 The requirement to transfer information about the user in RTCP 1514 reports in SDES, CNAME and NAME fields, and in conference 1515 notifications, for creation of labels may have privacy concerns as 1516 already stated in RFC 3550 [RFC3550], and may be restricted of 1517 privacy reasons. The receiving user will then get a more symbolic 1518 label for the source. 1520 21. Change history 1522 21.1. Changes included in draft-ietf-avtcore-multi-party-rtt-mix-06 1524 Improved definitions list format. 1526 The format of the media subtype parameters is made to match the 1527 requirements. 1529 The mapping of media subtype parameters to sdp is included. 1531 The CPS parameter belongs to the t140 subtype and does not need to be 1532 registered here. 1534 21.2. Changes included in draft-ietf-avtcore-multi-party-rtt-mix-05 1536 nomenclature and editorial improvements 1538 "this document" used consistently to refer to this document. 1540 21.3. Changes included in draft-ietf-avtcore-multi-party-rtt-mix-04 1542 'Redundancy header' renamed to 'data header'. 1544 More clarifications added. 1546 Language and figure number corrections. 1548 21.4. Changes included in draft-ietf-avtcore-multi-party-rtt-mix-03 1550 Mention possible need to mute and raise hands as for other media. 1551 ---done ---- 1553 Make sure that use in two-party calls is also possible and explained. 1554 - may need more wording - 1556 Clarify the RTT is often used together with other media. --done-- 1558 Tell that text mixing is N-1. A users own text is not received in 1559 the mix. -done- 1561 In 3. correct the interval to: A "text/rex" transmitter SHOULD send 1562 packets distributed in time as long as there is something (new or 1563 redundant T140blocks) to transmit. The maximum transmission interval 1564 SHOULD then be 300 ms. It is RECOMMENDED to send a packet to a 1565 receiver as soon as new text to that receiver is available, as long 1566 as the time after the latest sent packet to the same receiver is more 1567 than 150 ms, and also the maximum character rate to the receiver is 1568 not exceeded. The intention is to keep the latency low while keeping 1569 a good protection against text loss in bursty packet loss conditions. 1570 -done- 1572 In 1.3 say that the format is used both ways. -done- 1574 In 13.1 change presentation area to presentation field so that reader 1575 does not think it shall be totally separated. -done- 1577 In Performance and intro, tell the performance in number of 1578 simultaneous sending users and introduced delay 16, 150 vs 1579 requirements 5 vs 500. -done -- 1581 Clarify redundancy level per connection. -done- 1583 Timestamp also for the last data header. To make it possible for all 1584 text to have time offset as for transmission from the source. Make 1585 that header equal to the others. -done- 1587 Mixer always use the CSRC list, even for its own BOM. -done- 1588 Combine all talk about transmission interval (300 ms vs when text has 1589 arrived) in section 3 in one paragraph or close to each other. -done- 1591 Documents the goal of good performance with low delay for 5 1592 simultaneous typers in the introduction. -done- 1594 Describe better that only primary text shall be sent on to receivers. 1595 Redundancy and loss must be resolved by the mixer. -done- 1597 21.5. Changes included in draft-ietf-avtcore-multi-party-rtt-mix-02 1599 SDP and better description and visibility of security by OSRTP RFC 1600 8634 needed. 1602 The description of gatewaying to WebRTC extended. 1604 The description of the data header in the packet is improved. 1606 21.6. Changes to draft-ietf-avtcore-multi-party-rtt-mix-01 1608 2,5,6 More efficient format "text/rex" introduced and attribute 1609 a=rtt-mix deleted. 1611 3. Brief about use of OSRTP for security included- More needed. 1613 4. Brief motivation for the solution and why not rtp-translator is 1614 used added to intro. 1616 7. More limitations for the multi-party unaware mixing method 1617 inserted. 1619 8. Updates to RFC 4102 and 4103 more clearly expressed. 1621 9. Gateway to WebRTC started. More needed. 1623 21.7. Changes from draft-hellstrom-avtcore-multi-party-rtt-source-03 to 1624 draft-ietf-avtcore-multi-party-rtt-mix-00 1626 Changed file name to draft-ietf-avtcore-multi-party-rtt-mix-00 1628 Replaced CDATA in IANA registration table with better coding. 1630 Converted to xml2rfc version 3. 1632 21.8. Changes from draft-hellstrom-avtcore-multi-party-rtt-source-02 to 1633 -03 1635 Changed company and e-mail of the author. 1637 Changed title to "RTP-mixer formatting of multi-party Real-time text" 1638 to better match contents. 1640 Check and modification where needed of use of RFC 2119 words SHALL 1641 etc. 1643 More about the CC value in sections on transmitters and receivers so 1644 that 1-to-1 sessions do not use the mixer format. 1646 Enhanced section on presentation for multi-party-unaware endpoints 1648 A paragraph recommending CPS=150 inserted in the performance section. 1650 21.9. Changes from draft-hellstrom-avtcore-multi-party-rtt-source-01 to 1651 -02 1653 In Abstract and 1. Introduction: Introduced wording about regulatory 1654 requirements. 1656 In section 5: The transmission interval is decreased to 100 ms when 1657 there is text from more than one source to transmit. 1659 In section 11 about SDP negotiation, a SHOULD-requirement is 1660 introduced that the mixer should make a mix for multi-party unaware 1661 endpoints if the negotiation is not successful. And a reference to a 1662 later chapter about it. 1664 The presentation considerations chapter 14 is extended with more 1665 information about presentation on multi-party aware endpoints, and a 1666 new section on the multi-party unaware mixing with low functionality 1667 but SHOULD a be implemented in mixers. Presentation examples are 1668 added. 1670 A short chapter 15 on gateway considerations is introduced. 1672 Clarification about the text/t140 format included in chapter 10. 1674 This sentence added to the chapter 10 about use without redundancy. 1675 "The text/red format SHOULD be used unless some other protection 1676 against packet loss is utilized, for example a reliable network or 1677 transport." 1679 Note about deviation from RFC 2198 added in chapter 4. 1681 In chapter 9. "Use with SIP centralized conferencing framework" the 1682 following note is inserted: Note: The CSRC-list in an RTP packet only 1683 includes participants who's text is included in one or more text 1684 blocks. It is not the same as the list of participants in a 1685 conference. With audio and video media, the CSRC-list would often 1686 contain all participants who are not muted whereas text participants 1687 that don't type are completely silent and so don't show up in RTP 1688 packet CSRC-lists. 1690 21.10. Changes from draft-hellstrom-avtcore-multi-party-rtt-source-00 1691 to -01 1693 Editorial cleanup. 1695 Changed capability indication from fmtp-parameter to SDP attribute 1696 "rtt-mix". 1698 Swapped order of redundancy elements in the example to match reality. 1700 Increased the SDP negotiation section 1702 22. References 1704 22.1. Normative References 1706 [I-D.ietf-mmusic-t140-usage-data-channel] 1707 Holmberg, C. and G. Hellstrom, "T.140 Real-time Text 1708 Conversation over WebRTC Data Channels", Work in Progress, 1709 Internet-Draft, draft-ietf-mmusic-t140-usage-data-channel- 1710 14, 10 April 2020, . 1713 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1714 Requirement Levels", BCP 14, RFC 2119, 1715 DOI 10.17487/RFC2119, March 1997, 1716 . 1718 [RFC2198] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., 1719 Handley, M., Bolot, J.C., Vega-Garcia, A., and S. Fosse- 1720 Parisis, "RTP Payload for Redundant Audio Data", RFC 2198, 1721 DOI 10.17487/RFC2198, September 1997, 1722 . 1724 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 1725 Jacobson, "RTP: A Transport Protocol for Real-Time 1726 Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550, 1727 July 2003, . 1729 [RFC4102] Jones, P., "Registration of the text/red MIME Sub-Type", 1730 RFC 4102, DOI 10.17487/RFC4102, June 2005, 1731 . 1733 [RFC4103] Hellstrom, G. and P. Jones, "RTP Payload for Text 1734 Conversation", RFC 4103, DOI 10.17487/RFC4103, June 2005, 1735 . 1737 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 1738 Description Protocol", RFC 4566, DOI 10.17487/RFC4566, 1739 July 2006, . 1741 [RFC4855] Casner, S., "Media Type Registration of RTP Payload 1742 Formats", RFC 4855, DOI 10.17487/RFC4855, February 2007, 1743 . 1745 [RFC5764] McGrew, D. and E. Rescorla, "Datagram Transport Layer 1746 Security (DTLS) Extension to Establish Keys for the Secure 1747 Real-time Transport Protocol (SRTP)", RFC 5764, 1748 DOI 10.17487/RFC5764, May 2010, 1749 . 1751 [RFC6263] Marjou, X. and A. Sollaud, "Application Mechanism for 1752 Keeping Alive the NAT Mappings Associated with RTP / RTP 1753 Control Protocol (RTCP) Flows", RFC 6263, 1754 DOI 10.17487/RFC6263, June 2011, 1755 . 1757 [RFC6838] Freed, N., Klensin, J., and T. Hansen, "Media Type 1758 Specifications and Registration Procedures", BCP 13, 1759 RFC 6838, DOI 10.17487/RFC6838, January 2013, 1760 . 1762 [RFC8643] Johnston, A., Aboba, B., Hutton, A., Jesske, R., and T. 1763 Stach, "An Opportunistic Approach for Secure Real-time 1764 Transport Protocol (OSRTP)", RFC 8643, 1765 DOI 10.17487/RFC8643, August 2019, 1766 . 1768 [T140] ITU-T, "Recommendation ITU-T T.140 (02/1998), Protocol for 1769 multimedia application text conversation", February 1998, 1770 . 1772 [T140ad1] ITU-T, "Recommendation ITU-T.140 Addendum 1 - (02/2000), 1773 Protocol for multimedia application text conversation", 1774 February 2000, 1775 . 1777 22.2. Informative References 1779 [RFC4353] Rosenberg, J., "A Framework for Conferencing with the 1780 Session Initiation Protocol (SIP)", RFC 4353, 1781 DOI 10.17487/RFC4353, February 2006, 1782 . 1784 [RFC4575] Rosenberg, J., Schulzrinne, H., and O. Levin, Ed., "A 1785 Session Initiation Protocol (SIP) Event Package for 1786 Conference State", RFC 4575, DOI 10.17487/RFC4575, August 1787 2006, . 1789 [RFC4579] Johnston, A. and O. Levin, "Session Initiation Protocol 1790 (SIP) Call Control - Conferencing for User Agents", 1791 BCP 119, RFC 4579, DOI 10.17487/RFC4579, August 2006, 1792 . 1794 [RFC5194] van Wijk, A., Ed. and G. Gybels, Ed., "Framework for Real- 1795 Time Text over IP Using the Session Initiation Protocol 1796 (SIP)", RFC 5194, DOI 10.17487/RFC5194, June 2008, 1797 . 1799 Author's Address 1801 Gunnar Hellstrom 1802 Gunnar Hellstrom Accessible Communication 1803 Esplanaden 30 1804 SE-13670 Vendelso 1805 Sweden 1807 Email: gunnar.hellstrom@ghaccess.se