idnits 2.17.1 draft-ietf-mmusic-sdp-simulcast-14.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 5, 2019) is 1878 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-19) exists of draft-ietf-mmusic-sdp-mux-attributes-17 ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) == Outdated reference: A later version (-12) exists of draft-ietf-avtcore-multiplex-guidelines-08 == Outdated reference: A later version (-20) exists of draft-ietf-payload-flexible-fec-scheme-17 Summary: 1 error (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group B. Burman 3 Internet-Draft M. Westerlund 4 Intended status: Standards Track Ericsson 5 Expires: September 6, 2019 S. Nandakumar 6 M. Zanaty 7 Cisco 8 March 5, 2019 10 Using Simulcast in SDP and RTP Sessions 11 draft-ietf-mmusic-sdp-simulcast-14 13 Abstract 15 In some application scenarios it may be desirable to send multiple 16 differently encoded versions of the same media source in different 17 RTP streams. This is called simulcast. This document describes how 18 to accomplish simulcast in RTP and how to signal it in SDP. The 19 described solution uses an RTP/RTCP identification method to identify 20 RTP streams belonging to the same media source, and makes an 21 extension to SDP to relate those RTP streams as being different 22 simulcast formats of that media source. The SDP extension consists 23 of a new media level SDP attribute that expresses capability to send 24 and/or receive simulcast RTP streams. 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at https://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on September 6, 2019. 43 Copyright Notice 45 Copyright (c) 2019 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (https://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 61 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 4 62 2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 63 2.2. Requirements Language . . . . . . . . . . . . . . . . . . 5 64 3. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . 5 65 3.1. Reaching a Diverse Set of Receivers . . . . . . . . . . . 6 66 3.2. Application Specific Media Source Handling . . . . . . . 7 67 3.3. Receiver Media Source Preferences . . . . . . . . . . . . 7 68 4. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 7 69 5. Detailed Description . . . . . . . . . . . . . . . . . . . . 10 70 5.1. Simulcast Attribute . . . . . . . . . . . . . . . . . . . 10 71 5.2. Simulcast Capability . . . . . . . . . . . . . . . . . . 11 72 5.3. Offer/Answer Use . . . . . . . . . . . . . . . . . . . . 13 73 5.3.1. Generating the Initial SDP Offer . . . . . . . . . . 13 74 5.3.2. Creating the SDP Answer . . . . . . . . . . . . . . . 13 75 5.3.3. Offerer Processing the SDP Answer . . . . . . . . . . 14 76 5.3.4. Modifying the Session . . . . . . . . . . . . . . . . 15 77 5.4. Use with Declarative SDP . . . . . . . . . . . . . . . . 15 78 5.5. Relating Simulcast Streams . . . . . . . . . . . . . . . 16 79 5.6. Signaling Examples . . . . . . . . . . . . . . . . . . . 16 80 5.6.1. Single-Source Client . . . . . . . . . . . . . . . . 17 81 5.6.2. Multi-Source Client . . . . . . . . . . . . . . . . . 18 82 5.6.3. Simulcast and Redundancy . . . . . . . . . . . . . . 21 83 6. RTP Aspects . . . . . . . . . . . . . . . . . . . . . . . . . 23 84 6.1. Outgoing from Endpoint with Media Source . . . . . . . . 23 85 6.2. RTP Middlebox to Receiver . . . . . . . . . . . . . . . . 23 86 6.2.1. Media-Switching Mixer . . . . . . . . . . . . . . . . 24 87 6.2.2. Selective Forwarding Middlebox . . . . . . . . . . . 26 88 6.3. RTP Middlebox to RTP Middlebox . . . . . . . . . . . . . 27 89 7. Network Aspects . . . . . . . . . . . . . . . . . . . . . . . 28 90 7.1. Bitrate Adaptation . . . . . . . . . . . . . . . . . . . 28 91 8. Limitation . . . . . . . . . . . . . . . . . . . . . . . . . 29 92 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 29 93 10. Security Considerations . . . . . . . . . . . . . . . . . . . 30 94 11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 30 95 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 30 96 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 30 97 13.1. Normative References . . . . . . . . . . . . . . . . . . 31 98 13.2. Informative References . . . . . . . . . . . . . . . . . 32 99 Appendix A. Requirements . . . . . . . . . . . . . . . . . . . . 34 100 Appendix B. Changes From Earlier Versions . . . . . . . . . . . 35 101 B.1. Modifications Between WG Version -13 and -14 . . . . . . 35 102 B.2. Modifications Between WG Version -12 and -13 . . . . . . 36 103 B.3. Modifications Between WG Version -11 and -12 . . . . . . 36 104 B.4. Modifications Between WG Version -10 and -11 . . . . . . 36 105 B.5. Modifications Between WG Version -09 and -10 . . . . . . 37 106 B.6. Modifications Between WG Version -08 and -09 . . . . . . 37 107 B.7. Modifications Between WG Version -07 and -08 . . . . . . 37 108 B.8. Modifications Between WG Version -06 and -07 . . . . . . 38 109 B.9. Modifications Between WG Version -05 and -06 . . . . . . 38 110 B.10. Modifications Between WG Version -04 and -05 . . . . . . 38 111 B.11. Modifications Between WG Version -03 and -04 . . . . . . 39 112 B.12. Modifications Between WG Version -02 and -03 . . . . . . 39 113 B.13. Modifications Between WG Version -01 and -02 . . . . . . 40 114 B.14. Modifications Between WG Version -00 and -01 . . . . . . 40 115 B.15. Modifications Between Individual Version -00 and WG 116 Version -00 . . . . . . . . . . . . . . . . . . . . . . . 40 117 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 40 119 1. Introduction 121 Most of today's multiparty video conference solutions make use of 122 centralized servers to reduce the bandwidth and CPU consumption in 123 the endpoints. Those servers receive RTP streams from each 124 participant and send some suitable set of possibly modified RTP 125 streams to the rest of the participants, which usually have 126 heterogeneous capabilities (screen size, CPU, bandwidth, codec, etc). 127 One of the biggest issues is how to perform RTP stream adaptation to 128 different participants' constraints with the minimum possible impact 129 on both video quality and server performance. 131 Simulcast is defined in this memo as the act of simultaneously 132 sending multiple different encoded streams of the same media source, 133 e.g. the same video source encoded with different video encoder types 134 or image resolutions. This can be done in several ways and for 135 different purposes. This document focuses on the case where it is 136 desirable to provide a media source as multiple encoded streams over 137 RTP [RFC3550] towards an intermediary so that the intermediary can 138 provide the wanted functionality by selecting which RTP stream(s) to 139 forward to other participants in the session, and more specifically 140 how the identification and grouping of the involved RTP streams are 141 done. 143 The intended scope of the defined mechanism is to support negotiation 144 and usage of simulcast when using SDP offer/answer and media 145 transport over RTP. The media transport topologies considered are 146 point to point RTP sessions as well as centralized multi-party RTP 147 sessions, where a media sender will provide the simulcasted streams 148 to an RTP middlebox or endpoint, and middleboxes may further 149 distribute the simulcast streams to other middleboxes or endpoints. 150 Simulcast could, as part of a distributed multi-party scenario, be 151 used point-to-point between middleboxes. Usage of multicast or 152 broadcast transport is out of scope and left for future extensions. 154 This document describes a few scenarios that motivate the use of 155 simulcast, and also defines the needed RTP/RTCP and SDP signaling for 156 it. 158 2. Definitions 160 2.1. Terminology 162 This document makes use of the terminology defined in RTP Taxonomy 163 [RFC7656], and RTP Topologies [RFC7667]. The following terms are 164 especially noted or here defined: 166 RTP Mixer: An RTP middle node, defined in [RFC7667] (Section 3.6 to 167 3.9). 169 RTP Session: An association among a group of participants 170 communicating with RTP, as defined in [RFC3550] and amended by 171 [RFC7656]. 173 RTP Stream: A stream of RTP packets containing media data, as 174 defined in [RFC7656]. 176 RTP Switch: A common short term for the terms "switching RTP mixer", 177 "source projecting middlebox", and "video switching MCU" as 178 discussed in [RFC7667]. 180 Simulcast Stream: One encoded stream or dependent stream from a set 181 of concurrently transmitted encoded streams and optional dependent 182 streams, all sharing a common media source, as defined in 183 [RFC7656]. For example, HD and thumbnail video simulcast versions 184 of a single media source sent concurrently as separate RTP 185 Streams. 187 Simulcast Format: Different formats of a simulcast stream serve the 188 same purpose as alternative RTP payload types in non-simulcast 189 SDP: to allow multiple alternative media formats for a given RTP 190 stream. As for multiple RTP payload types on the m-line in offer/ 191 answer [RFC3264], any one of the negotiated alternative formats 192 can be used in a single RTP stream at a given point in time, but 193 not more than one (based on RTP timestamp). What format is used 194 can change dynamically from one RTP packet to another. 196 2.2. Requirements Language 198 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 199 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 200 "OPTIONAL" in this document are to be interpreted as described in BCP 201 14 [RFC2119] [RFC8174] when, and only when, they appear in all 202 capitals, as shown here. 204 3. Use Cases 206 The use cases of simulcast described in this document relate to a 207 multi-party communication session where one or more central nodes are 208 used to adapt the view of the communication session towards 209 individual participants, and facilitate the media transport between 210 participants. Thus, these cases target the RTP Mixer type of 211 topology. 213 There are two principal approaches for an RTP Mixer to provide this 214 adapted view of the communication session to each receiving 215 participant: 217 o Transcoding (decoding and re-encoding) received RTP streams with 218 characteristics adapted to each receiving participant. This often 219 include mixing or composition of media sources from multiple 220 participants into a mixed media source originated by the RTP 221 Mixer. The main advantage of this approach is that it achieves 222 close to optimal adaptation to individual receiving participants. 223 The main disadvantages are that it can be very computationally 224 expensive to the RTP Mixer, typically degrades media Quality of 225 Experience (QoE) such as end-to-end delay for the receiving 226 participants, and requires RTP Mixer access to media content. 228 o Switching a subset of all received RTP streams or sub-streams to 229 each receiving participant, where the used subset is typically 230 specific to each receiving participant. The main advantages of 231 this approach are that it is computationally cheap to the RTP 232 Mixer, has very limited impact on media QoE, and does not require 233 RTP Mixer (full) access to media content. The main disadvantage 234 is that it can be difficult to combine a subset of received RTP 235 streams into a perfect fit to the resource situation of a 236 receiving participant. It is also a disadvantage that sending 237 multiple RTP streams consumes more network resources from the 238 sending participant to the RTP Mixer. 240 The use of simulcast relates to the latter approach, where it is more 241 important to reduce the load on the RTP Mixer and/or minimize QoE 242 impact than to achieve an optimal adaptation of resource usage. 244 3.1. Reaching a Diverse Set of Receivers 246 The media sources provided by a sending participant potentially need 247 to reach several receiving participants that differ in terms of 248 available resources. The receiver resources that typically differ 249 include, but are not limited to: 251 Codec: This includes codec type (such as RTP payload format MIME 252 type) and can include codec configuration. A couple of codec 253 resources that differ only in codec configuration will be 254 "different" if they are somehow not "compatible", like if they 255 differ in video codec profile, or the transport packetization 256 configuration. 258 Sampling: This relates to how the media source is sampled, in 259 spatial as well as in temporal domain. For video streams, spatial 260 sampling affects image resolution and temporal sampling affects 261 video frame rate. For audio, spatial sampling relates to the 262 number of audio channels and temporal sampling affects audio 263 bandwidth. This may be used to suit different rendering 264 capabilities or needs at the receiving endpoints. 266 Bitrate: This relates to the number of bits sent per second to 267 transmit the media source as an RTP stream, which typically also 268 affects the QoE for the receiving user. 270 Letting the sending participant create a simulcast of a few 271 differently configured RTP streams per media source can be a good 272 tradeoff when using an RTP switch as middlebox, instead of sending a 273 single RTP stream and using an RTP mixer to create individual 274 transcodings to each receiving participant. 276 This requires that the receiving participants can be categorized in 277 terms of available resources and that the sending participant can 278 choose a matching configuration for a single RTP stream per category 279 and media source. For example, a set of receiving participants 280 differ only in screen resolution; some are able to display video with 281 at most 360p resolution and some support 720p resolution. A sending 282 participant can then reach all receivers with best possible 283 resolution by creating a simulcast of RTP streams with 360p and 720p 284 resolution for each sent video media source. 286 The maximum number of simulcasted RTP streams that can be sent is 287 mainly limited by the amount of processing and uplink network 288 resources available to the sending participant. 290 3.2. Application Specific Media Source Handling 292 The application logic that controls the communication session may 293 include special handling of some media sources. It is, for example, 294 commonly the case that the media from a sending participant is not 295 sent back to itself. 297 It is also common that a currently active speaker participant is 298 shown in larger size or higher quality than other participants (the 299 sampling or bitrate aspects of Section 3.1) in a receiving client. 300 Many conferencing systems do not send the active speaker's media back 301 to the sender itself, which means there is some other participant's 302 media that instead is forwarded to the active speaker; typically the 303 previous active speaker. This way, the previously active speaker is 304 needed both in larger size (to current active speaker) and in small 305 size (to the rest of the participants), which can be solved with a 306 simulcast from the previously active speaker to the RTP switch. 308 3.3. Receiver Media Source Preferences 310 The application logic that controls the communication session may 311 allow receiving participants to state preferences on the 312 characteristics of the RTP stream they like to receive, for example 313 in terms of the aspects listed in Section 3.1. Sending a simulcast 314 of RTP streams is one way of accommodating receivers with conflicting 315 or otherwise incompatible preferences. 317 4. Overview 319 This memo defines SDP [RFC4566] signaling that covers the above 320 described simulcast use cases and functionalities. A number of 321 requirements for such signaling are elaborated in Appendix A. 323 The RID mechanism, as defined in [I-D.ietf-mmusic-rid], enables an 324 SDP offerer or answerer to specify a number of different RTP stream 325 restrictions for a rid-id by using the "a=rid" line. Examples of 326 such restrictions are maximum bitrate, maximum spatial video 327 resolution (width and height), maximum video framerate, etc. Each 328 rid-id may also be restricted to use only a subset of the RTP payload 329 types in the associated SDP media description. Those RTP payload 330 types can have their own configurations and parameters affecting what 331 can be sent or received, using the "a=fmtp" line as well as other SDP 332 attributes. 334 A new SDP media level attribute "a=simulcast" is defined. The 335 attribute describes, independently for send and receive directions, 336 the number of simulcast RTP streams as well as potential alternative 337 formats for each simulcast RTP stream. Each simulcast RTP stream, 338 including alternatives, is identified using the RID identifier (rid- 339 id), defined in [I-D.ietf-mmusic-rid]. 341 a=simulcast:send 1;2,3 recv 4 343 If the above line is included in an SDP offer, the "send" part 344 indicates the offerer's capability and proposal to send two simulcast 345 RTP streams. Each simulcast stream is described by one or more RTP 346 stream identifiers (rid-id), each group of rid-ids for a simulcast 347 stream is separated by a semicolon (";"). When a simulcast stream 348 has multiple rid-ids that are separated by a comma (","), they 349 describe alternative representations for that particular simulcast 350 RTP stream. Thus, the above "send" part is interpreted as an 351 intention to send two simulcast RTP streams. The first simulcast RTP 352 stream is identified and restricted according to rid-id 1. The 353 second simulcast RTP stream can be sent as two alternatives, 354 identified and restricted according to rid-ids 2 and 3. The "recv" 355 part of the above line indicates that the offerer desires to receive 356 a single RTP stream (no simulcast) according to rid-id 4. 358 A more complete example SDP offer media description is provided 359 below: 361 m=video 49300 RTP/AVP 97 98 99 362 a=rtpmap:97 H264/90000 363 a=rtpmap:98 H264/90000 364 a=rtpmap:99 VP8/90000 365 a=fmtp:97 profile-level-id=42c01f;max-fs=3600;max-mbps=108000 366 a=fmtp:98 profile-level-id=42c00b;max-fs=240;max-mbps=3600 367 a=fmtp:99 max-fs=240; max-fr=30 368 a=rid:1 send pt=97;max-width=1280;max-height=720 369 a=rid:2 send pt=98;max-width=320;max-height=180 370 a=rid:3 send pt=99;max-width=320;max-height=180 371 a=rid:4 recv pt=97 372 a=simulcast:send 1;2,3 recv 4 373 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id 375 Figure 1: Example Simulcast Media Description in Offer 377 The above SDP media description can be interpreted at a high level to 378 say that the offerer is capable of sending two simulcast RTP streams, 379 one H.264 encoded stream in up to 720p resolution, and one additional 380 stream encoded as either H.264 or VP8 with a maximum resolution of 381 320x180 pixels. The offerer can receive one H.264 stream with 382 maximum 720p resolution. 384 The receiver of this SDP offer can generate an SDP answer that 385 indicates what it accepts. It uses the "a=simulcast" attribute to 386 indicate simulcast capability and specify what simulcast RTP streams 387 and alternatives to receive and/or send. An example of such 388 answering "a=simulcast" attribute, corresponding to the above offer, 389 is: 391 a=simulcast:recv 1;2 send 4 393 With this SDP answer, the answerer indicates in the "recv" part that 394 it wants to receive the two simulcast RTP streams. It has removed an 395 alternative that it doesn't support (rid-id 3). The send part 396 confirms to the offerer that it will receive one stream for this 397 media source according to rid-id 4. The corresponding, more complete 398 example SDP answer media description could look like: 400 m=video 49674 RTP/AVP 97 98 401 a=rtpmap:97 H264/90000 402 a=rtpmap:98 H264/90000 403 a=fmtp:97 profile-level-id=42c01f;max-fs=3600;max-mbps=108000 404 a=fmtp:98 profile-level-id=42c00b;max-fs=240;max-mbps=3600 405 a=rid:1 recv pt=97;max-width=1280;max-height=720 406 a=rid:2 recv pt=98;max-width=320;max-height=180 407 a=rid:4 send pt=97 408 a=simulcast:recv 1;2 send 4 409 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id 411 Figure 2: Example Simulcast Media Description in Answer 413 It is assumed that a single SDP media description is used to describe 414 a single media source. This is aligned with the concepts defined in 415 [RFC7656] and will work in a WebRTC context, both with and without 416 BUNDLE [I-D.ietf-mmusic-sdp-bundle-negotiation] grouping of media 417 descriptions. 419 To summarize, the "a=simulcast" line describes send and receive 420 direction simulcast streams separately. Each direction can in turn 421 describe one or more simulcast streams, separated by semicolon. The 422 identifiers describing simulcast streams on the "a=simulcast" line 423 are rid-id, as defined by "a=rid" lines in [I-D.ietf-mmusic-rid]. 424 Each simulcast stream can be offered as a list of alternative rid-id, 425 with each alternative separated by comma (not in the examples above). 426 A detailed specification can be found in Section 5 and more detailed 427 examples are outlined in Section 5.6. 429 5. Detailed Description 431 This section further details the overview above (Section 4). First, 432 formal syntax is provided (Section 5.1), followed by the rest of the 433 SDP attribute definition in Section 5.2. Relating Simulcast Streams 434 (Section 5.5) provides the definition of the RTP/RTCP mechanisms 435 used. The section is concluded with a number of examples. 437 5.1. Simulcast Attribute 439 This document defines a new SDP media-level "a=simulcast" attribute, 440 with value according to the following ABNF [RFC5234] syntax and its 441 update for Case-Sensitive String Support in ABNF [RFC7405]: 443 sc-value = ( sc-send [SP sc-recv] ) / ( sc-recv [SP sc-send] ) 444 sc-send = %s"send" SP sc-str-list 445 sc-recv = %s"recv" SP sc-str-list 446 sc-str-list = sc-alt-list *( ";" sc-alt-list ) 447 sc-alt-list = sc-id *( "," sc-id ) 448 sc-id-paused = "~" 449 sc-id = [sc-id-paused] rid-id 450 ; SP defined in [RFC5234] 451 ; rid-id defined in [I-D.ietf-mmusic-rid] 453 Figure 3: ABNF for Simulcast Value 455 Note to RFC Editor: Replace "I-D.ietf-mmusic-rid" in the above 456 figure with RFC number of draft-ietf-mmusic-rid before publication 457 of this document. 459 The "a=simulcast" attribute has a parameter in the form of one or two 460 simulcast stream descriptions, each consisting of a direction ("send" 461 or "recv"), followed by a list of one or more simulcast streams. 462 Each simulcast stream consists of one or more alternative simulcast 463 formats. Each simulcast format is identified by a simulcast stream 464 identifier (rid-id). The rid-id MUST have the form of an RTP stream 465 identifier, as described by RTP Payload Format Restrictions 466 [I-D.ietf-mmusic-rid]. 468 In the list of simulcast streams, each simulcast stream is separated 469 by a semicolon (";"). Each simulcast stream can in turn be offered 470 in one or more alternative formats, represented by rid-ids, separated 471 by a comma (","). Each rid-id can also be specified as initially 472 paused [RFC7728], indicated by prepending a "~" to the rid-id. The 473 reason to allow separate initial pause states for each rid-id is that 474 pause capability can be specified individually for each RTP payload 475 type referenced by an rid-id. Since pause capability specified via 476 the "a=rtcp-fb" attribute applies only to specified payload types and 477 rid-id specified by "a=rid" can refer to multiple different payload 478 types, it is unfeasible to pause streams with rid-id where any of the 479 related RTP payload type(s) do not have pause capability. 481 5.2. Simulcast Capability 483 Simulcast capability is expressed through a new media level SDP 484 attribute, "a=simulcast" (Section 5.1). The use of this attribute at 485 the session level is undefined. Implementations of this 486 specification MUST NOT use it at the session level and MUST ignore it 487 if received at the session level. Extensions to this specification 488 may define such session level usage. Each SDP media description MUST 489 contain at most one "a=simulcast" line. 491 There are separate and independent sets of simulcast streams in send 492 and receive directions. When listing multiple directions, each 493 direction MUST NOT occur more than once on the same line. 495 Simulcast streams using undefined rid-id MUST NOT be used as valid 496 simulcast streams by an RTP stream receiver. The direction for an 497 rid-id MUST be aligned with the direction specified for the 498 corresponding RTP stream identifier on the "a=rid" line. 500 The listed number of simulcast streams for a direction sets a limit 501 to the number of supported simulcast streams in that direction. The 502 order of the listed simulcast streams in the "send" direction 503 suggests a proposed order of preference, in decreasing order: the 504 rid-id listed first is the most preferred and subsequent streams have 505 progressively lower preference. The order of the listed rid-id in 506 the "recv" direction expresses which simulcast streams that are 507 preferred, with the leftmost being most preferred. This can be of 508 importance if the number of actually sent simulcast streams have to 509 be reduced for some reason. 511 rid-id that have explicit dependencies [RFC5583] 512 [I-D.ietf-mmusic-rid] to other rid-id (even in the same media 513 description) MAY be used. 515 Use of more than a single, alternative simulcast format for a 516 simulcast stream MAY be specified as part of the attribute parameters 517 by expressing the simulcast stream as a comma-separated list of 518 alternative rid-id. The order of the rid-id alternatives within a 519 simulcast stream is significant; the rid-id alternatives are listed 520 from (left) most preferred to (right) least preferred. For the use 521 of simulcast, this overrides the normal codec preference as expressed 522 by format type ordering on the "m=" line, using regular SDP rules. 523 This is to enable a separation of general codec preferences and 524 simulcast stream configuration preferences. However, the choice of 525 which alternative to use per simulcast stream is independent, and 526 there is currently no mechanism to align the choice between 527 alternative rid-ids between different simulcast streams. 529 A simulcast stream can use a codec defined such that the same RTP 530 SSRC can change RTP payload type multiple times during a session, 531 possibly even on a per-packet basis. A typical example can be a 532 speech codec that makes use of Comfort Noise [RFC3389] and/or DTMF 533 [RFC4733] formats. 535 If RTP stream pause/resume [RFC7728] is supported, any rid-id MAY be 536 prefixed by a "~" character to indicate that the corresponding 537 simulcast stream is initially paused already from start of the RTP 538 session. In this case, support for RTP stream pause/resume MUST also 539 be included under the same "m=" line where "a=simulcast" is included. 540 All RTP payload types related to such an initially paused simulcast 541 stream MUST be listed in the SDP as pause/resume capable as specified 542 by [RFC7728], e.g. by using the "*" wildcard format for "a=rtcp-fb". 544 An initially paused simulcast stream in "send" direction for the 545 endpoint sending the SDP MUST be considered equivalent to an 546 unsolicited locally paused stream, and be handled accordingly. 547 Initially paused simulcast streams are resumed as described by the 548 RTP pause/resume specification. An RTP stream receiver that wishes 549 to resume an unsolicited locally paused stream needs to know the SSRC 550 of that stream. The SSRC of an initially paused simulcast stream can 551 be obtained from an RTP stream sender RTCP Sender Report (SR) 552 including both the desired SSRC as "SSRC of sender", and the rid-id 553 value in an RtpStreamId RTCP SDES item [I-D.ietf-avtext-rid]. 555 If the endpoint sending the SDP includes an "recv" direction 556 simulcast stream that is initially paused, then the remote RTP sender 557 receiving the SDP SHOULD put its RTP stream in a unsolicited locally 558 paused state. The simulcast stream sender does not put the stream in 559 the locally paused state if there are other RTP stream receivers in 560 the session that do not mark the simulcast stream as initially 561 paused. However, in centralized conferencing the RTP sender usually 562 does not see the SDP signalling from RTP receivers and cannot make 563 this determination. The reason to require an initially paused "recv" 564 stream to be considered locally paused by the remote RTP sender, 565 instead of making it equivalent to implicitly sending a pause 566 request, is because the pausing RTP sender cannot know which 567 receiving SSRC owns the restriction when Temporary Maximum Media 568 Stream Bit Rate Request (TMMBR) and Temporary Maximum Media Stream 569 Bit Rate Notification (TMMBN) are used for pause/resume signaling 570 (Section 5.6 of [RFC7728]) since the RTP receiver's SSRC in send 571 direction is sometimes not yet known. 573 Use of the redundant audio data [RFC2198] format could be seen as a 574 form of simulcast for loss protection purposes, but is not considered 575 conflicting with the mechanisms described in this memo and MAY 576 therefore be used as any other format. In this case the "red" 577 format, rather than the carried formats, SHOULD be the one to list as 578 a simulcast stream on the "a=simulcast" line. 580 The media formats and corresponding characteristics of simulcast 581 streams SHOULD be chosen such that they are different, e.g. as 582 different SDP formats with differing "a=rtpmap" and/or "a=fmtp" 583 lines, or as differently defined RTP payload format restrictions. If 584 this difference is not required, it is RECOMMENDED to use RTP 585 duplication [RFC7104] procedures instead of simulcast. To avoid 586 complications in implementations, a single rid-id MUST NOT occur more 587 than once per "a=simulcast" line. Note that this does not eliminate 588 use of simulcast as an RTP duplication mechanism, since it is 589 possible to define multiple different rid-id that are effectively 590 equivalent. 592 5.3. Offer/Answer Use 594 Note: The inclusion of "a=simulcast" or the use of simulcast does 595 not change any of the interpretation or Offer/Answer procedures 596 for other SDP attributes, like "a=fmtp" or "a=rid". 598 5.3.1. Generating the Initial SDP Offer 600 An offerer wanting to use simulcast for a media description SHALL 601 include one "a=simulcast" attribute in that media description in the 602 offer. An offerer listing a set of receive simulcast streams and/or 603 alternative formats as rid-id in the offer MUST be prepared to 604 receive RTP streams for any of those simulcast streams and/or 605 alternative formats from the answerer. 607 5.3.2. Creating the SDP Answer 609 An answerer that does not understand the concept of simulcast will 610 also not know the attribute and will remove it in the SDP answer, as 611 defined in existing SDP Offer/Answer [RFC3264] procedures. Since SDP 612 session level simulcast is undefined in this memo, an answerer that 613 receives an offer with the "a=simulcast" attribute on SDP session 614 level SHALL remove it in the answer. An answerer that understands 615 the attribute but receives multiple "a=simulcast" attributes in the 616 same media description SHALL disable use of simulcast by removing all 617 "a=simulcast" lines for that media description in the answer. 619 An answerer that does understand the attribute and that wants to 620 support simulcast in an indicated direction SHALL reverse 621 directionality of the unidirectional direction parameters; "send" 622 becomes "recv" and vice versa, and include it in the answer. 624 An answerer that receives an offer with simulcast containing an 625 "a=simulcast" attribute listing alternative rid-id MAY keep all the 626 alternative rid-id in the answer, but it MAY also choose to remove 627 any non-desirable alternative rid-id in the answer. The answerer 628 MUST NOT add any alternative rid-id in send direction in the answer 629 that were not present in the offer receive direction. The answerer 630 MUST be prepared to receive any of the receive direction rid-id 631 alternatives and MAY send any of the send direction alternatives that 632 are part of the answer. 634 An answerer that receives an offer with simulcast that lists a number 635 of simulcast streams, MAY reduce the number of simulcast streams in 636 the answer, but MUST NOT add simulcast streams. 638 An answerer that receives an offer without RTP stream pause/resume 639 capability MUST NOT mark any simulcast streams as initially paused in 640 the answer. 642 An RTP stream pause/resume capable answerer that receives an offer 643 with RTP stream pause/resume capability MAY mark any rid-id that 644 refer to pause/resume capable formats as initially paused in the 645 answer. 647 An answerer that receives indication in an offer of an rid-id being 648 initially paused SHOULD mark that rid-id as initially paused also in 649 the answer, regardless of direction, unless it has good reason for 650 the rid-id not being initially paused. One reason to remove an 651 initial pause in the answer compared to the offer could, for example, 652 be that all receive direction simulcast streams for a media source 653 the answerer accepts in the answer would otherwise be paused. 655 5.3.3. Offerer Processing the SDP Answer 657 An offerer that receives an answer without "a=simulcast" MUST NOT use 658 simulcast towards the answerer. An offerer that receives an answer 659 with "a=simulcast" without any rid-id in a specified direction MUST 660 NOT use simulcast in that direction. 662 An offerer that receives an answer where some rid-id alternatives are 663 kept MUST be prepared to receive any of the kept send direction rid- 664 id alternatives, and MAY send any of the kept receive direction rid- 665 id alternatives. 667 An offerer that receives an answer where some of the rid-id are 668 removed compared to the offer MAY release the corresponding resources 669 (codec, transport, etc) in its receive direction and MUST NOT send 670 any RTP packets corresponding to the removed rid-id. 672 An offerer that offered some of its rid-id as initially paused and 673 that receives an answer that does not indicate RTP stream pause/ 674 resume capability, MUST NOT initially pause any simulcast streams. 676 An offerer with RTP stream pause/resume capability that receives an 677 answer where some rid-id are marked as initially paused, SHOULD 678 initially pause those RTP streams regardless if they were marked as 679 initially paused also in the offer, unless it has good reason for 680 those RTP streams not being initially paused. One such reason could, 681 for example, be that the answerer would otherwise initially not 682 receive any media of that type at all. 684 5.3.4. Modifying the Session 686 Offers inside an existing session follow the same rules as for 687 initial SDP offer, with these additions: 689 1. rid-id marked as initially paused in the offerer's send direction 690 SHALL reflect the offerer's opinion of the current pause state at 691 the time of creating the offer. This is purely informational, 692 and RTP stream pause/resume [RFC7728] signaling in the ongoing 693 session SHALL take precedence in case of any conflict or 694 ambiguity. 696 2. rid-id marked as initially paused in the offerer's receive 697 direction SHALL (as in an initial offer) reflect the offerer's 698 desired rid-id pause state. Except for the case where the 699 offerer already paused the corresponding RTP stream through RTP 700 stream pause/resume [RFC7728] signaling , this is identical to 701 the conditions at an initial offer. 703 Creation of SDP answers and processing of SDP answers inside an 704 existing session follow the same rules as described above for initial 705 SDP offer/answer. 707 Session modification restrictions in section 6.5 of RTP payload 708 format restrictions [I-D.ietf-mmusic-rid] also apply. 710 5.4. Use with Declarative SDP 712 This document does not define the use of "a=simulcast" in declarative 713 SDP, partly motivated by use of the simulcast format identification 714 [I-D.ietf-mmusic-rid] not being defined for use in declarative SDP. 715 If concrete use cases for simulcast in declarative SDP are identified 716 in the future, the authors of this memo expect that additional 717 specifications will address such use. 719 5.5. Relating Simulcast Streams 721 Simulcast RTP streams MUST be related on RTP level through 722 RtpStreamId [I-D.ietf-avtext-rid], as specified in the SDP 723 "a=simulcast" attribute (Section 5.2) parameters. This is sufficient 724 as long as there is only a single media source per SDP media 725 description. When using BUNDLE 726 [I-D.ietf-mmusic-sdp-bundle-negotiation], where multiple SDP media 727 descriptions jointly specify a single RTP session, the SDES MID 728 identification mechanism in BUNDLE allows relating RTP streams back 729 to individual media descriptions, after which the above described 730 RtpStreamId relations can be used. Use of the RTP header extension 731 [RFC8285] for both MID and RtpStreamId identifications can be 732 important to ensure rapid initial reception, required to correctly 733 interpret and process the RTP streams. Implementers of this 734 specification MUST support the RTCP source description (SDES) item 735 method and SHOULD support RTP header extension method to signal 736 RtpStreamId on RTP level. 738 NOTE: For the case where it is clear from SDP that RTP PT uniquely 739 maps to corresponding RtpStreamId, an RTP receiver can use RTP PT 740 to relate simulcast streams. This can sometimes enable decoding 741 even in advance to receiving RtpStreamId information in RTCP SDES 742 and/or RTP header extensions. 744 RTP streams MUST only use a single alternative rid-id at a time 745 (based on RTP timestamps), but MAY change format (and rid-id) on a 746 per-RTP packet basis. This corresponds to the existing (non- 747 simulcast) SDP offer/answer case when multiple formats are included 748 on the "m=" line in the SDP answer, enabling per-RTP packet change of 749 RTP payload type. 751 5.6. Signaling Examples 753 These examples describe a client to video conference service, using a 754 centralized media topology with an RTP mixer. 756 +---+ +-----------+ +---+ 757 | A |<---->| |<---->| B | 758 +---+ | | +---+ 759 | Mixer | 760 +---+ | | +---+ 761 | F |<---->| |<---->| J | 762 +---+ +-----------+ +---+ 764 Figure 4: Four-party Mixer-based Conference 766 5.6.1. Single-Source Client 768 Alice is calling in to the mixer with a simulcast-enabled client 769 capable of a single media source per media type. The client can send 770 a simulcast of 2 video resolutions and frame rates: HD 1280x720p 771 30fps and thumbnail 320x180p 15fps. This is defined below using the 772 "imageattr" [RFC6236]. In this example, only the "pt" "a=rid" 773 parameter is used, effectively achieving a 1:1 mapping between 774 RtpStreamId and media formats (RTP payload types), to describe 775 simulcast stream formats. Alice's Offer: 777 v=0 778 o=alice 2362969037 2362969040 IN IP4 192.0.2.156 779 s=Simulcast Enabled Client 780 c=IN IP4 192.0.2.156 781 t=0 0 782 m=audio 49200 RTP/AVP 0 783 a=rtpmap:0 PCMU/8000 784 m=video 49300 RTP/AVP 97 98 785 a=rtpmap:97 H264/90000 786 a=rtpmap:98 H264/90000 787 a=fmtp:97 profile-level-id=42c01f;max-fs=3600;max-mbps=108000 788 a=fmtp:98 profile-level-id=42c00b;max-fs=240;max-mbps=3600 789 a=imageattr:97 send [x=1280,y=720] recv [x=1280,y=720] 790 a=imageattr:98 send [x=320,y=180] recv [x=320,y=180] 791 a=rid:1 send pt=97 792 a=rid:2 send pt=98 793 a=rid:3 recv pt=97 794 a=simulcast:send 1;2 recv 3 795 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id 797 Figure 5: Single-Source Simulcast Offer 799 The only thing in the SDP that indicates simulcast capability is the 800 line in the video media description containing the "simulcast" 801 attribute. The included "a=fmtp" and "a=imageattr" parameters 802 indicates that sent simulcast streams can differ in video resolution. 803 The RTP header extension for RtpStreamId is offered to avoid issues 804 with the initial binding between RTP streams (SSRCs) and the 805 RtpStreamId identifying the simulcast stream and its format. 807 The Answer from the server indicates that it too is simulcast 808 capable. Should it not have been simulcast capable, the 809 "a=simulcast" line would not have been present and communication 810 would have started with the media negotiated in the SDP. Also the 811 usage of the RtpStreamId RTP header extension is accepted. 813 v=0 814 o=server 823479283 1209384938 IN IP4 192.0.2.2 815 s=Answer to Simulcast Enabled Client 816 c=IN IP4 192.0.2.43 817 t=0 0 818 m=audio 49672 RTP/AVP 0 819 a=rtpmap:0 PCMU/8000 820 m=video 49674 RTP/AVP 97 98 821 a=rtpmap:97 H264/90000 822 a=rtpmap:98 H264/90000 823 a=fmtp:97 profile-level-id=42c01f;max-fs=3600;max-mbps=108000 824 a=fmtp:98 profile-level-id=42c00b;max-fs=240;max-mbps=3600 825 a=imageattr:97 send [x=1280,y=720] recv [x=1280,y=720] 826 a=imageattr:98 send [x=320,y=180] recv [x=320,y=180] 827 a=rid:1 recv pt=97 828 a=rid:2 recv pt=98 829 a=rid:3 send pt=97 830 a=simulcast:recv 1;2 send 3 831 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id 833 Figure 6: Single-Source Simulcast Answer 835 Since the server is the simulcast media receiver, it reverses the 836 direction of the "simulcast" and "rid" attribute parameters. 838 5.6.2. Multi-Source Client 840 Fred is calling in to the same conference as in the example above 841 with a two-camera, two-display system, thus capable of handling two 842 separate media sources in each direction, where each media source is 843 simulcast-enabled in the send direction. Fred's client is restricted 844 to a single media source per media description. 846 The first two simulcast streams for the first media source use 847 different codecs, H264-SVC [RFC6190] and H264 [RFC6184]. These two 848 simulcast streams also have a temporal dependency. Two different 849 video codecs, VP8 [RFC7741] and H264, are offered as alternatives for 850 the third simulcast stream for the first media source. Only the 851 highest fidelity simulcast stream is sent from start, the lower 852 fidelity streams being initially paused. 854 The second media source is offered with three different simulcast 855 streams. All video streams of this second media source are loss 856 protected by RTP retransmission [RFC4588]. Also here, all but the 857 highest fidelity simulcast stream are initially paused. Note that 858 the lower resolution is more prioritized than the medium resolution 859 simulcast stream. 861 Fred's client is also using BUNDLE to send all RTP streams from all 862 media descriptions in the same RTP session on a single media 863 transport. Although using many different simulcast streams in this 864 example, the use of RtpStreamId as simulcast stream identification 865 enables use of a low number of RTP payload types. Note that the use 866 of both BUNDLE [I-D.ietf-mmusic-sdp-bundle-negotiation] and "a=rid" 867 [I-D.ietf-mmusic-rid] recommends using the RTP header extension 868 [RFC8285] for carrying these RTP stream identification fields, which 869 is consequently also included in the SDP. Note also that for 870 "a=rid", the corresponding RtpStreamId SDES attribute RTP header 871 extension is named rtp-stream-id [I-D.ietf-avtext-rid]. 873 v=0 874 o=fred 238947129 823479223 IN IP6 2001:db8::c000:27d 875 s=Offer from Simulcast Enabled Multi-Source Client 876 c=IN IP6 2001:db8::c000:27d 877 t=0 0 878 a=group:BUNDLE foo bar zen 879 m=audio 49200 RTP/AVP 99 880 a=mid:foo 881 a=rtpmap:99 G722/8000 882 m=video 49600 RTP/AVPF 100 101 103 883 a=mid:bar 884 a=rtpmap:100 H264-SVC/90000 885 a=rtpmap:101 H264/90000 886 a=rtpmap:103 VP8/90000 887 a=fmtp:100 profile-level-id=42400d;max-fs=3600;max-mbps=216000; \ 888 mst-mode=NI-TC 889 a=fmtp:101 profile-level-id=42c00d;max-fs=3600;max-mbps=108000 890 a=fmtp:103 max-fs=900; max-fr=30 891 a=rid:1 send pt=100;max-width=1280;max-height=720;max-fps=60;depend=2 892 a=rid:2 send pt=101;max-width=1280;max-height=720;max-fps=30 893 a=rid:3 send pt=101;max-width=640;max-height=360 894 a=rid:4 send pt=103;max-width=640;max-height=360 895 a=depend:100 lay bar:101 896 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 897 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id 898 a=rtcp-fb:* ccm pause nowait 899 a=simulcast:send 1;2;~4,3 900 m=video 49602 RTP/AVPF 96 104 901 a=mid:zen 902 a=rtpmap:96 VP8/90000 903 a=fmtp:96 max-fs=3600; max-fr=30 904 a=rtpmap:104 rtx/90000 905 a=fmtp:104 apt=96;rtx-time=200 906 a=rid:1 send max-fs=921600;max-fps=30 907 a=rid:2 send max-fs=614400;max-fps=15 908 a=rid:3 send max-fs=230400;max-fps=30 909 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 910 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id 911 a=extmap:3 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id 912 a=rtcp-fb:* ccm pause nowait 913 a=simulcast:send 1;~3;~2 915 Figure 7: Fred's Multi-Source Simulcast Offer 917 5.6.3. Simulcast and Redundancy 919 The example in this section looks at applying simulcast with audio 920 and video redundancy formats. The audio media description uses codec 921 and bitrate restrictions, combining it with RTP Payload for Redundant 922 Audio Data [RFC2198] for enhanced packet loss resilience. The video 923 media description applies both resolution and bitrate restrictions, 924 combining it with FEC in the form of Flexible FEC 925 [I-D.ietf-payload-flexible-fec-scheme] and RTP Retransmission 926 [RFC4588]. 928 The audio source is offered to be sent as two simulcast streams. The 929 first simulcast stream is encoded with Opus, restricted to 50 kbps 930 (rid-id=5), and the second simulcast stream is encoded either with 931 G.711 (rid-id=7) or with G.711 combined with LPC for redundancy (rid- 932 id=6). In this example, stand-alone LPC is not offered as an 933 possible payload type for the second simulcast stream's RID, which 934 could e.g. be motivated by not providing sufficient quality. 936 The video source is offered to be sent as two simulcast streams, both 937 with two alternative simulcast formats. Redundancy and repair are 938 offered in the form of both Flexible FEC and RTP Retransmission. The 939 Flexible FEC is not bound to any particular RTP streams and is 940 therefore possible to use across all RTP streams that are being sent 941 as part of this media description. 943 v=0 944 o=fred 238947129 823479223 IN IP6 2001:db8::c000:27d 945 s=Offer from Simulcast Enabled Client using Redundancy 946 c=IN IP6 2001:db8::c000:27d 947 t=0 0 948 a=group:BUNDLE foo bar 949 m=audio 49200 RTP/AVP 97 98 99 100 101 102 950 a=mid:foo 951 a=rtpmap:97 G711/8000 952 a=rtpmap:98 LPC/8000 953 a=rtpmap:99 OPUS/48000/1 954 a=rtpmap:100 RED/8000/1 955 a=rtpmap:101 CN/8000 956 a=rtpmap:102 telephone-event/8000 957 a=fmtp:99 useinbandfec=1;usedtx=0 958 a=fmtp:100 97/98 959 a=fmtp:102 0-15 960 a=ptime:20 961 a=maxptime:40 962 a=rid:1 send pt=99,102;max-br=64000 963 a=rid:2 send pt=100,97,101,102 964 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 965 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id 966 a=simulcast:send 1;2 967 m=video 49600 RTP/AVPF 103 104 105 106 107 968 a=mid:bar 969 a=rtpmap:103 H264/90000 970 a=rtpmap:104 VP8/90000 971 a=rtpmap:105 rtx/90000 972 a=rtpmap:106 rtx/90000 973 a=rtpmap:107 flexfec/90000 974 a=fmtp:103 profile-level-id=42c00d;max-fs=3600;max-mbps=108000 975 a=fmtp:104 max-fs=3600; max-fr=30 976 a=fmtp:105 apt=103;rtx-time=200 977 a=fmtp:106 apt=104;rtx-time=200 978 a=fmtp:107 repair-window=2000 979 a=rid:1 send pt=103;max-width=1280;max-height=720;max-fps=30 980 a=rid:2 send pt=104;max-width=1280;max-height=720;max-fps=30 981 a=rid:3 send pt=103;max-width=640;max-height=360;max-br=300000 982 a=rid:4 send pt=104;max-width=640;max-height=360;max-br=300000 983 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 984 a=extmap:2 urn:ietf:params:rtp-hdrext:sdes:rtp-stream-id 985 a=extmap:3 urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id 986 a=rtcp-fb:* ccm pause nowait 987 a=simulcast:send 1,2;3,4 989 Figure 8: Simulcast and Redundancy Example 991 6. RTP Aspects 993 This section discusses what the different entities in a simulcast 994 media path can expect to happen on RTP level. This is explored from 995 source to sink by starting in an endpoint with a media source that is 996 simulcasted to an RTP middlebox. That RTP middlebox sends media 997 sources both to other RTP middleboxes (cascaded middleboxes), as well 998 as selecting some simulcast format of the media source and sending it 999 to receiving endpoints. Different types of RTP middleboxes and their 1000 usage of the different simulcast formats results in several different 1001 behaviors. 1003 6.1. Outgoing from Endpoint with Media Source 1005 The most straightforward simulcast case is the RTP streams being 1006 emitted from the endpoint that originates a media source. When 1007 simulcast has been negotiated in the sending direction, the endpoint 1008 can transmit up to the number of RTP streams needed for the 1009 negotiated simulcast streams for that media source. Each RTP stream 1010 (SSRC) is identified by associating (Section 5.5) it with an 1011 RtpStreamId SDES item, transmitted in RTCP and possibly also as an 1012 RTP header extension. In cases where multiple media sources have 1013 been negotiated for the same RTP session and thus BUNDLE 1014 [I-D.ietf-mmusic-sdp-bundle-negotiation] is used, also the MID SDES 1015 item will be sent similarly to the RtpStreamId. 1017 Each RTP stream might not be continuously transmitted due to any of 1018 the following reasons; temporarily paused using Pause/Resume 1019 [RFC7728], sender side application logic temporarily pausing it, or 1020 lack of network resources to transmit this simulcast stream. 1021 However, all simulcast streams that have been negotiated have active 1022 and maintained SSRC (at least in regular RTCP reports), even if no 1023 RTP packets are currently transmitted. The relation between an RTP 1024 Stream (SSRC) and a particular simulcast stream is not expected to 1025 change, except in exceptional situations such as SSRC collisions. At 1026 SSRC changes, the usage of MID and RtpStreamId should enable the 1027 receiver to correctly identify the RTP streams even after an SSRC 1028 change. 1030 6.2. RTP Middlebox to Receiver 1032 RTP streams in a multi-party RTP session can be used in multiple 1033 different ways, when the session utilizes simulcast at least on the 1034 media source to middlebox legs. This is to a large degree due to the 1035 different RTP middlebox behaviors, but also the needs of the 1036 application. This text assumes that the RTP middlebox will select a 1037 media source and choose which simulcast stream for that media source 1038 to deliver to a specific receiver. In many cases, at most one 1039 simulcast stream per media source will be forwarded to a particular 1040 receiver at any instant in time, even if the selected simulcast 1041 stream may vary. For cases where this does not hold due to 1042 application needs, then the RTP stream aspects will fall under the 1043 middlebox to middlebox case Section 6.3. 1045 The selection of which simulcast streams to forward towards the 1046 receiver, is application specific. However, in conferencing 1047 applications, active speaker selection is common. In case the number 1048 of media sources possible to forward, N, is less than the total 1049 amount of media sources available in an multi-media session, the 1050 current and previous speakers (up to N in total) are often the ones 1051 forwarded. To avoid the need for media specific processing to 1052 determine the current speaker(s) in the RTP middlebox, the endpoint 1053 providing a media source may include meta data, such as the RTP 1054 Header Extension for Client-to-Mixer Audio Level Indication 1055 [RFC6464]. 1057 The possibilities for stream switching are media type specific, but 1058 for media types with significant interframe dependencies in the 1059 encoding, like most video coding, the switching needs to be made at 1060 suitable switching points in the media stream that breaks or 1061 otherwise deals with the dependency structure. Even if switching 1062 points can be included periodically, it is common to use mechanisms 1063 like Full Intra Requests [RFC5104] to request switching points from 1064 the endpoint performing the encoding of the media source. 1066 Inclusion of the RtpStreamId SDES item for an SSRC in the middlebox 1067 to receiver direction should only occur when use of RtpStreamId has 1068 been negotiated in that direction. It is worth noting that one can 1069 signal multiple RtpStreamIds when simulcast signalling indicates only 1070 a single simulcast stream, allowing one to use all of the 1071 RtpStreamIds as alternatives for that simulcast stream. One reason 1072 for including the RtpStreamId in the middlebox to receiver direction 1073 for an RTP stream is to let the receiver know which restrictions 1074 apply to the currently delivered RTP stream. In case the RtpStreamId 1075 is negotiated to be used, it is important to remember that the used 1076 identifiers will be specific to each signalling session. Even if the 1077 central entity can attempt to coordinate, it is likely that the 1078 RtpStreamIds need to be translated to the leg specific values. The 1079 below cases will have as base line that RtpStreamId is not used in 1080 the mixer to receiver direction. 1082 6.2.1. Media-Switching Mixer 1084 This section discusses the behavior in cases where the RTP middlebox 1085 behaves like the Media-Switching Mixer (Section 3.6.2) in RTP 1086 Topologies [RFC7667]. The fundamental aspect here is that the media 1087 sources delivered from the middlebox will be the mixer's conceptual 1088 or functional ones. For example, one media source may be the main 1089 speaker in high resolution video, while a number of other media 1090 sources are thumbnails of each participant. 1092 The above results in that the RTP stream produced by the mixer is one 1093 that switches between a number of received incoming RTP streams for 1094 different media sources and in different simulcast versions. The 1095 mixer selects the media source to be sent as one of the RTP streams, 1096 and then selects among the available simulcast streams for the most 1097 appropriate one. The selection criteria include available bandwidth 1098 on the mixer to receiver path and restrictions based on the 1099 functional usage of the RTP stream delivered to the receiver. As an 1100 example of the latter, it is unnecessary to forward a full HD video 1101 to a receiver if the display area is just a thumbnail. Thus, 1102 restrictions may exist to not allow some simulcast streams to be 1103 forwarded for some of the mixer's media sources. 1105 This will result in a single RTP stream being used for each of the 1106 RTP mixer's media sources. This RTP stream is at any point in time a 1107 selection of one particular RTP stream arriving to the mixer, where 1108 the RTP header field values are rewritten to provide a consistent, 1109 single RTP stream. If the RTP mixer doesn't receive any incoming 1110 stream matched to this media source, the SSRC will not transmit, but 1111 be kept alive using RTCP. The SSRC and thus RTP stream for the 1112 mixer's media source is expected to be long term stable. It will 1113 only be changed by signalling or other disruptive events. Note that 1114 although the above talks about a single RTP stream, there can in some 1115 cases be multiple RTP streams carrying the selected simulcast stream 1116 for the originating media source, including redundancy or other 1117 auxiliary RTP streams. 1119 The mixer may communicate the identity of the originating media 1120 source to the receiver by including the CSRC field with the 1121 originating media source's SSRC value. Note that due to the 1122 possibility that the RTP mixer switches between simulcast versions of 1123 the media source, the CSRC value may change, even if the media source 1124 is kept the same. 1126 It is important to note that any MID SDES item from the originating 1127 media source needs to be removed and not be associated with the RTP 1128 stream's SSRC. That is, there is nothing in the signalling between 1129 the mixer and the receiver that is structured around the originating 1130 media sources, only the mixer's media sources. If they would be 1131 associated with the SSRC, the receiver would likely believe that 1132 there has been an SSRC collision, and that the RTP stream is spurious 1133 as it doesn't carry the identifiers used to relate it to the correct 1134 context. However, this is not true for CSRC values, as long as they 1135 are never used as SSRC. In these cases one could provide CNAME and 1136 MID as SDES items. A receiver could use this to determine which CSRC 1137 values that are associated with the same originating media source. 1139 If RtpStreamIds are used in the scenario described by this section, 1140 it should be noted that the RtpStreamId on a particular SSRC will 1141 change based on the actual simulcast stream selected for switching. 1142 These RtpStreamId identifiers will be local to this leg's signalling 1143 context. In addition, the defined RtpStreamIds and their parameters 1144 need to cover all the media sources and simulcast streams received by 1145 the RTP mixer that can be switched into this media source, sent by 1146 the RTP mixer. 1148 6.2.2. Selective Forwarding Middlebox 1150 This section discusses the behavior in cases where the RTP middlebox 1151 behaves like the Selective Forwarding Middlebox (Section 3.7) in RTP 1152 Topologies [RFC7667]. Applications for this type of RTP middlebox 1153 results in that each originating media source will have a 1154 corresponding media source on the leg between the middlebox and the 1155 receiver. A Selective Forwarding Middlebox (SFM) could go as far as 1156 exposing all the simulcast streams for an media source, however this 1157 section will focus on having a single simulcast stream that can 1158 contain any of the simulcast formats. This section will assume that 1159 the SFM projection mechanism works on media source level, and maps 1160 one of the media source's simulcast streams onto one RTP stream from 1161 the SFM to the receiver. 1163 This usage will result in that the individual RTP stream(s) for one 1164 media source can switch between being active to paused, based on the 1165 subset of media sources the SFM wants to provide the receiver for the 1166 moment. With SFMs there exist no reasons to use CSRC to indicate the 1167 originating stream, as there is a one to one media source mapping. 1168 If the application requires knowing the simulcast version received to 1169 function well, then RtpStreamId should be negotiated on the SFM to 1170 receiver leg. Which simulcast stream that is being forwarded is not 1171 made explicit unless RtpStreamId is used on the leg. 1173 Any MID SDES items being sent by the SFM to the receiver are only 1174 those agreed between the SFM and the receiver, and no MID values from 1175 the originating side of the SFM are to be forwarded. 1177 A SFM could expose corresponding RTP streams for all the media 1178 sources and their simulcast streams, and then for any media source 1179 that is to be provided forward one selected simulcast stream. 1180 However, this is not recommended as it would unnecessarily increase 1181 the number of RTP streams and require the receiver to timely detect 1182 switching between simulcast streams. The above usage requires the 1183 same SFM functionality for switching, while avoiding the 1184 uncertainties of timely detecting that a RTP stream ends. The 1185 benefit would be that the received simulcast stream would be 1186 implicitly provided by which RTP stream would be active for a media 1187 source. However, using RtpStreamId to make this explicit also 1188 exposes which alternative format is used. The conclusion is that 1189 using one RTP stream per simulcast stream is unnecessary. The issue 1190 with timely detecting end of streams, independent if they are stopped 1191 temporarily or long term, is that there is no explicit indication 1192 that the transmission has intentionally been stopped. The RTCP based 1193 Pause and Resume mechanism [RFC7728] includes a PAUSED indication 1194 that provides the last RTP sequence number transmitted prior to the 1195 pause. Due to usage, the timeliness of this solution depends on when 1196 delivery using RTCP can occur in relation to the transmission of the 1197 last RTP packet. If no explicit information is provided at all, then 1198 detection based on non increasing RTCP SR field values and timers 1199 need to be used to determine pause in RTP packet delivery. This 1200 results in that one can usually not determine when the last RTP 1201 packet arrives (if it arrives) that this will be the last. That it 1202 was the last is something that one learns later. 1204 6.3. RTP Middlebox to RTP Middlebox 1206 This relates to the transmission of simulcast streams between RTP 1207 middleboxes or other usages where one wants to enable the delivery of 1208 multiple simultaneous simulcast streams per media source, but the 1209 transmitting entity is not the originating endpoint. For a 1210 particular direction between middlebox A and B, this looks very 1211 similar to the originating to middlebox case on a media source basis. 1212 However, in this case there is usually multiple media sources, 1213 originating from multiple endpoints. This can create situations 1214 where limitations in the number of simultaneously received media 1215 streams can arise, for example due to limitation in network 1216 bandwidth. In this case, a subset of not only the simulcast streams, 1217 but also media sources can be selected. This results in that 1218 individual RTP streams can be become paused at any point and later 1219 being resumed based on various criteria. 1221 The MIDs used between A and B are the ones agreed between these two 1222 identities in signalling. The RtpStreamId values will also be 1223 provided to ensure explicit information about which simulcast stream 1224 they are. The RTP stream to MID and RtpStreamId associations should 1225 here be long term stable. 1227 7. Network Aspects 1229 Simulcast is in this memo defined as the act of sending multiple 1230 alternative encoded streams of the same underlying media source. 1231 When transmitting multiple independent streams that originate from 1232 the same source, it could potentially be done in several different 1233 ways using RTP. A general discussion on considerations for use of 1234 the different RTP multiplexing alternatives can be found in 1235 Guidelines for Multiplexing in RTP 1236 [I-D.ietf-avtcore-multiplex-guidelines]. Discussion and 1237 clarification on how to handle multiple streams in an RTP session can 1238 be found in [RFC8108]. 1240 The network aspects that are relevant for simulcast are: 1242 Quality of Service: When using simulcast it might be of interest to 1243 prioritize a particular simulcast stream, rather than applying 1244 equal treatment to all streams. For example, lower bitrate 1245 streams may be prioritized over higher bitrate streams to minimize 1246 congestion or packet losses in the low bitrate streams. Thus, 1247 there is a benefit to use a simulcast solution with good QoS 1248 support. 1250 NAT/FW Traversal: Using multiple RTP sessions incurs more cost for 1251 NAT/FW traversal unless they can re-use the same transport flow, 1252 which can be achieved by Multiplexing Negotiation Using SDP Port 1253 Numbers [I-D.ietf-mmusic-sdp-bundle-negotiation]. 1255 7.1. Bitrate Adaptation 1257 Use of multiple simulcast streams can require a significant amount of 1258 network resources. The aggregate bandwidth for all simulcast streams 1259 for a media source (and thus SDP media description) is bounded by any 1260 SDP "b=" line applicable to that media source. It is assumed that a 1261 suitable congestion control mechanism is used by the application to 1262 ensure that it doesn't cause persistent congestion. If the amount of 1263 available network resources varies during an RTP session such that it 1264 does not match what is negotiated in SDP, the bitrate used by the 1265 different simulcast streams may have to be reduced dynamically. When 1266 a simulcasting media source uses a single media transport for all of 1267 the simulcast streams, it is likely that a joint congestion control 1268 across all simulcast streams is used for that media source. What 1269 simulcast streams to prioritize when allocating available bitrate 1270 among the simulcast streams in such adaptation SHOULD be taken from 1271 the simulcast stream order on the "a=simulcast" line and ordering of 1272 alternative simulcast formats Section 5.2. Simulcast streams that 1273 have pause/resume capability and that would be given such low bitrate 1274 by the adaptation process that they are considered not really useful 1275 can be temporarily paused until the limiting condition clears. 1277 8. Limitation 1279 The chosen approach has a limitation that relates to the use of a 1280 single RTP session for all simulcast formats of a media source, which 1281 comes from sending all simulcast streams related to a media source 1282 under the same SDP media description. 1284 It is not possible to use different simulcast streams on different 1285 media transports, limiting the possibilities to apply different QoS 1286 to different simulcast streams. When using unicast, QoS mechanisms 1287 based on individual packet marking are feasible, since they do not 1288 require separation of simulcast streams into different RTP sessions 1289 to apply different QoS. 1291 It is also not possible to separate different simulcast streams into 1292 different multicast groups to allow a multicast receiver to pick the 1293 stream it wants, rather than receive all of them. In this case, the 1294 only reasonable implementation is to use different RTP sessions for 1295 each multicast group so that reporting and other RTCP functions 1296 operate as intended. Such simulcast usage in multicast context is 1297 out of scope for the current document and would require additional 1298 specification. 1300 9. IANA Considerations 1302 This document requests to register a new media-level SDP attribute, 1303 "simulcast", in the "att-field (media level only)" registry within 1304 the SDP parameters registry, according to the procedures of [RFC4566] 1305 and [I-D.ietf-mmusic-sdp-mux-attributes]. 1307 Contact name, email: The IESG (iesg@ietf.org) 1309 Attribute name: simulcast 1311 Long-form attribute name: Simulcast stream description 1313 Charset dependent: No 1315 Attribute value: sc-value; see Section 5.1 of RFC XXXX. 1317 Purpose: Signals simulcast capability for a set of RTP streams 1319 MUX category: NORMAL 1320 Note to RFC Editor: Please replace "RFC XXXX" with the assigned 1321 number of this RFC. 1323 10. Security Considerations 1325 The simulcast capability, configuration attributes, and parameters 1326 are vulnerable to attacks in signaling. 1328 A false inclusion of the "a=simulcast" attribute may result in 1329 simultaneous transmission of multiple RTP streams that would 1330 otherwise not be generated. The impact is limited by the media 1331 description joint bandwidth, shared by all simulcast streams 1332 irrespective of their number. There may however be a large number of 1333 unwanted RTP streams that will impact the share of bandwidth 1334 allocated for the originally wanted RTP stream. 1336 A hostile removal of the "a=simulcast" attribute will result in 1337 simulcast not being used. 1339 Neither of the above will likely have any major consequences and can 1340 be mitigated by signaling that is at least integrity and source 1341 authenticated to prevent an attacker to change it. 1343 Security considerations related to the use of "a=rid" and the 1344 RtpStreamId SDES item is covered in [I-D.ietf-mmusic-rid] and 1345 [I-D.ietf-avtext-rid]. There are no additional security concerns 1346 related to their use in this specification. 1348 11. Contributors 1350 Morgan Lindqvist and Fredrik Jansson, both from Ericsson, have 1351 contributed with important material to the first versions of this 1352 document. Robert Hansen and Cullen Jennings, from Cisco, Peter 1353 Thatcher, from Google, and Adam Roach, from Mozilla, contributed 1354 significantly to subsequent versions. 1356 12. Acknowledgements 1358 The authors would like to thank Bernard Aboba, Thomas Belling, Roni 1359 Even, Adam Roach, Inaki Baz Castillo, Paul Kyzivat, and Arun 1360 Arunachalam for the feedback they provided during the development of 1361 this document. 1363 13. References 1364 13.1. Normative References 1366 [I-D.ietf-avtext-rid] 1367 Roach, A., Nandakumar, S., and P. Thatcher, "RTP Stream 1368 Identifier Source Description (SDES)", draft-ietf-avtext- 1369 rid-09 (work in progress), October 2016. 1371 [I-D.ietf-mmusic-rid] 1372 Roach, A., "RTP Payload Format Restrictions", draft-ietf- 1373 mmusic-rid-15 (work in progress), May 2018. 1375 [I-D.ietf-mmusic-sdp-bundle-negotiation] 1376 Holmberg, C., Alvestrand, H., and C. Jennings, 1377 "Negotiating Media Multiplexing Using the Session 1378 Description Protocol (SDP)", draft-ietf-mmusic-sdp-bundle- 1379 negotiation-54 (work in progress), December 2018. 1381 [I-D.ietf-mmusic-sdp-mux-attributes] 1382 Nandakumar, S., "A Framework for SDP Attributes when 1383 Multiplexing", draft-ietf-mmusic-sdp-mux-attributes-17 1384 (work in progress), February 2018. 1386 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1387 Requirement Levels", BCP 14, RFC 2119, 1388 DOI 10.17487/RFC2119, March 1997, 1389 . 1391 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 1392 Jacobson, "RTP: A Transport Protocol for Real-Time 1393 Applications", STD 64, RFC 3550, DOI 10.17487/RFC3550, 1394 July 2003, . 1396 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 1397 Description Protocol", RFC 4566, DOI 10.17487/RFC4566, 1398 July 2006, . 1400 [RFC5234] Crocker, D., Ed. and P. Overell, "Augmented BNF for Syntax 1401 Specifications: ABNF", STD 68, RFC 5234, 1402 DOI 10.17487/RFC5234, January 2008, 1403 . 1405 [RFC7405] Kyzivat, P., "Case-Sensitive String Support in ABNF", 1406 RFC 7405, DOI 10.17487/RFC7405, December 2014, 1407 . 1409 [RFC7728] Burman, B., Akram, A., Even, R., and M. Westerlund, "RTP 1410 Stream Pause and Resume", RFC 7728, DOI 10.17487/RFC7728, 1411 February 2016, . 1413 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1414 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1415 May 2017, . 1417 13.2. Informative References 1419 [I-D.ietf-avtcore-multiplex-guidelines] 1420 Westerlund, M., Burman, B., Perkins, C., Alvestrand, H., 1421 and R. Even, "Guidelines for using the Multiplexing 1422 Features of RTP to Support Multiple Media Streams", draft- 1423 ietf-avtcore-multiplex-guidelines-08 (work in progress), 1424 December 2018. 1426 [I-D.ietf-payload-flexible-fec-scheme] 1427 Zanaty, M., Singh, V., Begen, A., and G. Mandyam, "RTP 1428 Payload Format for Flexible Forward Error Correction 1429 (FEC)", draft-ietf-payload-flexible-fec-scheme-17 (work in 1430 progress), February 2019. 1432 [RFC2198] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., 1433 Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse- 1434 Parisis, "RTP Payload for Redundant Audio Data", RFC 2198, 1435 DOI 10.17487/RFC2198, September 1997, 1436 . 1438 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 1439 with Session Description Protocol (SDP)", RFC 3264, 1440 DOI 10.17487/RFC3264, June 2002, 1441 . 1443 [RFC3389] Zopf, R., "Real-time Transport Protocol (RTP) Payload for 1444 Comfort Noise (CN)", RFC 3389, DOI 10.17487/RFC3389, 1445 September 2002, . 1447 [RFC4588] Rey, J., Leon, D., Miyazaki, A., Varsa, V., and R. 1448 Hakenberg, "RTP Retransmission Payload Format", RFC 4588, 1449 DOI 10.17487/RFC4588, July 2006, 1450 . 1452 [RFC4733] Schulzrinne, H. and T. Taylor, "RTP Payload for DTMF 1453 Digits, Telephony Tones, and Telephony Signals", RFC 4733, 1454 DOI 10.17487/RFC4733, December 2006, 1455 . 1457 [RFC5104] Wenger, S., Chandra, U., Westerlund, M., and B. Burman, 1458 "Codec Control Messages in the RTP Audio-Visual Profile 1459 with Feedback (AVPF)", RFC 5104, DOI 10.17487/RFC5104, 1460 February 2008, . 1462 [RFC5109] Li, A., Ed., "RTP Payload Format for Generic Forward Error 1463 Correction", RFC 5109, DOI 10.17487/RFC5109, December 1464 2007, . 1466 [RFC5583] Schierl, T. and S. Wenger, "Signaling Media Decoding 1467 Dependency in the Session Description Protocol (SDP)", 1468 RFC 5583, DOI 10.17487/RFC5583, July 2009, 1469 . 1471 [RFC6184] Wang, Y., Even, R., Kristensen, T., and R. Jesup, "RTP 1472 Payload Format for H.264 Video", RFC 6184, 1473 DOI 10.17487/RFC6184, May 2011, 1474 . 1476 [RFC6190] Wenger, S., Wang, Y., Schierl, T., and A. Eleftheriadis, 1477 "RTP Payload Format for Scalable Video Coding", RFC 6190, 1478 DOI 10.17487/RFC6190, May 2011, 1479 . 1481 [RFC6236] Johansson, I. and K. Jung, "Negotiation of Generic Image 1482 Attributes in the Session Description Protocol (SDP)", 1483 RFC 6236, DOI 10.17487/RFC6236, May 2011, 1484 . 1486 [RFC6464] Lennox, J., Ed., Ivov, E., and E. Marocco, "A Real-time 1487 Transport Protocol (RTP) Header Extension for Client-to- 1488 Mixer Audio Level Indication", RFC 6464, 1489 DOI 10.17487/RFC6464, December 2011, 1490 . 1492 [RFC7104] Begen, A., Cai, Y., and H. Ou, "Duplication Grouping 1493 Semantics in the Session Description Protocol", RFC 7104, 1494 DOI 10.17487/RFC7104, January 2014, 1495 . 1497 [RFC7656] Lennox, J., Gross, K., Nandakumar, S., Salgueiro, G., and 1498 B. Burman, Ed., "A Taxonomy of Semantics and Mechanisms 1499 for Real-Time Transport Protocol (RTP) Sources", RFC 7656, 1500 DOI 10.17487/RFC7656, November 2015, 1501 . 1503 [RFC7667] Westerlund, M. and S. Wenger, "RTP Topologies", RFC 7667, 1504 DOI 10.17487/RFC7667, November 2015, 1505 . 1507 [RFC7741] Westin, P., Lundin, H., Glover, M., Uberti, J., and F. 1508 Galligan, "RTP Payload Format for VP8 Video", RFC 7741, 1509 DOI 10.17487/RFC7741, March 2016, 1510 . 1512 [RFC8108] Lennox, J., Westerlund, M., Wu, Q., and C. Perkins, 1513 "Sending Multiple RTP Streams in a Single RTP Session", 1514 RFC 8108, DOI 10.17487/RFC8108, March 2017, 1515 . 1517 [RFC8285] Singer, D., Desineni, H., and R. Even, Ed., "A General 1518 Mechanism for RTP Header Extensions", RFC 8285, 1519 DOI 10.17487/RFC8285, October 2017, 1520 . 1522 Appendix A. Requirements 1524 The following requirements are met by the defined solution to support 1525 the use cases (Section 3): 1527 REQ-1: Identification: 1529 REQ-1.1: It must be possible to identify a set of simulcasted RTP 1530 streams as originating from the same media source in SDP 1531 signaling. 1533 REQ-1.2: An RTP endpoint must be capable of identifying the 1534 simulcast stream a received RTP stream is associated with, 1535 knowing the content of the SDP signalling. 1537 REQ-2: Transport usage. The solution must work when using: 1539 REQ-2.1: Legacy SDP with separate media transports per SDP media 1540 description. 1542 REQ-2.2: Bundled [I-D.ietf-mmusic-sdp-bundle-negotiation] SDP 1543 media descriptions. 1545 REQ-3: Capability negotiation. It must be possible that: 1547 REQ-3.1: Sender can express capability of sending simulcast. 1549 REQ-3.2: Receiver can express capability of receiving simulcast. 1551 REQ-3.3: Sender can express maximum number of simulcast streams 1552 that can be provided. 1554 REQ-3.4: Receiver can express maximum number of simulcast streams 1555 that can be received. 1557 REQ-3.5: Sender can detail the characteristics of the simulcast 1558 streams that can be provided. 1560 REQ-3.6: Receiver can detail the characteristics of the simulcast 1561 streams that it prefers to receive. 1563 REQ-4: Distinguishing features. It must be possible to have 1564 different simulcast streams use different codec parameters, as can 1565 be expressed by SDP format values and RTP payload types. 1567 REQ-5: Compatibility. It must be possible to use simulcast in 1568 combination with other RTP mechanisms that generate additional RTP 1569 streams: 1571 REQ-5.1: RTP Retransmission [RFC4588]. 1573 REQ-5.2: RTP Forward Error Correction [RFC5109]. 1575 REQ-5.3: Related payload types such as audio Comfort Noise and/or 1576 DTMF. 1578 REQ-5.4: A single simulcast stream can consist of multiple RTP 1579 streams, to support codecs where a dependent stream is 1580 dependent on a set of encoded and dependent streams, each 1581 potentially carried in their own RTP stream. 1583 REQ-6: Interoperability. The solution must be possible to use in: 1585 REQ-6.1: Interworking with non-simulcast legacy clients using a 1586 single media source per media type. 1588 REQ-6.2: WebRTC environment with a single media source per SDP 1589 media description. 1591 Appendix B. Changes From Earlier Versions 1593 NOTE TO RFC EDITOR: Please remove this section prior to publication. 1595 B.1. Modifications Between WG Version -13 and -14 1597 o c= and t= line order corrected in SDP examples 1599 B.2. Modifications Between WG Version -12 and -13 1601 o Examples corrected to follow RID ABNF 1603 o Example Figure 7 now comments on priority for second media source. 1605 o Clarified a SHOULD limitation. 1607 o Added urn:ietf:params:rtp-hdrext:sdes:repaired-rtp-stream-id in 1608 examples with RTX. 1610 o ABNF now uses RFC 7405 to indicate case sensitivity 1612 o Various minor editorials and nits. 1614 B.3. Modifications Between WG Version -11 and -12 1616 o Modified Normative statement regarding RTP stream duplication in 1617 Section 5.2. 1619 o Clarified assumption about use of congestion control by 1620 applications. 1622 o Changed to use RFC 8174 boilerplate instead of RFC 2119. 1624 o Clarified explanation of syntax for simulcast attribute in 1625 Section 4. 1627 o Editorial clarification in Section 5.2 and 5.3.2. 1629 o Various minor editorials and nits. 1631 B.4. Modifications Between WG Version -10 and -11 1633 o Added new SDP example section on Simulcast and Redundancy, 1634 including both RED (RFC2198), RTP RTX (RFC4588), and FEC (draft- 1635 ietf-payload-flexible-fec-scheme). 1637 o Removed restriction that "related" payload formats in an RTP 1638 stream (such as CN and DTMF) must not have their own rid-id, since 1639 there is no reason to forbid this and corresponding clarification 1640 is made in draft-ietf-mmusic-rid. 1642 o Removed any mention of source-specific signaling and the reference 1643 to RFC5576, since draft-ietf-mmusic-rid is not defined for source- 1644 specific signaling. 1646 o Changed some SDP examples to use a=rid restrictions instead of 1647 a=imageattr. 1649 o Changed reference from the obsoleted RFC 5285 to RFC 8285. 1651 B.5. Modifications Between WG Version -09 and -10 1653 o Amended overview section with a bit more explanation on the 1654 examples, and added an rid-id alternative for one of the streams. 1656 o Removed SCID also from the Terminology section, which was 1657 forgotten in -09 when changing SCID to rid-id. 1659 B.6. Modifications Between WG Version -08 and -09 1661 o Changed SCID to rid-id, to align with ietf-draft-mmusic-rid 1662 naming. 1664 o Changed Overview to be based on examples and shortened it. 1666 o Changed semantics of initially paused rid-id in modified SDP 1667 offers from requiring it to follow actual RFC 7728 pause state to 1668 an informational offerer's opinion at the time of offer creation, 1669 not in any way overriding or amending RFC 7728 signaling. 1671 o Replaced text on ignoring all but the first of multiple 1672 "a=simulcast" lines in a media description with mandating that at 1673 most one "a=simulcast" line is included. 1675 o Clarified with a note that, for the case it is clear from the SDP 1676 that RTP PT uniquely maps to RtpStreamId, an RTP receiver can use 1677 RTP PT to relate simulcast streams. 1679 o Moved Section 4 Requirements to become Appendix A. 1681 o Editorial corrections and clarifications. 1683 B.7. Modifications Between WG Version -07 and -08 1685 o Correcting syntax of SDP examples in section 6.6.1, as found by 1686 Inaki Baz Castillo. 1688 o Changing ABNF to only define the sc-value, not the SDP attribute 1689 itself, as suggested by Paul Kyzivat. 1691 o Changing I-D reference to newly published RFC 8108. 1693 o Adding list of modifications between -06 and -07. 1695 B.8. Modifications Between WG Version -06 and -07 1697 o A scope clarification, as result of the discussion with Roni Even. 1699 o A reformulation of the identification requirements for simulcast 1700 stream. 1702 o Correcting the statement related to source specific signalling 1703 (RFC 5576) to address Roni Even's comment. 1705 o Update of the last paragraph in Section 6.2 regarding simulcast 1706 stream differences as well as forbidding multiple instances of the 1707 same SCID within a single a=simulcast line. 1709 o Removal of note in Section 6.4 as result of issue raised by Roni 1710 Even. 1712 o Use of "m=" has been changed to media description and a few other 1713 editorial improvements and clarifications. 1715 B.9. Modifications Between WG Version -05 and -06 1717 o Added section on RTP Aspects 1719 o Added a requirement (5-4) on that capability exchange must be 1720 capable of handling multi RTP stream cases. 1722 o Added extmap attribute also on first signalling example as it is a 1723 recommended to use mechanism. 1725 o Clarified the definition of the simulcast attribute and how 1726 simulcast streams relates to simulcast formats and SCIDs. 1728 o Updated References list and moved around some references between 1729 informative and normative categories. 1731 o Editorial improvements and corrections. 1733 B.10. Modifications Between WG Version -04 and -05 1735 o Aligned with recent changes in draft-ietf-mmusic-rid and draft- 1736 ietf-avtext-rid. 1738 o Modified the SDP offer/answer section to follow the generally 1739 accepted structure, also adding a brief text on modifying the 1740 session that is aligned with draft-ietf-mmusic-rid. 1742 o Improved text around simulcast stream identification (as opposed 1743 to the simulcast stream itself) to consistently use the acronym 1744 SCID and defined that in the Terminology section. 1746 o Changed references for RTP-level pause/resume and VP8 payload 1747 format that are now published as RFC. 1749 o Improved IANA registration text. 1751 o Removed unused reference to draft-ietf-payload-flexible-fec- 1752 scheme. 1754 o Editorial improvements and corrections. 1756 B.11. Modifications Between WG Version -03 and -04 1758 o Changed to only use RID identification, as was consensus during 1759 IETF 94. 1761 o ABNF improvements. 1763 o Clarified offer-answer rules for initially paused streams. 1765 o Changed references for RTP topologies and RTP taxonomy documents 1766 that are now published as RFC. 1768 o Added reference to the new RID draft in AVTEXT. 1770 o Re-structured section 6 to provide an easy reference by the 1771 updated IANA section. 1773 o Added a sub-section 7.1 with a discussion of bitrate adaptation. 1775 o Editorial improvements. 1777 B.12. Modifications Between WG Version -02 and -03 1779 o Removed text on multicast / broadcast from use cases, since it is 1780 not supported by the solution. 1782 o Removed explicit references to unified plan draft. 1784 o Added possibility to initiate simulcast streams in paused mode. 1786 o Enabled an offerer to offer multiple stream identification (pt or 1787 rid) methods and have the answerer choose which to use. 1789 o Added a preference indication also in send direction offers. 1791 o Added a section on limitations of the current proposal, including 1792 identification method specific limitations. 1794 B.13. Modifications Between WG Version -01 and -02 1796 o Relying on the new RID solution for codec constraints and 1797 configuration identification. This has resulted in changes in 1798 syntax to identify if pt or RID is used to describe the simulcast 1799 stream. 1801 o Renamed simulcast version and simulcast version alternative to 1802 simulcast stream and simulcast format respectively, and improved 1803 definitions for them. 1805 o Clarification that it is possible to switch between simulcast 1806 version alternatives, but that only a single one be used at any 1807 point in time. 1809 o Changed the definition so that ordering of simulcast formats for a 1810 specific simulcast stream do have a preference order. 1812 B.14. Modifications Between WG Version -00 and -01 1814 o No changes. Only preventing expiry. 1816 B.15. Modifications Between Individual Version -00 and WG Version -00 1818 o Added this appendix. 1820 Authors' Addresses 1822 Bo Burman 1823 Ericsson 1824 Gronlandsgatan 31 1825 SE-164 60 Stockholm 1826 Sweden 1828 Email: bo.burman@ericsson.com 1830 Magnus Westerlund 1831 Ericsson 1832 Torshamnsgatan 23 1833 SE-164 83 Stockholm 1834 Sweden 1836 Phone: +46 10 714 82 87 1837 Email: magnus.westerlund@ericsson.com 1838 Suhas Nandakumar 1839 Cisco 1840 170 West Tasman Drive 1841 San Jose, CA 95134 1842 USA 1844 Email: snandaku@cisco.com 1846 Mo Zanaty 1847 Cisco 1848 170 West Tasman Drive 1849 San Jose, CA 95134 1850 USA 1852 Email: mzanaty@cisco.com