idnits 2.17.1 draft-burman-mmusic-sdp-simulcast-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 27, 2014) is 3468 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC6236' is defined on line 944, but no explicit reference was found in the text ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) == Outdated reference: A later version (-12) exists of draft-ietf-avtcore-multiplex-guidelines-03 == Outdated reference: A later version (-11) exists of draft-ietf-avtcore-rtp-multi-stream-05 == Outdated reference: A later version (-10) exists of draft-ietf-avtcore-rtp-topologies-update-04 == Outdated reference: A later version (-08) exists of draft-ietf-avtext-rtp-grouping-taxonomy-02 == Outdated reference: A later version (-10) exists of draft-ietf-avtext-rtp-stream-pause-04 == Outdated reference: A later version (-54) exists of draft-ietf-mmusic-sdp-bundle-negotiation-12 == Outdated reference: A later version (-17) exists of draft-ietf-payload-vp8-13 -- Obsolete informational reference (is this intentional?): RFC 5117 (Obsoleted by RFC 7667) -- Obsolete informational reference (is this intentional?): RFC 5285 (Obsoleted by RFC 8285) Summary: 1 error (**), 0 flaws (~~), 9 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group M. Westerlund 3 Internet-Draft B. Burman 4 Intended status: Standards Track Ericsson 5 Expires: April 30, 2015 S. Nandakumar 6 M. Zanaty 7 Cisco 8 October 27, 2014 10 Using Simulcast in SDP and RTP Sessions 11 draft-burman-mmusic-sdp-simulcast-00 13 Abstract 15 In some application scenarios it may be desirable to send multiple 16 differently encoded versions of the same media source in independent 17 RTP streams. This is called simulcast. This document discusses the 18 best way of accomplishing simulcast in RTP and how to signal it in 19 SDP. A solution is defined by making an extension to SDP, and using 20 RTP/RTCP identification methods to relate RTP streams belonging to 21 the same media source. The SDP extension consists a new media level 22 SDP attribute that express capability to send and/or receive 23 simulcast RTP streams. One part of the RTP/RTCP identification 24 method is included as a reference to a separate document, since it is 25 useful also for other purposes. 27 Status of This Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at http://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on April 30, 2015. 44 Copyright Notice 46 Copyright (c) 2014 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents 51 (http://trustee.ietf.org/license-info) in effect on the date of 52 publication of this document. Please review these documents 53 carefully, as they describe your rights and restrictions with respect 54 to this document. Code Components extracted from this document must 55 include Simplified BSD License text as described in Section 4.e of 56 the Trust Legal Provisions and are provided without warranty as 57 described in the Simplified BSD License. 59 Table of Contents 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 62 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3 63 2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 64 2.2. Requirements Language . . . . . . . . . . . . . . . . . . 4 65 3. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . 4 66 3.1. Reaching a Diverse Set of Receivers . . . . . . . . . . . 5 67 3.2. Application Specific Media Source Handling . . . . . . . 6 68 3.3. Receiver Adaptation in Multicast/Broadcast . . . . . . . 6 69 3.4. Receiver Media Source Preferences . . . . . . . . . . . . 7 70 4. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 7 71 5. Proposed Solution Overview . . . . . . . . . . . . . . . . . 8 72 6. Proposed Solution . . . . . . . . . . . . . . . . . . . . . . 9 73 6.1. Simulcast Capability . . . . . . . . . . . . . . . . . . 9 74 6.1.1. Declarative Use . . . . . . . . . . . . . . . . . . . 11 75 6.1.2. Offer/Answer Use . . . . . . . . . . . . . . . . . . 11 76 6.2. Relating Simulcast Versions . . . . . . . . . . . . . . . 12 77 6.3. Signaling Examples . . . . . . . . . . . . . . . . . . . 13 78 6.3.1. Unified Plan Client . . . . . . . . . . . . . . . . . 13 79 6.3.2. Multi-Source Client . . . . . . . . . . . . . . . . . 14 80 7. Network Aspects . . . . . . . . . . . . . . . . . . . . . . . 16 81 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17 82 9. Security Considerations . . . . . . . . . . . . . . . . . . . 17 83 10. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 18 84 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 18 85 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 18 86 12.1. Normative References . . . . . . . . . . . . . . . . . . 18 87 12.2. Informative References . . . . . . . . . . . . . . . . . 18 88 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 20 90 1. Introduction 92 Most of today's multiparty video conference solutions make use of 93 centralized servers to reduce the bandwidth and CPU consumption in 94 the endpoints. Those servers receive RTP streams from each 95 participant and send some suitable set of possibly modified RTP 96 streams to the rest of the participants, which usually have 97 heterogeneous capabilities (screen size, CPU, bandwidth, codec, etc). 98 One of the biggest issues is how to perform RTP stream adaptation to 99 different participants' constraints with the minimum possible impact 100 on both video quality and server performance. 102 Simulcast is defined in this memo as the act of simultaneously 103 sending multiple different encoded streams of the same media source, 104 e.g. the same video source encoded with different video encoder types 105 or image resolutions. This can be done in several ways and for 106 different purposes. This document focuses on the case where it is 107 desirable to provide a media source as multiple encoded streams over 108 RTP [RFC3550] towards an intermediary so that the intermediary can 109 provide the wanted functionality by selecting which RTP stream to 110 forward to other participants in the session, and more specifically 111 how the identification and grouping of the involved RTP streams are 112 done. From an RTP perspective, simulcast is a specific application 113 of the aspects discussed in RTP Multiplexing Guidelines 114 [I-D.ietf-avtcore-multiplex-guidelines]. 116 The purpose of this document is to describe a few scenarios where it 117 is motivated to use simulcast, and propose a suitable solution for 118 SDP signaling and performing RTP simulcast. 120 2. Definitions 122 2.1. Terminology 124 This document makes use of the terminology defined in RTP Taxonomy 125 [I-D.ietf-avtext-rtp-grouping-taxonomy], RTP Topology [RFC5117] and 126 RTP Topologies Update [I-D.ietf-avtcore-rtp-topologies-update]. In 127 addition, the following terms are used: 129 RTP Mixer: An RTP middle node, defined in [RFC5117] (Section 3.4: 130 Topo-Mixer), further elaborated and extended with other topologies 131 in [I-D.ietf-avtcore-rtp-topologies-update] (Section 3.6 to 3.9). 133 RTP Switch: A common short term for the terms "switching RTP mixer", 134 "source projecting middlebox", and "video switching MCU" as 135 discussed in [I-D.ietf-avtcore-rtp-topologies-update]. 137 Simulcast version: One encoded stream from the set of encoded 138 streams that constitutes the simulcast for a single media source. 140 Simulcast version alternative: One encoded stream being encoded in 141 one of possibly multiple alternative ways to create a simulcast 142 version. 144 2.2. Requirements Language 146 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 147 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 148 document are to be interpreted as described in RFC 2119 [RFC2119]. 150 3. Use Cases 152 Many use cases of simulcast as described in this document relate to a 153 multi-party communication session where one or more central nodes are 154 used to adapt the view of the communication session towards 155 individual participants, and facilitate the media transport between 156 participants. Thus, these cases targets the RTP Mixer type of 157 topology. 159 There are two principle approaches for an RTP Mixer to provide this 160 adapted view of the communication session to each receiving 161 participant: 163 o Transcoding (decoding and re-encoding) received RTP streams with 164 characteristics adapted to each receiving participant. This often 165 include mixing or composition of media sources from multiple 166 participants into a mixed media source originated by the RTP 167 Mixer. The main advantage of this approach is that it achieves 168 close to optimal adaptation to individual receiving participants. 169 The main disadvantages are that it can be very computationally 170 expensive to the RTP Mixer and typically also degrades media 171 Quality of Experience (QoE) such as end-to-end delay for the 172 receiving participants. 174 o Switching a subset of all received RTP streams or sub-streams to 175 each receiving participant, where the used subset is typically 176 specific to each receiving participant. The main advantages of 177 this approach are that it is computationally cheap to the RTP 178 Mixer and it has very limited impact on media QoE. The main 179 disadvantage is that it can be difficult to combine a subset of 180 received RTP streams into a perfect fit to the resource situation 181 of a receiving participant. 183 The use of simulcast relates to the latter approach, where it is more 184 important to reduce the load on the RTP Mixer and/or minimize QoE 185 impact than to achieve an optimal adaptation of resource usage. 187 A multicast/broadcast case where the receivers themselves selects the 188 most appropriate simulcast version and tune in to the right media 189 transport to receive that version is also considered (Section 3.3) . 190 This enables large, heterogeneous receiver populations, when it comes 191 to capabilities and the use of network path bandwidth resources. 193 3.1. Reaching a Diverse Set of Receivers 195 The media sources provided by a sending participant potentially need 196 to reach several receiving participants that differ in terms of 197 available resources. The receiver resources that typically differ 198 include, but are not limited to: 200 Codec: This includes codec type (such as SDP MIME type) and can 201 include codec configuration options (e.g. SDP fmtp parameters). 202 A couple of codec resources that differ only in codec 203 configuration will be "different" if they are somehow not 204 "compatible", like if they differ in video codec profile, or the 205 transport packetization configuration. 207 Sampling: This relates to how the media source is sampled, in 208 spatial as well as in temporal domain. For video streams, spatial 209 sampling affects image resolution and temporal sampling affects 210 video frame rate. For audio, spatial sampling relates to the 211 number of audio channels and temporal sampling affects audio 212 bandwidth. This may be used to suit different rendering 213 capabilities or needs at the receiving endpoints, as well as a 214 method to achieve different transport capabilities, bitrates and 215 eventually QoE by controlling the amount of source data. 217 Bitrate: This relates to the amount of bits spent per second to 218 transmit the media source as an RTP stream, which typically also 219 affects the Quality of Experience (QoE) for the receiving user. 221 Letting the sending participant create a simulcast of a few 222 differently configured RTP streams per media source can be a good 223 tradeoff when using an RTP switch as middlebox, instead of sending a 224 single RTP stream and using an RTP mixer to create individual 225 transcodings to each receiving participant. 227 This requires that the receiving participants can be categorized in 228 terms of available resources and that the sending participant can 229 choose a matching configuration for a single RTP stream per category 230 and media source. 232 For example, assume for simplicity a set of receiving participants 233 that differ only in that some have support to receive Codec A, and 234 the others have support to receive Codec B. Further assume that the 235 sending participant can send both Codec A and B. It can then reach 236 all receivers by creating two simulcasted RTP streams from each media 237 source; one for Codec A and one for Codec B. 239 In another simple example, a set of receiving participants differ 240 only in screen resolution; some are able to display video with at 241 most 360p resolution and some support 720p resolution. A sending 242 participant can then reach all receivers by creating a simulcast of 243 RTP streams with 360p and 720p resolution for each sent video media 244 source. 246 In more elaborate cases, the receiving participants differ both in 247 available sampling and bitrate, and maybe also codec, and it is up to 248 the RTP switch to find a good trade-off in which simulcasted stream 249 to choose for each intended receiver. It is also the responsibility 250 of the RTP switch to negotiate a good fit of simulcast streams with 251 the sending participant. 253 The maximum number of simulcasted RTP streams that can be sent is 254 mainly limited by the amount of processing and uplink network 255 resources available to the sending participant. 257 3.2. Application Specific Media Source Handling 259 The application logic that controls the communication session may 260 include special handling of some media sources. It is for example 261 commonly the case that the media from a sending participant is not 262 sent back to itself. 264 It is also common that a currently active speaker participant is 265 shown in larger size or higher quality than other participants (the 266 sampling or bitrate aspects of Section 3.1). Not sending the active 267 speaker media back to itself means there is some other participant's 268 media that instead has to receive special handling towards the active 269 speaker; typically the previous active speaker. This way, the 270 previously active speaker is needed both in larger size (to current 271 active speaker) and in small size (to the rest of the participants), 272 which can be solved with a simulcast from the previously active 273 speaker to the RTP switch. 275 3.3. Receiver Adaptation in Multicast/Broadcast 277 When using broadcast or multicast technology to distribute real-time 278 media streams to large populations of receivers, there can still be 279 significant heterogeneity among the receiver population. This can 280 depend on several factors: 282 Network Bandwidth: The network paths to individual receivers will 283 have variations in the bandwidth, thus putting different limits on 284 the supported bit-rates that can be received. 286 Endpoint Capabilities: The end point's hardware and software can 287 have varying capabilities in relation to screen resolution, 288 decoding capabilities, and supported media codecs. 290 To handle these variations, a transmitter of real-time media may want 291 to apply simulcast to a media source and provide it as a set of 292 different encoded streams, enabling the receivers to select the best 293 fit from this set themselves. The end point capabilities will 294 usually result in a single initial choice. However, the network 295 bandwidth can vary over time, which requires a client to continuously 296 monitor its reception to determine if the received RTP streams still 297 fit within the available bandwidth. If not, another set of encoded 298 streams from the ones offered in the simulcast will have to be 299 chosen. 301 When using IP multicast, the level of granularity that the receiver 302 can select from is decided by its ability to choose different 303 multicast addresses. Thus, different simulcast versions need to be 304 put on different media transports using different multicast 305 addresses. If these simulcast versions are described using SDP, they 306 need to be part of different SDP media descriptions, as SDP binds to 307 transport on media description level. 309 3.4. Receiver Media Source Preferences 311 The application logic that controls the communication session may 312 allow receiving participants to apply preferences to the 313 characteristics of the RTP stream they receive, for example in terms 314 of the aspects listed in Section 3.1. Sending a simulcast of RTP 315 streams is one way of accommodating receivers with conflicting or 316 otherwise incompatible preferences. 318 4. Requirements 320 The following requirements need to be met to support the use cases in 321 previous sections: 323 REQ-1: Identification. It must be possible to identify a set of 324 simulcasted RTP streams as originating from the same media source: 326 REQ-1.1: In SDP signaling. 328 REQ-1.2: On RTP/RTCP level. 330 REQ-2: Transport usage. The solution must work when using: 332 REQ-2.1: Legacy SDP with separate media transports per SDP media 333 description. 335 REQ-2.2: Bundled SDP media descriptions. 337 REQ-3: Capability negotiation. It must be possible that: 339 REQ-3.1: Sender can express capability of sending simulcast. 341 REQ-3.2: Receiver can express capability of receiving simulcast. 343 REQ-3.3: Sender can express maximum number of simulcast versions 344 that can be provided. 346 REQ-3.4: Receiver can express maximum number of simulcast 347 versions that can be received. 349 REQ-3.5: Sender can detail the characteristics of the simulcast 350 versions that can be provided. 352 REQ-3.6: Receiver can detail the characteristics of the simulcast 353 versions that it prefers to receive. 355 REQ-4: Distinguishing features. It must be possible to have 356 different simulcast versions use different codec parameters, as 357 can be expressed by SDP format values and RTP payload types. 359 REQ-5: Compatibility. It must be possible to use simulcast in 360 combination with other RTP mechanisms that generate additional RTP 361 streams: 363 REQ-5.1: RTP Retransmission [RFC4588]. 365 REQ-5.2: RTP Forward Error Correction [RFC5109]. 367 REQ-5.3: Related payload types such as audio Comfort Noise and/or 368 DTMF. 370 REQ-6: Interoperability. The solution must be possible to use in: 372 REQ-6.1: Interworking with non-simulcast legacy clients using a 373 single media source per media type. 375 REQ-6.2: WebRTC "Unified Plan" environment with a single media 376 source per SDP media description. 378 5. Proposed Solution Overview 380 The proposed solution consists of signaling simulcast capability and 381 configurations in SDP [RFC4566]: 383 o An offer or answer can contain a number of simulcast versions, 384 separate for send and receive directions. 386 o An offer or answer can contain multiple, alternative simulcast 387 versions in the same fashion as multiple, alternative codecs can 388 be offered in a media description. 390 o Currently, a single media source per SDP media description is 391 assumed, which makes the solution work in an Unified Plan 392 [I-D.roach-mmusic-unified-plan] context (although different from 393 what is currently defined there), both with and without BUNDLE 394 grouping. 396 o The codec configuration for each simulcast version is expressed in 397 terms of existing SDP formats (and typically RTP payload types). 398 Some codecs may rely on codec configuration based on general 399 attributes that apply for all formats within a media description, 400 and which could thus not be used to separate different simulcast 401 versions. This memo makes no attempt to address such 402 shortcomings, but if needed instead encourages that a separate, 403 general mechanism is defined for that purpose. 405 o It is possible, but not required to use source-specific signaling 406 [RFC5576] with the proposed solution. 408 6. Proposed Solution 410 This section further details the signaling solution outlined above 411 (Section 5). 413 6.1. Simulcast Capability 415 Simulcast capability is expressed as a new media level SDP attribute, 416 "a=simulcast". For each desired direction (send/recv/sendrecv), the 417 simulcast attribute defines a list of simulcast versions (separated 418 by semicolons), each of which is a list of alternative RTP payload 419 types (separated by commas) for that simulcast version. The meaning 420 of the attribute on SDP session level is undefined and MUST NOT be 421 used. There MUST be at most one "a=simulcast" attribute per media 422 description. The ABNF [RFC5234] for this attribute is: 424 simulcast-attribute = "a=simulcast" 1*3( WSP sc-dir-list ) 425 sc-dir-list = sc-dir WSP sc-fmt-list *( ";" sc-fmt-list ) 426 sc-dir = "send" / "recv" / "sendrecv" 427 sc-fmt-list = sc-fmt *( "," sc-fmt ) 428 sc-fmt = fmt 429 ; WSP defined in [RFC5234] 430 ; fmt defined in [RFC4566] 432 Figure 1: ABNF for Simulcast 434 There are separate and independent sets of parameters for simulcast 435 in send and receive directions. When listing multiple directions, 436 each direction MUST NOT occur more than once. 438 Attribute parameters are grouped by direction and consist of a 439 listing of SDP format tokens (usually corresponding to RTP payload 440 types), which describe the simulcast versions to be used. The number 441 of (non-alternative, see below) formats in the list sets a limit to 442 the number of supported simulcast versions in that direction. The 443 order of the listed simulcast versions in the "send" direction is not 444 significant. The order of the listed simulcast versions in the 445 "recv" direction expresses a preference which simulcast versions that 446 are preferred, with the leftmost being most preferred, if the number 447 of actually sent simulcast versions have to be reduced for some 448 reason. 450 Formats that have explicit dependencies [RFC5583] to other formats 451 (even in the same media description) MAY be listed as different 452 simulcast versions. 454 Alternative simulcast versions MAY be specified as part of the 455 attribute parameters by expressing each simulcast version format as a 456 comma-separated list of alternative values. In this case, all 457 combinations of those alternatives MUST be supported. The order of 458 the alternatives within a simulcast version is not significant; codec 459 preference is expressed by format type ordering on the m-line, using 460 regular SDP rules. 462 A simulcast version can use a codec defined such that the same RTP 463 SSRC can change RTP payload type multiple times during a session, 464 possibly even on a per-packet basis. A typical example can be a 465 speech codec that makes use of Comfort Noise [RFC3389] and/or DTMF 466 [RFC4733] formats. In those cases, such "related" formats MUST NOT 467 be listed explicitly in the attribute parameters, since they are not 468 strictly simulcast versions of the media source, but rather a 469 specific way of generating the RTP stream of a single simulcast 470 version with varying RTP payload type. Instead, only a single codec 471 format MUST be used per simulcast version or simulcast version 472 alternative (if there are such). The codec format SHOULD be the 473 codec most relevant to the media description, if possible to 474 identify, for example the audio codec rather than the DTMF. What 475 codec format to choose in the case of switching between multiple 476 equally "important" formats is left open, but it is assumed that in 477 the presence of such strong relation it does not matter which is 478 chosen. 480 Use of the redundant audio data [RFC2198] format could be seen as a 481 form of simulcast for loss protection purposes, but is not considered 482 conflicting with the mechanisms described in this memo and MAY 483 therefore be used as any other format. In this case the "red" 484 format, rather than the carried formats, SHOULD be the one to list as 485 a simulcast version on the "a=simulcast" line. 487 Editor's note: Consider adding the possibility to put an RTP 488 stream in "paused" state [I-D.ietf-avtext-rtp-stream-pause] from 489 the beginning of the session, possibly starting it at a later 490 point in time by applying RTP/RTCP level procedures from that 491 specification. 493 6.1.1. Declarative Use 495 When used as a declarative media description, a=simulcast "recv" 496 direction formats indicates the configured end point's required 497 capability to recognize and receive a specified set of RTP streams as 498 simulcast streams. In the same fashion, a=simulcast "send" direction 499 requests the end point to send a specified set of RTP streams as 500 simulcast streams. The "sendrecv" direction combines "send" and 501 "recv" requirements, using the same format values for both. 503 If simulcast version alternatives are listed, it means that the 504 configured end point MUST be prepared to receive any of the "recv" 505 formats, and MAY send any of the "send" formats for that simulcast 506 version. 508 6.1.2. Offer/Answer Use 510 An offerer wanting to use simulcast SHALL include the "a=simulcast" 511 attribute in the offer. An offerer that receives an answer without 512 "a=simulcast" MUST NOT use simulcast towards the answerer. An 513 offerer that receives an answer with "a=simulcast" not listing a 514 direction or without any formats in a specified direction MUST NOT 515 use simulcast in that direction. 517 An answerer that does not understand the concept of simulcast will 518 also not know the attribute and will remove it in the SDP answer, as 519 defined in existing SDP Offer/Answer [RFC3264] procedures. An 520 answerer that does understand the attribute and that wants to support 521 simulcast in an indicated direction SHALL reverse directionality of 522 the unidirectional direction parameters; "send" becomes "recv" and 523 vice versa, and include it in the answer. If the offered direction 524 is "sendrecv", the answerer MAY keep it, but MAY also change it to 525 "send" or "recv" to indicate that it is only interested in simulcast 526 for a single direction. Note that, like all other use of SDP format 527 tags for the send direction in Offer/Answer, format tags related to 528 the simulcast send direction in an offer ("send" or "sendrecv") are 529 placeholders that refer to information in the offer SDP, and the 530 actual formats that will be used on the wire (including RTP Payload 531 Format numbers) depends on information included in the SDP answer. 533 An offerer listing a set of receive simulcast versions and/or 534 alternatives in the offer MUST be prepared to receive RTP streams for 535 any of those simulcast versions and/or alternatives from the 536 answerer. 538 An answerer that receives an offer with simulcast containing an 539 "a=simulcast" attribute listing alternative formats for simulcast 540 versions MAY keep all the alternatives in the answer, but it MAY also 541 choose to remove any non-desirable alternatives per simulcast version 542 in the answer. The answerer MUST NOT add any alternatives that were 543 not present in the offer. 545 An answerer that receives an offer with simulcast that lists a number 546 of simulcast versions, MAY reduce the number of simulcast versions in 547 the answer, but MUST NOT add simulcast versions. 549 An offerer that receives an answer where some simulcast version 550 alternatives are kept MUST be prepared to receive any of the kept 551 send direction alternatives, and MAY send any of the kept receive 552 direction alternatives from the answer. This is similar to the case 553 when the answer includes multiple formats on the m-line. 555 An offerer that receives an answer where some of the simulcast 556 versions are removed MAY release the corresponding resources (codec, 557 transport, etc) in its receive direction and MUST NOT send any RTP 558 streams corresponding to the removed simulcast versions. 560 The media formats and corresponding characteristics of encoded 561 streams used in a simulcast SHOULD be chosen such that they are 562 different. If this difference is not required, RTP duplication 563 [RFC7104] procedures SHOULD be considered instead of simulcast. 565 Note: The inclusion of "a=simulcast" or the use of simulcast does 566 not change any of the interpretation or Offer/Answer procedures 567 for other SDP attributes, like "a=fmtp". 569 6.2. Relating Simulcast Versions 571 As long as there is only a single media source per SDP media 572 description, simulcast RTP streams can be related on RTP level 573 through the RTP payload type, as specified in the SDP "a=simulcast" 574 attribute (Section 6.1) parameters. When using BUNDLE 575 [I-D.ietf-mmusic-sdp-bundle-negotiation] to multiplex multiple SDP 576 media descriptions over a specify a single RTP session, there is an 577 identification mechanism that allows relating RTP streams back to 578 individual media descriptions, after which the above RTP payload type 579 relation can be used. 581 6.3. Signaling Examples 583 These examples are for a case of client to video conference service 584 using a centralized media topology with an RTP mixer. 586 +---+ +-----------+ +---+ 587 | A |<---->| |<---->| B | 588 +---+ | | +---+ 589 | Mixer | 590 +---+ | | +---+ 591 | F |<---->| |<---->| J | 592 +---+ +-----------+ +---+ 594 Figure 2: Four-party Mixer-based Conference 596 6.3.1. Unified Plan Client 598 Alice is calling in to the mixer with a simulcast-enabled Unified 599 Plan client capable of a single media source per media type. The 600 client can send a simulcast of 2 video resolutions and frame rates: 601 HD 1280x720p 30fps and thumbnail 320x180p 15fps. Alice's Offer: 603 v=0 604 o=alice 2362969037 2362969040 IN IP4 192.0.2.156 605 s=Simulcast Enabled Unified Plan Client 606 t=0 0 607 c=IN IP4 192.0.2.156 608 m=audio 49200 RTP/AVP 0 609 a=rtpmap:0 PCMU/8000 610 m=video 49300 RTP/AVP 97 98 611 a=rtpmap:97 H264/90000 612 a=rtpmap:98 H264/90000 613 a=fmtp:97 profile-level-id=42c01f; max-fs=3600; max-mbps=108000 614 a=fmtp:98 profile-level-id=42c00b; max-fs=240; max-mbps=3600 615 a=imageattr:97 send [x=1280,y=720] recv [x=1280,y=720] 616 a=imageattr:98 send [x=320,y=180] recv [x=320,y=180] 617 a=simulcast send 97;98 recv 97 619 Figure 3: Unified Plan Simulcast Offer 621 The only thing in the SDP that indicates simulcast capability is the 622 line in the video media description containing the "simulcast" 623 attribute. The included format parameters indicates that sent 624 simulcast versions can differ in video resolution and framerate. 626 The Answer from the server indicates that it too is simulcast 627 capable. Should it not have been simulcast capable, the 628 "a=simulcast" line would not have been present and communication 629 would have started with the media negotiated in the SDP. 631 v=0 632 o=server 823479283 1209384938 IN IP4 192.0.2.2 633 s=Answer to Simulcast Enabled Unified Plan Client 634 t=0 0 635 c=IN IP4 192.0.2.43 636 m=audio 49672 RTP/AVP 0 637 a=rtpmap:0 PCMU/8000 638 m=video 49674 RTP/AVP 97 98 639 a=rtpmap:97 H264/90000 640 a=rtpmap:98 H264/90000 641 a=fmtp:97 profile-level-id=42c01f; max-fs=3600; max-mbps=108000 642 a=fmtp:98 profile-level-id=42c00b; max-fs=240; max-mbps=3600 643 a=imageattr:97 send [x=1280,y=720] recv [x=1280,y=720] 644 a=imageattr:98 send [x=320,y=180] recv [x=320,y=180] 645 a=simulcast recv 97;98 send 97 647 Figure 4: Unified Plan Simulcast Answer 649 Since the server is the simulcast media receiver, it reverses the 650 direction of the "simulcast" attribute. 652 6.3.2. Multi-Source Client 654 Fred is calling in to the same conference as in the example above 655 with a two-camera, two-display system, thus capable of handling two 656 separate media sources in each direction, where each media source is 657 simulcast-enabled in the send direction. Fred's client is a Unified 658 Plan client, restricted to a single media source per media 659 description. 661 The first two simulcast versions for the first media source use 662 different codecs, H264-SVC [RFC6190] and H264 [RFC6184]. These two 663 simulcast versions also have a temporal dependency. Two different 664 video codecs, VP8 [I-D.ietf-payload-vp8] and H264, are offered as 665 alternatives for the third simulcast version for the first media 666 source. 668 The second media source is offered with three different simulcast 669 versions. All video streams of this second media source are loss 670 protected by RTP retransmission [RFC4588]. 672 Fred's client is also using BUNDLE to send all RTP streams from all 673 media descriptions in the same RTP session on a single media 674 transport. There are not so many RTP payload types in this example 675 that there is any risk of running out of payload types, but for the 676 sake of making an example, it is assumed that one of the payload 677 types cannot be kept unique across all media descriptions. 678 Therefore, the SDP makes use of the mechanism (work in progress) in 679 BUNDLE that identifies which media description an RTP stream belongs 680 to (a new RTCP SDES item and RTP header extension [RFC5285] type 681 carrying the a=mid value). That identification will make it possible 682 to identify unambiguously also on RTP level which media source it is 683 and thus what the related simulcast versions are, even though two 684 separate RTP streams in the joint RTP session share RTP payload type. 686 v=0 687 o=fred 238947129 823479223 IN IP4 192.0.2.125 688 s=Offer from Simulcast Enabled Multi-Source Client 689 t=0 0 690 c=IN IP4 192.0.2.125 691 a=group:BUNDLE foo bar zen 693 m=audio 49200 RTP/AVP 99 694 a=mid:foo 695 a=rtpmap:99 G722/8000 697 m=video 49600 RTP/AVP 100 101 102 103 698 a=mid:bar 699 a=rtpmap:100 H264-SVC/90000 700 a=rtpmap:101 H264/90000 701 a=rtpmap:102 H264/90000 702 a=rtpmap:103 VP8/90000 703 a=fmtp:100 profile-level-id=42400d; max-fs=3600; max-mbps=108000; \ 704 mst-mode=NI-TC 705 a=fmtp:101 profile-level-id=42c00d; max-fs=3600; max-mbps=54000 706 a=fmtp:102 profile-level-id=42c00d; max-fs=900; max-mbps=27000 707 a=fmtp:103 max-fs=900; max-fr=30 708 a=imageattr:100 send [x=1280,y=720] recv [x=1280,y=720] 709 a=imageattr:101 send [x=1280,y=720] recv [x=1280,y=720] 710 a=imageattr:102 send [x=640,y=360] recv [x=640,y=360] 711 a=imageattr:103 send [x=640,y=360] recv [x=640,y=360] 712 a=depend:100 lay bar:101 713 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 714 a=simulcast sendrecv 100;101 send 103,102 716 m=video 49602 RTP/AVP 96 103 97 104 105 106 717 a=mid:zen 718 a=rtpmap:96 VP8/90000 719 a=fmtp:96 max-fs=3600; max-fr=30 720 a=rtpmap:104 rtx/90000 721 a=fmtp:104 apt=96;rtx-time=200 722 a=rtpmap:103 VP8/90000 723 a=fmtp:103 max-fs=900; max-fr=30 724 a=rtpmap:105 rtx/90000 725 a=fmtp:105 apt=103;rtx-time=200 726 a=rtpmap:97 VP8/90000 727 a=fmtp:97 max-fs=240; max-fr=15 728 a=rtpmap:106 rtx/90000 729 a=fmtp:106 apt=97;rtx-time=200 730 a=extmap:1 urn:ietf:params:rtp-hdrext:sdes:mid 731 a=simulcast send 97;96;103 733 Figure 5: Fred's Multi-Source Simulcast Offer 735 Note: Empty lines in the SDP above are added only for readability 736 and would not be present in an actual SDP. 738 7. Network Aspects 740 Simulcast is in this memo defined as the act of sending multiple 741 alternative encoded streams of the same underlying media source. 742 When transmitting multiple independent streams that originate from 743 the same source, it could potentially be done in several different 744 ways using RTP. A general discussion on considerations for use of 745 the different RTP multiplexing alternatives can be found in 746 Guidelines for Multiplexing in RTP 747 [I-D.ietf-avtcore-multiplex-guidelines]. Discussion and 748 clarification on how to handle multiple streams in an RTP session can 749 be found in [I-D.ietf-avtcore-rtp-multi-stream]. 751 The network aspects that are relevant for simulcast are: 753 Quality of Service: When using simulcast it might be of interest to 754 prioritize a particular simulcast version, rather than applying 755 equal treatment to all versions. For example, lower bit-rate 756 versions may be prioritized over higher bit-rate versions to 757 minimize congestion or packet losses in the low bit-rate versions. 758 Thus, there is a benefit to use a simulcast solution that supports 759 QoS as good as possible. By separating simulcast versions into 760 different RTP sessions and send those RTP sessions over different 761 media transports, a simulcast version can be prioritized by 762 existing flow based QoS mechanisms. When using unicast, QoS 763 mechanisms based on individual packet marking are also feasible, 764 which do not require separation of simulcast versions into 765 different RTP sessions to apply different QoS. The proposed 766 solution can be extended to support this functionality with an 767 optional mid: prefix before the RTP payload types of a simulcast 768 version, to describe simulcast across multiple media descriptions. 770 NAT/FW Traversal: Using multiple RTP sessions will incur more cost 771 for NAT/FW traversal unless they can re-use the same transport 772 flow, which can be achieved by either one of multiplexing multiple 773 RTP sessions on a single lower layer transport 774 [I-D.westerlund-avtcore-transport-multiplexing] or Multiplexing 775 Negotiation Using SDP Port Numbers 776 [I-D.ietf-mmusic-sdp-bundle-negotiation]. If flow based QoS with 777 any differentiation is desirable, the cost for additional 778 transport flows is likely necessary. 780 Multicast: Multiple RTP sessions will be required to enable 781 combining simulcast with multicast. Different simulcast versions 782 have to be separated to different multicast groups to allow a 783 multicast receiver to pick the version it wants, rather than 784 receive all of them. In this case, the only reasonable 785 implementation is to use different RTP sessions for each multicast 786 group so that reporting and other RTCP functions operate as 787 intended. The proposed solution can be extended to support this 788 functionality with an optional mid: prefix before the RTP payload 789 types of a simulcast version, to describe simulcast across 790 multiple media descriptions. 792 8. IANA Considerations 794 This document requests to register a new attribute, simulcast. 796 Formal registrations to be written. 798 9. Security Considerations 800 The simulcast capability and configuration attributes and parameters 801 are vulnerable to attacks in signaling. 803 A false inclusion of the "a=simulcast" attribute may result in 804 simultaneous transmission of multiple RTP streams that would 805 otherwise not be generated. The impact is limited by the media 806 description joint bandwidth, shared by all simulcast versions 807 irrespective of their number. There may however be a large number of 808 unwanted RTP streams that will impact the share of the bandwidth 809 allocated for the originally wanted RTP stream. 811 A hostile removal of the "a=simulcast" attribute will result in 812 simulcast not being used. 814 Neither of the above will likely have any major consequences and can 815 be mitigated by signaling that is at least integrity and source 816 authenticated to prevent an attacker to change it. 818 10. Contributors 820 Morgan Lindqvist and Fredrik Jansson, both from Ericsson, have 821 contributed with important material to the first versions of this 822 document. Robert Hansen, from Cisco, contributed significantly to 823 subsequent versions. 825 11. Acknowledgements 827 12. References 829 12.1. Normative References 831 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 832 Requirement Levels", BCP 14, RFC 2119, March 1997. 834 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 835 Jacobson, "RTP: A Transport Protocol for Real-Time 836 Applications", STD 64, RFC 3550, July 2003. 838 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 839 Description Protocol", RFC 4566, July 2006. 841 [RFC5109] Li, A., "RTP Payload Format for Generic Forward Error 842 Correction", RFC 5109, December 2007. 844 [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax 845 Specifications: ABNF", STD 68, RFC 5234, January 2008. 847 [RFC7104] Begen, A., Cai, Y., and H. Ou, "Duplication Grouping 848 Semantics in the Session Description Protocol", RFC 7104, 849 January 2014. 851 12.2. Informative References 853 [I-D.ietf-avtcore-multiplex-guidelines] 854 Westerlund, M., Perkins, C., and H. Alvestrand, 855 "Guidelines for using the Multiplexing Features of RTP to 856 Support Multiple Media Streams", draft-ietf-avtcore- 857 multiplex-guidelines-03 (work in progress), October 2014. 859 [I-D.ietf-avtcore-rtp-multi-stream] 860 Lennox, J., Westerlund, M., Wu, W., and C. Perkins, 861 "Sending Multiple Media Streams in a Single RTP Session", 862 draft-ietf-avtcore-rtp-multi-stream-05 (work in progress), 863 July 2014. 865 [I-D.ietf-avtcore-rtp-topologies-update] 866 Westerlund, M. and S. Wenger, "RTP Topologies", draft- 867 ietf-avtcore-rtp-topologies-update-04 (work in progress), 868 August 2014. 870 [I-D.ietf-avtext-rtp-grouping-taxonomy] 871 Lennox, J., Gross, K., Nandakumar, S., and G. Salgueiro, 872 "A Taxonomy of Grouping Semantics and Mechanisms for Real- 873 Time Transport Protocol (RTP) Sources", draft-ietf-avtext- 874 rtp-grouping-taxonomy-02 (work in progress), June 2014. 876 [I-D.ietf-avtext-rtp-stream-pause] 877 Akram, A., Even, R., and M. Westerlund, "RTP Stream Pause 878 and Resume", draft-ietf-avtext-rtp-stream-pause-04 (work 879 in progress), October 2014. 881 [I-D.ietf-mmusic-sdp-bundle-negotiation] 882 Holmberg, C., Alvestrand, H., and C. Jennings, 883 "Negotiating Media Multiplexing Using the Session 884 Description Protocol (SDP)", draft-ietf-mmusic-sdp-bundle- 885 negotiation-12 (work in progress), October 2014. 887 [I-D.ietf-payload-vp8] 888 Westin, P., Lundin, H., Glover, M., Uberti, J., and F. 889 Galligan, "RTP Payload Format for VP8 Video", draft-ietf- 890 payload-vp8-13 (work in progress), October 2014. 892 [I-D.roach-mmusic-unified-plan] 893 Roach, A., Uberti, J., and M. Thomson, "A Unified Plan for 894 Using SDP with Large Numbers of Media Flows", draft-roach- 895 mmusic-unified-plan-00 (work in progress), July 2013. 897 [I-D.westerlund-avtcore-transport-multiplexing] 898 Westerlund, M. and C. Perkins, "Multiplexing Multiple RTP 899 Sessions onto a Single Lower-Layer Transport", draft- 900 westerlund-avtcore-transport-multiplexing-07 (work in 901 progress), October 2013. 903 [RFC2198] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., 904 Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse- 905 Parisis, "RTP Payload for Redundant Audio Data", RFC 2198, 906 September 1997. 908 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 909 with Session Description Protocol (SDP)", RFC 3264, June 910 2002. 912 [RFC3389] Zopf, R., "Real-time Transport Protocol (RTP) Payload for 913 Comfort Noise (CN)", RFC 3389, September 2002. 915 [RFC4588] Rey, J., Leon, D., Miyazaki, A., Varsa, V., and R. 916 Hakenberg, "RTP Retransmission Payload Format", RFC 4588, 917 July 2006. 919 [RFC4733] Schulzrinne, H. and T. Taylor, "RTP Payload for DTMF 920 Digits, Telephony Tones, and Telephony Signals", RFC 4733, 921 December 2006. 923 [RFC5117] Westerlund, M. and S. Wenger, "RTP Topologies", RFC 5117, 924 January 2008. 926 [RFC5285] Singer, D. and H. Desineni, "A General Mechanism for RTP 927 Header Extensions", RFC 5285, July 2008. 929 [RFC5576] Lennox, J., Ott, J., and T. Schierl, "Source-Specific 930 Media Attributes in the Session Description Protocol 931 (SDP)", RFC 5576, June 2009. 933 [RFC5583] Schierl, T. and S. Wenger, "Signaling Media Decoding 934 Dependency in the Session Description Protocol (SDP)", RFC 935 5583, July 2009. 937 [RFC6184] Wang, Y., Even, R., Kristensen, T., and R. Jesup, "RTP 938 Payload Format for H.264 Video", RFC 6184, May 2011. 940 [RFC6190] Wenger, S., Wang, Y., Schierl, T., and A. Eleftheriadis, 941 "RTP Payload Format for Scalable Video Coding", RFC 6190, 942 May 2011. 944 [RFC6236] Johansson, I. and K. Jung, "Negotiation of Generic Image 945 Attributes in the Session Description Protocol (SDP)", RFC 946 6236, May 2011. 948 Authors' Addresses 949 Magnus Westerlund 950 Ericsson 951 Farogatan 6 952 SE-164 80 Kista 953 Sweden 955 Phone: +46 10 714 82 87 956 Email: magnus.westerlund@ericsson.com 958 Bo Burman 959 Ericsson 960 Kistavagen 25 961 SE-164 80 Kista 962 Sweden 964 Phone: +46 10 714 13 11 965 Email: bo.burman@ericsson.com 967 Suhas Nandakumar 968 Cisco 969 170 West Tasman Drive 970 San Jose, CA 95134 971 USA 973 Email: snandaku@cisco.com 975 Mo Zanaty 976 Cisco 977 170 West Tasman Drive 978 San Jose, CA 95134 979 USA 981 Email: mzanaty@cisco.com