idnits 2.17.1 draft-westerlund-avtcore-rtp-simulcast-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The document has examples using IPv4 documentation addresses according to RFC6890, but does not use any IPv6 documentation addresses. Maybe there should be IPv6 examples, too? Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 4, 2014) is 3581 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 4566 (Obsoleted by RFC 8866) == Outdated reference: A later version (-12) exists of draft-ietf-avtcore-multiplex-guidelines-02 == Outdated reference: A later version (-11) exists of draft-ietf-avtcore-rtp-multi-stream-04 == Outdated reference: A later version (-10) exists of draft-ietf-avtcore-rtp-topologies-update-02 == Outdated reference: A later version (-08) exists of draft-ietf-avtext-rtp-grouping-taxonomy-01 == Outdated reference: A later version (-10) exists of draft-ietf-avtext-rtp-stream-pause-00 == Outdated reference: A later version (-54) exists of draft-ietf-mmusic-sdp-bundle-negotiation-07 == Outdated reference: A later version (-17) exists of draft-ietf-payload-vp8-11 -- Obsolete informational reference (is this intentional?): RFC 5117 (Obsoleted by RFC 7667) -- Obsolete informational reference (is this intentional?): RFC 5285 (Obsoleted by RFC 8285) Summary: 1 error (**), 0 flaws (~~), 8 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group M. Westerlund 3 Internet-Draft B. Burman 4 Intended status: Standards Track Ericsson 5 Expires: January 5, 2015 S. Nandakumar 6 Cisco 7 July 4, 2014 9 Using Simulcast in RTP Sessions 10 draft-westerlund-avtcore-rtp-simulcast-04 12 Abstract 14 In some application scenarios it may be desirable to send multiple 15 differently encoded versions of the same media source in independent 16 RTP streams. This is called simulcast. This document discusses the 17 best way of accomplishing simulcast in RTP and how to signal it in 18 SDP. A solution is defined by making an extension to SDP, and using 19 RTP/RTCP identification methods to relate RTP streams belonging to 20 the same media source. The SDP extension consists a new media level 21 SDP attribute that express capability to send and/or receive 22 simulcast RTP streams. One part of the RTP/RTCP identification 23 method is included as a reference to a separate document, since it is 24 useful also for other purposes. 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at http://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on January 5, 2015. 43 Copyright Notice 45 Copyright (c) 2014 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 61 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3 62 2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 3 63 2.2. Requirements Language . . . . . . . . . . . . . . . . . . 4 64 3. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . 4 65 3.1. Reaching a Diverse Set of Receivers . . . . . . . . . . . 5 66 3.2. Application Specific Media Source Handling . . . . . . . 6 67 3.3. Receiver Adaptation in Multicast/Broadcast . . . . . . . 6 68 3.4. Receiver Media Source Preferences . . . . . . . . . . . . 7 69 4. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 7 70 5. Proposed Solution Overview . . . . . . . . . . . . . . . . . 8 71 6. Proposed Solution . . . . . . . . . . . . . . . . . . . . . . 9 72 6.1. Simulcast Capability . . . . . . . . . . . . . . . . . . 9 73 6.1.1. Declarative Use . . . . . . . . . . . . . . . . . . . 11 74 6.1.2. Offer/Answer Use . . . . . . . . . . . . . . . . . . 11 75 6.2. Relating Simulcast Versions . . . . . . . . . . . . . . . 12 76 6.3. Signaling Examples . . . . . . . . . . . . . . . . . . . 13 77 6.3.1. Unified Plan Client . . . . . . . . . . . . . . . . . 13 78 6.3.2. Multi-Source Client . . . . . . . . . . . . . . . . . 15 79 7. Network Aspects . . . . . . . . . . . . . . . . . . . . . . . 17 80 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 81 9. Security Considerations . . . . . . . . . . . . . . . . . . . 18 82 10. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 19 83 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 19 84 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 19 85 12.1. Normative References . . . . . . . . . . . . . . . . . . 19 86 12.2. Informative References . . . . . . . . . . . . . . . . . 19 87 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 22 89 1. Introduction 91 Most of today's multiparty video conference solutions make use of 92 centralized servers to reduce the bandwidth and CPU consumption in 93 the endpoints. Those servers receive RTP streams from each 94 participant and send some suitable set of possibly modified RTP 95 streams to the rest of the participants, which usually have 96 heterogeneous capabilities (screen size, CPU, bandwidth, codec, etc). 97 One of the biggest issues is how to perform RTP stream adaptation to 98 different participants' constraints with the minimum possible impact 99 on both video quality and server performance. 101 simulcast is defined in this memo as the act of simultaneously 102 sending multiple different encoded streams of the same media source, 103 e.g. the same video source encoded with different video encoder types 104 or image resolutions. This can be done in several ways and for 105 different purposes. This document focuses on the case where it is 106 desirable to provide a media source as multiple encoded streams over 107 RTP [RFC3550] towards an intermediary so that the intermediary can 108 provide the wanted functionality by selecting which RTP stream to 109 forward to other participants in the session, and more specifically 110 how the identification and grouping of the involved RTP streams are 111 done. From an RTP perspective, simulcast is a specific application 112 of the aspects discussed in RTP Multiplexing Guidelines 113 [I-D.ietf-avtcore-multiplex-guidelines]. 115 The purpose of this document is to describe a few scenarios where it 116 is motivated to use simulcast, and propose a suitable solution for 117 signaling and performing RTP simulcast. 119 2. Definitions 121 2.1. Terminology 123 This document makes use of the terminology defined in RTP Taxonomy 124 [I-D.ietf-avtext-rtp-grouping-taxonomy], RTP Topology [RFC5117] and 125 RTP Topologies Update [I-D.ietf-avtcore-rtp-topologies-update]. In 126 addition, the following terms are used: 128 RTP Mixer: An RTP middle node, defined in [RFC5117] (Section 3.4: 129 Topo-Mixer), further elaborated and extended with other topologies 130 in [I-D.ietf-avtcore-rtp-topologies-update] (Section 3.6 to 3.9). 132 RTP Switch: A common short term for the terms "switching RTP mixer", 133 "source projecting middlebox", and "video switching MCU" as 134 discussed in [I-D.ietf-avtcore-rtp-topologies-update]. 136 Simulcast version: One encoded stream from the set of encoded 137 streams that constitutes the simulcast for a single media source. 139 Simulcast version alternative: One encoded stream being encoded in 140 one of possibly multiple alternative ways to create a simulcast 141 version. 143 2.2. Requirements Language 145 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 146 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 147 document are to be interpreted as described in RFC 2119 [RFC2119]. 149 3. Use Cases 151 Many use cases of simulcast as described in this document relate to a 152 multi-party communication session where one or more central nodes are 153 used to adapt the view of the communication session towards 154 individual participants, and facilitate the media transport between 155 participants. Thus, these cases targets the RTP Mixer type of 156 topology. 158 There are two principle approaches for an RTP Mixer to provide this 159 adapted view of the communication session to each receiving 160 participant: 162 o Transcoding (decoding and re-encoding) received RTP streams with 163 characteristics adapted to each receiving participant. This often 164 include mixing or composition of media sources from multiple 165 participants into a mixed media source originated by the RTP 166 Mixer. The main advantage of this approach is that it achieves 167 close to optimal adaptation to individual receiving participants. 168 The main disadvantages are that it can be very computationally 169 expensive to the RTP Mixer and typically also degrades media 170 Quality of Experience (QoE) such as end-to-end delay for the 171 receiving participants. 173 o Switching a subset of all received RTP streams or sub-streams to 174 each receiving participant, where the used subset is typically 175 specific to each receiving participant. The main advantages of 176 this approach are that it is computationally cheap to the RTP 177 Mixer and it has very limited impact on media QoE. The main 178 disadvantage is that it can be difficult to combine a subset of 179 received RTP streams into a perfect fit to the resource situation 180 of a receiving participant. 182 The use of simulcast relates to the latter approach, where it is more 183 important to reduce the load on the RTP Mixer and/or minimize QoE 184 impact than to achieve an optimal adaptation of resource usage. 186 A multicast/broadcast case where the receivers themselves selects the 187 most appropriate simulcast version and tune in to the right media 188 transport to receive that version is also considered (Section 3.3) . 189 This enables large, heterogeneous receiver populations, when it comes 190 to capabilities and the use of network path bandwidth resources. 192 3.1. Reaching a Diverse Set of Receivers 194 The media sources provided by a sending participant potentially need 195 to reach several receiving participants that differ in terms of 196 available resources. The receiver resources that typically differ 197 include, but are not limited to: 199 Codec: This includes codec type (such as SDP MIME type) and can 200 include codec configuration options (e.g. SDP fmtp parameters). 201 A couple of codec resources that differ only in codec 202 configuration will be "different" if they are somehow not 203 "compatible", like if they differ in video codec profile, or the 204 transport packetization configuration. 206 Sampling: This relates to how the media source is sampled, in 207 spatial as well as in temporal domain. For video streams, spatial 208 sampling affects image resolution and temporal sampling affects 209 video frame rate. For audio, spatial sampling relates to the 210 number of audio channels and temporal sampling affects audio 211 bandwidth. This may be used to suit different rendering 212 capabilities or needs at the receiving endpoints, as well as a 213 method to achieve different transport capabilities, bitrates and 214 eventually QoE by controlling the amount of source data. 216 Bitrate: This relates to the amount of bits spent per second to 217 transmit the media source as an RTP stream, which typically also 218 affects the Quality of Experience (QoE) for the receiving user. 220 Letting the sending participant create a simulcast of a few 221 differently configured RTP streams per media source can be a good 222 tradeoff when using an RTP switch as middlebox, instead of sending a 223 single RTP stream and using an RTP mixer to create individual 224 transcodings to each receiving participant. 226 This requires that the receiving participants can be categorized in 227 terms of available resources and that the sending participant can 228 choose a matching configuration for a single RTP stream per category 229 and media source. 231 For example, assume for simplicity a set of receiving participants 232 that differ only in that some have support to receive Codec A, and 233 the others have support to receive Codec B. Further assume that the 234 sending participant can send both Codec A and B. It can then reach 235 all receivers by creating two simulcasted RTP streams from each media 236 source; one for Codec A and one for Codec B. 238 In another simple example, a set of receiving participants differ 239 only in screen resolution; some are able to display video with at 240 most 360p resolution and some support 720p resolution. A sending 241 participant can then reach all receivers by creating a simulcast of 242 RTP streams with 360p and 720p resolution for each sent video media 243 source. 245 In more elaborate cases, the receiving participants differ both in 246 available sampling and bitrate, and maybe also codec, and it is up to 247 the RTP switch to find a good trade-off in which simulcasted stream 248 to choose for each intended receiver. It is also the responsibility 249 of the RTP switch to negotiate a good fit of simulcast streams with 250 the sending participant. 252 The maximum number of simulcasted RTP streams that can be sent is 253 mainly limited by the amount of processing and uplink network 254 resources available to the sending participant. 256 3.2. Application Specific Media Source Handling 258 The application logic that controls the communication session may 259 include special handling of some media sources. It is for example 260 commonly the case that the media from a sending participant is not 261 sent back to itself. 263 It is also common that a currently active speaker participant is 264 shown in larger size or higher quality than other participants (the 265 sampling or bitrate aspects of Section 3.1). Not sending the active 266 speaker media back to itself means there is some other participant's 267 media that instead has to receive special handling towards the active 268 speaker; typically the previous active speaker. This way, the 269 previously active speaker is needed both in larger size (to current 270 active speaker) and in small size (to the rest of the participants), 271 which can be solved with a simulcast from the previously active 272 speaker to the RTP switch. 274 3.3. Receiver Adaptation in Multicast/Broadcast 276 When using broadcast or multicast technology to distribute real-time 277 media streams to large populations of receivers, there can still be 278 significant heterogeneity among the receiver population. This can 279 depend on several factors: 281 Network Bandwidth: The network paths to individual receivers will 282 have variations in the bandwidth, thus putting different limits on 283 the supported bit-rates that can be received. 285 Endpoint Capabilities: The end point's hardware and software can 286 have varying capabilities in relation to screen resolution, 287 decoding capabilities, and supported media codecs. 289 To handle these variations, a transmitter of real-time media may want 290 to apply simulcast to a media source and provide it as a set of 291 different encoded streams, enabling the receivers to select the best 292 fit from this set themselves. The end point capabilities will 293 usually result in a single initial choice. However, the network 294 bandwidth can vary over time, which requires a client to continuously 295 monitor its reception to determine if the received RTP streams still 296 fit within the available bandwidth. If not, another set of encoded 297 streams from the ones offered in the simulcast will have to be 298 chosen. 300 When using IP multicast, the level of granularity that the receiver 301 can select from is decided by its ability to choose different 302 multicast addresses. Thus, different simulcast versions need to be 303 put on different media transports using different multicast 304 addresses. If these simulcast versions are described using SDP, they 305 need to be part of different SDP media descriptions, as SDP binds to 306 transport on media description level. 308 3.4. Receiver Media Source Preferences 310 The application logic that controls the communication session may 311 allow receiving participants to apply preferences to the 312 characteristics of the RTP stream they receive, for example in terms 313 of the aspects listed in Section 3.1. Sending a simulcast of RTP 314 streams is one way of accommodating receivers with conflicting or 315 otherwise incompatible preferences. 317 4. Requirements 319 The following requirements need to be met to support the use cases in 320 previous sections: 322 REQ-1: Identification. It must be possible to identify a set of 323 simulcasted RTP streams as originating from the same media source: 325 REQ-1.1: In SDP signaling. 327 REQ-1.2: On RTP/RTCP level. 329 REQ-2: Transport usage. The solution must work when using: 331 REQ-2.1: Legacy SDP with separate media transports per SDP media 332 description. 334 REQ-2.2: Bundled SDP media descriptions. 336 REQ-3: Capability negotiation. It must be possible that: 338 REQ-3.1: Sender can express capability of sending simulcast. 340 REQ-3.2: Receiver can express capability of receiving simulcast. 342 REQ-3.3: Sender can express maximum number of simulcast versions 343 that can be provided. 345 REQ-3.4: Receiver can express maximum number of simulcast 346 versions that can be received. 348 REQ-3.5: Sender can detail the characteristics of the simulcast 349 versions that can be provided. 351 REQ-3.6: Receiver can detail the characteristics of the simulcast 352 versions that it prefers to receive. 354 REQ-4: Distinguishing features. It must be possible to have 355 different simulcast versions use different codec parameters, as 356 can be expressed by SDP format values and RTP payload types. 358 REQ-5: Compatibility. It must be possible to use simulcast in 359 combination with other RTP mechanisms that generate additional RTP 360 streams: 362 REQ-5.1: RTP Retransmission [RFC4588]. 364 REQ-5.2: RTP Forward Error Correction [RFC5109]. 366 REQ-5.3: Related payload types such as audio Comfort Noise and/or 367 DTMF. 369 REQ-6: Interoperability. The solution must be possible to use in: 371 REQ-6.1: Interworking with non-simulcast legacy clients using a 372 single media source per media type. 374 REQ-6.2: WebRTC "Unified Plan" environment with a single media 375 source per SDP media description. 377 5. Proposed Solution Overview 379 The proposed solution consists of signaling simulcast capability and 380 configurations in SDP [RFC4566]: 382 o An offer or answer can contain a number of simulcast versions, 383 separate for send and receive directions. 385 o An offer or answer can contain multiple, alternative simulcast 386 versions in the same fashion as multiple, alternative codecs can 387 be offered in a media description. 389 o Currently, a single media source per SDP media description is 390 assumed, which makes the solution work in an Unified Plan 391 [I-D.roach-mmusic-unified-plan] context (although different from 392 what is currently defined there), both with and without BUNDLE 393 grouping. 395 o The codec configuration for each simulcast version is expressed in 396 terms of existing SDP formats (and typically RTP payload types). 397 Some codecs may rely on codec configuration based on general 398 attributes that apply for all formats within a media description, 399 and which could thus not be used to separate different simulcast 400 versions. This memo makes no attempt to address such 401 shortcomings, but if needed instead encourages that a separate, 402 general mechanism is defined for that purpose. 404 o It is possible, but not required to use source-specific signaling 405 [RFC5576] with the proposed solution. 407 6. Proposed Solution 409 This section further details the signaling solution outlined above 410 (Section 5). 412 6.1. Simulcast Capability 414 It is proposed that simulcast capability is defined as a media level 415 SDP attribute, "a=simulcast". The meaning of the attribute on SDP 416 session level is undefined and MUST NOT be used. There MUST be at 417 most one "a=simulcast" attribute per media description. The ABNF 418 [RFC5234] for this attribute is: 420 simulcast-attribute = "a=simulcast" 1*3( WSP sc-dir-list ) 421 sc-dir-list = sc-dir WSP sc-fmt-list *( ";" sc-fmt-list ) 422 sc-dir = "send" / "recv" / "sendrecv" 423 sc-fmt-list = sc-fmt *( "," sc-fmt ) 424 sc-fmt = fmt 425 ; WSP defined in [RFC5234] 426 ; fmt defined in [RFC4566] 428 Figure 1: ABNF for Simulcast 430 There are separate and independent sets of parameters for simulcast 431 in send and receive directions. When listing multiple directions, 432 each direction MUST NOT occur more than once. 434 Attribute parameters are grouped by direction and consist of a 435 listing of SDP format tokens (usually corresponding to RTP payload 436 types), which describe the simulcast versions to be used. The number 437 of (non-alternative, see below) formats in the list sets a limit to 438 the number of supported simulcast versions in that direction. The 439 order of the listed simulcast versions in the "send" direction is not 440 significant. The order of the listed simulcast versions in the 441 "recv" direction expresses a preference which simulcast versions that 442 are preferred, with the leftmost being most preferred, if the number 443 of actually sent simulcast versions have to be reduced for some 444 reason. 446 Formats that have explicit dependencies [RFC5583] to other formats 447 (even in the same media description) MAY be listed as different 448 simulcast versions. 450 Alternative simulcast versions MAY be specified as part of the 451 attribute parameters by expressing each simulcast version format as a 452 comma-separated list of alternative values. In this case, all 453 combinations of those alternatives MUST be supported. The order of 454 the alternatives within a simulcast version is not significant; codec 455 preference is expressed by format type ordering on the m-line, using 456 regular SDP rules. 458 A simulcast version can use a codec defined such that the same RTP 459 SSRC can change RTP payload type multiple times during a session, 460 possibly even on a per-packet basis. A typical example can be a 461 speech codec that makes use of Comfort Noise [RFC3389] and/or DTMF 462 [RFC4733] formats. In those cases, such "related" formats MUST NOT 463 be listed explicitly in the attribute parameters, since they are not 464 strictly simulcast versions of the media source, but rather a 465 specific way of generating the RTP stream of a single simulcast 466 version with varying RTP payload type. Instead, only a single codec 467 format MUST be used per simulcast version or simulcast version 468 alternative (if there are such). The codec format SHOULD be the 469 codec most relevant to the media description, if possible to 470 identify, for example the audio codec rather than the DTMF. What 471 codec format to choose in the case of switching between multiple 472 equally "important" formats is left open, but it is assumed that in 473 the presence of such strong relation it does not matter which is 474 chosen. 476 Use of the redundant audio data [RFC2198] format could be seen as a 477 form of simulcast for loss protection purposes, but is not considered 478 conflicting with the mechanisms described in this memo and MAY 479 therefore be used as any other format. In this case the "red" 480 format, rather than the carried formats, SHOULD be the one to list as 481 a simulcast version on the "a=simulcast" line. 483 Editor's note: Consider adding the possibility to put an RTP 484 stream in "paused" state [I-D.ietf-avtext-rtp-stream-pause] from 485 the beginning of the session, possibly starting it at a later 486 point in time by applying RTP/RTCP level procedures from that 487 specification. 489 6.1.1. Declarative Use 491 When used as a declarative media description, a=simulcast "recv" 492 direction formats indicates the configured end point's required 493 capability to recognize and receive a specified set of RTP streams as 494 simulcast streams. In the same fashion, a=simulcast "send" direction 495 requests the end point to send a specified set of RTP streams as 496 simulcast streams. The "sendrecv" direction combines "send" and 497 "recv" requirements, using the same format values for both. 499 If simulcast version alternatives are listed, it means that the 500 configured end point MUST be prepared to receive any of the "recv" 501 formats, and MAY send any of the "send" formats for that simulcast 502 version. 504 6.1.2. Offer/Answer Use 506 An offerer wanting to use simulcast SHALL include the "a=simulcast" 507 attribute in the offer. An offerer that receives an answer without 508 "a=simulcast" MUST NOT use simulcast towards the answerer. An 509 offerer that receives an answer with "a=simulcast" not listing a 510 direction or without any formats in a specified direction MUST NOT 511 use simulcast in that direction. 513 An answerer that does not understand the concept of simulcast will 514 also not know the attribute and will remove it in the SDP answer, as 515 defined in existing SDP Offer/Answer [RFC3264] procedures. An 516 answerer that does understand the attribute and that wants to support 517 simulcast in an indicated direction SHALL reverse directionality of 518 the unidirectional direction parameters; "send" becomes "recv" and 519 vice versa, and include it in the answer. If the offered direction 520 is "sendrecv", the answerer MAY keep it, but MAY also change it to 521 "send" or "recv" to indicate that it is only interested in simulcast 522 for a single direction. Note that, like all other use of SDP format 523 tags for the send direction in Offer/Answer, format tags related to 524 the simulcast send direction in an offer ("send" or "sendrecv") are 525 placeholders that refer to information in the offer SDP, and the 526 actual formats that will be used on the wire (including RTP Payload 527 Format numbers) depends on information included in the SDP answer. 529 An offerer listing a set of receive simulcast versions and/or 530 alternatives in the offer MUST be prepared to receive RTP streams for 531 any of those simulcast versions and/or alternatives from the 532 answerer. 534 An answerer that receives an offer with simulcast containing an 535 "a=simulcast" attribute listing alternative formats for simulcast 536 versions MAY keep all the alternatives in the answer, but it MAY also 537 choose to remove any non-desirable alternatives per simulcast version 538 in the answer. The answerer MUST NOT add any alternatives that were 539 not present in the offer. 541 An answerer that receives an offer with simulcast that lists a number 542 of simulcast versions, MAY reduce the number of simulcast versions in 543 the answer, but MUST NOT add simulcast versions. 545 An offerer that receives an answer were some simulcast version 546 alternatives are kept MUST be prepared to receive any of the kept 547 send direction alternatives, and MAY send any of the kept receive 548 direction alternatives from the answer. This is similar to the case 549 when the answer includes multiple formats on the m-line. 551 An offerer that receives an answer where some of the simulcast 552 versions are removed MAY release the corresponding resources (codec, 553 transport, etc) in its receive direction and MUST NOT send any RTP 554 streams corresponding to the removed simulcast versions. 556 The media formats and corresponding characteristics of encoded 557 streams used in a simulcast SHOULD be chosen such that they are 558 different. If this difference is not required, RTP duplication 559 [RFC7104] procedures SHOULD be considered instead of simulcast. 561 Note: The inclusion of "a=simulcast" or the use of simulcast does 562 not change any of the interpretation or Offer/Answer procedures 563 for other SDP attributes, like "a=fmtp". 565 6.2. Relating Simulcast Versions 567 As long as there is only a single media source per SDP media 568 description, simulcast RTP streams can be related on RTP level 569 through the RTP payload type, as specified in the SDP "a=simulcast" 570 attribute (Section 6.1) parameters. When using BUNDLE 571 [I-D.ietf-mmusic-sdp-bundle-negotiation] to use multiple SDP media 572 descriptions to specify a single RTP session, there is an 573 identification mechanism that allows relating RTP streams back to 574 individual media descriptions, after which the above RTP payload type 575 relation can be used. 577 6.3. Signaling Examples 579 These examples are for a case of client to video conference service 580 using a centralized media topology with an RTP mixer. 582 +---+ +-----------+ +---+ 583 | A |<---->| |<---->| B | 584 +---+ | | +---+ 585 | Mixer | 586 +---+ | | +---+ 587 | F |<---->| |<---->| J | 588 +---+ +-----------+ +---+ 590 Figure 2: Four-party Mixer-based Conference 592 6.3.1. Unified Plan Client 594 Alice is calling in to the mixer with a simulcast-enabled Unified 595 Plan client capable of a single media source per media type. The 596 only difference to a non-simulcast client is capability to send video 597 resolution [RFC6236] ("imageattr") and framerate (codec specific 598 "max-mbps") based simulcast. Alice's Offer looks like: 600 v=0 601 o=alice 2362969037 2362969040 IN IP4 192.0.2.156 602 s=Simulcast Enabled Unified Plan Client 603 t=0 0 604 c=IN IP4 192.0.2.156 605 b=AS:665 606 m=audio 49200 RTP/AVP 96 8 607 b=AS:145 608 a=rtpmap:96 G719/48000/2 609 a=rtpmap:8 PCMA/8000 610 m=video 49300 RTP/AVP 97 98 611 b=AS:520 612 a=rtpmap:97 H264/90000 613 a=fmtp:97 profile-level-id=42c01e 614 a=imageattr:97 send [x=640,y=360] [x=320,y=180] \ 615 recv [x=640,y=360] [x=320,y=180] 616 a=rtpmap:98 H264/90000 617 a=fmtp:98 profile-level-id=42c00b; max-mbps=3600 618 a=imageattr:98 send [x=320,y=180] recv [x=320,y=180] 619 a=simulcast send 97;98 621 Figure 3: Unified Plan Simulcast Offer 623 The only thing in the SDP that indicates simulcast capability is the 624 line in the video media description containing the "simulcast" 625 attribute. The included format parameters indicates that sent 626 simulcast versions can differ in video resolution and framerate. 628 The Answer from the server indicates that it too is simulcast 629 capable. Should it not have been simulcast capable, the 630 "a=simulcast" line would not have been present and communication 631 would have started with the media negotiated in the SDP. 633 v=0 634 o=server 823479283 1209384938 IN IP4 192.0.2.2 635 s=Answer to Simulcast Enabled Unified Plan Client 636 t=0 0 637 c=IN IP4 192.0.2.43 638 b=AS:665 639 m=audio 49672 RTP/AVP 96 640 b=AS:145 641 a=rtpmap:96 G719/48000/2 642 m=video 49674 RTP/AVP 97 98 643 b=AS:520 644 a=rtpmap:97 H264/90000 645 a=fmtp:97 profile-level-id=42c01e 646 a=imageattr:97 send [x=640,y=360] [x=320,y=180] \ 647 recv [x=640,y=360] [x=320,y=180] 648 a=rtpmap:98 H264/90000 649 a=fmtp:98 profile-level-id=42c00b; max-mbps=3600 650 a=imageattr:98 send [x=320,y=180] recv [x=320,y=180] 651 a=simulcast recv 97;98 653 Figure 4: Unified Plan Simulcast Answer 655 Since the server is the simulcast media receiver, it reverses the 656 direction of the "simulcast" attribute. 658 6.3.2. Multi-Source Client 660 Fred is calling in to the same conference as in the example above 661 with a two-camera, two-display system, thus capable of handling two 662 separate media sources in each direction, where each media source is 663 simulcast-enabled in the send direction. Fred's client is a Unified 664 Plan client, restricted to a single media source per media 665 description. 667 The first two simulcast versions for the first media source use 668 different codecs, H264-SVC [RFC6190] and H264 [RFC6184]. These two 669 simulcast versions also have a temporal dependency. Two different 670 video codecs, VP8 [I-D.ietf-payload-vp8] and H264, are offered as 671 alternatives for the third simulcast version for the first media 672 source. 674 The second media source is offered with three different simulcast 675 versions. All video streams of this second media source are loss 676 protected by RTP retransmission [RFC4588]. 678 Fred's client is also using BUNDLE to send all RTP streams from all 679 media descriptions in the same RTP session on a single media 680 transport. There are not so many RTP payload types in this example 681 that there is any risk of running out of payload types, but for the 682 sake of making an example, it is assumed that one of the payload 683 types cannot be kept unique across all media descriptions. 684 Therefore, the SDP makes use of the mechanism (work in progress) in 685 BUNDLE that identifies which media description an RTP stream belongs 686 to (a new RTCP SDES item and RTP header extension [RFC5285] type 687 carrying the a=mid value). That identification will make it possible 688 to identify unambiguously also on RTP level which media source it is 689 and thus what the related simulcast versions are, even though two 690 separate RTP streams in the joint RTP session share RTP payload type. 692 v=0 693 o=fred 238947129 823479223 IN IP4 192.0.2.125 694 s=Offer from Simulcast Enabled Multi-Source Client 695 t=0 0 696 c=IN IP4 192.0.2.125 697 b=AS:825 698 a=group:BUNDLE foo bar zen 700 m=audio 49200 RTP/AVP 98 99 701 b=AS:145 702 a=mid:foo 703 a=rtpmap:98 G719/48000/2 704 a=rtpmap:99 G722/8000 706 m=video 49600 RTP/AVP 100 101 102 103 707 b=AS:3500 708 a=mid:bar 709 a=rtpmap:100 H264-SVC/90000 710 a=fmtp:100 profile-level-id=42400d; max-fs=3600; max-mbps=108000; \ 711 mst-mode=NI-TC 712 a=imageattr:100 send [x=1280,y=720] [x=640,y=360] \ 713 recv [x=1280,y=720] [x=640,y=360] 714 a=rtpmap:101 H264/90000 715 a=fmtp:101 profile-level-id=42c00d; max-fs=3600; max-mbps=54000 716 a=depend:100 lay bar:101 717 a=imageattr:101 send [x=1280,y=720] [x=640,y=360] \ 718 recv [x=1280,y=720] [x=640,y=360] 719 a=rtpmap:102 H264/90000 720 a=fmtp:102 profile-level-id=42c00d; max-fs=900; max-mbps=27000 721 a=imageattr:102 send [x=640,y=360] recv [x=640,y=360] 722 a=rtpmap:103 VP8/90000 723 a=fmtp:103 max-fs=900; max-fr=30 724 a=imageattr:103 send [x=640,y=360] recv [x=640,y=360] 725 a=rtcp-mid 726 a=extmap:1 urn:ietf:params:rtp-hdrext:mid 727 a=simulcast sendrecv 100;101 send 103,102 728 m=video 49602 RTP/AVP 96 103 97 104 105 106 729 b=AS:3500 730 a=mid:zen 731 a=rtpmap:96 VP8/90000 732 a=fmtp:96 max-fs=3600; max-fr=30 733 a=rtpmap:104 rtx/90000 734 a=fmtp:104 apt=96;rtx-time=200 735 a=rtpmap:103 VP8/90000 736 a=fmtp:103 max-fs=900; max-fr=30 737 a=rtpmap:105 rtx/90000 738 a=fmtp:105 apt=103;rtx-time=200 739 a=rtpmap:97 VP8/90000 740 a=fmtp:97 max-fs=240; max-fr=15 741 a=rtpmap:106 rtx/90000 742 a=fmtp:106 apt=97;rtx-time=200 743 a=rtcp-mid 744 a=extmap:1 urn:ietf:params:rtp-hdrext:mid 745 a=simulcast send 97;96;103 747 Figure 5: Fred's Multi-Source Simulcast Offer 749 Note: Empty lines in the SDP above are added only for readability 750 and would not be present in an actual SDP. 752 7. Network Aspects 754 Simulcast is in this memo defined as the act of sending multiple 755 alternative encoded streams of the same underlying media source. 756 When transmitting multiple independent streams that originate from 757 the same source, it could potentially be done in several different 758 ways using RTP. A general discussion on considerations for use of 759 the different RTP multiplexing alternatives can be found in 760 Guidelines for Multiplexing in RTP 761 [I-D.ietf-avtcore-multiplex-guidelines]. Discussion and 762 clarification on how to handle multiple streams in an RTP session can 763 be found in [I-D.ietf-avtcore-rtp-multi-stream]. 765 The network aspects that are relevant for simulcast are: 767 Quality of Service: When using simulcast it might be of interest to 768 prioritize a particular simulcast version, rather than applying 769 equal treatment to all versions. For example, lower bit-rate 770 versions may be prioritized over higher bit-rate versions to 771 minimize congestion or packet losses in the low bit-rate versions. 772 Thus, there is a benefit to use a simulcast solution that supports 773 QoS as good as possible. By separating simulcast versions into 774 different RTP sessions and send those RTP sessions over different 775 media transports, a simulcast version can be prioritized by 776 existing flow based QoS mechanisms. When using unicast, QoS 777 mechanisms based on individual packet marking are also feasible, 778 which do not require separation of simulcast versions into 779 different RTP sessions to apply different QoS. The proposed 780 solution does not support this functionality. 782 NAT/FW Traversal: Using multiple RTP sessions will incur more cost 783 for NAT/FW traversal unless they can re-use the same transport 784 flow, which can be achieved by either one of multiplexing multiple 785 RTP sessions on a single lower layer transport 786 [I-D.westerlund-avtcore-transport-multiplexing] or Multiplexing 787 Negotiation Using SDP Port Numbers 788 [I-D.ietf-mmusic-sdp-bundle-negotiation]. If flow based QoS with 789 any differentiation is desirable, the cost for additional 790 transport flows is likely necessary. 792 Multicast: Multiple RTP sessions will be required to enable 793 combining simulcast with multicast. Different simulcast versions 794 have to be separated to different multicast groups to allow a 795 multicast receiver to pick the version it wants, rather than 796 receive all of them. In this case, the only reasonable 797 implementation is to use different RTP sessions for each multicast 798 group so that reporting and other RTCP functions operate as 799 intended. The proposed solution does not support this 800 functionality. 802 8. IANA Considerations 804 This document requests to register a new attribute, simulcast. 806 Formal registrations to be written. 808 9. Security Considerations 810 The simulcast capability and configuration attributes and parameters 811 are vulnerable to attacks in signaling. 813 A false inclusion of the "a=simulcast" attribute may result in 814 simultaneous transmission of multiple RTP streams that would 815 otherwise not be generated. The impact is limited by the media 816 description joint bandwidth, shared by all simulcast versions 817 irrespective of their number. There may however be a large number of 818 unwanted RTP streams that will impact the share of the bandwidth 819 allocated for the originally wanted RTP stream. 821 A hostile removal of the "a=simulcast" attribute will result in 822 simulcast not being used. 824 Neither of the above will likely have any major consequences and can 825 be mitigated by signaling that is at least integrity and source 826 authenticated to prevent an attacker to change it. 828 10. Contributors 830 Morgan Lindqvist and Fredrik Jansson, both from Ericsson, have 831 contributed with important material to the first versions of this 832 document. Mo Zanaty and Robert Hansen, both from Cisco, contributed 833 significantly to subsequent versions. 835 11. Acknowledgements 837 12. References 839 12.1. Normative References 841 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 842 Requirement Levels", BCP 14, RFC 2119, March 1997. 844 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 845 Jacobson, "RTP: A Transport Protocol for Real-Time 846 Applications", STD 64, RFC 3550, July 2003. 848 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 849 Description Protocol", RFC 4566, July 2006. 851 [RFC5109] Li, A., "RTP Payload Format for Generic Forward Error 852 Correction", RFC 5109, December 2007. 854 [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax 855 Specifications: ABNF", STD 68, RFC 5234, January 2008. 857 [RFC7104] Begen, A., Cai, Y., and H. Ou, "Duplication Grouping 858 Semantics in the Session Description Protocol", RFC 7104, 859 January 2014. 861 12.2. Informative References 863 [I-D.ietf-avtcore-multiplex-guidelines] 864 Westerlund, M., Perkins, C., and H. Alvestrand, 865 "Guidelines for using the Multiplexing Features of RTP to 866 Support Multiple Media Streams", draft-ietf-avtcore- 867 multiplex-guidelines-02 (work in progress), January 2014. 869 [I-D.ietf-avtcore-rtp-multi-stream] 870 Lennox, J., Westerlund, M., Wu, W., and C. Perkins, 871 "Sending Multiple Media Streams in a Single RTP Session", 872 draft-ietf-avtcore-rtp-multi-stream-04 (work in progress), 873 May 2014. 875 [I-D.ietf-avtcore-rtp-topologies-update] 876 Westerlund, M. and S. Wenger, "RTP Topologies", draft- 877 ietf-avtcore-rtp-topologies-update-02 (work in progress), 878 May 2014. 880 [I-D.ietf-avtext-rtp-grouping-taxonomy] 881 Lennox, J., Gross, K., Nandakumar, S., and G. Salgueiro, 882 "A Taxonomy of Grouping Semantics and Mechanisms for Real- 883 Time Transport Protocol (RTP) Sources", draft-ietf-avtext- 884 rtp-grouping-taxonomy-01 (work in progress), February 885 2014. 887 [I-D.ietf-avtext-rtp-stream-pause] 888 Akram, A., Even, R., and M. Westerlund, "RTP Media Stream 889 Pause and Resume", draft-ietf-avtext-rtp-stream-pause-00 890 (work in progress), May 2014. 892 [I-D.ietf-mmusic-sdp-bundle-negotiation] 893 Holmberg, C., Alvestrand, H., and C. Jennings, 894 "Negotiating Media Multiplexing Using the Session 895 Description Protocol (SDP)", draft-ietf-mmusic-sdp-bundle- 896 negotiation-07 (work in progress), April 2014. 898 [I-D.ietf-payload-vp8] 899 Westin, P., Lundin, H., Glover, M., Uberti, J., and F. 900 Galligan, "RTP Payload Format for VP8 Video", draft-ietf- 901 payload-vp8-11 (work in progress), February 2014. 903 [I-D.roach-mmusic-unified-plan] 904 Roach, A., Uberti, J., and M. Thomson, "A Unified Plan for 905 Using SDP with Large Numbers of Media Flows", draft-roach- 906 mmusic-unified-plan-00 (work in progress), July 2013. 908 [I-D.westerlund-avtcore-transport-multiplexing] 909 Westerlund, M. and C. Perkins, "Multiplexing Multiple RTP 910 Sessions onto a Single Lower-Layer Transport", draft- 911 westerlund-avtcore-transport-multiplexing-07 (work in 912 progress), October 2013. 914 [RFC2198] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., 915 Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse- 916 Parisis, "RTP Payload for Redundant Audio Data", RFC 2198, 917 September 1997. 919 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 920 with Session Description Protocol (SDP)", RFC 3264, June 921 2002. 923 [RFC3389] Zopf, R., "Real-time Transport Protocol (RTP) Payload for 924 Comfort Noise (CN)", RFC 3389, September 2002. 926 [RFC4588] Rey, J., Leon, D., Miyazaki, A., Varsa, V., and R. 927 Hakenberg, "RTP Retransmission Payload Format", RFC 4588, 928 July 2006. 930 [RFC4733] Schulzrinne, H. and T. Taylor, "RTP Payload for DTMF 931 Digits, Telephony Tones, and Telephony Signals", RFC 4733, 932 December 2006. 934 [RFC5117] Westerlund, M. and S. Wenger, "RTP Topologies", RFC 5117, 935 January 2008. 937 [RFC5285] Singer, D. and H. Desineni, "A General Mechanism for RTP 938 Header Extensions", RFC 5285, July 2008. 940 [RFC5576] Lennox, J., Ott, J., and T. Schierl, "Source-Specific 941 Media Attributes in the Session Description Protocol 942 (SDP)", RFC 5576, June 2009. 944 [RFC5583] Schierl, T. and S. Wenger, "Signaling Media Decoding 945 Dependency in the Session Description Protocol (SDP)", RFC 946 5583, July 2009. 948 [RFC6184] Wang, Y., Even, R., Kristensen, T., and R. Jesup, "RTP 949 Payload Format for H.264 Video", RFC 6184, May 2011. 951 [RFC6190] Wenger, S., Wang, Y., Schierl, T., and A. Eleftheriadis, 952 "RTP Payload Format for Scalable Video Coding", RFC 6190, 953 May 2011. 955 [RFC6236] Johansson, I. and K. Jung, "Negotiation of Generic Image 956 Attributes in the Session Description Protocol (SDP)", RFC 957 6236, May 2011. 959 Authors' Addresses 961 Magnus Westerlund 962 Ericsson 963 Farogatan 6 964 SE-164 80 Kista 965 Sweden 967 Phone: +46 10 714 82 87 968 Email: magnus.westerlund@ericsson.com 970 Bo Burman 971 Ericsson 972 Farogatan 6 973 SE-164 80 Kista 974 Sweden 976 Phone: +46 10 714 13 11 977 Email: bo.burman@ericsson.com 979 Suhas Nandakumar 980 Cisco 981 170 West Tasman Drive 982 San Jose, CA 95134 983 USA 985 Email: snandaku@cisco.com