idnits 2.17.1 draft-ietf-avtcore-multi-media-rtp-session-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC3550, updated by this document, for RFC5378 checks: 1998-04-07) (Using the creation date from RFC3551, updated by this document, for RFC5378 checks: 1997-03-27) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (March 9, 2015) is 3336 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFCXXXX' is mentioned on line 457, but not defined == Outdated reference: A later version (-11) exists of draft-ietf-avtcore-rtp-multi-stream-06 == Outdated reference: A later version (-54) exists of draft-ietf-mmusic-sdp-bundle-negotiation-17 == Outdated reference: A later version (-12) exists of draft-ietf-avtcore-multiplex-guidelines-03 == Outdated reference: A later version (-10) exists of draft-ietf-avtcore-rtp-topologies-update-06 == Outdated reference: A later version (-08) exists of draft-ietf-avtext-rtp-grouping-taxonomy-06 -- Obsolete informational reference (is this intentional?): RFC 2733 (Obsoleted by RFC 5109) -- Obsolete informational reference (is this intentional?): RFC 4566 (Obsoleted by RFC 8866) Summary: 0 errors (**), 0 flaws (~~), 7 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 AVTCORE WG M. Westerlund 3 Internet-Draft Ericsson 4 Updates: 3550, 3551 (if approved) C. Perkins 5 Intended status: Standards Track University of Glasgow 6 Expires: September 10, 2015 J. Lennox 7 Vidyo 8 March 9, 2015 10 Sending Multiple Types of Media in a Single RTP Session 11 draft-ietf-avtcore-multi-media-rtp-session-07 13 Abstract 15 This document specifies how an RTP session can contain RTP Streams 16 with media from multiple media types such as audio, video, and text. 17 This has been restricted by the RTP Specification, and thus this 18 document updates RFC 3550 and RFC 3551 to enable this behaviour for 19 applications that satisfy the applicability for using multiple media 20 types in a single RTP session. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at http://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on September 10, 2015. 39 Copyright Notice 41 Copyright (c) 2015 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 57 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3 58 3. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 4 59 4. Overview of Solution . . . . . . . . . . . . . . . . . . . . 5 60 5. Applicability . . . . . . . . . . . . . . . . . . . . . . . . 5 61 5.1. Usage of the RTP session . . . . . . . . . . . . . . . . 6 62 5.2. Signalled Support . . . . . . . . . . . . . . . . . . . . 6 63 5.3. Homogeneous Multi-party . . . . . . . . . . . . . . . . . 7 64 5.4. Reduced number of Payload Types . . . . . . . . . . . . . 8 65 5.5. Stream Differentiation . . . . . . . . . . . . . . . . . 8 66 5.6. Non-compatible Extensions . . . . . . . . . . . . . . . . 8 67 6. RTP Session Specification . . . . . . . . . . . . . . . . . . 9 68 6.1. RTP Session . . . . . . . . . . . . . . . . . . . . . . . 9 69 6.2. Sender Source Restrictions . . . . . . . . . . . . . . . 11 70 6.3. Payload Type Applicability . . . . . . . . . . . . . . . 12 71 6.4. RTCP Considerations . . . . . . . . . . . . . . . . . . . 12 72 7. Extension Considerations . . . . . . . . . . . . . . . . . . 12 73 7.1. RTP Retransmission . . . . . . . . . . . . . . . . . . . 13 74 7.2. Generic FEC . . . . . . . . . . . . . . . . . . . . . . . 14 75 8. Signalling . . . . . . . . . . . . . . . . . . . . . . . . . 15 76 8.1. SDP-Based Signalling . . . . . . . . . . . . . . . . . . 16 77 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 78 10. Security Considerations . . . . . . . . . . . . . . . . . . . 16 79 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 16 80 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 16 81 12.1. Normative References . . . . . . . . . . . . . . . . . . 16 82 12.2. Informative References . . . . . . . . . . . . . . . . . 17 83 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 18 85 1. Introduction 87 When the Real-time Transport Protocol (RTP) [RFC3550] was designed, 88 close to 20 years ago, IP networks were different to those deployed 89 at the time of this writing. The virtually ubiquitous deployment of 90 Network Address Translators (NAT) and Firewalls has since increased 91 the cost and likely-hood of communication failure when using many 92 different transport flows. Hence, there is pressure to reduce the 93 number of concurrent transport flows used by RTP applications. 95 The RTP specification recommends against sending several different 96 types of media, for example audio and video, in a single RTP session. 98 The RTP profile for Audio and Video Conferences with Minimal Control 99 (RTP/AVP) [RFC3551] mandates a similar restriction. The motivation 100 for these limitations is partly to allow lower layer Quality of 101 Service (QoS) mechanisms to be used, and partly due to limitations of 102 the RTCP timing rules that assumes all media in a session to have 103 similar bandwidth. The Session Description Protocol (SDP) [RFC4566] 104 is one of the dominant signalling methods for establishing RTP 105 sessions, and has enforced this rule by not allowing multiple media 106 types for a given destination or set of ICE candidates. 108 The fact that these limitations have been in place for so long, in 109 addition to RFC 3550 being written without fully considering the use 110 of multiple media types in an RTP session, results in a number of 111 issues when allowing this behaviour. This memo updates [RFC3550] and 112 [RFC3551] with important considerations regarding applicability and 113 functionality when using multiple types of media in an RTP session, 114 including normative specification of behaviour. This memo makes no 115 changes to RTP behaviour when using multiple RTP streams with media 116 of the same type (e.g., multiple audio streams or multiple video 117 streams) in a single RTP session. Instead it relies on the 118 clarifications in [I-D.ietf-avtcore-rtp-multi-stream]. 120 This memo is structured as follows. First, some basic definitions 121 are provided. This is followed by a background that discusses the 122 motivation in more detail. A overview of the solution of how to 123 provide multiple media types in one RTP session is then presented. 124 Next is the formal applicability this specification have followed by 125 the normative specification. This is followed by a discussion how 126 some RTP/RTCP Extensions are expected to function in the case of 127 multiple media types in one RTP session. A specification of the 128 requirements on signalling from this specification and a look how 129 this is realized in SDP using Bundle 130 [I-D.ietf-mmusic-sdp-bundle-negotiation]. The memo ends with the 131 security considerations. 133 2. Definitions 135 Media Type: The general type of media data used by a real-time 136 application. The media type corresponds to the value used in the 137 field of an SDP m= line. The media types defined at the 138 time of this writing are "audio", "video", "text", "application", 139 and "message". 141 Quality of Service (QoS): Network mechanisms that are intended to 142 ensure that the packets within a flow or with a specific marking 143 are transported with certain properties. 145 The terms Encoded Stream, Endpoint, Media Source, RTP Session, and 146 RTP Stream are used as defined in 147 [I-D.ietf-avtext-rtp-grouping-taxonomy]. 149 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 150 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 151 document are to be interpreted as described in [RFC2119]. 153 3. Motivation 155 The existence of NATs and Firewalls at almost all Internet access has 156 had implications on protocols like RTP that were designed to use 157 multiple transport flows. First of all, the NAT/FW traversal 158 solution needs to ensure that all these transport flows are 159 established. This has three consequences: 161 1. Increased delay to perform the transport flow establishment 163 2. The more transport flows, the more state and the more resource 164 consumption in the NAT and Firewalls. When the resource 165 consumption in NAT/FWs reaches their limits, unexpected 166 behaviours usually occur. 168 3. More transport flows means a higher risk that some transport flow 169 fails to be established, thus preventing the application to 170 communicate. 172 Using fewer transport flows reduces the risk of communication 173 failure, improved establishment behaviour and less load on NAT and 174 Firewalls. 176 Furthermore, we note that many RTP-using applications don't utilize 177 any network level Quality of Service (QoS) functions. Nor do they 178 expect or desire any separation in network treatment of its media 179 packets, independent of whether they are audio, video or text. When 180 an application has no such desire, it doesn't need to provide a 181 transport flow structure that simplifies flow based QoS. 183 For applications that don't require different lower-layer QoS for 184 different media types, and that have no special requirements for RTP 185 extensions or RTCP reporting, the requirement to separate different 186 media into different RTP sessions might seem unnecessary. Provided 187 the application accepts that all media flows will get similar RTCP 188 reporting, using the same RTP session for several types of media at 189 once appears a reasonable choice. The architecture ought to be 190 agnostic about the type of media being carried in an RTP session to 191 the extent possible given the constraints of the protocol. 193 4. Overview of Solution 195 The goal of the solution is to enable each RTP session to contain 196 more than just one media type. This includes having multiple RTP 197 sessions containing a given media type, for example having three 198 sessions containing both video and audio. 200 The solution is quite straightforward. The first step is to override 201 the SHOULD and SHOULD NOT language of the RTP specification 202 [RFC3550]. Similar change is needed to a sentence in Section 6 of 203 [RFC3551] that states that "different media types SHALL NOT be 204 interleaved or multiplexed within a single RTP Session". This is 205 resolved by appropriate exception clauses given that this 206 specification and its applicability is followed. 208 Within an RTP session where multiple media types have been configured 209 for use, an SSRC can only send one type of media during its lifetime 210 (i.e., it can switch between different audio codecs, since those are 211 both the same type of media, but cannot switch between audio and 212 video). Different SSRCs MUST be used for the different media 213 sources, the same way multiple media sources of the same media type 214 already have to do. The payload type will inform a receiver which 215 media type the SSRC is being used for. Thus the payload type MUST be 216 unique across all of the payload configurations independent of media 217 type that is used in the RTP session. 219 Some few extra considerations within the RTP sessions also needs to 220 be considered. RTCP bandwidth and regular reporting suppression 221 (RTP/AVPF and RTP/SAVPF) SHOULD be configured to reduce the impact 222 for bit-rate variations between RTP streams and media types. It is 223 also clarified how timeout calculations are to be done to avoid any 224 issues. Certain payload types like FEC also need additional rules. 226 The final important part of the solution to this is to use signalling 227 and ensure that agreement on using multiple media types in an RTP 228 session exists, and how that then is configured. This memo describes 229 some existing requirements, while an external reference defines how 230 this is accomplished in SDP. 232 5. Applicability 234 This specification has limited applicability, and anyone intending to 235 use it needs to ensure that their application and usage meets the 236 below criteria. 238 5.1. Usage of the RTP session 240 Before choosing to use this specification, an application implementer 241 needs to ensure that they don't have a need for different RTP 242 sessions between the media types for some reason. The main rule is 243 that if one expects to have equal treatment of all media packets, 244 then this specification might be suitable. The equal treatment 245 include anything from network level up to RTCP reporting and 246 feedback. The document Guidelines for using the Multiplexing 247 Features of RTP [I-D.ietf-avtcore-multiplex-guidelines] gives more 248 detailed guidance on aspects to consider when choosing how to use RTP 249 and specifically sessions. 251 There is some work in progress 252 [I-D.westerlund-avtcore-transport-multiplexing] that attempt to 253 address a solution for RTP-using applications that need or would 254 prefer multiple RTP sessions, but do not require the 255 functionalities or behaviours that multiple transport flows give. 257 The second important consideration is the resulting behaviour when 258 media flows to be sent within a single RTP session does not have 259 similar RTCP requirements. There are limitations in the RTCP timing 260 rules, and this implies a common RTCP reporting interval across all 261 participants in a session. If an RTP session contains flows with 262 very different RTCP requirements, for example due to RTP Streams 263 bandwidth consumption and packet rate, for example low-rate audio 264 coupled with high-quality video, this can result in either excessive 265 or insufficient RTCP for some flows, depending how the RTCP session 266 bandwidth, and hence reporting interval, is configured. This is 267 discussed further in Section 6.4. 269 5.2. Signalled Support 271 Usage of this specification is not compatible with anyone following 272 RFC 3550 and intending to have different RTP sessions for each media 273 type. Therefore there needs to be mutual agreement to use multiple 274 media types in one RTP session by all participants within that RTP 275 session. This agreement has to be determined using signalling in 276 most cases. 278 This requirement can be a problem for signalling solutions that can't 279 negotiate with all participants. For declarative signalling 280 solutions, mandating that the session is using multiple media types 281 in one RTP session can be a way of attempting to ensure that all 282 participants in the RTP session follow the requirement. However, for 283 signalling solutions that lack methods for enforcing that a receiver 284 supports a specific feature, this can still cause issues. 286 5.3. Homogeneous Multi-party 288 In multiparty communication scenarios it is important to separate two 289 different cases. One case is where the RTP session contains multiple 290 participants in a common RTP session. This occurs for example in Any 291 Source Multicast (ASM) and Relay (Transport Translator) topologies as 292 defined in RTP Topologies [I-D.ietf-avtcore-rtp-topologies-update]. 293 It can also occur in some implementations of RTP mixers that share 294 the same SSRC/CSRC space across all participants. The second case is 295 when the RTP session is terminated in a middlebox and the other 296 participants sources are projected or switched into each RTP session 297 and rewritten on RTP header level including SSRC mappings. 299 For the first case, with a common RTP session or at least shared 300 SSRC/CSRC values, all participants in multiparty communication are 301 REQUIRED to support multiple media types in an RTP session. An 302 participant using two or more RTP sessions towards a multiparty 303 session can't be collapsed into a single session with multiple media 304 types. The reason is that in case of multiple RTP sessions, the same 305 SSRC value can be use in both RTP sessions without any issues, but 306 when collapsed to a single session there is an SSRC collision. In 307 addition some collisions can't be represented in the multiple 308 separate RTP sessions. For example, in a session with audio and 309 video, an SSRC value used for video will not show up in the Audio RTP 310 session at the participant using multiple RTP sessions, and thus not 311 trigger any collision handling. Thus any application using this type 312 of RTP session structure MUST have a homogeneous support for multiple 313 media types in one RTP session, or be forced to insert a translator 314 node between that participant and the rest of the RTP session. 316 For the second case of separate RTP sessions for each multiparty 317 participant and a central node it is possible to have a mix of single 318 RTP session users and multiple RTP session users as long as one is 319 willing to remap the SSRCs used by a participant with multiple RTP 320 sessions into non-used values in the single RTP session SSRC space 321 for each of the participants using a single RTP session with multiple 322 media types. It can be noted that this type of implementation has to 323 understand all types of RTP/RTCP extension being used in the RTP 324 sessions to correctly be able to translate them between the RTP 325 sessions. It might also suffer issues due to differencies in 326 configured RTCP bandwidth and other parameters between the RTP 327 sessions. It can also negatively impact the possibility for loop 328 detection, as SSRC/CSRC can't be used to detect the loops, instead 329 some other RTP stream or media source identity name space that is 330 common across all interconnect parts are needed. 332 5.4. Reduced number of Payload Types 334 An RTP session with multiple media types in it have only a single 335 7-bit Payload Type range for all its payload types. Within the 128 336 available values, only 96 or less if "Multiplexing RTP Data and 337 Control Packets on a Single Port" [RFC5761] is used, all the 338 different RTP payload configurations for all the media types need to 339 fit in the available space. For most applications this will not be a 340 real problem, but the limitation exists and could be encountered. 342 5.5. Stream Differentiation 344 If network level differentiation of the RTP streams with different 345 media types is desired, using this specification can cause severe 346 limitations. All RTP streams in an RTP session, independent of the 347 media type, will be sent over the same underlying transport flow. 348 Any flow-based Quality of Service (QoS) mechanism will be unable to 349 provide differentiated treatment between different media types, e.g. 350 to prioritize audio over video. If differentiated treatment is 351 desired using flow-based QoS, separate RTP sessions over different 352 underlying transport flows needs to be used. 354 Marking-based QoS schemes like DiffServ can be affected if a network 355 ingress is the one that performs, markings based on flows. Endpoint 356 marking where the network API supports marking on individual packet 357 level will be unaffected by this specification. However, there exist 358 limitations, as discussed in [I-D.ietf-dart-dscp-rtp], on how 359 different traffic classes can be applied on different packets or RTP 360 streams within a single transport flow. 362 5.6. Non-compatible Extensions 364 There exist some RTP and RTCP extensions that rely on the existence 365 of multiple RTP sessions. If the goal of using an RTP session with 366 multiple media types is to have only a single RTP session, then these 367 extensions can't be used. If one has no need to have different RTP 368 sessions for the media types but is willing to have multiple RTP 369 sessions, one for the main media transmission and one for the 370 extension, they can be used. It is to be noted that this assumes 371 that it is possible to get the extension working when the related RTP 372 session contains multiple media types. 374 Identified RTP/RTCP extensions that require multiple RTP Sessions 375 are: 377 RTP Retransmission: RTP Retransmission [RFC4588] has a session 378 multiplexed mode. It also has a SSRC multiplexed mode that can be 379 used instead. So use the mode that is suitable for the RTP 380 application. 382 XOR-Based FEC: The RTP Payload Format for Generic Forward Error 383 Correction [RFC5109] and its predecessor [RFC2733] requires a 384 separate RTP session unless the FEC data is carried in RTP Payload 385 for Redundant Audio Data [RFC2198]. However, using the Generic 386 FEC with the Redundancy payload has another set of restrictions, 387 see Section 7.2. 389 Note that the Source-Specific Media Attributes [RFC5576] 390 specification defines an SDP syntax (the "FEC" semantic of the 391 "ssrc-group" attribute) to signal FEC relationships between 392 multiple RTP streams within a single RTP session. However, this 393 can't be used as the FEC repair packets need to have the same SSRC 394 value as the source packets being protected. [RFC5576] does not 395 normatively update and resolve that restriction. There is ongoing 396 work on an ULP extension to allow it be use FEC RTP streams within 397 the same RTP Session as the source stream 398 [I-D.lennox-payload-ulp-ssrc-mux]. 400 6. RTP Session Specification 402 This section defines what needs to be done or avoided to make an RTP 403 session with multiple media types function without issues. 405 6.1. RTP Session 407 Section 5.2 of "RTP: A Transport Protocol for Real-Time Applications" 408 [RFC3550] states: 410 For example, in a teleconference composed of audio and video media 411 encoded separately, each medium SHOULD be carried in a separate 412 RTP session with its own destination transport address. 414 Separate audio and video streams SHOULD NOT be carried in a single 415 RTP session and demultiplexed based on the payload type or SSRC 416 fields. 418 This specification changes both of these sentences. The first 419 sentence is changed to: 421 For example, in a teleconference composed of audio and video media 422 encoded separately, each medium SHOULD be carried in a separate 423 RTP session with its own destination transport address, unless 424 specification [RFCXXXX] is followed and the application meets the 425 applicability constraints. 427 The second sentence is changed to: 429 Separate audio and video media sources SHOULD NOT be carried in a 430 single RTP session and demultiplexed based on the payload type or 431 SSRC fields, unless multiplexed based on both SSRC and payload 432 type and usage meets what Multiple Media Types in an RTP Session 433 [RFCXXXX] specifies. 435 Second paragraph of Section 6 in RTP Profile for Audio and Video 436 Conferences with Minimal Control [RFC3551] says: 438 The payload types currently defined in this profile are assigned 439 to exactly one of three categories or media types: audio only, 440 video only and those combining audio and video. The media types 441 are marked in Tables 4 and 5 as "A", "V" and "AV", respectively. 442 Payload types of different media types SHALL NOT be interleaved or 443 multiplexed within a single RTP session, but multiple RTP sessions 444 MAY be used in parallel to send multiple media types. An RTP 445 source MAY change payload types within the same media type during 446 a session. See the section "Multiplexing RTP Sessions" of RFC 447 3550 for additional explanation. 449 This specifications purpose is to violate that existing SHALL NOT 450 under certain conditions. Thus also this sentence has to be changed 451 to allow for multiple media type's payload types in the same session. 452 The above sentence is changed to: 454 Payload types of different media types SHALL NOT be interleaved or 455 multiplexed within a single RTP session unless as specified and 456 under the restriction in Multiple Media Types in an RTP Session 457 [RFCXXXX]. Multiple RTP sessions MAY be used in parallel to send 458 multiple media types. 460 RFC-Editor Note: Please replace RFCXXXX with the RFC number of this 461 specification when assigned. 463 We can now go on and discuss the five bullets that are motivating the 464 previous in Section 5.2 of the RTP Specification [RFC3550]. They are 465 repeated here for the reader's convenience: 467 1. If, say, two audio streams shared the same RTP session and the 468 same SSRC value, and one were to change encodings and thus 469 acquire a different RTP payload type, there would be no general 470 way of identifying which stream had changed encodings. 472 2. An SSRC is defined to identify a single timing and sequence 473 number space. Interleaving multiple payload types would require 474 different timing spaces if the media clock rates differ and would 475 require different sequence number spaces to tell which payload 476 type suffered packet loss. 478 3. The RTCP sender and receiver reports (see Section 6.4 of RFC 479 3550) can only describe one timing and sequence number space per 480 SSRC and do not carry a payload type field. 482 4. An RTP mixer would not be able to combine interleaved streams of 483 incompatible media into one stream. 485 5. Carrying multiple media in one RTP session precludes: the use of 486 different network paths or network resource allocations if 487 appropriate; reception of a subset of the media if desired, for 488 example just audio if video would exceed the available bandwidth; 489 and receiver implementations that use separate processes for the 490 different media, whereas using separate RTP sessions permits 491 either single- or multiple-process implementations. 493 Bullets 1 to 3 are all related to that each media source has to use 494 one or more unique SSRCs to avoid these issues as mandated below 495 (Section 6.2). Bullet 4 can be served by two arguments, first of all 496 each SSRC will be associated with a specific media type, communicated 497 through the RTP payload type, allowing a middlebox to do media type 498 specific operations. The second argument is that in many contexts 499 blind combining without additional contexts are anyway not suitable. 500 Regarding bullet 5 this is a understood and explicitly stated 501 applicability limitations for the method described in this document. 503 6.2. Sender Source Restrictions 505 A SSRC in the RTP session MUST only send one media type (audio, 506 video, text etc.) during the SSRC's lifetime. The main motivation is 507 that a given SSRC has its own RTP timestamp and sequence number 508 spaces. The same way that you can't send two encoded streams of 509 audio on the same SSRC, you can't send one encoded audio and one 510 encoded video stream on the same SSRC. Each encoded stream when made 511 into an RTP stream needs to have the sole control over the sequence 512 number and timestamp space. If not, one would not be able to detect 513 packet loss for that particular encoded stream. Nor can one easily 514 determine which clock rate a particular SSRCs timestamp will increase 515 with. For additional arguments why RTP payload type based 516 multiplexing of multiple media sources doesn't work see 517 [I-D.ietf-avtcore-multiplex-guidelines]. 519 6.3. Payload Type Applicability 521 Most Payload Types have a native media type, like an audio codec is 522 natural belonging to the audio media type. However, there exist a 523 number of RTP payload types that don't have a native media type. For 524 example, transport robustness mechanisms like RTP Retransmission 525 [RFC4588] and Generic FEC [RFC5109] inherit their media type from 526 what they protect. RTP Retransmission is explicitly bound to the 527 payload type it is protecting, and thus will inherit it. However 528 Generic FEC is a excellent example of an RTP payload type that has no 529 natural media type. The media type for what it protects is not 530 relevant as it is the recovered RTP packets that have a particular 531 media type, and thus Generic FEC is best categorized as an 532 application media type. 534 The above discussion is relevant to what limitations exist for RTP 535 payload type usage within an RTP session that has multiple media 536 types. In fact this document (Section 7.2) suggest that for usage of 537 Generic FEC (XOR-based) as defined in RFC 5109 can actually use a 538 single media type when used with independent RTP sessions for source 539 and repair data. 541 Note a particular SSRC carrying Generic FEC will clearly only 542 protect a specific SSRC and thus that instance is bound to the 543 SSRC's media type. For this specific case, it is possible to have 544 one be applicable to both. However, in cases when the signalling 545 is setup to enable fall back to using separate RTP sessions, then 546 using a different media type, e.g. application, than the media 547 being protected can create issues. 549 6.4. RTCP Considerations 551 Guidelines for handling RTCP when sending multiple RTP streams with 552 disparate rates in a single RTP session are outlined in 553 [I-D.ietf-avtcore-rtp-multi-stream]. These guidelines apply when 554 sending multiple types of media in a single RTP session if the 555 different types of media have different rates. 557 7. Extension Considerations 559 This section discusses the impact on some RTP/RTCP extensions due to 560 usage of multiple media types in on RTP session. Only extensions 561 where something worth noting has been included. 563 7.1. RTP Retransmission 565 SSRC-multiplexed RTP retransmission [RFC4588] is actually very 566 straightforward. Each retransmission RTP payload type is explicitly 567 connected to an associated payload type. If retransmission is only 568 to be used with a subset of all payload types, this is not a problem, 569 as it will be evident from the retransmission payload types which 570 payload types have retransmission enabled for them. 572 Session-multiplexed RTP retransmission is also possible to use where 573 an retransmission session contains the retransmissions of the 574 associated payload types in the source RTP session. The only 575 difference to the previous case is if the source RTP session is one 576 which contains multiple media types. This results in the 577 retransmission streams in the RTP session for the retransmission 578 having multiple associated media types. 580 When using SDP signalling for a multiple media type RTP session, i.e. 581 BUNDLE [I-D.ietf-mmusic-sdp-bundle-negotiation], the session 582 multiplexed case do require some recommendations on how to signal 583 this. To avoid breaking the semantics of the FID grouping as defined 584 by [RFC5888] each media line can only be included in one FID group. 585 FID is used by RTP retransmission to indicate the SDP media lines 586 that is a source and retransmission pair. Thus, for SDP using 587 BUNDLE, each original media source (m= line) that is retransmitted 588 needs a corresponding media line in the retransmission RTP session. 589 In case there are multiple media lines for retransmission, these 590 media lines will form a independent BUNDLE group from the BUNDLE 591 group with the source streams. 593 Below is an SDP example (Figure 1) which shows the grouping 594 structures. This example is not legal SDP and only the most 595 important attributes has been left in place. Note that this SDP is 596 not an initial BUNDLE offer. As can be seen there are two bundle 597 groups, one for the source RTP session and one for the 598 retransmissions. Then each of the media sources are grouped with its 599 retransmission flow using FID, resulting in three more groupings. 601 a=group:BUNDLE foo bar fiz 602 a=group:BUNDLE zoo kelp glo 603 a=group:FID foo zoo 604 a=group:FID bar kelp 605 a=group:FID fiz glo 606 m=audio 10000 RTP/AVP 0 607 a=mid:foo 608 a=rtpmap:0 PCMU/8000 609 m=video 10000 RTP/AVP 31 610 a=mid:bar 611 a=rtpmap:31 H261/90000 612 m=video 10000 RTP/AVP 31 613 a=mid:fiz 614 a=rtpmap:31 H261/90000 615 m=audio 40000 RTP/AVPF 99 616 a=rtpmap:99 rtx/90000 617 a=fmtp:99 apt=0;rtx-time=3000 618 a=mid:zoo 619 m=video 40000 RTP/AVPF 100 620 a=rtpmap:100 rtx/90000 621 a=fmtp:199 apt=31;rtx-time=3000 622 a=mid:kelp 623 m=video 40000 RTP/AVPF 100 624 a=rtpmap:100 rtx/90000 625 a=fmtp:199 apt=31;rtx-time=3000 626 a=mid:glo 628 Figure 1: SDP example of Session Multiplexed RTP Retransmission 630 7.2. Generic FEC 632 The RTP Payload Format for Generic Forward Error Correction 633 [RFC5109], and also its predecessor [RFC2733], requires some 634 considerations, and they are different depending on what type of 635 configuration of usage one has. 637 Independent RTP Sessions, i.e. where source and repair data are sent 638 in different RTP sessions. As this mode of configuration requires 639 different RTP session, there has to be at least one RTP session for 640 source data, this session can be one using multiple media types. The 641 repair session only needs one RTP Payload type indicating repair 642 data, i.e. x/ulpfec or x/parityfec depending if RFC 5109 or RFC 2733 643 is used. The media type in this session is not relevant and can in 644 theory be any of the defined ones. It is RECOMMENDED that one uses 645 "Application". 647 If one uses SDP signalling with BUNDLE 648 [I-D.ietf-mmusic-sdp-bundle-negotiation], then the RTP session 649 carrying the FEC streams will be its own BUNDLE group. The media 650 line with the source stream for the FEC and the FEC stream's media 651 line will be grouped using media line grouping using the FEC or FEC- 652 FR [RFC5956] grouping. This is very similar to the situation that 653 arise for RTP retransmission with session multiplexing discussed 654 above inSection 7.1. 656 In stream, using RTP Payload for Redundant Audio Data [RFC2198] 657 combining repair and source data in the same packets. This is 658 possible to use within a single RTP session. However, the usage and 659 configuration of the payload types can create an issue. First of all 660 it might be necessary to have one payload type per media type for the 661 FEC repair data payload format, i.e. one for audio/ulpfec and one for 662 text/ulpfec if audio and text are combined in an RTP session. 663 Secondly each combination of source payload and its FEC repair data 664 has to be an explicit configured payload type. This has potential 665 for making the limitation of RTP payload types available into a real 666 issue. 668 8. Signalling 670 The Signalling requirements 672 Establishing an RTP session with multiple media types requires 673 signalling. This signalling needs to fulfil the following 674 requirements: 676 1. Ensure that any participant in the RTP session is aware that this 677 is an RTP session with multiple media types. 679 2. Ensure that the payload types in use in the RTP session are using 680 unique values, with no overlap between the media types. 682 3. Configure the RTP session level parameters, such as RTCP RR and 683 RS bandwidth, AVPF trr-int, underlying transport, the RTCP 684 extensions in use, and security parameters, commonly for the RTP 685 session. 687 4. RTP and RTCP functions that can be bound to a particular media 688 type SHOULD be reused when possible also for other media types, 689 instead of having to be configured for multiple code-points. 690 Note: In some cases one will not have a choice but to use 691 multiple configurations. 693 8.1. SDP-Based Signalling 695 The signalling of multiple media types in one RTP session in SDP is 696 specified in "Multiplexing Negotiation Using Session Description 697 Protocol (SDP) Port Numbers" 698 [I-D.ietf-mmusic-sdp-bundle-negotiation]. 700 9. IANA Considerations 702 This document makes no request of IANA. 704 Note to RFC Editor: this section is to be removed on publication as 705 an RFC. 707 10. Security Considerations 709 Having an RTP session with multiple media types doesn't change the 710 methods for securing a particular RTP session. One possible 711 difference is that the different media have often had different 712 security requirements. When combining multiple media types in one 713 session, their security requirements also have to be combined by 714 selecting the most demanding for each property. Thus having multiple 715 media types can result in increased overhead for security for some 716 media types to ensure that all requirements are meet. 718 Otherwise, the recommendations for how to configure and RTP session 719 do not add any additional requirements compared to normal RTP, except 720 for the need to be able to ensure that the participants are aware 721 that it is a multiple media type session. If not that is ensured it 722 can cause issues in the RTP session for both the unaware and the 723 aware one. Similar issues can also be produced in an normal RTP 724 session by creating configurations for different end-points that 725 doesn't match each other. 727 11. Acknowledgements 729 The authors would like to thank Christer Holmberg, Gunnar Hellstroem, 730 and Charles Eckel for the feedback on the document. 732 12. References 734 12.1. Normative References 736 [I-D.ietf-avtcore-rtp-multi-stream] 737 Lennox, J., Westerlund, M., Wu, W., and C. Perkins, 738 "Sending Multiple Media Streams in a Single RTP Session", 739 draft-ietf-avtcore-rtp-multi-stream-06 (work in progress), 740 October 2014. 742 [I-D.ietf-mmusic-sdp-bundle-negotiation] 743 Holmberg, C., Alvestrand, H., and C. Jennings, 744 "Negotiating Media Multiplexing Using the Session 745 Description Protocol (SDP)", draft-ietf-mmusic-sdp-bundle- 746 negotiation-17 (work in progress), March 2015. 748 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 749 Requirement Levels", BCP 14, RFC 2119, March 1997. 751 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 752 Jacobson, "RTP: A Transport Protocol for Real-Time 753 Applications", STD 64, RFC 3550, July 2003. 755 [RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and 756 Video Conferences with Minimal Control", STD 65, RFC 3551, 757 July 2003. 759 12.2. Informative References 761 [I-D.ietf-avtcore-multiplex-guidelines] 762 Westerlund, M., Perkins, C., and H. Alvestrand, 763 "Guidelines for using the Multiplexing Features of RTP to 764 Support Multiple Media Streams", draft-ietf-avtcore- 765 multiplex-guidelines-03 (work in progress), October 2014. 767 [I-D.ietf-avtcore-rtp-topologies-update] 768 Westerlund, M. and S. Wenger, "RTP Topologies", draft- 769 ietf-avtcore-rtp-topologies-update-06 (work in progress), 770 March 2015. 772 [I-D.ietf-avtext-rtp-grouping-taxonomy] 773 Lennox, J., Gross, K., Nandakumar, S., and G. Salgueiro, 774 "A Taxonomy of Grouping Semantics and Mechanisms for Real- 775 Time Transport Protocol (RTP) Sources", draft-ietf-avtext- 776 rtp-grouping-taxonomy-06 (work in progress), March 2015. 778 [I-D.ietf-dart-dscp-rtp] 779 Black, D. and P. Jones, "Differentiated Services 780 (DiffServ) and Real-time Communication", draft-ietf-dart- 781 dscp-rtp-10 (work in progress), November 2014. 783 [I-D.lennox-payload-ulp-ssrc-mux] 784 Lennox, J., "Supporting Source-Multiplexing of the Real- 785 Time Transport Protocol (RTP) Payload for Generic Forward 786 Error Correction", draft-lennox-payload-ulp-ssrc-mux-00 787 (work in progress), February 2013. 789 [I-D.westerlund-avtcore-transport-multiplexing] 790 Westerlund, M. and C. Perkins, "Multiplexing Multiple RTP 791 Sessions onto a Single Lower-Layer Transport", draft- 792 westerlund-avtcore-transport-multiplexing-07 (work in 793 progress), October 2013. 795 [RFC2198] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., 796 Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse- 797 Parisis, "RTP Payload for Redundant Audio Data", RFC 2198, 798 September 1997. 800 [RFC2733] Rosenberg, J. and H. Schulzrinne, "An RTP Payload Format 801 for Generic Forward Error Correction", RFC 2733, December 802 1999. 804 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 805 Description Protocol", RFC 4566, July 2006. 807 [RFC4588] Rey, J., Leon, D., Miyazaki, A., Varsa, V., and R. 808 Hakenberg, "RTP Retransmission Payload Format", RFC 4588, 809 July 2006. 811 [RFC5109] Li, A., "RTP Payload Format for Generic Forward Error 812 Correction", RFC 5109, December 2007. 814 [RFC5576] Lennox, J., Ott, J., and T. Schierl, "Source-Specific 815 Media Attributes in the Session Description Protocol 816 (SDP)", RFC 5576, June 2009. 818 [RFC5761] Perkins, C. and M. Westerlund, "Multiplexing RTP Data and 819 Control Packets on a Single Port", RFC 5761, April 2010. 821 [RFC5888] Camarillo, G. and H. Schulzrinne, "The Session Description 822 Protocol (SDP) Grouping Framework", RFC 5888, June 2010. 824 [RFC5956] Begen, A., "Forward Error Correction Grouping Semantics in 825 the Session Description Protocol", RFC 5956, September 826 2010. 828 Authors' Addresses 829 Magnus Westerlund 830 Ericsson 831 Farogatan 6 832 SE-164 80 Kista 833 Sweden 835 Phone: +46 10 714 82 87 836 Email: magnus.westerlund@ericsson.com 838 Colin Perkins 839 University of Glasgow 840 School of Computing Science 841 Glasgow G12 8QQ 842 United Kingdom 844 Email: csp@csperkins.org 846 Jonathan Lennox 847 Vidyo, Inc. 848 433 Hackensack Avenue 849 Seventh Floor 850 Hackensack, NJ 07601 851 US 853 Email: jonathan@vidyo.com