idnits 2.17.1 draft-ietf-avtcore-multi-media-rtp-session-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC3550, updated by this document, for RFC5378 checks: 1998-04-07) (Using the creation date from RFC3551, updated by this document, for RFC5378 checks: 1997-03-27) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (October 08, 2014) is 3486 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFCXXXX' is mentioned on line 466, but not defined == Outdated reference: A later version (-11) exists of draft-ietf-avtcore-rtp-multi-stream-05 == Outdated reference: A later version (-54) exists of draft-ietf-mmusic-sdp-bundle-negotiation-11 == Outdated reference: A later version (-12) exists of draft-ietf-avtcore-multiplex-guidelines-02 -- Obsolete informational reference (is this intentional?): RFC 2733 (Obsoleted by RFC 5109) -- Obsolete informational reference (is this intentional?): RFC 4566 (Obsoleted by RFC 8866) -- Obsolete informational reference (is this intentional?): RFC 5117 (Obsoleted by RFC 7667) Summary: 0 errors (**), 0 flaws (~~), 5 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 AVTCORE WG M. Westerlund 3 Internet-Draft Ericsson 4 Updates: 3550, 3551 (if approved) C. Perkins 5 Intended status: Standards Track University of Glasgow 6 Expires: April 11, 2015 J. Lennox 7 Vidyo 8 October 08, 2014 10 Sending Multiple Types of Media in a Single RTP Session 11 draft-ietf-avtcore-multi-media-rtp-session-06 13 Abstract 15 This document specifies how an RTP session can contain media streams 16 with media from multiple media types such as audio, video, and text. 17 This has been restricted by the RTP Specification, and thus this 18 document updates RFC 3550 and RFC 3551 to enable this behaviour for 19 applications that satisfy the applicability for using multiple media 20 types in a single RTP session. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at http://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on April 11, 2015. 39 Copyright Notice 41 Copyright (c) 2014 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 57 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3 58 3. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 4 59 4. Overview of Solution . . . . . . . . . . . . . . . . . . . . 5 60 5. Applicability . . . . . . . . . . . . . . . . . . . . . . . . 6 61 5.1. Usage of the RTP session . . . . . . . . . . . . . . . . 6 62 5.2. Signalled Support . . . . . . . . . . . . . . . . . . . . 7 63 5.3. Homogeneous Multi-party . . . . . . . . . . . . . . . . . 7 64 5.4. Reduced number of Payload Types . . . . . . . . . . . . . 8 65 5.5. Stream Differentiation . . . . . . . . . . . . . . . . . 8 66 5.6. Non-compatible Extensions . . . . . . . . . . . . . . . . 8 67 6. RTP Session Specification . . . . . . . . . . . . . . . . . . 9 68 6.1. RTP Session . . . . . . . . . . . . . . . . . . . . . . . 9 69 6.2. Sender Source Restrictions . . . . . . . . . . . . . . . 12 70 6.3. Payload Type Applicability . . . . . . . . . . . . . . . 12 71 6.4. RTCP Considerations . . . . . . . . . . . . . . . . . . . 12 72 7. Extension Considerations . . . . . . . . . . . . . . . . . . 13 73 7.1. RTP Retransmission . . . . . . . . . . . . . . . . . . . 13 74 7.2. Generic FEC . . . . . . . . . . . . . . . . . . . . . . . 13 75 8. Signalling . . . . . . . . . . . . . . . . . . . . . . . . . 14 76 8.1. SDP-Based Signalling . . . . . . . . . . . . . . . . . . 15 77 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 78 10. Security Considerations . . . . . . . . . . . . . . . . . . . 15 79 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 15 80 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 15 81 12.1. Normative References . . . . . . . . . . . . . . . . . . 15 82 12.2. Informative References . . . . . . . . . . . . . . . . . 16 83 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 17 85 1. Introduction 87 When the Real-time Transport Protocol (RTP) [RFC3550] was designed, 88 close to 20 years ago, IP networks were different to those deployed 89 at the time of this writing. The virtually ubiquitous deployment of 90 Network Address Translators (NAT) and Firewalls has since increased 91 the cost and likely-hood of communication failure when using many 92 different transport flows. Hence, there is pressure to reduce the 93 number of concurrent transport flows used by RTP applications. 95 The RTP specification recommends against sending several different 96 types of media, for example audio and video, in a single RTP session. 98 The RTP profile for Audio and Video Conferences with Minimal Control 99 (RTP/AVP) [RFC3551] mandates a similar restriction. The motivation 100 for these limitations is partly to allow lower layer Quality of 101 Service (QoS) mechanisms to be used, and partly due to limitations of 102 the RTCP timing rules that assumes all media in a session to have 103 similar bandwidth. The Session Description Protocol (SDP) [RFC4566] 104 is one of the dominant signalling methods for establishing RTP 105 sessions, and has enforced this rule by not allowing multiple media 106 types for a given destination or set of ICE candidates. 108 The fact that these limitations have been in place for so long, in 109 addition to RFC 3550 being written without fully considering the use 110 of multiple media types in an RTP session, results in a number of 111 issues when allowing this behaviour. This memo updates [RFC3550] and 112 [RFC3551] with important considerations regarding applicability and 113 functionality when using multiple types of media in an RTP session, 114 including normative specification of behaviour. This memo makes no 115 changes to RTP behaviour when using multiple streams of media of the 116 same type (e.g., multiple audio streams or multiple video streams) in 117 a single RTP session. 119 This memo is structured as follows. First, some basic definitions 120 are provided. This is followed by a background that discusses the 121 motivation in more detail. A overview of the solution of how to 122 provide multiple media types in one RTP session is then presented. 123 Next is the formal applicability this specification have followed by 124 the normative specification. This is followed by a discussion how 125 some RTP/RTCP Extensions are expected to function in the case of 126 multiple media types in one RTP session. A specification of the 127 requirements on signalling from this specification and a look how 128 this is realized in SDP using Bundle 129 [I-D.ietf-mmusic-sdp-bundle-negotiation]. The memo ends with the 130 security considerations. 132 2. Definitions 134 The following terms are used with supplied definitions: 136 Endpoint: A single entity sending or receiving RTP packets. It can 137 be decomposed into several functional blocks, but as long as it 138 behaves as a single RTP stack entity it is classified as a single 139 endpoint. 141 Media Stream: A sequence of RTP packets using a single SSRC that 142 together carries part or all of the content of a specific Media 143 Type from a specific sender source within a given RTP session. 145 Media Type: Audio, video, text or application whose form and meaning 146 are defined by a specific real-time application. 148 QoS: Quality of Service, i.e. network mechanisms that intended to 149 ensure that the packets within a flow or with a specific marking 150 are transported with certain properties. 152 RTP Session: As defined by [RFC3550], the endpoints belonging to the 153 same RTP Session are those that share a single SSRC space. That 154 is, those endpoints can see an SSRC identifier transmitted by any 155 one of the other endpoints. An endpoint can receive an SSRC 156 either as SSRC or as CSRC in RTP and RTCP packets. Thus, the RTP 157 Session scope is decided by the endpoints' network interconnection 158 topology, in combination with RTP and RTCP forwarding strategies 159 deployed by endpoints and any interconnecting middle nodes. 161 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 162 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 163 document are to be interpreted as described in [RFC2119]. 165 3. Motivation 167 The existence of NATs and Firewalls at almost all Internet access has 168 had implications on protocols like RTP that were designed to use 169 multiple transport flows. First of all, the NAT/FW traversal 170 solution needs to ensure that all these transport flows are 171 established. This has three consequences: 173 1. Increased delay to perform the transport flow establishment 175 2. The more transport flows, the more state and the more resource 176 consumption in the NAT and Firewalls. When the resource 177 consumption in NAT/FWs reaches their limits, unexpected 178 behaviours usually occur. 180 3. More transport flows means a higher risk that some transport flow 181 fails to be established, thus preventing the application to 182 communicate. 184 Using fewer transport flows reduces the risk of communication 185 failure, improved establishment behaviour and less load on NAT and 186 Firewalls. 188 Furthermore, we note that many RTP-using applications don't utilize 189 any network level Quality of Service (QoS) functions. Nor do they 190 expect or desire any separation in network treatment of its media 191 packets, independent of whether they are audio, video or text. When 192 an application has no such desire, it doesn't need to provide a 193 transport flow structure that simplifies flow based QoS. 195 For applications that don't require different lower-layer QoS for 196 different media types, and that have no special requirements for RTP 197 extensions or RTCP reporting, the requirement to separate different 198 media into different RTP sessions might seem unnecessary. Provided 199 the application accepts that all media flows will get similar RTCP 200 reporting, using the same RTP session for several types of media at 201 once appears a reasonable choice. The architecture ought to be 202 agnostic about the type of media being carried in an RTP session to 203 the extent possible given the constraints of the protocol. 205 4. Overview of Solution 207 The goal of the solution is to enable each RTP session to contain 208 more than just one media type. This includes having multiple RTP 209 sessions containing a given media type, for example having three 210 sessions containing both video and audio. 212 The solution is quite straightforward. The first step is to override 213 the SHOULD and SHOULD NOT language of the RTP specification 214 [RFC3550]. Similar change is needed to a sentence in Section 6 of 215 [RFC3551] that states that "different media types SHALL NOT be 216 interleaved or multiplexed within a single RTP Session". This is 217 resolved by appropriate exception clauses given that this 218 specification and its applicability is followed. 220 Within an RTP session where multiple media types have been configured 221 for use, an SSRC can only send one type of media during its lifetime 222 (i.e., it can switch between different audio codecs, since those are 223 both the same type of media, but cannot switch between audio and 224 video). Different SSRCs MUST be used for the different media 225 sources, the same way multiple media sources of the same media type 226 already have to do. The payload type will inform a receiver which 227 media type the SSRC is being used for. Thus the payload type MUST be 228 unique across all of the payload configurations independent of media 229 type that is used in the RTP session. 231 Some few extra considerations within the RTP sessions also needs to 232 be considered. RTCP bandwidth and regular reporting suppression (RTP 233 /AVPF and RTP/SAVPF) SHOULD be configured to reduce the impact for 234 bit-rate variations between streams and media types. It is also 235 clarified how timeout calculations are to be done to avoid any 236 issues. Certain payload types like FEC also need additional rules. 238 The final important part of the solution to this is to use signalling 239 and ensure that agreement on using multiple media types in an RTP 240 session exists, and how that then is configured. This memo describes 241 some existing requirements, while an external reference defines how 242 this is accomplished in SDP. 244 5. Applicability 246 This specification has limited applicability, and anyone intending to 247 use it needs to ensure that their application and usage meets the 248 below criteria. 250 5.1. Usage of the RTP session 252 Before choosing to use this specification, an application implementer 253 needs to ensure that they don't have a need for different RTP 254 sessions between the media types for some reason. The main rule is 255 that if one expects to have equal treatment of all media packets, 256 then this specification might be suitable. The equal treatment 257 include anything from network level up to RTCP reporting and 258 feedback. The document Guidelines for using the Multiplexing 259 Features of RTP [I-D.ietf-avtcore-multiplex-guidelines] gives more 260 detailed guidance on aspects to consider when choosing how to use RTP 261 and specifically sessions. RTP-using applications that need or would 262 prefer multiple RTP sessions, but do not require the functionalities 263 or behaviours that multiple transport flows give, can consider using 264 Multiple RTP Sessions on a Single Lower-Layer Transport 265 [I-D.westerlund-avtcore-transport-multiplexing]. 267 The second important consideration is the resulting behaviour when 268 media flows to be sent within a single RTP session does not have 269 similar RTCP requirements. There are limitations in the RTCP timing 270 rules, and this implies a common RTCP reporting interval across all 271 participants in a session. If an RTP session contains flows with 272 very different RTCP requirements, for example due to media streams 273 bandwidth consumption and packet rate, for example low-rate audio 274 coupled with high-quality video, this can result in either excessive 275 or insufficient RTCP for some flows, depending how the RTCP session 276 bandwidth, and hence reporting interval, is configured. This is 277 discussed further in Section 6.4. 279 5.2. Signalled Support 281 Usage of this specification is not compatible with anyone following 282 RFC 3550 and intending to have different RTP sessions for each media 283 type. Therefore there needs to be mutual agreement to use multiple 284 media types in one RTP session by all participants within that RTP 285 session. This agreement has to be determined using signalling in 286 most cases. 288 This requirement can be a problem for signalling solutions that can't 289 negotiate with all participants. For declarative signalling 290 solutions, mandating that the session is using multiple media types 291 in one RTP session can be a way of attempting to ensure that all 292 participants in the RTP session follow the requirement. However, for 293 signalling solutions that lack methods for enforcing that a receiver 294 supports a specific feature, this can still cause issues. 296 5.3. Homogeneous Multi-party 298 In multiparty communication scenarios it is important to separate two 299 different cases. One case is where the RTP session contains multiple 300 participants in a common RTP session. This occurs for example in Any 301 Source Multicast (ASM) and Transport Translator topologies as defined 302 in RTP Topologies [RFC5117]. It can also occur in some 303 implementations of RTP mixers that share the same SSRC/CSRC space 304 across all participants. The second case is when the RTP session is 305 terminated in a middlebox and the other participants sources are 306 projected or switched into each RTP session and rewritten on RTP 307 header level including SSRC mappings. 309 For the first case, with a common RTP session or at least shared SSRC 310 /CSRC values, all participants in multiparty communication are 311 REQUIRED to support multiple media types in an RTP session. An 312 participant using two or more RTP sessions towards a multiparty 313 session can't be collapsed into a single session with multiple media 314 types. The reason is that in case of multiple RTP sessions, the same 315 SSRC value can be use in both RTP sessions without any issues, but 316 when collapsed to a single session there is an SSRC collision. In 317 addition some collisions can't be represented in the multiple 318 separate RTP sessions. For example, in a session with audio and 319 video, an SSRC value used for video will not show up in the Audio RTP 320 session at the participant using multiple RTP sessions, and thus not 321 trigger any collision handling. Thus any application using this type 322 of RTP session structure MUST have a homogeneous support for multiple 323 media types in one RTP session, or be forced to insert a translator 324 node between that participant and the rest of the RTP session. 326 For the second case of separate RTP sessions for each multiparty 327 participant and a central node it is possible to have a mix of single 328 RTP session users and multiple RTP session users as long as one is 329 willing to remap the SSRCs used by a participant with multiple RTP 330 sessions into non-used values in the single RTP session SSRC space 331 for each of the participants using a single RTP session with multiple 332 media types. It can be noted that this type of implementation has to 333 understand all types of RTP/RTCP extension being used in the RTP 334 sessions to correctly be able to translate them between the RTP 335 sessions. It might also suffer issues due to differencies in 336 configured RTCP bandwidth and other parameters between the RTP 337 sessions. It can also negatively impact the possibility for loop 338 detection, as SSRC/CSRC can't be used to detect the loops, instead 339 some other media stream identity name space that is common across all 340 interconnect parts are needed. 342 5.4. Reduced number of Payload Types 344 An RTP session with multiple media types in it have only a single 345 7-bit Payload Type range for all its payload types. Within the 128 346 available values, only 96 or less if "Multiplexing RTP Data and 347 Control Packets on a Single Port" [RFC5761] is used, all the 348 different RTP payload configurations for all the media types need to 349 fit in the available space. For most applications this will not be a 350 real problem, but the limitation exists and could be encountered. 352 5.5. Stream Differentiation 354 If network level differentiation of the media streams of different 355 media types are desired using this specification can cause severe 356 limitations. All media streams in an RTP session, independent of the 357 media type, will be sent over the same underlying transport flow. 358 Any flow-based Quality of Service (QoS) mechanism will be unable to 359 provide differentiated treatment between different media types, e.g. 360 to prioritize audio over video. If differentiated treatment is 361 desired using flow-based QoS, separate RTP sessions over different 362 underlying transport flows needs to be used. 364 Marking-based QoS scheme like DiffServ can be affected if network 365 ingress is the one that performs markings based on flows. Endpoint 366 marking where the network API supports marking on individual packet 367 level will be unaffected by this specification. However, there exist 368 limitations as discussed in [I-D.ietf-avtcore-multiplex-guidelines] 369 exist for how different traffic classes can be applied on a single 370 RTP media stream. 372 5.6. Non-compatible Extensions 373 There exist some RTP and RTCP extensions that rely on the existence 374 of multiple RTP sessions. If the goal of using an RTP session with 375 multiple media types is to have only a single RTP session, then these 376 extensions can't be used. If one has no need to have different RTP 377 sessions for the media types but is willing to have multiple RTP 378 sessions, one for the main media transmission and one for the 379 extension, they can be used. It is to be noted that this assumes 380 that it is possible to get the extension working when the related RTP 381 session contains multiple media types. 383 Identified RTP/RTCP extensions that require multiple RTP Sessions 384 are: 386 RTP Retransmission: RTP Retransmission [RFC4588] has a session 387 multiplexed mode. It also has a SSRC multiplexed mode that can be 388 used instead. So use the mode that is suitable for the RTP 389 application. 391 XOR-Based FEC: The RTP Payload Format for Generic Forward Error 392 Correction [RFC5109] and its predecessor [RFC2733] requires a 393 separate RTP session unless the FEC data is carried in RTP Payload 394 for Redundant Audio Data [RFC2198]. However, using the Generic 395 FEC with the Redundancy payload has another set of restrictions, 396 see Section 7.2. 398 Note that the Source-Specific Media Attributes [RFC5576] 399 specification defines an SDP syntax (the "FEC" semantic of the 400 "ssrc-group" attribute) to signal FEC relationships between 401 multiple media streams within a single RTP session. However, this 402 can't be used as the FEC repair packets need to have the same SSRC 403 value as the source packets being protected. [RFC5576] does not 404 normatively update and resolve that restriction. There is ongoing 405 work on an ULP extension to allow it be use FEC streams within the 406 same RTP Session as the source stream 407 [I-D.lennox-payload-ulp-ssrc-mux]. 409 6. RTP Session Specification 411 This section defines what needs to be done or avoided to make an RTP 412 session with multiple media types function without issues. 414 6.1. RTP Session 416 Section 5.2 of "RTP: A Transport Protocol for Real-Time Applications" 417 [RFC3550] states: 419 For example, in a teleconference composed of audio and video media 420 encoded separately, each medium SHOULD be carried in a separate 421 RTP session with its own destination transport address. 423 Separate audio and video streams SHOULD NOT be carried in a single 424 RTP session and demultiplexed based on the payload type or SSRC 425 fields. 427 This specification changes both of these sentences. The first 428 sentence is changed to: 430 For example, in a teleconference composed of audio and video media 431 encoded separately, each medium SHOULD be carried in a separate 432 RTP session with its own destination transport address, unless 433 specification [RFCXXXX] is followed and the application meets the 434 applicability constraints. 436 The second sentence is changed to: 438 Separate audio and video streams SHOULD NOT be carried in a single 439 RTP session and demultiplexed based on the payload type or SSRC 440 fields, unless multiplexed based on both SSRC and payload type and 441 usage meets what Multiple Media Types in an RTP Session [RFCXXXX] 442 specifies. 444 Second paragraph of Section 6 in RTP Profile for Audio and Video 445 Conferences with Minimal Control [RFC3551] says: 447 The payload types currently defined in this profile are assigned 448 to exactly one of three categories or media types: audio only, 449 video only and those combining audio and video. The media types 450 are marked in Tables 4 and 5 as "A", "V" and "AV", respectively. 451 Payload types of different media types SHALL NOT be interleaved or 452 multiplexed within a single RTP session, but multiple RTP sessions 453 MAY be used in parallel to send multiple media types. An RTP 454 source MAY change payload types within the same media type during 455 a session. See the section "Multiplexing RTP Sessions" of RFC 456 3550 for additional explanation. 458 This specifications purpose is to violate that existing SHALL NOT 459 under certain conditions. Thus also this sentence has to be changed 460 to allow for multiple media type's payload types in the same session. 461 The above sentence is changed to: 463 Payload types of different media types SHALL NOT be interleaved or 464 multiplexed within a single RTP session unless as specified and 465 under the restriction in Multiple Media Types in an RTP Session 466 [RFCXXXX]. Multiple RTP sessions MAY be used in parallel to send 467 multiple media types. 469 RFC-Editor Note: Please replace RFCXXXX with the RFC number of this 470 specification when assigned. 472 We can now go on and discuss the five bullets that are motivating the 473 previous in Section 5.2 of the RTP Specification [RFC3550]. They are 474 repeated here for the reader's convenience: 476 1. If, say, two audio streams shared the same RTP session and the 477 same SSRC value, and one were to change encodings and thus 478 acquire a different RTP payload type, there would be no general 479 way of identifying which stream had changed encodings. 481 2. An SSRC is defined to identify a single timing and sequence 482 number space. Interleaving multiple payload types would require 483 different timing spaces if the media clock rates differ and would 484 require different sequence number spaces to tell which payload 485 type suffered packet loss. 487 3. The RTCP sender and receiver reports (see Section 6.4 of RFC 488 3550) can only describe one timing and sequence number space per 489 SSRC and do not carry a payload type field. 491 4. An RTP mixer would not be able to combine interleaved streams of 492 incompatible media into one stream. 494 5. Carrying multiple media in one RTP session precludes: the use of 495 different network paths or network resource allocations if 496 appropriate; reception of a subset of the media if desired, for 497 example just audio if video would exceed the available bandwidth; 498 and receiver implementations that use separate processes for the 499 different media, whereas using separate RTP sessions permits 500 either single- or multiple-process implementations. 502 Bullets 1 to 3 are all related to that each media source has to use 503 one or more unique SSRCs to avoid these issues as mandated below 504 (Section 6.2). Bullet 4 can be served by two arguments, first of all 505 each SSRC will be associated with a specific media type, communicated 506 through the RTP payload type, allowing a middlebox to do media type 507 specific operations. The second argument is that in many contexts 508 blind combining without additional contexts are anyway not suitable. 509 Regarding bullet 5 this is a understood and explicitly stated 510 applicability limitations for the method described in this document. 512 6.2. Sender Source Restrictions 514 A SSRC in the RTP session MUST only send one media type (audio, 515 video, text etc.) during the SSRC's lifetime. The main motivation 516 is that a given SSRC has its own RTP timestamp and sequence number 517 spaces. The same way that you can't send two streams of encoded 518 audio on the same SSRC, you can't send one audio and one video 519 encoding on the same SSRC. Each media encoding when made into an RTP 520 stream needs to have the sole control over the sequence number and 521 timestamp space. If not, one would not be able to detect packet loss 522 for that particular stream. Nor can one easily determine which clock 523 rate a particular SSRCs timestamp will increase with. For additional 524 arguments why RTP payload type based multiplexing of multiple media 525 streams doesn't work see Appendix A in 526 [I-D.ietf-avtcore-multiplex-guidelines]. 528 6.3. Payload Type Applicability 530 Most Payload Types have a native media type, like an audio codec is 531 natural belonging to the audio media type. However, there exist a 532 number of RTP payload types that don't have a native media type. For 533 example, transport robustness mechanisms like RTP Retransmission 534 [RFC4588] and Generic FEC [RFC5109] inherit their media type from 535 what they protect. RTP Retransmission is explicitly bound to the 536 payload type it is protecting, and thus will inherit it. However 537 Generic FEC is a excellent example of an RTP payload type that has no 538 natural media type. The media type for what it protects is not 539 relevant as it is the recovered RTP packets that have a particular 540 media type, and thus Generic FEC is best categorized as an 541 application media type. 543 The above discussion is relevant to what limitations exist for RTP 544 payload type usage within an RTP session that has multiple media 545 types. In fact this document (Section 7.2) suggest that for usage of 546 Generic FEC (XOR-based) as defined in RFC 5109 can actually use a 547 single media type when used with independent RTP sessions for source 548 and repair data. 550 Note a particular SSRC carrying Generic FEC will clearly only 551 protect a specific SSRC and thus that instance is bound to the 552 SSRC's media type. For this specific case, it is possible to have 553 one be applicable to both. However, in cases when the signalling 554 is setup to enable fall back to using separate RTP sessions, then 555 using a different media type, e.g. application, than the media 556 being protected can create issues. 558 6.4. RTCP Considerations 559 Guidelines for handling RTCP when sending multiple media streams with 560 disparate rates in a single RTP session are outlined in 561 [I-D.ietf-avtcore-rtp-multi-stream]. These guidelines apply when 562 sending multiple types of media in a single RTP session if the 563 different types of media have different rates. 565 7. Extension Considerations 567 This section discusses the impact on some RTP/RTCP extensions due to 568 usage of multiple media types in on RTP session. Only extensions 569 where something worth noting has been included. 571 7.1. RTP Retransmission 573 SSRC-multiplexed RTP retransmission [RFC4588] is actually very 574 straightforward. Each retransmission RTP payload type is explicitly 575 connected to an associated payload type. If retransmission is only 576 to be used with a subset of all payload types, this is not a problem, 577 as it will be evident from the retransmission payload types which 578 payload types that have retransmission enabled for them. 580 Session-multiplexed RTP retransmission is also possible to use where 581 an retransmission session contains the retransmissions of the 582 associated payload types in the source RTP session. The only 583 difference to previously is that the source RTP session is one which 584 contains multiple media types. Thus it is even more likely that only 585 a subset of the source RTP session's payload types and SSRCs are 586 actually retransmitted. 588 Open Issue: When using SDP to signal retransmission for one RTP 589 session with multiple media types and one RTP session for the 590 retransmission data will cause a situation where one will have 591 multiple m= lines grouped using FID and the ones belonging to 592 respective RTP session being grouped using BUNDLE. This usage might 593 contradict both the FID semantics [RFC5888] and an assumption in the 594 RTP retransmission specification [RFC4588]. 596 7.2. Generic FEC 598 The RTP Payload Format for Generic Forward Error Correction 599 [RFC5109], and also its predecessor [RFC2733], requires some 600 considerations, and they are different depending on what type of 601 configuration of usage one has. 603 Independent RTP Sessions, i.e. where source and repair data are sent 604 in different RTP sessions. As this mode of configuration requires 605 different RTP session, there has to be at least one RTP session for 606 source data, this session can be one using multiple media types. The 607 repair session only needs one RTP Payload type indicating repair 608 data, i.e. x/ulpfec or x/parityfec depending if RFC 5109 or RFC 2733 609 is used. The media type in this session is not relevant and can in 610 theory be any of the defined ones. It is RECOMMENDED that one uses 611 "Application". 613 In stream, using RTP Payload for Redundant Audio Data [RFC2198] 614 combining repair and source data in the same packets. This is 615 possible to use within a single RTP session. However, the usage and 616 configuration of the payload types can create an issue. First of all 617 it might be necessary to have one payload type per media type for the 618 FEC repair data payload format, i.e. one for audio/ulpfec and one 619 for text/ulpfec if audio and text are combined in an RTP session. 620 Secondly each combination of source payload and its FEC repair data 621 has to be an explicit configured payload type. This has potential 622 for making the limitation of RTP payload types available into a real 623 issue. 625 8. Signalling 627 The Signalling requirements 629 Establishing an RTP session with multiple media types requires 630 signalling. This signalling needs to fulfil the following 631 requirements: 633 1. Ensure that any participant in the RTP session is aware that this 634 is an RTP session with multiple media types. 636 2. Ensure that the payload types in use in the RTP session are using 637 unique values, with no overlap between the media types. 639 3. Configure the RTP session level parameters, such as RTCP RR and 640 RS bandwidth, AVPF trr-int, underlying transport, the RTCP 641 extensions in use, and security parameters, commonly for the RTP 642 session. 644 4. RTP and RTCP functions that can be bound to a particular media 645 type SHOULD be reused when possible also for other media types, 646 instead of having to be configured for multiple code-points. 647 Note: In some cases one will not have a choice but to use 648 multiple configurations. 650 8.1. SDP-Based Signalling 652 The signalling of multiple media types in one RTP session in SDP is 653 specified in "Multiplexing Negotiation Using Session Description 654 Protocol (SDP) Port Numbers" 655 [I-D.ietf-mmusic-sdp-bundle-negotiation]. 657 9. IANA Considerations 659 This document makes no request of IANA. 661 Note to RFC Editor: this section is to be removed on publication as 662 an RFC. 664 10. Security Considerations 666 Having an RTP session with multiple media types doesn't change the 667 methods for securing a particular RTP session. One possible 668 difference is that the different media have often had different 669 security requirements. When combining multiple media types in one 670 session, their security requirements also have to be combined by 671 selecting the most demanding for each property. Thus having multiple 672 media types can result in increased overhead for security for some 673 media types to ensure that all requirements are meet. 675 Otherwise, the recommendations for how to configure and RTP session 676 do not add any additional requirements compared to normal RTP, except 677 for the need to be able to ensure that the participants are aware 678 that it is a multiple media type session. If not that is ensured it 679 can cause issues in the RTP session for both the unaware and the 680 aware one. Similar issues can also be produced in an normal RTP 681 session by creating configurations for different end-points that 682 doesn't match each other. 684 11. Acknowledgements 686 The authors would like to thank Christer Holmberg, Gunnar Hellstroem, 687 and Charles Eckel for the feedback on the document. 689 12. References 691 12.1. Normative References 693 [I-D.ietf-avtcore-rtp-multi-stream] 694 Lennox, J., Westerlund, M., Wu, W., and C. Perkins, 695 "Sending Multiple Media Streams in a Single RTP Session", 696 draft-ietf-avtcore-rtp-multi-stream-05 (work in progress), 697 July 2014. 699 [I-D.ietf-mmusic-sdp-bundle-negotiation] 700 Holmberg, C., Alvestrand, H., and C. Jennings, 701 "Negotiating Media Multiplexing Using the Session 702 Description Protocol (SDP)", draft-ietf-mmusic-sdp-bundle- 703 negotiation-11 (work in progress), September 2014. 705 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 706 Requirement Levels", BCP 14, RFC 2119, March 1997. 708 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 709 Jacobson, "RTP: A Transport Protocol for Real-Time 710 Applications", STD 64, RFC 3550, July 2003. 712 [RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and 713 Video Conferences with Minimal Control", STD 65, RFC 3551, 714 July 2003. 716 12.2. Informative References 718 [I-D.ietf-avtcore-multiplex-guidelines] 719 Westerlund, M., Perkins, C., and H. Alvestrand, 720 "Guidelines for using the Multiplexing Features of RTP to 721 Support Multiple Media Streams", draft-ietf-avtcore- 722 multiplex-guidelines-02 (work in progress), January 2014. 724 [I-D.lennox-payload-ulp-ssrc-mux] 725 Lennox, J., "Supporting Source-Multiplexing of the Real- 726 Time Transport Protocol (RTP) Payload for Generic Forward 727 Error Correction", draft-lennox-payload-ulp-ssrc-mux-00 728 (work in progress), February 2013. 730 [I-D.westerlund-avtcore-transport-multiplexing] 731 Westerlund, M. and C. Perkins, "Multiplexing Multiple RTP 732 Sessions onto a Single Lower-Layer Transport", draft- 733 westerlund-avtcore-transport-multiplexing-07 (work in 734 progress), October 2013. 736 [RFC2198] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., 737 Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse- 738 Parisis, "RTP Payload for Redundant Audio Data", RFC 2198, 739 September 1997. 741 [RFC2733] Rosenberg, J. and H. Schulzrinne, "An RTP Payload Format 742 for Generic Forward Error Correction", RFC 2733, December 743 1999. 745 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 746 Description Protocol", RFC 4566, July 2006. 748 [RFC4588] Rey, J., Leon, D., Miyazaki, A., Varsa, V., and R. 749 Hakenberg, "RTP Retransmission Payload Format", RFC 4588, 750 July 2006. 752 [RFC5109] Li, A., "RTP Payload Format for Generic Forward Error 753 Correction", RFC 5109, December 2007. 755 [RFC5117] Westerlund, M. and S. Wenger, "RTP Topologies", RFC 5117, 756 January 2008. 758 [RFC5576] Lennox, J., Ott, J., and T. Schierl, "Source-Specific 759 Media Attributes in the Session Description Protocol 760 (SDP)", RFC 5576, June 2009. 762 [RFC5761] Perkins, C. and M. Westerlund, "Multiplexing RTP Data and 763 Control Packets on a Single Port", RFC 5761, April 2010. 765 [RFC5888] Camarillo, G. and H. Schulzrinne, "The Session Description 766 Protocol (SDP) Grouping Framework", RFC 5888, June 2010. 768 Authors' Addresses 770 Magnus Westerlund 771 Ericsson 772 Farogatan 6 773 SE-164 80 Kista 774 Sweden 776 Phone: +46 10 714 82 87 777 Email: magnus.westerlund@ericsson.com 779 Colin Perkins 780 University of Glasgow 781 School of Computing Science 782 Glasgow G12 8QQ 783 United Kingdom 785 Email: csp@csperkins.org 786 Jonathan Lennox 787 Vidyo, Inc. 788 433 Hackensack Avenue 789 Seventh Floor 790 Hackensack, NJ 07601 791 US 793 Email: jonathan@vidyo.com