idnits 2.17.1 draft-ietf-avtcore-multi-media-rtp-session-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC3550, updated by this document, for RFC5378 checks: 1998-04-07) (Using the creation date from RFC3551, updated by this document, for RFC5378 checks: 1997-03-27) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (January 13, 2014) is 3727 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFCXXXX' is mentioned on line 470, but not defined == Unused Reference: 'RFC4585' is defined on line 752, but no explicit reference was found in the text == Unused Reference: 'RFC5124' is defined on line 767, but no explicit reference was found in the text == Unused Reference: 'RFC5506' is defined on line 771, but no explicit reference was found in the text == Outdated reference: A later version (-11) exists of draft-ietf-avtcore-rtp-multi-stream-01 == Outdated reference: A later version (-54) exists of draft-ietf-mmusic-sdp-bundle-negotiation-05 == Outdated reference: A later version (-12) exists of draft-ietf-avtcore-multiplex-guidelines-01 -- Obsolete informational reference (is this intentional?): RFC 2733 (Obsoleted by RFC 5109) -- Obsolete informational reference (is this intentional?): RFC 4566 (Obsoleted by RFC 8866) -- Obsolete informational reference (is this intentional?): RFC 5117 (Obsoleted by RFC 7667) Summary: 0 errors (**), 0 flaws (~~), 8 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 AVTCORE WG M. Westerlund 3 Internet-Draft Ericsson 4 Updates: 3550, 3551 (if approved) C. Perkins 5 Intended status: Standards Track University of Glasgow 6 Expires: July 17, 2014 J. Lennox 7 Vidyo 8 January 13, 2014 10 Sending Multiple Types of Media in a Single RTP Session 11 draft-ietf-avtcore-multi-media-rtp-session-04 13 Abstract 15 This document specifies how an RTP session can contain media streams 16 with media from multiple media types such as audio, video, and text. 17 This has been restricted by the RTP Specification, and thus this 18 document updates RFC 3550 and RFC 3551 to enable this behaviour for 19 applications that satisfy the applicability for using multiple media 20 types in a single RTP session. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at http://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on July 17, 2014. 39 Copyright Notice 41 Copyright (c) 2014 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 57 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3 58 3. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . 4 59 3.1. NAT and Firewalls . . . . . . . . . . . . . . . . . . . . 4 60 3.2. No Transport Level QoS . . . . . . . . . . . . . . . . . 4 61 3.3. Architectural Equality . . . . . . . . . . . . . . . . . 5 62 4. Overview of Solution . . . . . . . . . . . . . . . . . . . . 5 63 5. Applicability . . . . . . . . . . . . . . . . . . . . . . . . 6 64 5.1. Usage of the RTP session . . . . . . . . . . . . . . . . 6 65 5.2. Signalled Support . . . . . . . . . . . . . . . . . . . . 7 66 5.3. Homogeneous Multi-party . . . . . . . . . . . . . . . . . 7 67 5.4. Reduced number of Payload Types . . . . . . . . . . . . . 8 68 5.5. Stream Differentiation . . . . . . . . . . . . . . . . . 8 69 5.6. Non-compatible Extensions . . . . . . . . . . . . . . . . 8 70 6. RTP Session Specification . . . . . . . . . . . . . . . . . . 9 71 6.1. RTP Session . . . . . . . . . . . . . . . . . . . . . . . 9 72 6.2. Sender Source Restrictions . . . . . . . . . . . . . . . 11 73 6.3. Payload Type Applicability . . . . . . . . . . . . . . . 12 74 6.4. RTCP Considerations . . . . . . . . . . . . . . . . . . . 12 75 7. Extension Considerations . . . . . . . . . . . . . . . . . . 12 76 7.1. RTP Retransmission . . . . . . . . . . . . . . . . . . . 13 77 7.2. Generic FEC . . . . . . . . . . . . . . . . . . . . . . . 13 78 8. Signalling . . . . . . . . . . . . . . . . . . . . . . . . . 14 79 8.1. SDP-Based Signalling . . . . . . . . . . . . . . . . . . 14 80 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 81 10. Security Considerations . . . . . . . . . . . . . . . . . . . 14 82 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 15 83 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 15 84 12.1. Normative References . . . . . . . . . . . . . . . . . . 15 85 12.2. Informative References . . . . . . . . . . . . . . . . . 16 86 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 17 88 1. Introduction 90 When the Real-time Transport Protocol (RTP) [RFC3550] was designed, 91 close to 20 years ago, IP networks were very different compared to 92 the ones in 2013 when this is written. The almost ubiquitous 93 deployment of Network Address Translators (NAT) and Firewalls has 94 increased the cost and likely-hood of communication failure when 95 using many different transport flows. Thus there exists a pressure 96 to reduce the number of concurrent transport flows. 98 RTP [RFC3550] recommends against sending several different types of 99 media, for example audio and video, in a single RTP session. The RTP 100 profile for Audio and Video Conferences with Minimal Control (RTP/ 101 AVP) [RFC3551] mandates a similar restriction. The motivation for 102 these limitations is partly to allow lower layer Quality of Service 103 (QoS) mechanisms to be used, and partly due to limitations of the 104 RTCP timing rules that assumes all media in a session to have similar 105 bandwidth. The Session Description Protocol (SDP) [RFC4566], as one 106 of the dominant signalling method for establishing RTP session, has 107 enforced this rule, simply by not allowing multiple media types for a 108 given receiver destination or set of ICE candidates, which is the 109 most common method to determine which RTP session the packets are 110 intended for. 112 The fact that these limitations have been in place for so long a 113 time, in addition to RFC 3550 being written without fully considering 114 multiple media types in an RTP session, does result in a number of 115 considerations being needed when allowing this behaviour. This 116 document provides such considerations regarding applicability as well 117 as functionality, including normative specification of behaviour. 119 First, some basic definitions are provided. This is followed by a 120 background that discusses the motivation in more detail. A overview 121 of the solution of how to provide multiple media types in one RTP 122 session is then presented. Next is the formal applicability this 123 specification have followed by the normative specification. This is 124 followed by a discussion how some RTP/RTCP Extensions is expected to 125 function in the case of multiple media types in one RTP session. A 126 specification of the requirements on signalling from this 127 specification and a look how this is realized in SDP using Bundle 128 [I-D.ietf-mmusic-sdp-bundle-negotiation]. The document ends with the 129 security considerations. 131 2. Definitions 133 The following terms are used with supplied definitions: 135 Endpoint: A single entity sending or receiving RTP packets. It can 136 be decomposed into several functional blocks, but as long as it 137 behaves as a single RTP stack entity it is classified as a single 138 endpoint. 140 Media Stream: A sequence of RTP packets using a single SSRC that 141 together carries part or all of the content of a specific Media 142 Type from a specific sender source within a given RTP session. 144 Media Type: Audio, video, text or application whose form and meaning 145 are defined by a specific real-time application. 147 QoS: Quality of Service, i.e. network mechanisms that intended to 148 ensure that the packets within a flow or with a specific marking 149 are transported with certain properties. 151 RTP Session: As defined by [RFC3550], the endpoints belonging to the 152 same RTP Session are those that share a single SSRC space. That 153 is, those endpoints can see an SSRC identifier transmitted by any 154 one of the other endpoints. An endpoint can receive an SSRC 155 either as SSRC or as CSRC in RTP and RTCP packets. Thus, the RTP 156 Session scope is decided by the endpoints' network interconnection 157 topology, in combination with RTP and RTCP forwarding strategies 158 deployed by endpoints and any interconnecting middle nodes. 160 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 161 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 162 document are to be interpreted as described in [RFC2119]. 164 3. Motivation 166 This section discusses in more detail the main motivations why 167 allowing multiple media types in the same RTP session is suitable. 169 3.1. NAT and Firewalls 171 The existence of NATs and Firewalls at almost all Internet access has 172 had implications on protocols like RTP that were designed to use 173 multiple transport flows. First of all, the NAT/FW traversal 174 solution needs to ensure that all these transport flows are 175 established. This has three consequences: 177 1. Increased delay to perform the transport flow establishment 179 2. The more transport flows, the more state and the more resource 180 consumption in the NAT and Firewalls. When the resource 181 consumption in NAT/FWs reaches their limits, unexpected 182 behaviours usually occur. 184 3. More transport flows means a higher risk that some transport flow 185 fails to be established, thus preventing the application to 186 communicate. 188 Using fewer transport flows reduces the risk of communication 189 failure, improved establishment behaviour and less load on NAT and 190 Firewalls. 192 3.2. No Transport Level QoS 193 Many RTP-using applications don't utilize any network level Quality 194 of Service functions. Nor do they expect or desire any separation in 195 network treatment of its media packets, independent of whether they 196 are audio, video or text. When an application has no such desire, it 197 doesn't need to provide a transport flow structure that simplifies 198 flow based QoS. 200 3.3. Architectural Equality 202 For applications that don't require different lower-layer QoS for 203 different media types, and that have no special requirements for RTP 204 extensions or RTCP reporting, the requirement to separate different 205 media into different RTP sessions might seem unnecessary. Provided 206 the application accepts that all media flows will get similar RTCP 207 reporting, using the same RTP session for several types of media at 208 once appears a reasonable choice. The architecture ought to be 209 agnostic about the type of media being carried in an RTP session to 210 the extent possible given the constraints of the protocol. 212 4. Overview of Solution 214 The goal of the solution is to enable each RTP session to contain 215 more than just one media type. This includes having multiple RTP 216 sessions containing a given media type, for example having three 217 sessions containing both video and audio. 219 The solution is quite straightforward. The first step is to override 220 the SHOULD and SHOULD NOT language of the RTP specification 221 [RFC3550]. Similar change is needed to a sentence in Section 6 of 222 [RFC3551] that states that "different media types SHALL NOT be 223 interleaved or multiplexed within a single RTP Session". This is 224 resolved by appropriate exception clauses given that this 225 specification and its applicability is followed. 227 Within an RTP session where multiple media types have been configured 228 for use, an SSRC can only send one type of media during its lifetime 229 (i.e., it can switch between different audio codecs, since those are 230 both the same type of media, but cannot switch between audio and 231 video). Different SSRCs MUST be used for the different media 232 sources, the same way multiple media sources of the same media type 233 already have to do. The payload type will inform a receiver which 234 media type the SSRC is being used for. Thus the payload type MUST be 235 unique across all of the payload configurations independent of media 236 type that is used in the RTP session. 238 Some few extra considerations within the RTP sessions also needs to 239 be considered. RTCP bandwidth and regular reporting suppression (RTP 240 /AVPF and RTP/SAVPF) SHOULD be configured to reduce the impact for 241 bit-rate variations between streams and media types. It is also 242 clarified how timeout calculations are to be done to avoid any 243 issues. Certain payload types like FEC also need additional rules. 245 The final important part of the solution to this is to use signalling 246 and ensure that agreement on using multiple media types in an RTP 247 session exists, and how that then is configured. This memo describes 248 some existing requirements, while an external reference defines how 249 this is accomplished in SDP. 251 5. Applicability 253 This specification has limited applicability, and anyone intending to 254 use it needs to ensure that their application and usage meets the 255 below criteria. 257 5.1. Usage of the RTP session 259 Before choosing to use this specification, an application implementer 260 needs to ensure that they don't have a need for different RTP 261 sessions between the media types for some reason. The main rule is 262 that if one expects to have equal treatment of all media packets, 263 then this specification might be suitable. The equal treatment 264 include anything from network level up to RTCP reporting and 265 feedback. The document Guidelines for using the Multiplexing 266 Features of RTP [I-D.ietf-avtcore-multiplex-guidelines] gives more 267 detailed guidance on aspects to consider when choosing how to use RTP 268 and specifically sessions. RTP-using applications that need or would 269 prefer multiple RTP sessions, but do not require the functionalities 270 or behaviours that multiple transport flows give, can consider using 271 Multiple RTP Sessions on a Single Lower-Layer Transport 272 [I-D.westerlund-avtcore-transport-multiplexing]. It needs to be 273 noted that some difference in treatment is still possible to achieve, 274 for example marking based QoS, or RTCP feedback traffic for only some 275 media streams. 277 The second important consideration is the resulting behaviour when 278 media flows to be sent within a single RTP session does not have 279 similar bandwidth. There are limitations in the RTCP timing rules, 280 and this implies a common RTCP reporting interval across all 281 participants in a session. If an RTP session contains flows with 282 very different bandwidths, for example low-rate audio coupled with 283 high-quality video, this can result in either excessive or 284 insufficient RTCP for some flows, depending how the RTCP session 285 bandwidth, and hence reporting interval, is configured. This is 286 discussed further in Section 6.4. 288 5.2. Signalled Support 290 Usage of this specification is not compatible with anyone following 291 RFC 3550 and intending to have different RTP sessions for each media 292 type. Therefore there needs to be mutual agreement to use multiple 293 media types in one RTP session by all participants within that RTP 294 session. This agreement has to be determined using signalling in 295 most cases. 297 This requirement can be a problem for signalling solutions that can't 298 negotiate with all participants. For declarative signalling 299 solutions, mandating that the session is using multiple media types 300 in one RTP session can be a way of attempting to ensure that all 301 participants in the RTP session follow the requirement. However, for 302 signalling solutions that lack methods for enforcing that a receiver 303 supports a specific feature, this can still cause issues. 305 5.3. Homogeneous Multi-party 307 In multiparty communication scenarios it is important to separate two 308 different cases. One case is where the RTP session contains multiple 309 participants in a common RTP session. This occurs for example in Any 310 Source Multicast (ASM) and Transport Translator topologies as defined 311 in RTP Topologies [RFC5117]. It can also occur in some 312 implementations of RTP mixers that share the same SSRC/CSRC space 313 across all participants. The second case is when the RTP session is 314 terminated in a middlebox and the other participants sources are 315 projected or switched into each RTP session and rewritten on RTP 316 header level including SSRC mappings. 318 For the first case, with a common RTP session or at least shared SSRC 319 /CSRC values, all participants in multiparty communication are 320 REQUIRED to support multiple media types in an RTP session. An 321 participant using two or more RTP sessions towards a multiparty 322 session can't be collapsed into a single session with multiple media 323 types. The reason is that in case of multiple RTP sessions, the same 324 SSRC value can be use in both RTP sessions without any issues, but 325 when collapsed to a single session there is an SSRC collision. In 326 addition some collisions can't be represented in the multiple 327 separate RTP sessions. For example, in a session with audio and 328 video, an SSRC value used for video will not show up in the Audio RTP 329 session at the participant using multiple RTP sessions, and thus not 330 trigger any collision handling. Thus any application using this type 331 of RTP session structure MUST have a homogeneous support for multiple 332 media types in one RTP session, or be forced to insert a translator 333 node between that participant and the rest of the RTP session. 335 For the second case of separate RTP sessions for each multiparty 336 participant and a central node it is possible to have a mix of single 337 RTP session users and multiple RTP session users as long as one is 338 willing to remap the SSRCs used by a participant with multiple RTP 339 sessions into non-used values in the single RTP session SSRC space 340 for each of the participants using a single RTP session with multiple 341 media types. It can be noted that this type of implementation has to 342 understand all types of RTP/RTCP extension being used in the RTP 343 sessions to correctly be able to translate them between the RTP 344 sessions. It can also negatively impact the possibility for loop 345 detection, as SSRC/CSRC can't be used to detect the loops, instead 346 some other media stream identity name space that is common across all 347 interconnect parts are needed. 349 5.4. Reduced number of Payload Types 351 An RTP session with multiple media types in it have only a single 352 7-bit Payload Type range for all its payload types. Within the 128 353 available values, only 96 or less if "Multiplexing RTP Data and 354 Control Packets on a Single Port" [RFC5761] is used, all the 355 different RTP payload configurations for all the media types need to 356 fit in the available space. For most applications this will not be a 357 real problem, but the limitation exists and could be encountered. 359 5.5. Stream Differentiation 361 If network level differentiation of the media streams of different 362 media types are desired using this specification can cause severe 363 limitations. All media streams in an RTP session, independent of the 364 media type, will be sent over the same underlying transport flow. 365 Any flow-based Quality of Service (QoS) mechanism will be unable to 366 provide differentiated treatment between different media types, e.g. 367 to prioritize audio over video. If differentiated treatment is 368 desired using flow-based QoS, separate RTP sessions over different 369 underlying transport flows needs to be used. 371 Any marking-based QoS scheme like DiffServ is not affected unless a 372 network ingress marks based on flows, in which case the same 373 considerations as for flow based QoS applies. 375 5.6. Non-compatible Extensions 377 There exist some RTP and RTCP extensions that rely on the existence 378 of multiple RTP sessions. If the goal of using an RTP session with 379 multiple media types is to have only a single RTP session, then these 380 extensions can't be used. If one has no need to have different RTP 381 sessions for the media types but is willing to have multiple RTP 382 sessions, one for the main media transmission and one for the 383 extension, they can be used. It is to be noted that this assumes 384 that it is possible to get the extension working when the related RTP 385 session contains multiple media types. 387 Identified RTP/RTCP extensions that require multiple RTP Sessions 388 are: 390 RTP Retransmission: RTP Retransmission [RFC4588] has a session 391 multiplexed mode. It also has a SSRC multiplexed mode that can be 392 used instead. So use the mode that is suitable for the RTP 393 application. 395 XOR-Based FEC: The RTP Payload Format for Generic Forward Error 396 Correction [RFC5109] and its predecessor [RFC2733] requires a 397 separate RTP session unless the FEC data is carried in RTP Payload 398 for Redundant Audio Data [RFC2198]. However, using the Generic 399 FEC with the Redundancy payload has another set of restrictions, 400 see Section 7.2. 402 Note that the Source-Specific Media Attributes [RFC5576] 403 specification defines an SDP syntax (the "FEC" semantic of the 404 "ssrc-group" attribute) to signal FEC relationships between 405 multiple media streams within a single RTP session. However, this 406 can't be used as the FEC repair packets need to have the same SSRC 407 value as the source packets being protected. [RFC5576] does not 408 normatively update and resolve that restriction. There is ongoing 409 work on an ULP extension to allow it be use FEC streams within the 410 same RTP Session as the source stream 411 [I-D.lennox-payload-ulp-ssrc-mux]. 413 6. RTP Session Specification 415 This section defines what needs to be done or avoided to make an RTP 416 session with multiple media types function without issues. 418 6.1. RTP Session 420 Section 5.2 of "RTP: A Transport Protocol for Real-Time Applications" 421 [RFC3550] states: 423 For example, in a teleconference composed of audio and video media 424 encoded separately, each medium SHOULD be carried in a separate 425 RTP session with its own destination transport address. 427 Separate audio and video streams SHOULD NOT be carried in a single 428 RTP session and demultiplexed based on the payload type or SSRC 429 fields. 431 This specification changes both of these sentences. The first 432 sentence is changed to: 434 For example, in a teleconference composed of audio and video media 435 encoded separately, each medium SHOULD be carried in a separate 436 RTP session with its own destination transport address, unless 437 specification [RFCXXXX] is followed and the application meets the 438 applicability constraints. 440 The second sentence is changed to: 442 Separate audio and video streams SHOULD NOT be carried in a single 443 RTP session and demultiplexed based on the payload type or SSRC 444 fields, unless multiplexed based on both SSRC and payload type and 445 usage meets what Multiple Media Types in an RTP Session [RFCXXXX] 446 specifies. 448 Second paragraph of Section 6 in RTP Profile for Audio and Video 449 Conferences with Minimal Control [RFC3551] says: 451 The payload types currently defined in this profile are assigned 452 to exactly one of three categories or media types: audio only, 453 video only and those combining audio and video. The media types 454 are marked in Tables 4 and 5 as "A", "V" and "AV", respectively. 455 Payload types of different media types SHALL NOT be interleaved or 456 multiplexed within a single RTP session, but multiple RTP sessions 457 MAY be used in parallel to send multiple media types. An RTP 458 source MAY change payload types within the same media type during 459 a session. See the section "Multiplexing RTP Sessions" of RFC 460 3550 for additional explanation. 462 This specifications purpose is to violate that existing SHALL NOT 463 under certain conditions. Thus also this sentence has to be changed 464 to allow for multiple media type's payload types in the same session. 465 The above sentence is changed to: 467 Payload types of different media types SHALL NOT be interleaved or 468 multiplexed within a single RTP session unless as specified and 469 under the restriction in Multiple Media Types in an RTP Session 470 [RFCXXXX]. Multiple RTP sessions MAY be used in parallel to send 471 multiple media types. 473 RFC-Editor Note: Please replace RFCXXXX with the RFC number of this 474 specification when assigned. 476 We can now go on and discuss the five bullets that are motivating the 477 previous in Section 5.2 of the RTP Specification [RFC3550]. They are 478 repeated here for the reader's convenience: 480 1. If, say, two audio streams shared the same RTP session and the 481 same SSRC value, and one were to change encodings and thus 482 acquire a different RTP payload type, there would be no general 483 way of identifying which stream had changed encodings. 485 2. An SSRC is defined to identify a single timing and sequence 486 number space. Interleaving multiple payload types would require 487 different timing spaces if the media clock rates differ and would 488 require different sequence number spaces to tell which payload 489 type suffered packet loss. 491 3. The RTCP sender and receiver reports (see Section 6.4 of RFC 492 3550) can only describe one timing and sequence number space per 493 SSRC and do not carry a payload type field. 495 4. An RTP mixer would not be able to combine interleaved streams of 496 incompatible media into one stream. 498 5. Carrying multiple media in one RTP session precludes: the use of 499 different network paths or network resource allocations if 500 appropriate; reception of a subset of the media if desired, for 501 example just audio if video would exceed the available bandwidth; 502 and receiver implementations that use separate processes for the 503 different media, whereas using separate RTP sessions permits 504 either single- or multiple-process implementations. 506 Bullets 1 to 3 are all related to that each media source has to use 507 one or more unique SSRCs to avoid these issues as mandated below 508 (Section 6.2). Bullet 4 can be served by two arguments, first of all 509 each SSRC will be associated with a specific media type, communicated 510 through the RTP payload type, allowing a middlebox to do media type 511 specific operations. The second argument is that in many contexts 512 blind combining without additional contexts are anyway not suitable. 513 Regarding bullet 5 this is a understood and explicitly stated 514 applicability limitations for the method described in this document. 516 6.2. Sender Source Restrictions 518 A SSRC in the RTP session MUST only send one media type (audio, 519 video, text etc.) during the SSRC's lifetime. The main motivation 520 is that a given SSRC has its own RTP timestamp and sequence number 521 spaces. The same way that you can't send two streams of encoded 522 audio on the same SSRC, you can't send one audio and one video 523 encoding on the same SSRC. Each media encoding when made into an RTP 524 stream needs to have the sole control over the sequence number and 525 timestamp space. If not, one would not be able to detect packet loss 526 for that particular stream. Nor can one easily determine which clock 527 rate a particular SSRCs timestamp will increase with. For additional 528 arguments why RTP payload type based multiplexing of multiple media 529 streams doesn't work see Appendix A in 530 [I-D.ietf-avtcore-multiplex-guidelines]. 532 6.3. Payload Type Applicability 534 Most Payload Types have a native media type, like an audio codec is 535 natural belonging to the audio media type. However, there exist a 536 number of RTP payload types that don't have a native media type. For 537 example, transport robustness mechanisms like RTP Retransmission 538 [RFC4588] and Generic FEC [RFC5109] inherit their media type from 539 what they protect. RTP Retransmission is explicitly bound to the 540 payload type it is protecting, and thus will inherit it. However 541 Generic FEC is a excellent example of an RTP payload type that has no 542 natural media type. The media type for what it protects is not 543 relevant as it is the recovered RTP packets that have a particular 544 media type, and thus Generic FEC is best categorized as an 545 application media type. 547 The above discussion is relevant to what limitations exist for RTP 548 payload type usage within an RTP session that has multiple media 549 types. In fact this document (Section 7.2) suggest that for usage of 550 Generic FEC (XOR-based) as defined in RFC 5109 can actually use a 551 single media type when used with independent RTP sessions for source 552 and repair data. 554 Note a particular SSRC carrying Generic FEC will clearly only 555 protect a specific SSRC and thus that instance is bound to the 556 SSRC's media type. For this specific case, it is possible to have 557 one be applicable to both. However, in cases when the signalling 558 is setup to enable fall back to using separate RTP sessions, then 559 using a different media type, e.g. application, than the media 560 being protected can create issues. 562 6.4. RTCP Considerations 564 Guidelines for handling RTCP when sending multiple media streams with 565 disparate rates in a single RTP session are outlined in 566 [I-D.ietf-avtcore-rtp-multi-stream]. These guidelines apply when 567 sending multiple types of media in a single RTP session if the 568 different types of media have different rates. 570 7. Extension Considerations 572 This section discusses the impact on some RTP/RTCP extensions due to 573 usage of multiple media types in on RTP session. Only extensions 574 where something worth noting has been included. 576 7.1. RTP Retransmission 578 SSRC-multiplexed RTP retransmission [RFC4588] is actually very 579 straightforward. Each retransmission RTP payload type is explicitly 580 connected to an associated payload type. If retransmission is only 581 to be used with a subset of all payload types, this is not a problem, 582 as it will be evident from the retransmission payload types which 583 payload types that have retransmission enabled for them. 585 Session-multiplexed RTP retransmission is also possible to use where 586 an retransmission session contains the retransmissions of the 587 associated payload types in the source RTP session. The only 588 difference to previously is that the source RTP session is one which 589 contains multiple media types. Thus it is even more likely that only 590 a subset of the source RTP session's payload types and SSRCs are 591 actually retransmitted. 593 Open Issue: When using SDP to signal retransmission for one RTP 594 session with multiple media types and one RTP session for the 595 retransmission data will cause a situation where one will have 596 multiple m= lines grouped using FID and the ones belonging to 597 respective RTP session being grouped using BUNDLE. This usage might 598 contradict both the FID semantics [RFC5888] and an assumption in the 599 RTP retransmission specification [RFC4588]. 601 7.2. Generic FEC 603 The RTP Payload Format for Generic Forward Error Correction 604 [RFC5109], and also its predecessor [RFC2733], requires some 605 considerations, and they are different depending on what type of 606 configuration of usage one has. 608 Independent RTP Sessions, i.e. where source and repair data are sent 609 in different RTP sessions. As this mode of configuration requires 610 different RTP session, there has to be at least one RTP session for 611 source data, this session can be one using multiple media types. The 612 repair session only needs one RTP Payload type indicating repair 613 data, i.e. x/ulpfec or x/parityfec depending if RFC 5109 or RFC 2733 614 is used. The media type in this session is not relevant and can in 615 theory be any of the defined ones. It is RECOMMENDED that one uses 616 "Application". 618 In stream, using RTP Payload for Redundant Audio Data [RFC2198] 619 combining repair and source data in the same packets. This is 620 possible to use within a single RTP session. However, the usage and 621 configuration of the payload types can create an issue. First of all 622 it might be necessary to have one payload type per media type for the 623 FEC repair data payload format, i.e. one for audio/ulpfec and one 624 for text/ulpfec if audio and text are combined in an RTP session. 625 Secondly each combination of source payload and its FEC repair data 626 has to be an explicit configured payload type. This has potential 627 for making the limitation of RTP payload types available into a real 628 issue. 630 8. Signalling 632 The Signalling requirements 634 Establishing an RTP session with multiple media types requires 635 signalling. This signalling needs to fulfil the following 636 requirements: 638 1. Ensure that any participant in the RTP session is aware that this 639 is an RTP session with multiple media types. 641 2. Ensure that the payload types in use in the RTP session are using 642 unique values, with no overlap between the media types. 644 3. Configure the RTP session level parameters, such as RTCP RR and 645 RS bandwidth, AVPF trr-int, underlying transport, the RTCP 646 extensions in use, and security parameters, commonly for the RTP 647 session. 649 4. RTP and RTCP functions that can be bound to a particular media 650 type SHOULD be reused when possible also for other media types, 651 instead of having to be configured for multiple code-points. 652 Note: In some cases one will not have a choice but to use 653 multiple configurations. 655 8.1. SDP-Based Signalling 657 The signalling of multiple media types in one RTP session in SDP is 658 specified in "Multiplexing Negotiation Using Session Description 659 Protocol (SDP) Port Numbers" 660 [I-D.ietf-mmusic-sdp-bundle-negotiation]. 662 9. IANA Considerations 664 This document makes no request of IANA. 666 Note to RFC Editor: this section is to be removed on publication as 667 an RFC. 669 10. Security Considerations 670 Having an RTP session with multiple media types doesn't change the 671 methods for securing a particular RTP session. One possible 672 difference is that the different media have often had different 673 security requirements. When combining multiple media types in one 674 session, their security requirements also have to be combined by 675 selecting the most demanding for each property. Thus having multiple 676 media types can result in increased overhead for security for some 677 media types to ensure that all requirements are meet. 679 Otherwise, the recommendations for how to configure and RTP session 680 do not add any additional requirements compared to normal RTP, except 681 for the need to be able to ensure that the participants are aware 682 that it is a multiple media type session. If not that is ensured it 683 can cause issues in the RTP session for both the unaware and the 684 aware one. Similar issues can also be produced in an normal RTP 685 session by creating configurations for different end-points that 686 doesn't match each other. 688 11. Acknowledgements 690 The authors would like to thank Christer Holmberg, Gunnar Hellstroem, 691 and Charles Eckel for the feedback on the document. 693 12. References 695 12.1. Normative References 697 [I-D.ietf-avtcore-rtp-multi-stream] 698 Lennox, J., Westerlund, M., Wu, W., and C. Perkins, 699 "Sending Multiple Media Streams in a Single RTP Session", 700 draft-ietf-avtcore-rtp-multi-stream-01 (work in progress), 701 July 2013. 703 [I-D.ietf-mmusic-sdp-bundle-negotiation] 704 Holmberg, C., Alvestrand, H., and C. Jennings, 705 "Multiplexing Negotiation Using Session Description 706 Protocol (SDP) Port Numbers", draft-ietf-mmusic-sdp- 707 bundle-negotiation-05 (work in progress), October 2013. 709 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 710 Requirement Levels", BCP 14, RFC 2119, March 1997. 712 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 713 Jacobson, "RTP: A Transport Protocol for Real-Time 714 Applications", STD 64, RFC 3550, July 2003. 716 [RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and 717 Video Conferences with Minimal Control", STD 65, RFC 3551, 718 July 2003. 720 12.2. Informative References 722 [I-D.ietf-avtcore-multiplex-guidelines] 723 Westerlund, M., Perkins, C., and H. Alvestrand, 724 "Guidelines for using the Multiplexing Features of RTP to 725 Support Multiple Media Streams", draft-ietf-avtcore- 726 multiplex-guidelines-01 (work in progress), July 2013. 728 [I-D.lennox-payload-ulp-ssrc-mux] 729 Lennox, J., "Supporting Source-Multiplexing of the Real- 730 Time Transport Protocol (RTP) Payload for Generic Forward 731 Error Correction", draft-lennox-payload-ulp-ssrc-mux-00 732 (work in progress), February 2013. 734 [I-D.westerlund-avtcore-transport-multiplexing] 735 Westerlund, M. and C. Perkins, "Multiplexing Multiple RTP 736 Sessions onto a Single Lower-Layer Transport", draft- 737 westerlund-avtcore-transport-multiplexing-07 (work in 738 progress), October 2013. 740 [RFC2198] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., 741 Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse- 742 Parisis, "RTP Payload for Redundant Audio Data", RFC 2198, 743 September 1997. 745 [RFC2733] Rosenberg, J. and H. Schulzrinne, "An RTP Payload Format 746 for Generic Forward Error Correction", RFC 2733, December 747 1999. 749 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 750 Description Protocol", RFC 4566, July 2006. 752 [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, 753 "Extended RTP Profile for Real-time Transport Control 754 Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, July 755 2006. 757 [RFC4588] Rey, J., Leon, D., Miyazaki, A., Varsa, V., and R. 758 Hakenberg, "RTP Retransmission Payload Format", RFC 4588, 759 July 2006. 761 [RFC5109] Li, A., "RTP Payload Format for Generic Forward Error 762 Correction", RFC 5109, December 2007. 764 [RFC5117] Westerlund, M. and S. Wenger, "RTP Topologies", RFC 5117, 765 January 2008. 767 [RFC5124] Ott, J. and E. Carrara, "Extended Secure RTP Profile for 768 Real-time Transport Control Protocol (RTCP)-Based Feedback 769 (RTP/SAVPF)", RFC 5124, February 2008. 771 [RFC5506] Johansson, I. and M. Westerlund, "Support for Reduced-Size 772 Real-Time Transport Control Protocol (RTCP): Opportunities 773 and Consequences", RFC 5506, April 2009. 775 [RFC5576] Lennox, J., Ott, J., and T. Schierl, "Source-Specific 776 Media Attributes in the Session Description Protocol 777 (SDP)", RFC 5576, June 2009. 779 [RFC5761] Perkins, C. and M. Westerlund, "Multiplexing RTP Data and 780 Control Packets on a Single Port", RFC 5761, April 2010. 782 [RFC5888] Camarillo, G. and H. Schulzrinne, "The Session Description 783 Protocol (SDP) Grouping Framework", RFC 5888, June 2010. 785 Authors' Addresses 787 Magnus Westerlund 788 Ericsson 789 Farogatan 6 790 SE-164 80 Kista 791 Sweden 793 Phone: +46 10 714 82 87 794 Email: magnus.westerlund@ericsson.com 796 Colin Perkins 797 University of Glasgow 798 School of Computing Science 799 Glasgow G12 8QQ 800 United Kingdom 802 Email: csp@csperkins.org 803 Jonathan Lennox 804 Vidyo, Inc. 805 433 Hackensack Avenue 806 Seventh Floor 807 Hackensack, NJ 07601 808 US 810 Email: jonathan@vidyo.com