idnits 2.17.1 draft-ietf-avtcore-multi-media-rtp-session-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC3550, updated by this document, for RFC5378 checks: 1998-04-07) (Using the creation date from RFC3551, updated by this document, for RFC5378 checks: 1997-03-27) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (October 22, 2012) is 4203 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFCXXXX' is mentioned on line 456, but not defined == Outdated reference: A later version (-54) exists of draft-ietf-mmusic-sdp-bundle-negotiation-01 == Outdated reference: A later version (-02) exists of draft-lennox-avtcore-rtp-multi-stream-00 == Outdated reference: A later version (-03) exists of draft-westerlund-avtcore-multiplex-architecture-02 == Outdated reference: A later version (-07) exists of draft-westerlund-avtcore-transport-multiplexing-03 -- Obsolete informational reference (is this intentional?): RFC 2733 (Obsoleted by RFC 5109) -- Obsolete informational reference (is this intentional?): RFC 4566 (Obsoleted by RFC 8866) -- Obsolete informational reference (is this intentional?): RFC 5117 (Obsoleted by RFC 7667) Summary: 0 errors (**), 0 flaws (~~), 6 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 AVTCORE WG M. Westerlund 3 Internet-Draft Ericsson 4 Updates: 3550, 3551 (if approved) C. Perkins 5 Intended status: Standards Track University of Glasgow 6 Expires: April 25, 2013 J. Lennox 7 Vidyo 8 October 22, 2012 10 Multiple Media Types in an RTP Session 11 draft-ietf-avtcore-multi-media-rtp-session-01 13 Abstract 15 This document specifies how an RTP session can contain media streams 16 with media from multiple media types such as audio, video, and text. 17 This has been restricted by the RTP Specification, and thus this 18 document updates RFC 3550 and RFC 3551 to enable this behavior for 19 applications that satisfy the applicability for using multiple media 20 types in a single RTP session. 22 Status of this Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at http://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on April 25, 2013. 39 Copyright Notice 41 Copyright (c) 2012 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 57 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3 58 2.1. Requirements Language . . . . . . . . . . . . . . . . . . 4 59 2.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 60 3. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . 4 61 3.1. NAT and Firewalls . . . . . . . . . . . . . . . . . . . . 4 62 3.2. No Transport Level QoS . . . . . . . . . . . . . . . . . . 5 63 3.3. Architectural Equality . . . . . . . . . . . . . . . . . . 5 64 4. Overview of Solution . . . . . . . . . . . . . . . . . . . . . 5 65 5. Applicability . . . . . . . . . . . . . . . . . . . . . . . . 6 66 5.1. Usage of the RTP session . . . . . . . . . . . . . . . . . 6 67 5.2. Signalled Support . . . . . . . . . . . . . . . . . . . . 7 68 5.3. Homogeneous Multi-party . . . . . . . . . . . . . . . . . 7 69 5.4. Reduced number of Payload Types . . . . . . . . . . . . . 8 70 5.5. Stream Differentiation . . . . . . . . . . . . . . . . . . 8 71 5.6. Non-compatible Extensions . . . . . . . . . . . . . . . . 9 72 6. RTP Session Specification . . . . . . . . . . . . . . . . . . 9 73 6.1. RTP Session . . . . . . . . . . . . . . . . . . . . . . . 9 74 6.2. Sender Source Restrictions . . . . . . . . . . . . . . . . 11 75 6.3. Payload Type Applicability . . . . . . . . . . . . . . . . 12 76 6.4. RTCP . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 77 7. Extension Considerations . . . . . . . . . . . . . . . . . . . 14 78 7.1. RTP Retransmission . . . . . . . . . . . . . . . . . . . . 14 79 7.2. Generic FEC . . . . . . . . . . . . . . . . . . . . . . . 14 80 8. Signalling . . . . . . . . . . . . . . . . . . . . . . . . . . 15 81 8.1. SDP-Based Signalling . . . . . . . . . . . . . . . . . . . 15 82 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 83 10. Security Considerations . . . . . . . . . . . . . . . . . . . 16 84 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 16 85 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 16 86 12.1. Normative References . . . . . . . . . . . . . . . . . . . 16 87 12.2. Informative References . . . . . . . . . . . . . . . . . . 17 88 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 18 90 1. Introduction 92 When the Real-time Transport Protocol (RTP) [RFC3550] was designed, 93 close to 20 years ago, IP networks were very different compared to 94 the ones in 2012 when this is written. The almost ubiquitous 95 deployment of Network Address Translators (NAT) and Firewalls has 96 increased the cost and likely-hood of communication failure when 97 using many different transport flows. Thus there exists a pressure 98 to reduce the number of concurrent transport flows. 100 RTP [RFC3550] recommends against sending several different types of 101 media, for example audio and video, in a single RTP session. The RTP 102 profile for Audio and Video Conferences with Minimal Control (RTP/ 103 AVP) [RFC3551] mandates a similar restriction. The motivation for 104 these limitations is partly to allow lower layer Quality of Service 105 (QoS) mechanisms to be used, and partly due to limitations of the 106 RTCP timing rules that require all media in a session to have similar 107 bandwidth. The Session Description Protocol (SDP) [RFC4566], as one 108 of the dominant signalling method for establishing RTP session, has 109 enforced this rule, simply by not allowing multiple media types for a 110 given receiver destination or set of ICE candidates, which is the 111 most common method to determine which RTP session the packets are 112 intended for. 114 The fact that these limitations have been in place for so long a 115 time, in addition to RFC 3550 being written without fully considering 116 multiple media types in an RTP session, does result in a number of 117 considerations being needed when allowing this behavior. This 118 document provides such considerations regarding applicability as well 119 as functionality, including normative specification of behavior. 121 First, some basic definitions are provided. This is followed by a 122 background that discusses the motivation in more detail. A overview 123 of the solution of how to provide multiple media types in one RTP 124 session is then presented. Next is the formal applicability this 125 specification have followed by the normative specification. This is 126 followed by a discussion how some RTP/RTCP Extensions should function 127 in the case of multiple media types in one RTP session. A 128 specification of the requirements on signalling from this 129 specification and a look how this is realized in SDP using Bundle 130 [I-D.ietf-mmusic-sdp-bundle-negotiation]. The document ends with the 131 security considerations. 133 2. Definitions 134 2.1. Requirements Language 136 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 137 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 138 document are to be interpreted as described in [RFC2119]. 140 2.2. Terminology 142 The following terms are used with supplied definitions: 144 Endpoint: A single entity sending or receiving RTP packets. It may 145 be decomposed into several functional blocks, but as long as it 146 behaves as a single RTP stack entity it is classified as a single 147 endpoint. 149 Media Stream: A sequence of RTP packets using a single SSRC that 150 together carries part or all of the content of a specific Media 151 Type from a specific sender source within a given RTP session. 153 Media Type: Audio, video, text or application whose form and meaning 154 are defined by a specific real-time application. 156 RTP Session: As defined by [RFC3550], the endpoints belonging to the 157 same RTP Session are those that share a single SSRC space. That 158 is, those endpoints can see an SSRC identifier transmitted by any 159 one of the other endpoints. An endpoint can receive an SSRC 160 either as SSRC or as CSRC in RTP and RTCP packets. Thus, the RTP 161 Session scope is decided by the endpoints' network interconnection 162 topology, in combination with RTP and RTCP forwarding strategies 163 deployed by endpoints and any interconnecting middle nodes. 165 3. Motivation 167 This section discusses in more detail the main motivations why 168 allowing multiple media types in the same RTP session is suitable. 170 3.1. NAT and Firewalls 172 The existence of NATs and Firewalls at almost all Internet access has 173 had implications on protocols like RTP that were designed to use 174 multiple transport flows. First of all, the NAT/FW traversal 175 solution one uses needs to ensure that all these transport flows are 176 established. This has three different impacts: 178 1. Increased delay to perform the transport flow establishment 180 2. The more transport flows, the more state and the more resource 181 consumption in the NAT and Firewalls. When the resource 182 consumption in NAT/FWs reaches their limits, unexpected behaviors 183 usually occur. 185 3. More transport flows means a higher risk that some transport flow 186 fails to be established, thus preventing the application to 187 communicate. 189 Using fewer transport flows reduces the risk of communication 190 failure, improved establishment behavior and less load on NAT and 191 Firewalls. 193 3.2. No Transport Level QoS 195 Many RTP-using applications don't utilize any network level Quality 196 of Service functions. Nor do they expect or desire any separation in 197 network treatment of its media packets, independent of whether they 198 are audio, video or text. When an application has no such desire, it 199 doesn't need to provide a transport flow structure that simplifies 200 flow based QoS. 202 3.3. Architectural Equality 204 For applications that don't require different lower-layer QoS for 205 different media types, and that have no special requirements for RTP 206 extensions or RTCP reporting, the requirement to separate different 207 media into different RTP sessions may seem unnecessary. Provided the 208 media flows have similar bandwidth requirements, so that the RTCP 209 timing rules work, using the same RTP session for several types of 210 media at once appears a reasonable choice. The architecture should 211 be agnostic about the type of media being carried in an RTP session 212 to the extent possible given the constraints of the protocol. 214 4. Overview of Solution 216 The goal of the solution is to enable having one or more RTP 217 sessions, where each RTP session may contain two or more media types. 218 This includes having multiple RTP sessions containing a given media 219 type, for example having three sessions containing video and audio. 221 The solution is quite straightforward. The first step is to override 222 the SHOULD and SHOULD NOT language of the RTP specification 223 [RFC3550]. Similar change is needed to a sentence in Section 6 of 224 [RFC3551] that states that "different media types SHALL NOT be 225 interleaved or multiplexed within a single RTP Session". This is 226 resolved by appropriate exception clauses given that this 227 specification and its applicability is followed. 229 Within an RTP session where multiple media types have been configured 230 for use, an SSRC may send only one type of media during its lifetime 231 (i.e., it can switch between different audio codecs, since those are 232 both the same type of media, but cannot switch between audio and 233 video). Different SSRCs must be used for the different media 234 sources, the same way multiple media sources of the same media type 235 already have to do. The payload type will inform a receiver which 236 media type the SSRC is being used for. Thus the payload type must be 237 unique across all of the payload configurations independent of media 238 type that may be used in the RTP session. 240 Some few extra considerations within the RTP sessions also needs to 241 be considered. RTCP bandwidth and regular reporting suppression 242 (AVPF and SAVPF) should be considered to be configured. Certain 243 payload types like FEC also need additional rules. 245 The final important part of the solution to this is to use signalling 246 and ensure that agreement on using multiple media types in an RTP 247 session exists, and how that then is configured. Thus document 248 documents some existing requirements, while an external reference 249 defines how this is accomplished in SDP. 251 5. Applicability 253 This specification has limited applicability and any one intending to 254 use it must ensure that their application and usage meets the below 255 criteria. 257 5.1. Usage of the RTP session 259 Before choosing to use this specification, an application implementer 260 needs to ensure that they don't have a need for different RTP 261 sessions between the media types for some reason. The main rule is 262 that if one expects to have equal treatment of all media packets, 263 then this specification might be suitable. The equal treatment 264 include anything from network level up to RTCP reporting and 265 feedback. The document Guidelines for using the Multiplexing 266 Features of RTP [I-D.westerlund-avtcore-multiplex-architecture] gives 267 more detailed guidance on aspects to consider when choosing how to 268 use RTP and specifically sessions. RTP-using applications that need 269 or would prefer multiple RTP sessions, but do not require the 270 functionalities or behaviors that multiple transport flows give, can 271 consider using Multiple RTP Sessions on a Single Lower-Layer 272 Transport [I-D.westerlund-avtcore-transport-multiplexing]. 274 The second important consideration is that all media flows to be sent 275 within a single RTP session need to have similar bandwidth. This is 276 due to limitations of the RTCP timing rules, and the need for a 277 common RTCP reporting interval across all participants in a session 278 to avoid problems with premature SSRC timeouts. If an RTP session 279 contains flows with very different bandwidths, for example low-rate 280 audio coupled with high-quality video, this will result in either 281 excessive or insufficient RTCP for some flows, depending how the RTCP 282 session bandwidth, and hence reporting interval, is configured. This 283 is discussed further in Section 6.4. 285 5.2. Signalled Support 287 Usage of this specification is not compatible with anyone following 288 RFC 3550 and intending to have different RTP sessions for each media 289 type. Therefore there must be mutual agreement to use multiple media 290 types in one RTP session by all participants within an RTP session. 291 This agreement must in most cases be determined using signalling. 293 This requirement can be a problem for signalling solutions that can't 294 negotiate with all participants. For declarative signalling 295 solutions, mandating that the session is using multiple media types 296 in one RTP session can be a way of attempting to ensure that all 297 participants in the RTP session follow the requirement. However, for 298 signalling solutions that lack methods for enforcing that a receiver 299 supports a specific feature, this can still cause issues. 301 5.3. Homogeneous Multi-party 303 In multiparty communication scenarios it is important to separate two 304 different cases. One case is where the RTP session contains multiple 305 participants in a common RTP session. This occurs for example in Any 306 Source Multicast (ASM) and Transport Translator topologies as defined 307 in RTP Topologies [RFC5117]. It may also occur in some 308 implementations of RTP mixers that share the same SSRC/CSRC space 309 across all participants. The second case is when the RTP session is 310 terminated in a middlebox and the other participants sources are 311 projected or switched into each RTP session and rewritten on RTP 312 header level including SSRC mappings. 314 For the first case, with a common RTP session or at least shared 315 SSRC/CSRC values, all participants in multiparty communication are 316 required to support multiple media types in an RTP session. An 317 participant using two or more RTP sessions towards a multiparty 318 session can't be collapsed into a single session with multiple media 319 types. The reason is that in case of multiple RTP sessions, the same 320 SSRC value can be use in both RTP sessions without any issues, but 321 when collapsed to a single session there is an SSRC collision. In 322 addition some collisions can't be represented in the multiple 323 separate RTP sessions. For example, in a session with audio and 324 video, an SSRC value used for video will not show up in the Audio RTP 325 session at the participant using multiple RTP sessions, and thus not 326 trigger any collision handling. Thus any application using this type 327 of RTP session structure must have a homogeneous support for multiple 328 media types in one RTP session, or be forced to insert a translator 329 node between that participant and the rest of the RTP session. 331 For the second case of separate RTP sessions for each multiparty 332 participant and a central node it is possible to have a mix of single 333 RTP session users and multiple RTP session users as long as one is 334 willing to remap the SSRCs used by a participant with multiple RTP 335 sessions into non-used values in the single RTP session SSRC space 336 for each of the participants using a single RTP session with multiple 337 media types. It can be noted that this type of implementation is 338 required to understand any type of RTP/RTCP extension being used in 339 the RTP sessions to correctly be able to translate them between the 340 RTP sessions. 342 5.4. Reduced number of Payload Types 344 An RTP session with multiple media types in it have only a single 345 7-bit Payload Type range for all its payload types. Within the 128 346 available values, only 96 or less if "Multiplexing RTP Data and 347 Control Packets on a Single Port" [RFC5761] is used, all the 348 different RTP payload configurations for all the media types must 349 fit. For most applications this will not be a real problem, but the 350 limitation exists and could be encountered. 352 5.5. Stream Differentiation 354 If network level differentiation of the media streams of different 355 media types are desired using this specification can cause severe 356 limitations. All media streams in an RTP session, independent of the 357 media type, will be sent over the same underlying transport flow. 358 Any flow-based Quality of Service (QoS) mechanism will be unable to 359 provide differentiated treatment between different media types, e.g. 360 to prioritize audio over video. If that is desired, separate RTP 361 sessions over different underlying transport flows needs to be used. 362 Any marking-based QoS scheme like DiffServ is not affected unless a 363 network ingress marks based on flows. 365 5.6. Non-compatible Extensions 367 There exist some RTP and RTCP extensions that rely on the existence 368 of multiple RTP sessions. If the goal of using an RTP session with 369 multiple media types is to have only a single RTP session, then these 370 extensions can't be used. If one has no need to have different RTP 371 sessions for the media types but is willing to have multiple RTP 372 sessions, one for the main media transmission and one for the 373 extension, they can be used. It should be noted that this assumes 374 that it is possible to get the extension working when the related RTP 375 session contains multiple media types. 377 Identified RTP/RTCP extensions that require multiple RTP Sessions 378 are: 380 RTP Retransmission: RTP Retransmission [RFC4588] has a session 381 multiplexed mode. It also has a SSRC multiplexed mode that can be 382 used instead. So use the mode that is suitable for the RTP 383 application. 385 XOR-Based FEC: The RTP Payload Format for Generic Forward Error 386 Correction [RFC5109] and its predecessor [RFC2733] requires a 387 separate RTP session unless the FEC data is carried in RTP Payload 388 for Redundant Audio Data [RFC2198] which has another set of 389 restrictions. 391 Note that the Source-Specific Media Attributes [RFC5576] 392 specification defines an SDP syntax (the "FEC" semantic of the 393 "ssrc-group" attribute) to signal FEC relationships between 394 multiple media streams within a single RTP session. However, this 395 can't be used as the FEC repair packets are required to have the 396 same SSRC value as the source packets being protected. [RFC5576] 397 does not normatively update and resolve that restriction. 399 6. RTP Session Specification 401 This section defines what needs to be done or avoided to make an RTP 402 session with multiple media types function without issues. 404 6.1. RTP Session 406 Section 5.2 of "RTP: A Transport Protocol for Real-Time Applications" 407 [RFC3550] states: 409 For example, in a teleconference composed of audio and video media 410 encoded separately, each medium SHOULD be carried in a separate 411 RTP session with its own destination transport address. 413 Separate audio and video streams SHOULD NOT be carried in a single 414 RTP session and demultiplexed based on the payload type or SSRC 415 fields. 417 This specification changes both of these sentences. The first 418 sentence is changed to: 420 For example, in a teleconference composed of audio and video media 421 encoded separately, each medium SHOULD be carried in a separate 422 RTP session with its own destination transport address, unless 423 specification [RFCXXXX] is followed and the application meets the 424 applicability constraints. 426 The second sentence is changed to: 428 Separate audio and video streams SHOULD NOT be carried in a single 429 RTP session and demultiplexed based on the payload type or SSRC 430 fields, unless multiplexed based on both SSRC and payload type and 431 usage meets what Multiple Media Types in an RTP Session [RFCXXXX] 432 specifies. 434 Second paragraph of Section 6 in RTP Profile for Audio and Video 435 Conferences with Minimal Control [RFC3551] says: 437 The payload types currently defined in this profile are assigned 438 to exactly one of three categories or media types: audio only, 439 video only and those combining audio and video. The media types 440 are marked in Tables 4 and 5 as "A", "V" and "AV", respectively. 441 Payload types of different media types SHALL NOT be interleaved or 442 multiplexed within a single RTP session, but multiple RTP sessions 443 MAY be used in parallel to send multiple media types. An RTP 444 source MAY change payload types within the same media type during 445 a session. See the section "Multiplexing RTP Sessions" of RFC 446 3550 for additional explanation. 448 This specifications purpose is to violate that existing SHALL NOT 449 under certain conditions. Thus also this sentence must be changed to 450 allow for multiple media type's payload types in the same session. 451 The above sentence is changed to: 453 Payload types of different media types SHALL NOT be interleaved or 454 multiplexed within a single RTP session unless as specifified and 455 under the restriction in Multiple Media Types in an RTP Session 456 [RFCXXXX]. Multiple RTP sessions MAY be used in parallel to send 457 multiple media types. 459 RFC-Editor Note: Please replace RFCXXXX with the RFC number of this 460 specification when assigned. 462 We can now go on and discuss the five bullets that are motivating the 463 previous in Section 5.2 of the RTP Specification [RFC3550]. They are 464 repeated here for the reader's convenience: 466 1. If, say, two audio streams shared the same RTP session and the 467 same SSRC value, and one were to change encodings and thus 468 acquire a different RTP payload type, there would be no general 469 way of identifying which stream had changed encodings. 471 2. An SSRC is defined to identify a single timing and sequence 472 number space. Interleaving multiple payload types would require 473 different timing spaces if the media clock rates differ and would 474 require different sequence number spaces to tell which payload 475 type suffered packet loss. 477 3. The RTCP sender and receiver reports (see Section 6.4 of RFC 478 3550) can only describe one timing and sequence number space per 479 SSRC and do not carry a payload type field. 481 4. An RTP mixer would not be able to combine interleaved streams of 482 incompatible media into one stream. 484 5. Carrying multiple media in one RTP session precludes: the use of 485 different network paths or network resource allocations if 486 appropriate; reception of a subset of the media if desired, for 487 example just audio if video would exceed the available bandwidth; 488 and receiver implementations that use separate processes for the 489 different media, whereas using separate RTP sessions permits 490 either single- or multiple-process implementations. 492 Bullets 1 to 3 are all related to that each media source must use one 493 or more unique SSRCs to avoid these issues as mandated below 494 (Section 6.2). Bullet 4 can be served by two arguments, first of all 495 each SSRC will commonly a native media type, communicated through the 496 RTP payload type, allowing a middlebox to do media type specific 497 operations. The second argument is that in many contexts blind 498 combining without additional contexts are anyway not suitable. 499 Regarding bullet 5 this is a understood and explicitly stated 500 applicability limitations for the method described in this document. 502 6.2. Sender Source Restrictions 504 A SSRC in the RTP session MUST only send one media type (audio, 505 video, text etc.) during the SSRC's lifetime. The main motivation is 506 that a given SSRC has its own RTP timestamp and sequence number 507 spaces. The same way that you can't send two streams of encoded 508 audio on the same SSRC, you can't send one audio and one video 509 encoding on the same SSRC. Each media encoding when made into an RTP 510 stream needs to have the sole control over the sequence number and 511 timestamp space. If not, one would not be able to detect packet loss 512 for that particular stream. Nor can one easily determine which clock 513 rate a particular SSRCs timestamp shall increase with. 515 6.3. Payload Type Applicability 517 Most Payload Types have a native media type, like an audio codec is 518 natural belonging to the audio media type. However, there exist a 519 number of RTP payload types that don't have a native media type. For 520 example, transport robustification mechanisms like RTP Retransmission 521 [RFC4588] and Generic FEC [RFC5109] inherit their media type from 522 what they protect. RTP Retransmission is explicitly bound to the 523 payload type it is protecting, and thus will inherit it. However 524 Generic FEC is a excellent example of an RTP payload type that has no 525 natural media type. The media type for what it protects is not 526 relevant as it is the recovered RTP packets that have a particular 527 media type, and thus Generic FEC is best categorized as an 528 application media type. 530 The above discussion is relevant to what limitations exist for RTP 531 payload type usage within an RTP session that has multiple media 532 types. In fact this document (Section 7.2) suggest that for usage of 533 Generic FEC (XOR-based) as defined in RFC 5109 can actually use a 534 single media type when used with independent RTP sessions for source 535 and repair data. 537 Note a particular SSRC carrying Generic FEC will clearly only 538 protect a specific SSRC and thus that instance is bound to the 539 SSRC's media type. For this specific case, it is possible to have 540 one be applicable to both. However, in cases when the signalling 541 is setup to enable fallback to using separate RTP sessions, then 542 using a different media type, e.g. application, than the media 543 being protected can create issues. 545 6.4. RTCP 547 An RTP session has a single set of parameters that configure the 548 session bandwidth, the RTCP sender and receiver fractions (e.g., via 549 the SDP "b=RR:" and "b=RS: lines), and the parameters of the RTP/AVPF 550 profile [RFC4585] (e.g., trr-int) if that profile (or its secure 551 extension, RTP/SAVPF [RFC5124]) is used. As a consequence, the RTCP 552 reporting interval will be the same for every SSRC in an RTP session. 553 This uniform RTCP reporting interval can result in RTCP reports being 554 sent more often than is considered desirable for a particular media 555 type. For example, if an audio flow is multiplexed with a high 556 quality video flow where the session bandwidth is configured to match 557 the video bandwidth, this can result in the RTCP packets having a 558 greater bandwidth allocation than the audio data rate. If the 559 reduced minimum RTCP interval described in Section 6.2 of [RFC3550] 560 is used in the session, which might be appropriate for video where 561 rapid feedback is wanted, the audio sources could be required to send 562 RTCP packets more often than they send audio data packets. This is 563 clearly undesirable, and while the mismatch can be reduced through 564 careful tuning of the RTCP parameters, particularly trr_int in RTP/ 565 AVPF sessions, it is inherent in the design of the RTCP timing rules, 566 and affects all RTP sessions containing flows with mismatched 567 bandwidth. 569 (tbd: A future version of this draft needs to provide details of 570 the extent of this problem, recommendations for how to tune the 571 RTCP bandwidth fraction and trr_int, and when the mismatch is so 572 great that it's better to use separate RTP sessions. The 573 recommendations will likely be different for RTP/AVP and RTP/AVPF 574 sessions, since trr_int offers a potential solution that is not 575 suitable in legacy session.) 577 Having multiple media types in one RTP session also results in more 578 SSRCs being present in this RTP session. This increasing the amount 579 of cross reporting between the SSRCs. From an RTCP perspective, two 580 RTP sessions with half the number of SSRCs in each will be slightly 581 more efficient. If someone needs either the higher efficiency due to 582 the lesser number of SSRCs or the fact that one can't tailor RTCP 583 usage per media type, they need to use independent RTP sessions. 585 When it comes to handling multiple SSRCs in an RTP session there is a 586 clarification under discussion in Real-Time Transport Protocol (RTP) 587 Considerations for Multi-Stream Endpoints 588 [I-D.lennox-avtcore-rtp-multi-stream]. When it comes to configuring 589 RTCP the need for regular periodic reporting needs to be weighted 590 against any feedback or control messages being sent. The 591 applications using AVPF or SAVPF are RECOMMENDED to consider setting 592 trr-int parameter to a value suitable for the applications needs, 593 thus potentially reducing the need for regular reporting and thus 594 releasing more bandwidth for use for feedback or control. 596 Another aspect of an RTP session with multiple media types is that 597 the used RTCP packets, RTCP Feedback Messages, or RTCP XR metrics 598 used may not be applicable to all media types. Instead all RTP/RTCP 599 endpoints need to correlate the media type of the SSRC being 600 referenced in an messages/packet and only use those that apply to 601 that particular SSRC and its media type. Signalling solutions may 602 have shortcomings when it comes to indicate that a particular set of 603 RTCP reports or feedback messages only apply to a particular media 604 type within an RTP session. 606 7. Extension Considerations 608 This section discusses the impact on some RTP/RTCP extensions due to 609 usage of multiple media types in on RTP session. Only extensions 610 where something worth noting has been included. 612 7.1. RTP Retransmission 614 SSRC-multiplexed RTP retransmission [RFC4588] is actually very 615 straightforward. Each retransmission RTP payload type is explicitly 616 connected to an associated payload type. If retransmission is only 617 to be used with a subset of all payload types, this is not a problem, 618 as it will be evident from the retransmission payload types which 619 payload types that have retransmission enabled for them. 621 Session-multiplexed RTP retransmission is also possible to use where 622 an retransmission session contains the retransmissions of the 623 associated payload types in the source RTP session. The only 624 difference to previously is that the source RTP session is one which 625 contains multiple media types. Thus it is even more likely that only 626 a subset of the source RTP session's payload types and SSRCs are 627 actually retransmitted. 629 Open Issue: When using SDP to signal retransmission for one RTP 630 session with multiple media types and one RTP session for the 631 retransmission data will cause a situation where one will have 632 multiple m= lines grouped using FID and the ones belonging to 633 respective RTP session being grouped using BUNDLE. This usage may 634 contradict both the FID semantics [RFC5888] and an assumption in the 635 RTP retransmission specification [RFC4588]. 637 7.2. Generic FEC 639 The RTP Payload Format for Generic Forward Error Correction 640 [RFC5109], and also its predecessor [RFC2733], requires some 641 considerations, and they are different depending on what type of 642 configuration of usage one has. 644 Independent RTP Sessions, i.e. where source and repair data are sent 645 in different RTP sessions. As this mode of configuration requires 646 different RTP session, there must be at least one RTP session for 647 source data, this session can be one using multiple media types. The 648 repair session only needs one RTP Payload type indicating repair 649 data, i.e. x/ulpfec or x/parityfec depending if RFC 5109 or RFC 2733 650 is used. The media type in this session is not relevant and can in 651 theory be any of the defined ones. It is recommended that one uses 652 "Application". 654 In stream, using RTP Payload for Redundant Audio Data [RFC2198] 655 combining repair and source data in the same packets. This is 656 possible to use within a single RTP session. However, the usage and 657 configuration of the payload types can create an issue. First of all 658 it might be required to have one payload type per media type for the 659 FEC repair data payload format, i.e. one for audio/ulpfec and one for 660 text/ulpfec if audio and text are combined in an RTP session. 661 Secondly each combination of source payload and its FEC repair data 662 must be an explicit configured payload type. This has potential for 663 making the limitation of RTP payload types available into a real 664 issue. 666 8. Signalling 668 The Signalling requirements 670 Establishing an RTP session with multiple media types requires 671 signalling. This signalling needs to fulfill the following 672 requirements: 674 1. Ensure that any participant in the RTP session is aware that this 675 is an RTP session with multiple media types. 677 2. Ensure that the payload types in use in the RTP session are using 678 unique values, with no overlap between the media types. 680 3. Configure the RTP session level parameters, such as RTCP RR and 681 RS bandwidth, AVPF trr-int, underlying transport, the RTCP 682 extensions in use, and security parameters, commonly for the RTP 683 session. 685 4. RTP and RTCP functions that can be bound to a particular media 686 type should be reused when possible also for other media types, 687 instead of having to be configured for multiple code-points. 688 Note: In some cases one will not have a choice but to use 689 multiple configurations. 691 8.1. SDP-Based Signalling 693 The signalling of multiple media types in one RTP session in SDP is 694 specified in "Multiplexing Negotiation Using Session Description 695 Protocol (SDP) Port Numbers" 696 [I-D.ietf-mmusic-sdp-bundle-negotiation]. 698 9. IANA Considerations 700 This document makes no request of IANA. 702 Note to RFC Editor: this section may be removed on publication as an 703 RFC. 705 10. Security Considerations 707 Having an RTP session with multiple media types doesn't change the 708 methods for securing a particular RTP session. One possible 709 difference is that the different media have often had different 710 security requirements. When combining multiple media types in one 711 session, their security requirements must also be combined by 712 selecting the most demanding for each property. Thus having multiple 713 media types may result in increased overhead for security for some 714 media types to ensure that all requirements are meet. 716 Otherwise, the recommendations for how to configure and RTP session 717 do not add any additional requirements compared to normal RTP, except 718 for the need to be able to ensure that the participants are aware 719 that it is a multiple media type session. If not that is ensured it 720 can cause issues in the RTP session for both the unaware and the 721 aware one. Similar issues can also be produced in an normal RTP 722 session by creating configurations for different end-points that 723 doesn't match each other. 725 11. Acknowledgements 727 The authors would like to thank Christer Holmberg for the feedback on 728 the document. 730 12. References 732 12.1. Normative References 734 [I-D.ietf-mmusic-sdp-bundle-negotiation] 735 Holmberg, C. and H. Alvestrand, "Multiplexing Negotiation 736 Using Session Description Protocol (SDP) Port Numbers", 737 draft-ietf-mmusic-sdp-bundle-negotiation-01 (work in 738 progress), August 2012. 740 [I-D.lennox-avtcore-rtp-multi-stream] 741 Lennox, J. and M. Westerlund, "Real-Time Transport 742 Protocol (RTP) Considerations for Endpoints Sending 743 Multiple Media Streams", 744 draft-lennox-avtcore-rtp-multi-stream-00 (work in 745 progress), July 2012. 747 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 748 Requirement Levels", BCP 14, RFC 2119, March 1997. 750 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 751 Jacobson, "RTP: A Transport Protocol for Real-Time 752 Applications", STD 64, RFC 3550, July 2003. 754 [RFC3551] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and 755 Video Conferences with Minimal Control", STD 65, RFC 3551, 756 July 2003. 758 12.2. Informative References 760 [I-D.westerlund-avtcore-multiplex-architecture] 761 Westerlund, M., Burman, B., Perkins, C., and H. 762 Alvestrand, "Guidelines for using the Multiplexing 763 Features of RTP", 764 draft-westerlund-avtcore-multiplex-architecture-02 (work 765 in progress), July 2012. 767 [I-D.westerlund-avtcore-transport-multiplexing] 768 Westerlund, M. and C. Perkins, "Multiple RTP Sessions on a 769 Single Lower-Layer Transport", 770 draft-westerlund-avtcore-transport-multiplexing-03 (work 771 in progress), July 2012. 773 [RFC2198] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., 774 Handley, M., Bolot, J., Vega-Garcia, A., and S. Fosse- 775 Parisis, "RTP Payload for Redundant Audio Data", RFC 2198, 776 September 1997. 778 [RFC2733] Rosenberg, J. and H. Schulzrinne, "An RTP Payload Format 779 for Generic Forward Error Correction", RFC 2733, 780 December 1999. 782 [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 783 Description Protocol", RFC 4566, July 2006. 785 [RFC4585] Ott, J., Wenger, S., Sato, N., Burmeister, C., and J. Rey, 786 "Extended RTP Profile for Real-time Transport Control 787 Protocol (RTCP)-Based Feedback (RTP/AVPF)", RFC 4585, 788 July 2006. 790 [RFC4588] Rey, J., Leon, D., Miyazaki, A., Varsa, V., and R. 792 Hakenberg, "RTP Retransmission Payload Format", RFC 4588, 793 July 2006. 795 [RFC5109] Li, A., "RTP Payload Format for Generic Forward Error 796 Correction", RFC 5109, December 2007. 798 [RFC5117] Westerlund, M. and S. Wenger, "RTP Topologies", RFC 5117, 799 January 2008. 801 [RFC5124] Ott, J. and E. Carrara, "Extended Secure RTP Profile for 802 Real-time Transport Control Protocol (RTCP)-Based Feedback 803 (RTP/SAVPF)", RFC 5124, February 2008. 805 [RFC5576] Lennox, J., Ott, J., and T. Schierl, "Source-Specific 806 Media Attributes in the Session Description Protocol 807 (SDP)", RFC 5576, June 2009. 809 [RFC5761] Perkins, C. and M. Westerlund, "Multiplexing RTP Data and 810 Control Packets on a Single Port", RFC 5761, April 2010. 812 [RFC5888] Camarillo, G. and H. Schulzrinne, "The Session Description 813 Protocol (SDP) Grouping Framework", RFC 5888, June 2010. 815 Authors' Addresses 817 Magnus Westerlund 818 Ericsson 819 Farogatan 6 820 SE-164 80 Kista 821 Sweden 823 Phone: +46 10 714 82 87 824 Email: magnus.westerlund@ericsson.com 826 Colin Perkins 827 University of Glasgow 828 School of Computing Science 829 Glasgow G12 8QQ 830 United Kingdom 832 Email: csp@csperkins.org 833 Jonathan Lennox 834 Vidyo, Inc. 835 433 Hackensack Avenue 836 Seventh Floor 837 Hackensack, NJ 07601 838 US 840 Email: jonathan@vidyo.com