| < draft-westerlund-avtcore-multiplex-architecture-00.txt | draft-westerlund-avtcore-multiplex-architecture-01.txt > | |||
|---|---|---|---|---|
| Network Working Group M. Westerlund | Network Working Group M. Westerlund | |||
| Internet-Draft B. Burman | Internet-Draft B. Burman | |||
| Intended status: BCP Ericsson | Intended status: Informational Ericsson | |||
| Expires: April 26, 2012 C. Perkins | Expires: September 13, 2012 C. Perkins | |||
| University of Glasgow | University of Glasgow | |||
| October 24, 2011 | March 12, 2012 | |||
| RTP Multiplexing Architecture | RTP Multiplexing Architecture | |||
| draft-westerlund-avtcore-multiplex-architecture-00 | draft-westerlund-avtcore-multiplex-architecture-01 | |||
| Abstract | Abstract | |||
| RTP has always been a protocol that supports multiple participants | Real-time Transport Protocol is a flexible protocol possible to use | |||
| each sending their own media streams in an RTP session. Thus relying | in a wide range of applications and network and system topologies. | |||
| on the three main multiplexing points in RTP; RTP session, SSRC and | This flexibility and the implications of different choices should be | |||
| Payload Type for their various needs. However, most usages of RTP | understood by any application developer using RTP. To facilitate | |||
| have been less complex often with a single SSRC in each direction, | that understanding, this document contains an in-depth discussion of | |||
| with a single RTP session per media type. But the more complex | the usage of RTP's multiplexing points; the RTP session, the | |||
| usages start to be more common and thus guidance on how to use RTP in | Synchronization Source Identifier (SSRC), and the payload type. The | |||
| various complex cases are needed. This document analyzes a number of | focus is put on the first two, trying to give guidance and source | |||
| cases and discusses the usage of the various multiplexing points and | material for an analysis on the most suitable choices for the | |||
| the need for functionality when defining RTP/RTCP extensions that | application being designed. | |||
| utilize multiple RTP streams and multiple RTP sessions. | ||||
| Status of this Memo | Status of this Memo | |||
| This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
| provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
| working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
| Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
| This Internet-Draft will expire on April 26, 2012. | This Internet-Draft will expire on September 13, 2012. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2011 IETF Trust and the persons identified as the | Copyright (c) 2012 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
| to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
| include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
| the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
| skipping to change at page 2, line 22 ¶ | skipping to change at page 2, line 21 ¶ | |||
| Table of Contents | Table of Contents | |||
| 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
| 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 5 | 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
| 2.1. Requirements Language . . . . . . . . . . . . . . . . . . 5 | 2.1. Requirements Language . . . . . . . . . . . . . . . . . . 5 | |||
| 2.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5 | 2.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
| 3. RTP Multiplex Points . . . . . . . . . . . . . . . . . . . . . 6 | 3. RTP Multiplex Points . . . . . . . . . . . . . . . . . . . . . 6 | |||
| 3.1. Session . . . . . . . . . . . . . . . . . . . . . . . . . 6 | 3.1. Session . . . . . . . . . . . . . . . . . . . . . . . . . 6 | |||
| 3.2. SSRC . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 | 3.2. SSRC . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 | |||
| 3.3. CSRC . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 | 3.3. CSRC . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 | |||
| 3.4. Payload Type . . . . . . . . . . . . . . . . . . . . . . . 8 | 3.4. Payload Type . . . . . . . . . . . . . . . . . . . . . . . 9 | |||
| 4. Multiple Streams Alternatives . . . . . . . . . . . . . . . . 9 | 4. Multiple Streams Alternatives . . . . . . . . . . . . . . . . 10 | |||
| 5. RTP Topologies and Issues . . . . . . . . . . . . . . . . . . 10 | 5. RTP Topologies and Issues . . . . . . . . . . . . . . . . . . 11 | |||
| 5.1. Point to Point . . . . . . . . . . . . . . . . . . . . . . 11 | 5.1. Point to Point . . . . . . . . . . . . . . . . . . . . . . 12 | |||
| 5.1.1. RTCP Reporting . . . . . . . . . . . . . . . . . . . . 11 | 5.1.1. RTCP Reporting . . . . . . . . . . . . . . . . . . . . 12 | |||
| 5.1.2. Compound RTCP Packets . . . . . . . . . . . . . . . . 12 | 5.1.2. Compound RTCP Packets . . . . . . . . . . . . . . . . 13 | |||
| 5.2. Point to Multipoint Using Multicast . . . . . . . . . . . 12 | 5.2. Point to Multipoint Using Multicast . . . . . . . . . . . 13 | |||
| 5.3. Point to Multipoint Using an RTP Translator . . . . . . . 14 | 5.3. Point to Multipoint Using an RTP Translator . . . . . . . 15 | |||
| 5.4. Point to Multipoint Using an RTP Mixer . . . . . . . . . . 15 | 5.4. Point to Multipoint Using an RTP Mixer . . . . . . . . . . 16 | |||
| 5.5. Point to Multipoint using Multiple Unicast flows . . . . . 16 | 5.5. Point to Multipoint using Multiple Unicast flows . . . . . 17 | |||
| 5.6. Decomposited End-Point . . . . . . . . . . . . . . . . . . 16 | 5.6. De-composite End-Point . . . . . . . . . . . . . . . . . . 18 | |||
| 6. Dismissing Payload Type Multiplexing . . . . . . . . . . . . . 18 | 6. Multiple Streams Discussion . . . . . . . . . . . . . . . . . 19 | |||
| 7. Multiple Streams Discussion . . . . . . . . . . . . . . . . . 20 | 6.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . 19 | |||
| 7.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . 20 | 6.2. RTP/RTCP Aspects . . . . . . . . . . . . . . . . . . . . . 19 | |||
| 7.2. RTP/RTCP Aspects . . . . . . . . . . . . . . . . . . . . . 20 | 6.2.1. The RTP Specification . . . . . . . . . . . . . . . . 19 | |||
| 7.2.1. The RTP Specification . . . . . . . . . . . . . . . . 20 | 6.2.2. Handling Varying sets of Senders . . . . . . . . . . . 22 | |||
| 7.2.2. Multiple SSRC Legacy Considerations . . . . . . . . . 22 | 6.2.3. Cross Session RTCP Requests . . . . . . . . . . . . . 22 | |||
| 7.2.3. RTP Specification Clarifications Needed . . . . . . . 23 | 6.2.4. Binding Related Sources . . . . . . . . . . . . . . . 22 | |||
| 7.2.4. Handling Varying sets of Senders . . . . . . . . . . . 23 | 6.2.5. Forward Error Correction . . . . . . . . . . . . . . . 24 | |||
| 7.2.5. Cross Session RTCP requests . . . . . . . . . . . . . 23 | 6.2.6. Transport Translator Sessions . . . . . . . . . . . . 25 | |||
| 7.2.6. Binding Related Sources . . . . . . . . . . . . . . . 23 | 6.3. Interworking . . . . . . . . . . . . . . . . . . . . . . . 25 | |||
| 7.2.7. Forward Error Correction . . . . . . . . . . . . . . . 25 | 6.3.1. Interworking Applications . . . . . . . . . . . . . . 26 | |||
| 7.2.8. Transport Translator Sessions . . . . . . . . . . . . 26 | 6.3.2. Multiple SSRC Legacy Considerations . . . . . . . . . 27 | |||
| 7.2.9. Multiple Media Types in one RTP session . . . . . . . 26 | 6.4. Signalling Aspects . . . . . . . . . . . . . . . . . . . . 28 | |||
| 7.3. Signalling Aspects . . . . . . . . . . . . . . . . . . . . 28 | 6.4.1. Session Oriented Properties . . . . . . . . . . . . . 28 | |||
| 7.3.1. Session Oriented Properties . . . . . . . . . . . . . 28 | 6.4.2. SDP Prevents Multiple Media Types . . . . . . . . . . 29 | |||
| 7.3.2. SDP Prevents Multiple Media Types . . . . . . . . . . 29 | 6.4.3. Media Stream Usage . . . . . . . . . . . . . . . . . . 29 | |||
| 7.4. Network Apsects . . . . . . . . . . . . . . . . . . . . . 29 | 6.5. Network Aspects . . . . . . . . . . . . . . . . . . . . . 30 | |||
| 7.4.1. Quality of Service . . . . . . . . . . . . . . . . . . 29 | 6.5.1. Quality of Service . . . . . . . . . . . . . . . . . . 30 | |||
| 7.4.2. NAT and Firewall Traversal . . . . . . . . . . . . . . 29 | 6.5.2. NAT and Firewall Traversal . . . . . . . . . . . . . . 31 | |||
| 7.4.3. Multicast . . . . . . . . . . . . . . . . . . . . . . 31 | 6.5.3. Multicast . . . . . . . . . . . . . . . . . . . . . . 32 | |||
| 7.4.4. Multiplexing multiple RTP Session on a Single | 6.5.4. Multiplexing multiple RTP Session on a Single | |||
| Transport . . . . . . . . . . . . . . . . . . . . . . 32 | Transport . . . . . . . . . . . . . . . . . . . . . . 33 | |||
| 7.5. Security Aspects . . . . . . . . . . . . . . . . . . . . . 32 | 6.6. Security Aspects . . . . . . . . . . . . . . . . . . . . . 33 | |||
| 7.5.1. Security Context Scope . . . . . . . . . . . . . . . . 32 | 6.6.1. Security Context Scope . . . . . . . . . . . . . . . . 33 | |||
| 7.5.2. Key-Management for Multi-party session . . . . . . . . 33 | 6.6.2. Key-Management for Multi-party session . . . . . . . . 34 | |||
| 8. Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . 33 | 6.6.3. Complexity Implications . . . . . . . . . . . . . . . 34 | |||
| 9. RTP Specification Clarifications . . . . . . . . . . . . . . . 35 | 6.7. Multiple Media Types in one RTP session . . . . . . . . . 35 | |||
| 9.1. RTCP Reporting from all SSRCs . . . . . . . . . . . . . . 35 | 7. Arch-Types . . . . . . . . . . . . . . . . . . . . . . . . . . 37 | |||
| 9.2. RTCP Self-reporting . . . . . . . . . . . . . . . . . . . 35 | 7.1. Single SSRC per Session . . . . . . . . . . . . . . . . . 37 | |||
| 9.3. Combined RTCP Packets . . . . . . . . . . . . . . . . . . 35 | 7.2. Multiple SSRCs of the Same Media Type . . . . . . . . . . 39 | |||
| 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 35 | 7.3. Multiple Sessions for one Media type . . . . . . . . . . . 40 | |||
| 11. Security Considerations . . . . . . . . . . . . . . . . . . . 36 | 7.4. Multiple Media Types in one Session . . . . . . . . . . . 41 | |||
| 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 36 | 7.5. Summary . . . . . . . . . . . . . . . . . . . . . . . . . 43 | |||
| 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 36 | 8. Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . 43 | |||
| 13.1. Normative References . . . . . . . . . . . . . . . . . . . 36 | 9. Proposal for Future Work . . . . . . . . . . . . . . . . . . . 44 | |||
| 13.2. Informative References . . . . . . . . . . . . . . . . . . 36 | 10. RTP Specification Clarifications . . . . . . . . . . . . . . . 45 | |||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 39 | 10.1. RTCP Reporting from all SSRCs . . . . . . . . . . . . . . 45 | |||
| 10.2. RTCP Self-reporting . . . . . . . . . . . . . . . . . . . 45 | ||||
| 10.3. Combined RTCP Packets . . . . . . . . . . . . . . . . . . 45 | ||||
| 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 46 | ||||
| 12. Security Considerations . . . . . . . . . . . . . . . . . . . 46 | ||||
| 13. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 46 | ||||
| 14. References . . . . . . . . . . . . . . . . . . . . . . . . . . 46 | ||||
| 14.1. Normative References . . . . . . . . . . . . . . . . . . . 46 | ||||
| 14.2. Informative References . . . . . . . . . . . . . . . . . . 46 | ||||
| Appendix A. Dismissing Payload Type Multiplexing . . . . . . . . 49 | ||||
| Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 51 | ||||
| 1. Introduction | 1. Introduction | |||
| This document focuses at issues of non-basic usage of RTP [RFC3550] | Real-time Transport Protocol (RTP) [RFC3550] is a commonly used | |||
| where multiple media sources of the same media type are sent over | protocol for real-time media transport. It is a protocol that | |||
| RTP. Separation of different media types is another issue that will | provides great flexibility and can support a large set of different | |||
| be discussed in this document. The intended uses include for example | applications. RTP has several multiplexing points designed for | |||
| multiple sources from the same end-point, multiple streams from a | different purposes. These enable support of multiple media streams | |||
| single media source, multiple end-points each having a source, or an | and switching between different encoding or packetization of the | |||
| application that needs multiple representations (encodings) of a | media. By using multiple RTP sessions, sets of media streams can be | |||
| particular source. It will be shown that these uses are inter- | structured for efficient processing or identification. Thus the | |||
| related and need a common discussion to ensure consistency. In | question for any RTP application designer is how to best use the RTP | |||
| general, usage of the RTP session and media streams will be discussed | session, the SSRC and the payload type to meet the application's | |||
| in detail. | needs. | |||
| RTP is already designed for multiple participants in a communication | ||||
| session. This is not restricted to multicast, as many believe, but | ||||
| also provides functionality over unicast, using either multiple | ||||
| transport flows below RTP or a network node that re-distributes the | ||||
| RTP packets. The node can for example be a transport translator | ||||
| (relay) that forwards the packets unchanged, a translator performing | ||||
| media translation in addition to forwarding, or an RTP mixer that | ||||
| creates new conceptual sources from the received streams. In | ||||
| addition, multiple streams may occur when a single end-point have | ||||
| multiple media sources of the same media type, like multiple cameras | ||||
| or microphones that need to be sent simultaneously. | ||||
| Historically, the most common RTP use cases have been point to point | ||||
| Voice over IP (VoIP) or streaming applications, commonly with no more | ||||
| than one media source per end-point and media type (typically audio | ||||
| and video). Even in conferencing applications, especially voice | ||||
| only, the conference focus or bridge has provided a single stream | ||||
| with a mix of the other participants to each participant. It is also | ||||
| common to have individual RTP sessions between each end-point and the | ||||
| RTP mixer. | ||||
| SSRC is the RTP media stream identifier that helps to uniquely | The purpose of this document is to provide clear information about | |||
| identify media sources in RTP sessions. Even though available SSRC | the possibilities of RTP when it comes to multiplexing. The RTP | |||
| space can theoretically handle more than 4 billion simultaneous | application designer should understand the implications that come | |||
| sources, the perceived need for handling multiple SSRCs in | from a particular choice of RTP multiplexing points. The document | |||
| implementations has been small. This has resulted in an installed | will recommend against some usages as being unsuitable, in general or | |||
| legacy base that isn't fully RTP specification compliant and will | for particular purposes. | |||
| have different issues if they receive multiple SSRCs of media, either | ||||
| simultaneously or in sequence. These issues will manifest themselves | ||||
| in various ways, either by software crashes or simply in limited | ||||
| functionality, like only decoding and playing back the first or | ||||
| latest received SSRC and discarding media related to any other SSRCs. | ||||
| There have also arisen various cases where multiple SSRCs are used to | RTP was from the beginning designed for multiple participants in a | |||
| represent different aspects of what is in fact a single underlying | communication session. This is not restricted to multicast, as some | |||
| media source. A very basic case is RTP retransmission [RFC4588] | may believe, but also provides functionality over unicast, using | |||
| which have one SSRC for the original stream, and a second SSRC either | either multiple transport flows below RTP or a network node that re- | |||
| in the same session or in a different session to represent the | distributes the RTP packets. The re-distributing node can for | |||
| retransmitted packets to ensure that the monitoring functions still | example be a transport translator (relay) that forwards the packets | |||
| function. Another use case is scalable encoding, such as the RTP | unchanged, a translator performing media translation in addition to | |||
| payload format for Scalable Video Coding (SVC) [RFC6190], which has | forwarding, or an RTP mixer that creates new conceptual sources from | |||
| an operation mode named Multiple Session Transmission (MST) that uses | the received streams. In addition, multiple streams may occur when a | |||
| one SSRC in each RTP session to send one or more scalability layers. | single end-point have multiple media sources, like multiple cameras | |||
| A third example is simulcast where a single media source is encoded | or microphones that need to be sent simultaneously. | |||
| in different versions and transmitted to an RTP mixer that picks | ||||
| which version to actually distribute to a given receiver part of the | ||||
| RTP session. | ||||
| This situation has created a need for a document that discusses the | This document has been written due to increased interest in more | |||
| existing possibilities in the RTP protocol and how these can and | advanced usage of RTP, resulting in questions regarding the most | |||
| should be used in applications. A new set of applications needing | appropriate RTP usage. The limitations in some implementations, RTP/ | |||
| more advanced functionalities from RTP is also emerging on the | RTCP extensions, and signalling has also been exposed. It is | |||
| market, such as telepresence and advanced video conferencing. Thus | expected that some limitations will be addressed by updates or new | |||
| furthering the need for a more common understanding of how multiple | extensions resolving the shortcomings. The authors also hope that | |||
| streams are handled in RTP to ensure media plane interoperability. | clarification on the usefulness of some functionalities in RTP will | |||
| result in more complete implementations in the future. | ||||
| The document starts with some definitions and then goes into the | The document starts with some definitions and then goes into the | |||
| existing RTP functionalities around multiplexing. Both the desired | existing RTP functionalities around multiplexing. Both the desired | |||
| behavior and the implications of a particular behavior depend on | behavior and the implications of a particular behavior depend on | |||
| which topologies are used, which requires some consideration. This | which topologies are used, which requires some consideration. This | |||
| is followed by a discussion of some choices in multiplexing behavior | is followed by a discussion of some choices in multiplexing behavior | |||
| and their impacts. Finally, some recommendations and examples are | and their impacts. Some arch-types of RTP usage are discussed. | |||
| provided. | ||||
| Finally, some recommendations and examples are provided. | ||||
| This document is currently an individual contribution, but it is the | ||||
| intention of the authors that this should become a WG document that | ||||
| objectively describes and provides suitable recommendations for which | ||||
| there is WG consensus. Currently this document only represents the | ||||
| views of the authors. The authors gladly accept any feedback on the | ||||
| document and will be happy to discuss suitable recommendations. | ||||
| 2. Definitions | 2. Definitions | |||
| 2.1. Requirements Language | 2.1. Requirements Language | |||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
| document are to be interpreted as described in RFC 2119 [RFC2119]. | document are to be interpreted as described in RFC 2119 [RFC2119]. | |||
| 2.2. Terminology | 2.2. Terminology | |||
| The following terms and abbreviations are used in this document: | The following terms and abbreviations are used in this document: | |||
| End-point: A single entity sending or receiving RTP packets. It may | End-point: A single entity sending or receiving RTP packets. It may | |||
| be decomposed into several functional blocks, but as long as it | be decomposed into several functional blocks, but as long as it | |||
| behaves a single RTP stack entity it is classified as a single | behaves a single RTP stack entity it is classified as a single | |||
| end-point. | end-point. | |||
| Media Stream: A sequence of RTP packets using a single SSRC that | Media Stream: A sequence of RTP packets using a single SSRC that | |||
| together carry part or all of the content of a specific Media Type | together carries part or all of the content of a specific Media | |||
| from a specific sender source within a given RTP session. | Type from a specific sender source within a given RTP session. | |||
| Media Source: The originator or source of a particular Media Stream. | ||||
| It can either be a single media capturing device such as a video | ||||
| camera, a microphone, or a specific output of a media production | ||||
| function, such as an audio mixer, or some video editing function. | ||||
| Media Aggregate: All Media Streams related to a particular Source. | Media Aggregate: All Media Streams related to a particular Source. | |||
| Media Type: Audio, video, text or data whose form and meaning are | Media Type: Audio, video, text or data whose form and meaning are | |||
| defined by a specific real-time application. | defined by a specific real-time application. | |||
| Source: The source of a particular media stream. Either a single | Multiplex: The operation of taking multiple entities as input, | |||
| media capturing device such as a video camera, or a microphone, or | aggregating them onto some common resource while keeping the | |||
| a specific output of a media production function, such as an audio | individual entities addressable such that they can later be fully | |||
| mixer, or some video editing function. | and unambiguously separated (de-multiplexed) again. | |||
| RTP Session: As defined by [RFC3550], the end-points belonging to | ||||
| the same RTP Session are those that share a single SSRC space. | ||||
| That is, those end-points can see an SSRC identifier transmitted | ||||
| by any one of the other end-points. An end-point can receive an | ||||
| SSRC either as SSRC or as CSRC in RTP and RTCP packets. Thus, the | ||||
| RTP Session scope is decided by the end-points' network | ||||
| interconnection topology, in combination with RTP and RTCP | ||||
| forwarding strategies deployed by end-points and any | ||||
| interconnecting middle nodes. | ||||
| Source: See Media Source. | ||||
| 3. RTP Multiplex Points | 3. RTP Multiplex Points | |||
| This section describes the existing RTP tools that enable | This section describes the existing RTP tools that enable | |||
| multiplexing of different media streams and RTP functionalities. | multiplexing of different media streams. | |||
| 3.1. Session | 3.1. Session | |||
| The RTP Session is the highest semantic level in RTP and contains all | The RTP Session is the highest semantic level in RTP and contains all | |||
| of the RTP functionality. | of the RTP functionality. | |||
| RTP in itself does not contain any Session identifier, but relies on | Identifier: RTP in itself does not contain any Session identifier, | |||
| the underlying transport. For example, when running RTP on top of | but relies either on the underlying transport or on the used | |||
| UDP, an RTP endpoint can identify and delimit an RTP Session from | signalling protocol, depending on in which context the identifier | |||
| other RTP Sessions through the UDP source and destination transport | is used (e.g. transport or signalling). Due to this, a single RTP | |||
| address, consisting of network address and port number(s). Most | Session may have multiple associated identifiers belonging to | |||
| commonly only the destination address, i.e. all traffic received on a | different contexts. | |||
| particular port, is defined as belonging to a specific RTP Session. | ||||
| It is worth noting that in practice a more narrow definition of the | ||||
| transport flows that are related to a give RTP session is possible. | ||||
| An RTP session can for example be defined as one or more 5-tuples | ||||
| (Transport Protocol, Source Address, Source Port, Destination | ||||
| Address, Destination Port). Any set of identifiers of RTP and RTCP | ||||
| packet flows are sufficient to determine if the flow belongs to a | ||||
| particular session or not. | ||||
| Commonly, RTP and RTCP use separate ports and the destination | Position: Depending on underlying transport and signalling | |||
| transport address is in fact an address pair, but in the case of RTP/ | protocol. For example, when running RTP on top of UDP, an RTP | |||
| RTCP multiplex [RFC5761] there is only a single port. | endpoint can identify and delimit an RTP Session from other RTP | |||
| Sessions through the UDP source and destination transport | ||||
| address, consisting of network address and port number(s). | ||||
| Commonly, RTP and RTCP use separate ports and the destination | ||||
| transport address is in fact an address pair, but in the case | ||||
| of RTP/RTCP multiplex [RFC5761] there is only a single port. | ||||
| Another example is SDP signalling [RFC4566], where the grouping | ||||
| framework [RFC5888] uses an identifier per "m="-line. If there | ||||
| is a one-to-one mapping between "m="-line and RTP Session, that | ||||
| grouping framework identifier can identify a single RTP | ||||
| Session. | ||||
| Usage: Identify separate RTP Sessions. | ||||
| Uniqueness: Globally unique within the general communication | ||||
| context for the specific end-point. | ||||
| Inter-relation: Depending on the underlying transport and | ||||
| signalling protocol. | ||||
| Special Restrictions: None. | ||||
| A source that changes its source transport address during a session | A source that changes its source transport address during a session | |||
| must also choose a new SSRC identifier to avoid being interpreted as | must also choose a new SSRC identifier to avoid being interpreted as | |||
| a looped source. | a looped source. | |||
| The set of participants considered part of the same RTP Session is | The set of participants considered part of the same RTP Session is | |||
| defined by[RFC3550] as those that share a single SSRC space. That | defined by[RFC3550] as those that share a single SSRC space. That | |||
| is, those participants that can see an SSRC identifier transmitted by | is, those participants that can see an SSRC identifier transmitted by | |||
| any one of the other participants. A participant can receive an SSRC | any one of the other participants. A participant can receive an SSRC | |||
| either as SSRC or CSRC in RTP and RTCP packets. Thus, the RTP | either as SSRC or CSRC in RTP and RTCP packets. Thus, the RTP | |||
| Session scope is decided by the participants' network interconnection | Session scope is decided by the participants' network interconnection | |||
| topology, in combination with RTP and RTCP forwarding strategies | topology, in combination with RTP and RTCP forwarding strategies | |||
| deployed by end-points and any interconnecting middle nodes. | deployed by end-points and any interconnecting middle nodes. | |||
| 3.2. SSRC | 3.2. SSRC | |||
| The Synchronization Source (SSRC) identifier is used to identify | An RTP Session serves one or more Media Sources, each sending a Media | |||
| individual sources within an RTP Session. The SSRC number is | Stream. | |||
| globally unique within an RTP Session and all RTP implementations | ||||
| must be prepared to use procedures for SSRC collision handling, which | Identifier: Synchronization Source (SSRC), 32-bit unsigned number. | |||
| results in an SSRC number change. The SSRC number is randomly | ||||
| chosen, carried in every RTP packet header and is not dependent on | Position: In every RTP and RTCP packet header. May be present in | |||
| network address. SSRC is also used as identifier to refer to | RTCP payload. May be present in SDP signalling. | |||
| separate media streams in RTCP. | ||||
| Usage: Identify individual Media Sources within an RTP Session. | ||||
| Refer to individual Media Sources in RTCP messages and SDP | ||||
| signalling. | ||||
| Uniqueness: Randomly chosen, globally unique within an RTP | ||||
| Session and not dependent on network address. | ||||
| Inter-relation: SSRC belonging to the same synchronization | ||||
| context (originating from the same end-point), within or | ||||
| between RTP Sessions, are indicated through use of identical | ||||
| SDES CNAME items in RTCP compound packets with those SSRC as | ||||
| originating source. SDP signalling can provide explicit SSRC | ||||
| grouping [RFC5576]. When CNAME is inappropriate or | ||||
| insufficient, there exist a few other methods to relate | ||||
| different SSRC. One such case is session-based RTP | ||||
| retransmission [RFC4588]. In some cases, the same SSRC | ||||
| Identifier value is used to relate streams in two different RTP | ||||
| Sessions, such as in Multi-Session Transmission of scalable | ||||
| video [RFC6190]. | ||||
| Special Restrictions: All RTP implementations must be prepared to | ||||
| use procedures for SSRC collision handling, which results in an | ||||
| SSRC number change. A Media Source that changes its RTP Session | ||||
| identifier (e.g. source transport address) during a session must | ||||
| also choose a new SSRC identifier to avoid being interpreted as | ||||
| looped source. Note that RTP sequence number and RTP timestamp | ||||
| are scoped by SSRC and thus independent between different SSRCs. | ||||
| A media source having an SSRC identifier can be of different types: | A media source having an SSRC identifier can be of different types: | |||
| Real: Connected to a "physical" media source, for example a camera | Real: Connected to a "physical" media source, for example a camera | |||
| or microphone. | or microphone. | |||
| Conceptual: A source with some attributed property generated by some | Conceptual: A source with some attributed property generated by some | |||
| network node, for example a filtering function in an RTP mixer | network node, for example a filtering function in an RTP mixer | |||
| that provides the most active speaker based on some criteria, or a | that provides the most active speaker based on some criteria, or a | |||
| mix representing a set of other sources. | mix representing a set of other sources. | |||
| skipping to change at page 7, line 47 ¶ | skipping to change at page 8, line 37 ¶ | |||
| anyway need a sender SSRC for use as source in RTCP reports. | anyway need a sender SSRC for use as source in RTCP reports. | |||
| Note that a "multimedia source" that generates more than one media | Note that a "multimedia source" that generates more than one media | |||
| type, e.g. a conference participant sending both audio and video, | type, e.g. a conference participant sending both audio and video, | |||
| need not (and commonly should not) use the same SSRC value across RTP | need not (and commonly should not) use the same SSRC value across RTP | |||
| sessions. RTCP Compound packets containing the CNAME SDES item is | sessions. RTCP Compound packets containing the CNAME SDES item is | |||
| the designated method to bind an SSRC to a CNAME, effectively cross- | the designated method to bind an SSRC to a CNAME, effectively cross- | |||
| correlating SSRCs within and between RTP Sessions as coming from the | correlating SSRCs within and between RTP Sessions as coming from the | |||
| same end-point. The main property attributed to SSRCs associated | same end-point. The main property attributed to SSRCs associated | |||
| with the same CNAME is that they are from a particular | with the same CNAME is that they are from a particular | |||
| synchronization context and may be synchronized at playback. There | synchronization context and may be synchronized at playback. | |||
| exist a few other methods to relate different SSRC where use of CNAME | ||||
| is inappropriate, such as session-based RTP retransmission [RFC4588]. | ||||
| Note also that RTP sequence number and RTP timestamp are scoped by | Note also that RTP sequence number and RTP timestamp are scoped by | |||
| SSRC and thus independent between different SSRCs. | SSRC and thus independent between different SSRCs. | |||
| An RTP receiver receiving a previously unseen SSRC value must | An RTP receiver receiving a previously unseen SSRC value must | |||
| interpret it as a new source. It may in fact be a previously | interpret it as a new source. It may in fact be a previously | |||
| existing source that had to change SSRC number due to an SSRC | existing source that had to change SSRC number due to an SSRC | |||
| conflict. However, the originator of the previous SSRC should have | conflict. However, the originator of the previous SSRC should have | |||
| ended the conflicting source by sending an RTCP BYE for it prior to | ended the conflicting source by sending an RTCP BYE for it prior to | |||
| starting to send with the new SSRC, so the new SSRC is anyway | starting to send with the new SSRC, so the new SSRC is anyway | |||
| skipping to change at page 8, line 15 ¶ | skipping to change at page 9, line 4 ¶ | |||
| An RTP receiver receiving a previously unseen SSRC value must | An RTP receiver receiving a previously unseen SSRC value must | |||
| interpret it as a new source. It may in fact be a previously | interpret it as a new source. It may in fact be a previously | |||
| existing source that had to change SSRC number due to an SSRC | existing source that had to change SSRC number due to an SSRC | |||
| conflict. However, the originator of the previous SSRC should have | conflict. However, the originator of the previous SSRC should have | |||
| ended the conflicting source by sending an RTCP BYE for it prior to | ended the conflicting source by sending an RTCP BYE for it prior to | |||
| starting to send with the new SSRC, so the new SSRC is anyway | starting to send with the new SSRC, so the new SSRC is anyway | |||
| effectively a new source. | effectively a new source. | |||
| Some RTP extension mechanisms already require the RTP stacks to | Some RTP extension mechanisms already require the RTP stacks to | |||
| handle additional SSRCs, like SSRC multiplexed RTP retransmission | handle additional SSRCs, like SSRC multiplexed RTP retransmission | |||
| [RFC4588]. However, that still only requires handling a single media | [RFC4588]. However, that still only requires handling a single media | |||
| decoding chain per pair of SSRCs. | decoding chain per pair of SSRCs. | |||
| 3.3. CSRC | 3.3. CSRC | |||
| The Contributing Source (CSRC) can arguably be seen as a sub-part of | The Contributing Source (CSRC) can arguably be seen as a sub-part of | |||
| a specific SSRC and thus a multiplexing point. It is optionally | a specific SSRC and thus a multiplexing point. It is optionally | |||
| included in the RTP header, shares the SSRC number space and | included in the RTP header, shares the SSRC number space and | |||
| specifies which set of SSRCs that has contributed to the RTP payload. | specifies which set of SSRCs that has contributed to the RTP payload. | |||
| However, even though each RTP packet and SSRC can be tagged with the | However, even though each RTP packet and SSRC can be tagged with the | |||
| contained CSRCs, the media representation of an individual CSRC is in | contained CSRCs, the media representation of an individual CSRC is in | |||
| general not possible to extract from the RTP payload since it is | general not possible to extract from the RTP payload since it is | |||
| typically the result of a media mixing (merge) operation (by an RTP | typically the result of a media mixing (merge) operation (by an RTP | |||
| mixer) on the individual media streams corresponding to the CSRC | mixer) on the individual media streams corresponding to the CSRC | |||
| identifiers. Due to these restrictions, CSRC will not be considered | identifiers. Due to these restrictions, CSRC will not be considered | |||
| a fully qualified multiplex point and will be disregarded in the rest | a fully qualified multiplex point and will be disregarded in the rest | |||
| of this document. | of this document. | |||
| 3.4. Payload Type | 3.4. Payload Type | |||
| The Payload Type number is also carried in every RTP packet header | Each Media Stream can be represented in various encoding formats. | |||
| and identifies what format the RTP payload has. The term "format" | ||||
| here includes whatever can be described by out-of-band signaling | ||||
| means for dynamic payload types, as well as the statically allocated | ||||
| payload types in [RFC3551]. In SDP the term "format" includes media | ||||
| type, RTP timestamp sampling rate, codec, codec configuration, | ||||
| payload format configurations, and various robustness mechanisms such | ||||
| as redundant encodings [RFC2198]. | ||||
| The meaning of a Payload Type definition (the number) is re-used | Identifier: Payload Type number. | |||
| between all media streams within an RTP session, when the definition | ||||
| is either static or signaled through SDP. There however do exist | ||||
| cases where each end-point have different sets of payload types due | ||||
| to SDP offer/answer. | ||||
| Although Payload Type definitions are commonly local to an RTP | Position: In every RTP header and in SDP signalling. | |||
| Session, there are some uses where Payload Type numbers need be | ||||
| unique across RTP Sessions. This is for example the case in Media | ||||
| Decoding Dependency [RFC5583] where Payload Types are used to | ||||
| describe media dependency across RTP Sessions. | ||||
| Given that multiple Payload Types are defined in an RTP Session, a | Usage: Identify a specific Media Stream encoding format. The | |||
| media sender is free to change the Payload Type on a per packet | format definition may be taken from [RFC3551] for statically | |||
| basis. One example of designed per-packet change of Payload Type is | allocated Payload Types, but should be explicitly defined in | |||
| a speech codec that makes use of generic Comfort Noise [RFC3389]. | signalling, such as SDP, both for static and dynamic Payload | |||
| Types. The term "format" here includes whatever can be | ||||
| described by out-of-band signaling means. In SDP, the term | ||||
| "format" includes media type, RTP timestamp sampling rate, | ||||
| codec, codec configuration, payload format configurations, and | ||||
| various robustness mechanisms such as redundant encodings | ||||
| [RFC2198]. | ||||
| The RTP Payload Type in RTP is designed such that only a single | Uniqueness: Scoped by sending end-point within an RTP Session. | |||
| Payload Type is valid at any time instant in the SSRC's timestamp | To avoid any potential for ambiguity, it is desirable that | |||
| time line, effectively time-multiplexing different Payload Types if | payload types are unique across all sending end-points within | |||
| any switch occurs. Even when this constraint is met, having | an RTP session, but this is often not true in practice. All | |||
| different rates on the RTP timestamp clock for the RTP Payload Types | SSRC in an RTP session sent from an single end-point share the | |||
| in use in the same RTP Session have issues such as loss of | same Payload Types definitions. The RTP Payload Type is | |||
| synchronization. Payload Type clock rate switching requires some | designed such that only a single Payload Type is valid at any | |||
| special consideration that is described in the multiple clock rates | time instant in the SSRC's RTP timestamp time line, effectively | |||
| specification [I-D.ietf-avtext-multiple-clock-rates]. | time-multiplexing different Payload Types if any change occurs. | |||
| Used Payload Type may change on a per-packet basis for an SSRC, | ||||
| for example a speech codec making use of generic Comfort Noise | ||||
| [RFC3389]. | ||||
| Inter-relation: There are some uses where Payload Type numbers | ||||
| need be unique across RTP Sessions. This is for example the | ||||
| case in Media Decoding Dependency [RFC5583] where Payload Types | ||||
| are used to describe media dependency across RTP Sessions. | ||||
| Another example is session-based RTP retransmission [RFC4588]. | ||||
| Special Restrictions: Using different RTP timestamp clock rates for | ||||
| the RTP Payload Types in use in the same RTP Session have issues | ||||
| such as loss of synchronization. Payload Type clock rate | ||||
| switching requires some special consideration that is described in | ||||
| the multiple clock rates specification | ||||
| [I-D.ietf-avtext-multiple-clock-rates]. | ||||
| If there is a true need to send multiple Payload Types for the same | If there is a true need to send multiple Payload Types for the same | |||
| SSRC that are valid for the same RTP Timestamps, then redundant | SSRC that are valid for the same RTP Timestamps, then redundant | |||
| encodings [RFC2198] can be used. Several additional constraints than | encodings [RFC2198] can be used. Several additional constraints than | |||
| the ones mentioned above need to be met to enable this use, one of | the ones mentioned above need to be met to enable this use, one of | |||
| which are that the combined payload sizes of the different Payload | which is that the combined payload sizes of the different Payload | |||
| Types must not exceed the transport MTU. | Types must not exceed the transport MTU. | |||
| Other aspects of RTP payload format use are described in RTP Payload | Other aspects of RTP payload format use are described in RTP Payload | |||
| HowTo [I-D.ietf-payload-rtp-howto]. | HowTo [I-D.ietf-payload-rtp-howto]. | |||
| 4. Multiple Streams Alternatives | 4. Multiple Streams Alternatives | |||
| This section reviews the alternatives to enable multi-stream | This section reviews the alternatives to enable multi-stream | |||
| handling. Let's start with describing mechanisms that could enable | handling. Let's start with describing mechanisms that could enable | |||
| multiple media streams, independent of the purpose for having | multiple media streams, independent of the purpose for having | |||
| skipping to change at page 9, line 50 ¶ | skipping to change at page 10, line 48 ¶ | |||
| within a RTP Session. | within a RTP Session. | |||
| Session Multiplexing: Using additional RTP Sessions to handle | Session Multiplexing: Using additional RTP Sessions to handle | |||
| additional Media Streams | additional Media Streams | |||
| Payload Type Multiplexing: Using different RTP payload types for | Payload Type Multiplexing: Using different RTP payload types for | |||
| different additional streams. | different additional streams. | |||
| Independent of the reason to use additional media streams, achieving | Independent of the reason to use additional media streams, achieving | |||
| it using payload type multiplexing is not a good choice as can be | it using payload type multiplexing is not a good choice as can be | |||
| seen in the below section (Section 6). The RTP payload type alone is | seen in the Appendix A. The RTP payload type alone is not suitable | |||
| not suitable for cases where additional media streams are required. | for cases where additional media streams are required. Streams need | |||
| Streams need their own SSRCs, so that they get their own sequence | their own SSRCs, so that they get their own sequence number space. | |||
| number space. The SSRC itself is also important so that the media | The SSRC itself is also important so that the media stream can be | |||
| stream can be referenced and reported on. | referenced and reported on. | |||
| This leaves us with two choices, either using SSRC multiplexing to | This leaves us with two main choices, either using SSRC multiplexing | |||
| have multiple SSRCs from one end-point in one RTP session, or create | to have multiple SSRCs from one end-point in one RTP session, or | |||
| additional RTP sessions to hold that additional SSRC. As the below | create an additional RTP session to hold that additional SSRC. As | |||
| discussion will show, in reality we cannot choose a single one of the | the below discussion will show, in reality we cannot choose a single | |||
| two solutions. To utilize RTP well and as efficiently as possible, | one of the two solutions. To utilize RTP well and as efficiently as | |||
| both are needed. The real issue is finding the right guidance on | possible, both are needed. The real issue is finding the right | |||
| when to create RTP sessions and when additional SSRCs in an RTP | guidance on when to create RTP sessions and when additional SSRCs in | |||
| session is the right choice. | an RTP session is the right choice. | |||
| In the below discussion, please keep in mind that the reasons for | In the below discussion, please keep in mind that the reasons for | |||
| having multiple media streams vary and include but are not limited to | having multiple media streams vary and include but are not limited to | |||
| the following: | the following: | |||
| o Multiple Media Sources of the same media type | o Multiple Media Sources | |||
| o Retransmission streams | o Retransmission streams | |||
| o FEC stream | o FEC stream | |||
| o Alternative Encoding | o Alternative Encodings | |||
| o Scalability layer | o Scalability layers | |||
| Thus the choice made due to one reason may not be the choice suitable | Thus the choice made due to one reason may not be the choice suitable | |||
| for another reason. In the above list, the different items have | for another reason. In the above list, the different items have | |||
| different levels of maturity in the discussion on how to solve them. | different levels of maturity in the discussion on how to solve them. | |||
| The clearest understanding is associated with multiple media sources | The clearest understanding is associated with multiple media sources | |||
| of the same media type. However, all warrant discussion and | of the same media type. However, all warrant discussion and | |||
| clarification on how to deal with them. | clarification on how to deal with them. | |||
| 5. RTP Topologies and Issues | 5. RTP Topologies and Issues | |||
| skipping to change at page 11, line 7 ¶ | skipping to change at page 12, line 7 ¶ | |||
| attempts to highlight the important behaviors concerning RTP | attempts to highlight the important behaviors concerning RTP | |||
| multiplexing and multi-stream handling. It lists any identified | multiplexing and multi-stream handling. It lists any identified | |||
| issues regarding RTP and RTCP handling, and introduces additional | issues regarding RTP and RTCP handling, and introduces additional | |||
| topologies that are supported by RTP beyond those included in RTP | topologies that are supported by RTP beyond those included in RTP | |||
| Topologies [RFC5117]. The RTP Topologies that do not follow the RTP | Topologies [RFC5117]. The RTP Topologies that do not follow the RTP | |||
| specification or do not attempt to utilize the facilities of RTP are | specification or do not attempt to utilize the facilities of RTP are | |||
| ignored in this document. | ignored in this document. | |||
| 5.1. Point to Point | 5.1. Point to Point | |||
| This is the most basic use case with an RTP session containing of two | This is the most basic use case with an RTP session containing two | |||
| end-points. Each end-point has one or more SSRCs. | end-points. Each end-point has one or more SSRCs. | |||
| +---+ +---+ | +---+ +---+ | |||
| | A |<------->| B | | | A |<------->| B | | |||
| +---+ +---+ | +---+ +---+ | |||
| Point to Point | Figure 1: Point to Point | |||
| 5.1.1. RTCP Reporting | 5.1.1. RTCP Reporting | |||
| In cases when an end-point uses multiple SSRCs, we have found two | In cases when an end-point uses multiple SSRCs, we have found two | |||
| closely related issues. The first is if every SSRC shall report on | closely related issues. The first is if every SSRC shall report on | |||
| all other SSRC, even the ones originating from the same end-point. | all other SSRC, even the ones originating from the same end-point. | |||
| The reason for this would be ensure that no monitoring function | The reason for this would be to ensure that no monitoring function | |||
| should suspect a breakage in the RTP session. | should suspect a breakage in the RTP session. | |||
| The second issue around RTCP reporting arise when an end-point | The second issue around RTCP reporting arise when an end-point | |||
| receives one or more media streams, and when the receiving end-point | receives one or more media streams, and when the receiving end-point | |||
| itself sends multiple SSRC in the same RTP session. As transport | itself sends multiple SSRC in the same RTP session. As transport | |||
| statistics are gathered per end-point and shared between the nodes, | statistics are gathered per end-point and shared between the nodes, | |||
| all the end-point's SSRC will report based on the same received data, | all the end-point's SSRC will report based on the same received data, | |||
| the only difference will be which SSRCs sends the report. This could | the only difference will be which SSRCs sends the report. This could | |||
| be considered unnecessary overhead, but for consistency it might be | be considered unnecessary overhead, but for consistency it might be | |||
| simplest to always have all sending SSRCs send RTCP reports on all | simplest to always have all sending SSRCs send RTCP reports on all | |||
| media streams the end-point receives. | media streams the end-point receives. | |||
| The current RTP text is silent about sending RTCP Receiver Reports | The current RTP text is silent about sending RTCP Receiver Reports | |||
| for an endpoint's own sources, but does not preclude either sending | for an endpoint's own sources, but does not preclude either sending | |||
| or omitting them. The uncertainty in the expected behavior in those | or omitting them. The uncertainty in the expected behavior in those | |||
| cases have likely caused variations in the implementation strategy. | cases has likely caused variations in the implementation strategy. | |||
| This could cause an interoperability issue where it is not possible | This could cause an interoperability issue where it is not possible | |||
| to determine if the lack of reports are a true transport issue, or | to determine if the lack of reports is a true transport issue, or | |||
| simply a result of implementation. | simply a result of implementation. | |||
| Although this issue is valid already for the simple point to point | Although this issue is valid already for the simple point to point | |||
| case, it needs to be considered in all topologies. From the | case, it needs to be considered in all topologies. From the | |||
| perspective of an end-point, any solution needs to take into account | perspective of an end-point, any solution needs to take into account | |||
| what a particular end-point can determine without explicit | what a particular end-point can determine without explicit | |||
| information of the topology. For example, a Transport Translator | information of the topology. For example, a Transport Translator | |||
| (Relay) topology will look quite similar as point to point on an RTP | (Relay) topology will look quite similar to point to point on a | |||
| level but is different. The main difference between a point to point | transport level but is different on RTP level. Assume a first | |||
| with two SSRC being sent from the remote end-point and a Transport | scenario with two SSRC being sent from an end-point to a Transport | |||
| Translator with two single SSRC remote clients are that the RTT may | Translator, and a second scenario with two single SSRC remote end- | |||
| vary between the SSRCs (but it is not guaranteed), and that the SSRCs | points sending to the same Transport Translator. The main | |||
| may have different CNAMEs. | differences between those two scenarios are that in the second | |||
| scenario, the RTT may vary between the SSRCs (but it is not | ||||
| guaranteed), and the SSRCs may also have different CNAMEs. | ||||
| 5.1.2. Compound RTCP Packets | 5.1.2. Compound RTCP Packets | |||
| When an end-point has multiple SSRCs and it needs to send RTCP | When an end-point has multiple SSRCs and it needs to send RTCP | |||
| packets on behalf of these SSRCs, the question arises if and how RTCP | packets on behalf of these SSRCs, the question arises if and how RTCP | |||
| packets with different source SSRCs can be sent in the same compound | packets with different source SSRCs can be sent in the same compound | |||
| packet. If it is allowed, then some consideration of the | packet. If it is allowed, then some consideration of the | |||
| transmission scheduling is needed. | transmission scheduling is needed. | |||
| 5.2. Point to Multipoint Using Multicast | 5.2. Point to Multipoint Using Multicast | |||
| This section discusses the Point to Multi-point using Multicast to | This section discusses the Point to Multi-point using Multicast to | |||
| interconnect the session participants. This needs to consider both | interconnect the session participants. This needs to consider both | |||
| Any Source Multicast (ASM) and Source-Specific Multicast (SSM). | Any Source Multicast (ASM) and Source-Specific Multicast (SSM). | |||
| There are large commercial deployments of multicast for applications | ||||
| like IPTV. | ||||
| +-----+ | +-----+ | |||
| +---+ / \ +---+ | +---+ / \ +---+ | |||
| | A |----/ \---| B | | | A |----/ \---| B | | |||
| +---+ / Multi- \ +---+ | +---+ / Multi- \ +---+ | |||
| + Cast + | + Cast + | |||
| +---+ \ Network / +---+ | +---+ \ Network / +---+ | |||
| | C |----\ /---| D | | | C |----\ /---| D | | |||
| +---+ \ / +---+ | +---+ \ / +---+ | |||
| +-----+ | +-----+ | |||
| Point to Multipoint Using Any Source Multicast | Figure 2: Point to Multipoint Using Any Source Multicast | |||
| In Any Source Multicast, any of the participants can send to all the | In Any Source Multicast, any of the participants can send to all the | |||
| other participants, simply by sending a packet to the multicast | other participants, simply by sending a packet to the multicast | |||
| group. That is not possible in Source Specific Multicast [RFC4607] | group. That is not possible in Source Specific Multicast [RFC4607] | |||
| where only a single source (Distribution Source) can send to the | where only a single source (Distribution Source) can send to the | |||
| multicast group, creating a topology that looks like the one below: | multicast group, creating a topology that looks like the one below: | |||
| Source-specific | +--------+ +-----+ | |||
| +--------+ +-----+ Multicast | |Media | | | Source-specific | |||
| |Media | | | +----------------> R(1) | |Sender 1|<----->| D S | Multicast | |||
| |Sender 1|<----->| D S | | | | +--------+ | I O | +--+----------------> R(1) | |||
| +--------+ | I O | +--+ | | ||||
| | S U | | | | | | S U | | | | | |||
| +--------+ | T R | | +-----------> R(2) | | +--------+ | T R | | +-----------> R(2) | | |||
| |Media |<----->| R C |->+ +---- : | | | |Media |<----->| R C |->+ | : | | | |||
| |Sender 2| | I E | | +------> R(n-1) | | | |Sender 2| | I E | | +------> R(n-1) | | | |||
| +--------+ | B | | | | | | | +--------+ | B | | | | | | | |||
| : | U | +--+--> R(n) | | | | : | U | +--+--> R(n) | | | | |||
| : | T +-| | | | | | : | T +-| | | | | | |||
| | I | |<---------+ | | | | : | I | |<---------+ | | | | |||
| +--------+ | O |F|<---------------+ | | | +--------+ | O |F|<---------------+ | | | |||
| |Media | | N |T|<--------------------+ | | |Media | | N |T|<--------------------+ | | |||
| |Sender M|<----->| | |<-------------------------+ | |Sender M|<----->| | |<-------------------------+ | |||
| +--------+ +-----+ Unicast | +--------+ +-----+ RTCP Unicast | |||
| FT = Feedback Target | FT = Feedback Target | |||
| Transport from the Feedback Target to the Distribution | Transport from the Feedback Target to the Distribution | |||
| Source is via unicast or multicast RTCP if they are not | Source is via unicast or multicast RTCP if they are not | |||
| co-located. | co-located. | |||
| Point to Multipoint using Source Specific Multicast | Figure 3: Point to Multipoint using Source Specific Multicast | |||
| In this topology a number of Media Senders (1 to M) are allowed to | In this topology a number of Media Senders (1 to M) are allowed to | |||
| send media to the SSM group, sends media to the distribution source | send media to the SSM group, sends media to the distribution source | |||
| which then forwards the media streams to the multicast group. The | which then forwards the media streams to the multicast group. The | |||
| media streams reach the Receivers (R(1) to R(n)). The Receiver's | media streams reach the Receivers (R(1) to R(n)). The Receiver's | |||
| RTCP cannot be sent to the multicast group. To support RTCP, an RTP | RTCP cannot be sent to the multicast group. To support RTCP, an RTP | |||
| extension for SSM [RFC5760] was defined that use unicast transmission | extension for SSM [RFC5760] was defined to use unicast transmission | |||
| to send RTCP from the receivers to one or more Feedback Targets (FT). | to send RTCP from the receivers to one or more Feedback Targets (FT). | |||
| As multicast is a one to many distribution system this must be taken | As multicast is a one to many distribution system, this must be taken | |||
| into consideration. For example, the only practical method for | into consideration. For example, the only practical method for | |||
| adapting the bit-rate sent towards a given receiver is to use a set | adapting the bit-rate sent towards a given receiver for large groups | |||
| of multicast groups, where each multicast group represents a | is to use a set of multicast groups, where each multicast group | |||
| particular bit-rate. The media encoding is either scalable, where | represents a particular bit-rate. Otherwise the whole group gets | |||
| multiple layers can be combined, or simulcast where a single version | media adapted to the participant with the worst conditions. The | |||
| is selected. By either selecting or combing multicast groups, the | media encoding is either scalable, where multiple layers can be | |||
| receiver can control the bit-rate sent on the path to itself. It is | combined, or simulcast where a single version is selected. By either | |||
| also common that transport robustification is sent in its own | selecting or combing multicast groups, the receiver can control the | |||
| multicast group to allow for interworking with legacy or to support | bit-rate sent on the path to itself. It is also common that streams | |||
| different levels of protection. | that improve transport robustness is sent in its own multicast group | |||
| to allow for interworking with legacy or to support different levels | ||||
| of protection. | ||||
| The result of this is three common behaviors for RTP multicast: | The result of this is three common behaviors for RTP multicast: | |||
| 1. Use of multiple RTP sessions for the same media type. | 1. Use of multiple RTP sessions for the same media type. | |||
| 2. The need for identifying RTP sessions that are related in one of | 2. The need for identifying RTP sessions that are related in one of | |||
| several ways. | several possible ways. | |||
| 3. The need for binding related SSRCs in different RTP sessions | 3. The need for binding related SSRCs in different RTP sessions | |||
| together. | together. | |||
| This indicates that Multicast is an important consideration when | This indicates that Multicast is an important consideration when | |||
| working with the RTP multiplexing and multi stream architecture | working with the RTP multiplexing and multi stream architecture | |||
| questions. It is also important to note that so far there is no | questions. It is also important to note that so far there is no | |||
| special mode for basic behavior between multicast and unicast usages | special mode for basic behavior between multicast and unicast usages | |||
| of RTP. Yes, there are extensions targeted to deal with multicast | of RTP. Yes, there are extensions targeted to deal with multicast | |||
| specific cases but the general applicability does need to be | specific cases, but the general applicability does need to be | |||
| considered. | considered. | |||
| 5.3. Point to Multipoint Using an RTP Translator | 5.3. Point to Multipoint Using an RTP Translator | |||
| Transport Translators (Relays) are a very important consideration for | Transport Translators (Relays) are a very important consideration for | |||
| this document as they result in an RTP session situation that is very | this document as they result in an RTP session situation that is very | |||
| similar to how an ASM group RTP session would behave. | similar to how an ASM group RTP session would behave. | |||
| +---+ +------------+ +---+ | +---+ +------------+ +---+ | |||
| | A |<---->| |<---->| B | | | A |<---->| |<---->| B | | |||
| +---+ | | +---+ | +---+ | | +---+ | |||
| | Translator | | | Translator | | |||
| +---+ | | +---+ | +---+ | | +---+ | |||
| | C |<---->| |<---->| D | | | C |<---->| |<---->| D | | |||
| +---+ +------------+ +---+ | +---+ +------------+ +---+ | |||
| Transport Translator (Relay) | Figure 4: Transport Translator (Relay) | |||
| One of the most important aspects with the simple relay is that it is | One of the most important aspects with the simple relay is that it is | |||
| both easy to implement and require minimal amount of resources as | both easy to implement and require minimal amount of resources as | |||
| only transport headers are rewritten, no RTP modifications nor media | only transport headers are rewritten, no RTP modifications nor media | |||
| transcoding occur. Thus it is most likely the cheapest and most | transcoding occur. Thus it is most likely the cheapest and most | |||
| generally deployable method for multi-point sessions. The most | generally deployable method for multi-point sessions. The most | |||
| obvious downside of this basic relaying is that the translator has no | obvious downside of this basic relaying is that the translator has no | |||
| control over how many streams needs to be delivered to a receiver. | control over how many streams needs to be delivered to a receiver. | |||
| Nor can it simply select to deliver only certain streams, at least | Nor can it simply select to deliver only certain streams, as it | |||
| not without new RTCP extensions to coherently handle the fact that | creates session inconsistencies. If some middlebox temporarily stops | |||
| some middlebox temporarily stops a stream, preventing some receivers | a stream, this prevents some receivers from reporting on it. From | |||
| from reporting on it. This consistency problem in RTCP reporting | the senders perspective it will look like a transport failure. | |||
| needs to be handled. | Applications having needs to stop or switch streams in the central | |||
| node should consider using an RTP mixer to avoid this issue. | ||||
| The Transport Translator does not need to have an SSRC of itself, nor | The Transport Translator does not need to have an SSRC of itself, nor | |||
| need it send any RTCP reports on the flows that passes it, but it may | need it send any RTCP reports on the flows that pass it, but it may | |||
| choose to do that. | choose to do that. | |||
| Use of a transport translator results in that any of the end-points | Use of a transport translator results in that any of the end-points | |||
| will receive multiple SSRCs over a single unicast transport flow from | will receive multiple SSRCs over a single unicast transport flow from | |||
| the translator. That is independent of the other end-points having | the translator. That is independent of the other end-points having | |||
| only a single or several SSRCs. End-points that have multiple SSRCs | only a single or several SSRCs. End-points that have multiple SSRCs | |||
| put further requirements on how SSRCs can be related or bound within | put further requirements on how SSRCs can be related or bound within | |||
| and across RTP sessions and how they can be identified on an | and across RTP sessions and how they can be identified on an | |||
| application level. | application level. The transport translator has a signalling | |||
| requirement that also exist in any source multicast; all of the | ||||
| participants will need to have the same RTP and payload type | ||||
| configuration. Otherwise, A could for example be using payload type | ||||
| 97 as the video codec H.264 while B thinks it is MPEG-2. It should | ||||
| be noted that SDP offer/answer [RFC3264] has issues with ensuring | ||||
| this property. | ||||
| A Media Translator can perform a large variety of media functions | A Media Translator can perform a large variety of media functions | |||
| affecting the media stream passing the translator, coming from one | affecting the media stream passing the translator, coming from one | |||
| source and destined to a particular end-point. The media stream can | source and destined to a particular end-point. The translator can | |||
| be transcoded to a different bit-rate, change to another encoder, | transcode to a different bit-rate, transcode to use another encoder, | |||
| change the packetization of the media stream, add FEC streams, or | change the packetization of the media stream, add FEC streams, or | |||
| terminate RTP retransmissions. The latter behaviors require the | terminate RTP retransmissions. The latter behaviors require the | |||
| translator to use SSRCs that only exist in a particular sub-domain of | translator to use SSRCs that only exist in a particular sub-domain of | |||
| the RTP session, and it may also create additional sessions when the | the RTP session, and it may also create additional sessions when the | |||
| mechanism applied on one side so requires. | mechanism applied on one side so requires. | |||
| 5.4. Point to Multipoint Using an RTP Mixer | 5.4. Point to Multipoint Using an RTP Mixer | |||
| The most commonly used topology in centralized conferencing is based | The most commonly used topology in centralized conferencing is based | |||
| on the RTP Mixer. The main reason for this is that it provides a | on the RTP Mixer. The main reason for this is that it provides a | |||
| skipping to change at page 15, line 44 ¶ | skipping to change at page 16, line 51 ¶ | |||
| underlying sources based on some mixer policy or control signalling. | underlying sources based on some mixer policy or control signalling. | |||
| +---+ +------------+ +---+ | +---+ +------------+ +---+ | |||
| | A |<---->| |<---->| B | | | A |<---->| |<---->| B | | |||
| +---+ | | +---+ | +---+ | | +---+ | |||
| | Mixer | | | Mixer | | |||
| +---+ | | +---+ | +---+ | | +---+ | |||
| | C |<---->| |<---->| D | | | C |<---->| |<---->| D | | |||
| +---+ +------------+ +---+ | +---+ +------------+ +---+ | |||
| RTP Mixer | Figure 5: RTP Mixer | |||
| In the case where the mixer does stream selection, an application may | In the case where the mixer does stream selection, an application may | |||
| in fact desire multiple simultaneous streams but only as many as the | in fact desire multiple simultaneous streams but only as many as the | |||
| mixer can handle. As long as the mixer and an end-point can agree on | mixer can handle. As long as the mixer and an end-point can agree on | |||
| the maximum number of streams and how the streams that are delivered | the maximum number of streams and how the streams that are delivered | |||
| are selected, this provides very good functionality. As these | are selected, this provides very good functionality. As these | |||
| streams are forwarded using the mixer's SSRCs, there are no | streams are forwarded using the mixer's SSRCs, there are no | |||
| inconsistencies within the session. | inconsistencies within the session. | |||
| 5.5. Point to Multipoint using Multiple Unicast flows | 5.5. Point to Multipoint using Multiple Unicast flows | |||
| skipping to change at page 16, line 25 ¶ | skipping to change at page 17, line 33 ¶ | |||
| | A |<---->| B | | | A |<---->| B | | |||
| +---+ +---+ | +---+ +---+ | |||
| ^ ^ | ^ ^ | |||
| \ / | \ / | |||
| \ / | \ / | |||
| v v | v v | |||
| +---+ | +---+ | |||
| | C | | | C | | |||
| +---+ | +---+ | |||
| Point to Multi-Point using Multiple Unicast Transprots | Figure 6: Point to Multi-Point using Multiple Unicast Transports | |||
| This doesn't create any additional requirements beyond the need to | This doesn't create any additional requirements beyond the need to | |||
| have multiple transport flows associated with a single RTP session. | have multiple transport flows associated with a single RTP session. | |||
| Note that an end-point may use a single local port to receive all | Note that an end-point may use a single local port to receive all | |||
| these transport flows, or it might have separate local reception | these transport flows, or it might have separate local reception | |||
| ports for each of the end-points. | ports for each of the end-points. | |||
| 5.6. Decomposited End-Point | There exists an alternative structure for establishing the above | |||
| communication scenario (Figure 6) which uses independent RTP sessions | ||||
| between each pair of peers, i.e. three different RTP sessions. | ||||
| Unless independently adapted the same RTP media stream could be sent | ||||
| in both of the RTP sessions an end-point has. The difference exists | ||||
| in the behaviors around RTCP, for example common RTCP bandwidth for | ||||
| one joint session, rather than three independent pools, and the | ||||
| awareness based on RTCP reports between the peers of how that third | ||||
| leg is doing. | ||||
| 5.6. De-composite End-Point | ||||
| There is some possibility that an RTP end-point implementation in | There is some possibility that an RTP end-point implementation in | |||
| fact reside on multiple devices, each with their own network address. | fact reside on multiple devices, each with their own network address. | |||
| A very basic use case for this would be to separate audio and video | A very basic use case for this would be to separate audio and video | |||
| processing for a particular end-point, like a conference room, into | processing for a particular end-point, like a conference room, into | |||
| one device handling the audio and another handling the video being | one device handling the audio and another handling the video, being | |||
| interconnected by some control functions allowing them to behave as a | interconnected by some control functions allowing them to behave as a | |||
| single end-point. | single end-point. | |||
| +---------------------+ | +---------------------+ | |||
| | End-point A | | | End-point A | | |||
| | Local Area Network | | | Local Area Network | | |||
| | +------------+ | | | +------------+ | | |||
| | +->| Audio |<+----\ | | +->| Audio |<+----\ | |||
| | | +------------+ | \ +------+ | | | +------------+ | \ +------+ | |||
| | | +------------+ | +-->| | | | | +------------+ | +-->| | | |||
| | +->| Video |<+--------->| B | | | +->| Video |<+--------->| B | | |||
| | | +------------+ | +-->| | | | | +------------+ | +-->| | | |||
| | | +------------+ | / +------+ | | | +------------+ | / +------+ | |||
| | +->| Control |<+----/ | | +->| Control |<+----/ | |||
| | +------------+ | | | +------------+ | | |||
| +---------------------+ | +---------------------+ | |||
| Decomposited End-Point | Figure 7: De-composite End-Point | |||
| In the above usage, let us assume that the RTP sessions are different | In the above usage, let us assume that the RTP sessions are different | |||
| for audio and video. The audio and video parts will use a common | for audio and video. The audio and video parts will use a common | |||
| CNAME and also have a common clock to ensure that synchronization and | CNAME and also have a common clock to ensure that synchronization and | |||
| clock drift handling works despite the decomposition. However, if | clock drift handling works despite the decomposition. | |||
| the audio and video were in a single RTP session then this use case | ||||
| becomes problematic. This as all transport flow receivers will need | However, if the audio and video were in a single RTP session then | |||
| to receive all the other media streams that are part of the session. | this use case becomes problematic. This as all transport flow | |||
| Thus the audio component will receive also all the video media | receivers will need to receive all the other media streams that are | |||
| streams, while the video component will receive all the audio ones, | part of the session. Thus the audio component will receive also all | |||
| thus doubling the site's bandwidth requirements from all other | the video media streams, while the video component will receive all | |||
| session participants. With a joint RTP session it also becomes | the audio ones, doubling the site's bandwidth requirements from all | |||
| other session participants. With a joint RTP session it also becomes | ||||
| evident that a given end-point, as interpreted from a CNAME | evident that a given end-point, as interpreted from a CNAME | |||
| perspective, has two sets of transport flows for receiving the | perspective, has two sets of transport flows for receiving the | |||
| streams and the decomposition isn't hidden. | streams and the decomposition is not hidden. | |||
| The requirements that can derived from the above usage is that the | The requirements that can derived from the above usage is that the | |||
| transport flows for each RTP session might be under common control | transport flows for each RTP session might be under common control | |||
| but still go to what looks like different end-points based on | but still go to what looks like different end-points based on | |||
| addresses and ports. A conclusion can also be reached that | addresses and ports. A conclusion can also be reached that | |||
| decomposition without using separate RTP sessions has downsides and | decomposition without using separate RTP sessions has downsides and | |||
| potential for RTP/RTCP issues. | potential for RTP/RTCP issues. | |||
| There exist another use case which might be considered as a | There exist another use case which might be considered as a de- | |||
| decomposited end-point. However, as will be shown this should be | composite end-point. However, as will be shown this should be | |||
| considered a translator instead. An example of this is when an end- | considered a translator instead. An example of this is when an end- | |||
| point A sends a media flow to B. On the path there is a device C that | point A sends a media flow to B. On the path there is a device C that | |||
| on A's behalf does something with the media streams, for example adds | on A's behalf does something with the media streams, for example adds | |||
| an RTP session with FEC information for A's media streams. C will in | an RTP session with FEC information for A's media streams. C will in | |||
| this case need to bind the new FEC streams to A's media stream by | this case need to bind the new FEC streams to A's media stream by | |||
| using the same CNAME as A. | using the same CNAME as A. | |||
| +------+ +------+ +------+ | +------+ +------+ +------+ | |||
| | | | | | | | | | | | | | | |||
| | A |------->| C |-------->| B | | | A |------->| C |-------->| B | | |||
| | | | |---FEC-->| | | | | | |---FEC-->| | | |||
| +------+ +------+ +------+ | +------+ +------+ +------+ | |||
| When Decomposition is a Translator | Figure 8: When De-composition is a Translator | |||
| This type of functionality where C does something with the media | This type of functionality where C does something with the media | |||
| stream on behalf of A is clearly covered under the media translator | stream on behalf of A is clearly covered under the media translator | |||
| definition (Section 5.3). | definition (Section 5.3). | |||
| 6. Dismissing Payload Type Multiplexing | 6. Multiple Streams Discussion | |||
| Before starting a discussion on when to use what alternative, we will | ||||
| first document a number of reasons why using the payload type as a | ||||
| multiplexing point for anything related to multiple streams is | ||||
| unsuitable and will not be considered further. | ||||
| If one attempts to use Payload type multiplexing beyond it's defined | ||||
| usage, that has well known negative effects on RTP. To use Payload | ||||
| type as the single discriminator for multiple streams implies that | ||||
| all the different media streams are being sent with the same SSRC, | ||||
| thus using the same timestamp and sequence number space. This has | ||||
| many effects: | ||||
| 1. Putting restraint on RTP timestamp rate for the multiplexed | ||||
| media. For example, media streams that use different RTP | ||||
| timestamp rates cannot be combined, as the timestamp values need | ||||
| to be consistent across all multiplexed media frames. Thus | ||||
| streams are forced to use the same rate. When this is not | ||||
| possible, Payload Type multiplexing cannot be used. | ||||
| 2. Many RTP payload formats may fragment a media object over | ||||
| multiple packets, like parts of a video frame. These payload | ||||
| formats need to determine the order of the fragments to | ||||
| correctly decode them. Thus it is important to ensure that all | ||||
| fragments related to a frame or a similar media object are | ||||
| transmitted in sequence and without interruptions within the | ||||
| object. This can relatively simple be solved on the sender side | ||||
| by ensuring that the fragments of each media stream are sent in | ||||
| sequence. | ||||
| 3. Some media formats require uninterrupted sequence number space | ||||
| between media parts. These are media formats where any missing | ||||
| RTP sequence number will result in decoding failure or invoking | ||||
| of a repair mechanism within a single media context. The text/ | ||||
| T140 payload format [RFC4103] is an example of such a format. | ||||
| These formats will need a sequence numbering abstraction | ||||
| function between RTP and the individual media stream before | ||||
| being used with Payload Type multiplexing. | ||||
| 4. Sending multiple streams in the same sequence number space makes | ||||
| it impossible to determine which Payload Type and thus which | ||||
| stream a packet loss relates to. | ||||
| 5. If RTP Retransmission [RFC4588] is used and there is a loss, it | ||||
| is possible to ask for the missing packet(s) by SSRC and | ||||
| sequence number, not by Payload Type. If only some of the | ||||
| Payload Type multiplexed streams are of interest, there is no | ||||
| way of telling which missing packet(s) belong to the interesting | ||||
| stream(s) and all lost packets must be requested, wasting | ||||
| bandwidth. | ||||
| 6. The current RTCP feedback mechanisms are built around providing | ||||
| feedback on media streams based on stream ID (SSRC), packet | ||||
| (sequence numbers) and time interval (RTP Timestamps). There is | ||||
| almost never a field to indicate which Payload Type is reported, | ||||
| so sending feedback for a specific media stream is difficult | ||||
| without extending existing RTCP reporting. | ||||
| 7. The current RTCP media control messages [RFC5104] specification | ||||
| is oriented around controlling particular media flows, i.e. | ||||
| requests are done addressing a particular SSRC. Such mechanisms | ||||
| would need to be redefined to support Payload Type multiplexing. | ||||
| 8. The number of payload types are inherently limited. | ||||
| Accordingly, using Payload Type multiplexing limits the number | ||||
| of streams that can be multiplexed and does not scale. This | ||||
| limitation is exacerbated if one uses solutions like RTP and | ||||
| RTCP multiplexing [RFC5761] where a number of payload types are | ||||
| blocked due to the overlap between RTP and RTCP. | ||||
| 9. At times, there is a need to group multiplexed streams and this | ||||
| is currently possible for RTP Sessions and for SSRC, but there | ||||
| is no defined way to group Payload Types. | ||||
| 10. It is currently not possible to signal bandwidth requirements | ||||
| per media stream when using Payload Type Multiplexing. | ||||
| 11. Most existing SDP media level attributes cannot be applied on a | ||||
| per Payload Type level and would require re-definition in that | ||||
| context. | ||||
| 12. A legacy end-point that doesn't understand the indication that | ||||
| different RTP payload types are different media streams may be | ||||
| slightly confused by the large amount of possibly overlapping or | ||||
| identically defined RTP Payload Types. | ||||
| 7. Multiple Streams Discussion | ||||
| 7.1. Introduction | 6.1. Introduction | |||
| Using multiple media streams is a well supported feature of RTP. | Using multiple media streams is a well supported feature of RTP. | |||
| However, what can be unclear for most implementors or people writing | However, it can be unclear for most implementers or people writing | |||
| RTP/RTCP extensions attempting to apply multiple streams, is when it | RTP/RTCP applications or extensions attempting to apply multiple | |||
| is most appropriate to add an additional SSRC in an existing RTP | streams when it is most appropriate to add an additional SSRC in an | |||
| session and when it is better to use multiple RTP sessions. This | existing RTP session and when it is better to use multiple RTP | |||
| section tries to discuss the various considerations needed. The next | sessions. This section tries to discuss the various considerations | |||
| section then concludes with some guidelines. | needed. The next section then concludes with some guidelines. | |||
| 7.2. RTP/RTCP Aspects | 6.2. RTP/RTCP Aspects | |||
| This section discusses RTP and RTCP aspects worth considering when | This section discusses RTP and RTCP aspects worth considering when | |||
| selecting between SSRC multiplexing and Session multiplexing. | selecting between SSRC multiplexing and Session multiplexing. | |||
| 7.2.1. The RTP Specification | 6.2.1. The RTP Specification | |||
| RFC 3550 contains some recommendations and a bullet list with 5 | RFC 3550 contains some recommendations and a bullet list with 5 | |||
| arguments for different aspects of RTP multiplexing. Let's review | arguments for different aspects of RTP multiplexing. Let's review | |||
| Section 5.2 of [RFC3550], reproduced below: | Section 5.2 of [RFC3550], reproduced below: | |||
| "For efficient protocol processing, the number of multiplexing points | "For efficient protocol processing, the number of multiplexing points | |||
| should be minimized, as described in the integrated layer processing | should be minimized, as described in the integrated layer processing | |||
| design principle [ALF]. In RTP, multiplexing is provided by the | design principle [ALF]. In RTP, multiplexing is provided by the | |||
| destination transport address (network address and port number) which | destination transport address (network address and port number) which | |||
| is different for each RTP session. For example, in a teleconference | is different for each RTP session. For example, in a teleconference | |||
| skipping to change at page 21, line 37 ¶ | skipping to change at page 21, line 4 ¶ | |||
| RTP session would avoid the first three problems but not the last | RTP session would avoid the first three problems but not the last | |||
| two. | two. | |||
| On the other hand, multiplexing multiple related sources of the same | On the other hand, multiplexing multiple related sources of the same | |||
| medium in one RTP session using different SSRC values is the norm for | medium in one RTP session using different SSRC values is the norm for | |||
| multicast sessions. The problems listed above don't apply: an RTP | multicast sessions. The problems listed above don't apply: an RTP | |||
| mixer can combine multiple audio sources, for example, and the same | mixer can combine multiple audio sources, for example, and the same | |||
| treatment is applicable for all of them. It may also be appropriate | treatment is applicable for all of them. It may also be appropriate | |||
| to multiplex streams of the same medium using different SSRC values | to multiplex streams of the same medium using different SSRC values | |||
| in other scenarios where the last two problems do not apply." | in other scenarios where the last two problems do not apply." | |||
| Let's consider one argument at a time. The first is an argument for | Let's consider one argument at a time. The first is an argument for | |||
| using different SSRC for each individual media stream, which still is | using different SSRC for each individual media stream, which still is | |||
| very applicable. | very applicable. | |||
| The second argument is advocating against using payload type | The second argument is advocating against using payload type | |||
| multiplexing, which still stands as can been seen by the extensive | multiplexing, which still stands as can been seen by the extensive | |||
| list of issues found in Section 6. | list of issues found in Appendix A. | |||
| The third argument is yet another argument against payload type | The third argument is yet another argument against payload type | |||
| multiplexing. | multiplexing. | |||
| The fourth is an argument against multiplexing media streams that | The fourth is an argument against multiplexing media streams that | |||
| require different handling into the same session. This is to | require different handling into the same session. This is to | |||
| simplify the processing at any receiver of the media stream. If all | simplify the processing at any receiver of the media stream. If all | |||
| media streams that exist in an RTP session is of one media type and | media streams that exist in an RTP session are of one media type and | |||
| one particular purpose, there is no need for deeper inspection of the | one particular purpose, there is no need for deeper inspection of the | |||
| packets before processing them in both end-points and RTP aware | packets before processing them in both end-points and RTP aware | |||
| middle nodes. | middle nodes. | |||
| The fifth argument discusses network aspects that we will discuss | The fifth argument discusses network aspects that we will discuss | |||
| more below in Section 7.4. It also goes into aspects of | more below in Section 6.5. It also goes into aspects of | |||
| implementation, like decomposed end-points where different processes | implementation, like decomposed end-points where different processes | |||
| or inter-connected devices handle different aspects of the whole | or inter-connected devices handle different aspects of the whole | |||
| multi-media session. | multi-media session. | |||
| A summary of RFC 3550's view on multiplexing is to use unique SSRCs | A summary of RFC 3550's view on multiplexing is to use unique SSRCs | |||
| for anything that is its' own media/packet stream, and secondly use | for anything that is its' own media/packet stream, and secondly use | |||
| different RTP sessions for media streams that don't share media type | different RTP sessions for media streams that don't share media type | |||
| and purpose, to maximize flexibility when it comes to processing and | and purpose, to maximize flexibility when it comes to processing and | |||
| handling of the media streams. | handling of the media streams. | |||
| This mostly agrees with the discussion and recommendations in this | This mostly agrees with the discussion and recommendations in this | |||
| document. However, there has been an evolution of RTP since that | document. However, there has been an evolution of RTP since that | |||
| text was written which needs to be reflected in the discussion. | text was written which needs to be reflected in the discussion. | |||
| Additional clarifications for specific cases are also needed. | Additional clarifications for specific cases are also needed. | |||
| 7.2.2. Multiple SSRC Legacy Considerations | 6.2.1.1. Different Media Types Recommendations | |||
| When establishing RTP sessions that may contain end-points that | ||||
| aren't updated to handle multiple streams following these | ||||
| recommendations, a particular application can have issues with | ||||
| multiple SSRCs within a single session. These issues include: | ||||
| 1. Need to handle more than one stream simultaneously rather than | ||||
| replacing an already existing stream with a new one. | ||||
| 2. Be capable of decoding multiple streams simultaneously. | ||||
| 3. Be capable of rendering multiple streams simultaneously. | ||||
| RTP Session multiplexing could potentially avoid these issues if | The above quote from RTP [RFC3550] includes a strong recommendation: | |||
| there is only a single SSRC at each end-point, and in topologies | ||||
| which appears like point to point as seen the end-point. However, | ||||
| forcing the usage of session multiplexing due to this reason would be | ||||
| a great mistake, as it is likely that a significant set of | ||||
| applications will need a combination of SSRC multiplexing of several | ||||
| media sources and session multiplexing for other aspects such as | ||||
| encoding alternatives, robustification or simply to support legacy. | ||||
| However, this issue does need consideration when deploying multiple | ||||
| media streams within an RTP session where legacy end-points may | ||||
| occur. | ||||
| 7.2.3. RTP Specification Clarifications Needed | "For example, in a teleconference composed of audio and video | |||
| media encoded separately, each medium SHOULD be carried in a | ||||
| separate RTP session with its own destination transport address." | ||||
| The RTP specification contains a few things that are potential | It has been identified in "Why RTP Sessions Should Be Content | |||
| interoperability issues when using multiple SSRCs within a session. | Neutral" [I-D.alvestrand-rtp-sess-neutral] that the above statement | |||
| These issues are described and discussed in Section 9. These should | is poorly supported by any of the motivations provided in the RTP | |||
| not be considered strong arguments against using SSRC multiplexing | specification. This document has a more detailed analysis of | |||
| when otherwise appropriate, and there are some issues we expect to be | potential issues in having multiple media types in the same RTP | |||
| solved in the near future. | session in Section 6.7. An important influence for underlying | |||
| thinking for the RTP design and likely this statement can be found in | ||||
| the academic paper by David Clark and David Tennenhouse | ||||
| "Architectural considerations for a new generation of protocols" | ||||
| [ALF]. | ||||
| 7.2.4. Handling Varying sets of Senders | 6.2.2. Handling Varying sets of Senders | |||
| Another potential issue that needs to be considered is where a | A potential issue that some application designers may need to | |||
| limited set of simultaneously active sources varies within a larger | consider is the case where the set of simultaneously active sources | |||
| set of session members. As each media decoding chain may contain | varies within a larger set of session members. As each media | |||
| state, it is important that this type of usage ensures that a | decoding chain may contain state, it is important that this type of | |||
| receiver can flush a decoding state for an inactive source and if | usage ensures that a receiver can flush a decoding state for an | |||
| that source becomes active again, it does not assume that this | inactive source and if that source becomes active again, it does not | |||
| previous state exists. | assume that this previous state exists. | |||
| This behavior might in certain applications be possible to limit to a | This behavior will cause similar issues independent of SSRC or | |||
| particular RTP Session and instead use multiple RTP sessions. But in | Session multiplexing. It might be possible in certain applications | |||
| some cases it is likely unavoidable and the most appropriate thing is | to limit the changes to a subset of communication session | |||
| to SSRC multiplex. | participants by have the sub-set use particular RTP Sessions. | |||
| 7.2.5. Cross Session RTCP requests | 6.2.3. Cross Session RTCP Requests | |||
| There currently exist no functionality to make truly synchronized and | There currently exists no functionality to make truly synchronized | |||
| atomic RTCP requests across multiple RTP Sessions. Instead separate | and atomic RTCP messages with some type of request semantics across | |||
| RTCP messages will have to be sent in each session. This gives SSRC | multiple RTP Sessions. Instead, separate RTCP messages will have to | |||
| multiplexed streams a slight advantage as RTCP requests for different | be sent in each session. This gives SSRC multiplexed streams a | |||
| streams in the same session can be sent in a compound RTCP packet. | slight advantage as RTCP messages for different streams in the same | |||
| Thus providing an atomic operation if different modifications of | session can be sent in a compound RTCP packet. Thus providing an | |||
| different streams are requested at the same time. | atomic operation if different modifications of different streams are | |||
| requested at the same time. | ||||
| In Session multiplexed cases, the RTCP timing rules in the sessions | In Session multiplexed cases, the RTCP timing rules in the sessions | |||
| and the transport aspects, such as packet loss and jitter, prevents a | and the transport aspects, such as packet loss and jitter, prevents a | |||
| receiver from relying on atomic operations, instead more robust and | receiver from relying on atomic operations, forcing it to use more | |||
| forgiving mechanisms need to be used. | robust and forgiving mechanisms. | |||
| 7.2.6. Binding Related Sources | 6.2.4. Binding Related Sources | |||
| A common problem in a number of various RTP extensions has been how | A common problem in a number of various RTP extensions has been how | |||
| to bind together related sources. This issue is common independent | to bind related sources together. This issue is common to SSRC | |||
| of SSRC multiplexing and Session Multiplexing, and any solution and | multiplexing and Session Multiplexing, and any solution and | |||
| recommendation to the problem should work equally well for both to | recommendation related to the problem should work equally well with | |||
| avoid creating barriers between using session multiplexing and SSRC | both methods to avoid creating barriers between using session | |||
| multiplexing. | multiplexing and SSRC multiplexing. | |||
| The current solutions don't have these properties. There exist one | The current solutions do not have these properties. There exists one | |||
| solution for grouping RTP session together in SDP [RFC5888] to know | solution for grouping RTP session together in SDP [RFC5888] to know | |||
| which RTP session contains for example the FEC data for the source | which RTP session contains for example the FEC data for the source | |||
| data in another session. However, this mechanism does not work on | data in another session. However, this mechanism does not work on | |||
| individual media flows and is thus not directly applicable to the | individual media flows and is thus not directly applicable to the | |||
| problem. The other solution is also SDP based and can group SSRCs | problem. The other solution is also SDP based and can group SSRCs | |||
| within a single RTP session [RFC5576]. Thus this mechanism can bind | within a single RTP session [RFC5576]. Thus this mechanism can bind | |||
| media streams in SSRC multiplexed cases. Both solutions have the | media streams in SSRC multiplexed cases. Both solutions have the | |||
| shortcoming of being restricted to SDP based signalling and also do | shortcoming of being restricted to SDP based signalling and also do | |||
| not work in cases where the session's dynamic properties are such | not work in cases where the session's dynamic properties are such | |||
| that it is difficult or resource consuming to keep the list of | that it is difficult or resource consuming to keep the list of | |||
| skipping to change at page 24, line 48 ¶ | skipping to change at page 23, line 46 ¶ | |||
| This will prevent also the media streams not having an actual | This will prevent also the media streams not having an actual | |||
| collision from being usable during the re-synchronization and also | collision from being usable during the re-synchronization and also | |||
| increases the time until synchronization is finalized. In addition, | increases the time until synchronization is finalized. In addition, | |||
| it requires exception handling in the SSRC generation. | it requires exception handling in the SSRC generation. | |||
| The above collision issue does not occur in case of having only one | The above collision issue does not occur in case of having only one | |||
| SSRC space across all sessions and all participants will be part of | SSRC space across all sessions and all participants will be part of | |||
| at least one session, like the base layer in layered encoding. In | at least one session, like the base layer in layered encoding. In | |||
| that case the only downside is the special behavior that needs to be | that case the only downside is the special behavior that needs to be | |||
| well defined by anyone using this. But, having an exception behavior | well defined by anyone using this. But, having an exception behavior | |||
| where the SSRC space is common across all session an that doesn't fit | where the SSRC space is common across all session is an issue as this | |||
| all the RTP extensions or payload formats present in the sessions is | behavior does not fit all the RTP extensions or payload formats. It | |||
| a issue. It is possible to create a situation where the different | is possible to create a situation where the different mechanisms | |||
| mechanisms can't be combined due to the non standard SSRC allocation | cannot be combined due to the non standard SSRC allocation behavior. | |||
| behavior. | ||||
| Existing mechanisms with known issues: | Existing mechanisms with known issues: | |||
| RTP Retransmission (RFC4588): Has two modes, one for SSRC | RTP Retransmission (RFC4588): Has two modes, one for SSRC | |||
| multiplexing and one for Session multiplexing. The session | multiplexing and one for Session multiplexing. The session | |||
| multiplexing requires the same CNAME and mandates that the same | multiplexing requires the same CNAME and mandates that the same | |||
| SSRC is used in both sessions. Using the same SSRC does work but | SSRC is used in both sessions. Using the same SSRC does work but | |||
| will potentially have issues in certain cases. In SSRC | will potentially have issues in certain cases. In SSRC | |||
| multiplexed mode the CNAME is used, and when the first | multiplexed mode the CNAME is used to bind media and | |||
| retransmission request is sent, one must not have another | retransmission streams together. However, if multiple media | |||
| retransmission request outstanding for an SSRC which don't have a | streams are sent from the same end-point in the same session this | |||
| the binding between the original SSRC and the retransmission | does not provide non-ambiguous binding. Therefore when the first | |||
| stream's SSRC. This works but creates some limitations that can | retransmission request for a media stream is sent, one must not | |||
| be avoided by a more explicit mechanism. The SDP based ssrc-group | have another retransmission request outstanding for an SSRC which | |||
| mechanism is sufficient in this case as long as the application | don't have a binding between the original SSRC and the | |||
| can rely on the signalling based solution. | retransmission stream's SSRC. This works but creates some | |||
| limitations that can be avoided by a more explicit mechanism. The | ||||
| SDP based ssrc-group mechanism is sufficient in this case as long | ||||
| as the application can rely on the signalling based solution. | ||||
| Scalable Video Coding (RFC6190): As an example of scalable coding, | Scalable Video Coding (RFC6190): As an example of scalable coding, | |||
| SVC [RFC6190] has various modes. The Multi Session Transmission | SVC [RFC6190] has various modes. The Multi Session Transmission | |||
| (MST) uses Session multiplexing to separate scalability layers. | (MST) uses Session multiplexing to separate scalability layers. | |||
| However, this specification has failed to explicit how these | However, this specification has failed to be explicit on how these | |||
| layers are bound together in cases where CNAME isn't sufficient. | layers are bound together in cases where CNAME is not sufficient. | |||
| CNAME is no longer sufficient when more than one media source | CNAME is no longer sufficient when more than one media source | |||
| occur within a session that have the same CNAME, for example due | occur within a session that has the same CNAME, for example due to | |||
| to multiple video cameras capturing the same lecture hall. This | multiple video cameras capturing the same lecture hall. This | |||
| likely implies that a single SSRC space as recommend by Section | likely implies that a single SSRC space as recommend by Section | |||
| 8.3 of RTP [RFC3550] is to be used. | 8.3 of RTP [RFC3550] is to be used. | |||
| Forward Error Correction: If some type of FEC or redundancy stream | Forward Error Correction: If some type of FEC or redundancy stream | |||
| is being sent, it needs it's own SSRC, with the exception of | is being sent, it needs its own SSRC, with the exception of | |||
| constructions like redundancy encoding [RFC2198]. Thus in case of | constructions like redundancy encoding [RFC2198]. Thus in case of | |||
| transmitting the FEC in the same session as the source data, the | transmitting the FEC in the same session as the source data, the | |||
| inter SSRC relation within a session is needed. In case of | inter SSRC relation within a session is needed. In case of | |||
| sending the redundant data in a separate session from the source, | sending the redundant data in a separate session from the source, | |||
| the SSRC in each session needs to be related. This occurs for | the SSRC in each session needs to be related. This occurs for | |||
| example in RFC5109 when using session separation of original and | example in RFC5109 when using session separation of original and | |||
| FEC data. SSRC multiplexing is not supported, only using | FEC data. SSRC multiplexing is not supported, only using | |||
| redundant encoding is supported. | redundant encoding is supported. | |||
| This issue appears to need action to harmonize and avoid future | This issue appears to need action to harmonize and avoid future | |||
| shortcomings in extension specifications. A proposed solution for | shortcomings in extension specifications. A proposed solution for | |||
| handling this issue is [I-D.westerlund-avtext-rtcp-sdes-srcname]. | handling this issue is [I-D.westerlund-avtext-rtcp-sdes-srcname]. | |||
| 7.2.7. Forward Error Correction | 6.2.5. Forward Error Correction | |||
| There exist a number of Forward Error Correction (FEC) based schemes | There exist a number of Forward Error Correction (FEC) based schemes | |||
| for how to reduce the packet loss of the original streams. Most of | for how to reduce the packet loss of the original streams. Most of | |||
| the FEC schemes will protect a single source flow. The protection is | the FEC schemes will protect a single source flow. The protection is | |||
| achieved by transmitting a certain amount of redundant information | achieved by transmitting a certain amount of redundant information | |||
| that is encoded such that it can repair one or more packet loss over | that is encoded such that it can repair one or more packet loss over | |||
| the set of packets they protect. This sequence of redundant | the set of packets they protect. This sequence of redundant | |||
| information also needs to be transmitted as its own media stream, or | information also needs to be transmitted as its own media stream, or | |||
| in some cases instead of the original media stream. Thus many of | in some cases instead of the original media stream. Thus many of | |||
| these schemes creates a need for binding the related flows as | these schemes create a need for binding the related flows as | |||
| discussed above. They also create additional flows that need to be | discussed above. They also create additional flows that need to be | |||
| transported. Looking at the history of these schemes, there is both | transported. Looking at the history of these schemes, there is both | |||
| SSRC multiplexed and Session multiplexed solutions and some schemes | SSRC multiplexed and Session multiplexed solutions and some schemes | |||
| that support both. | that support both. | |||
| Using a Session multiplexed solution provides good support for legacy | Using a Session multiplexed solution provides good support for legacy | |||
| when deploying FEC or changing the scheme used so that some set of | when deploying FEC or changing the scheme used, in the sense that it | |||
| receivers may not be able to utilize the FEC information. By placing | supports the case where some set of receivers may not be able to | |||
| it in a separate RTP session, it can easily be ignored. | utilize the FEC information. By placing it in a separate RTP | |||
| session, it can easily be ignored. | ||||
| In usages involving multicast, having the FEC information on its own | In usages involving multicast, having the FEC information on its own | |||
| multicast group and RTP session allows for flexibility, for example | multicast group and RTP session allows for flexibility, for example | |||
| when using Rapid Acquisition of Multicast Groups (RAMS) [RFC6285]. | when using Rapid Acquisition of Multicast Groups (RAMS) [RFC6285]. | |||
| During the RAMS burst where data is received over unicast and where | During the RAMS burst where data is received over unicast and where | |||
| it is possible to combine with unicast based retransmission | it is possible to combine with unicast based retransmission | |||
| [RFC4588], there is no need to burst the FEC data related to the | [RFC4588], there is no need to burst the FEC data related to the | |||
| burst of the source media streams needed to catch up with the | burst of the source media streams needed to catch up with the | |||
| multicast group. This saves bandwidth to the receiver during the | multicast group. This saves bandwidth to the receiver during the | |||
| burst, enabling quicker catch up. When the receiver has catched up | burst, enabling quicker catch up. When the receiver has caught up | |||
| and joins the multicast group(s) for the source, it can at the same | and joins the multicast group(s) for the source, it can at the same | |||
| time join the multicast group with the FEC information. Having the | time join the multicast group with the FEC information. Having the | |||
| source stream and the FEC in separate groups allow for easy | source stream and the FEC in separate groups allow for easy | |||
| separation in the Burst/Retransmission Source (BRS) without having to | separation in the Burst/Retransmission Source (BRS) without having to | |||
| individually classify packets. | individually classify packets. | |||
| 7.2.8. Transport Translator Sessions | 6.2.6. Transport Translator Sessions | |||
| A basic Transport Translator relays any incoming RTP and RTCP packets | A basic Transport Translator relays any incoming RTP and RTCP packets | |||
| to the other participants. The main difference between SSRC | to the other participants. The main difference between SSRC | |||
| multiplexing and Session multiplexing resulting from this use case is | multiplexing and Session multiplexing resulting from this use case is | |||
| that for SSRC multiplexing it is not possible for a particular | that for SSRC multiplexing it is not possible for a particular | |||
| session participant to decide to receive a subset of media streams. | session participant to decide to receive a subset of media streams. | |||
| When using separate RTP sessions for the different sets of media | When using separate RTP sessions for the different sets of media | |||
| streams, a single participant can choose to leave one of the sessions | streams, a single participant can choose to leave one of the sessions | |||
| but not the other. | but not the other. | |||
| 7.2.9. Multiple Media Types in one RTP session | 6.3. Interworking | |||
| Having different media types, like audio and video, in the same RTP | There are several different kinds of interworking, and this section | |||
| sessions is not forbidden, only recommended against as can be seen in | discusses two related ones. The interworking between different | |||
| Section 7.2.1. When using multiple media types, there are a number | applications and the implications of potentially different choices of | |||
| of considerations: | usage of RTP's multiplexing points. The second topic relates to what | |||
| limitations may have to be considered working with some legacy | ||||
| applications. | ||||
| Payload Type gives Media Type: This solution is dependent on getting | 6.3.1. Interworking Applications | |||
| the media type from the Payload Type. Thus overloading this de- | ||||
| multiplexing point in a receiver for two purposes. First for the | ||||
| main media type and determining the processing chain, then later | ||||
| for the exact configuration of the encoder and packetization. | ||||
| Payload Type field limiations: The total number of Payload Types | It is not uncommon that applications or services of similar usage, | |||
| available to use in an RTP session is fairly limited, especially | especially the ones intended for interactive communication, ends up | |||
| if Multiplexing RTP Data and Control Packets on a Single Port | in a situation where one want to interconnect two or more of these | |||
| [RFC5761] is used. For certain applications negotiating a large | applications. From an RTP perspective this could be problem free if | |||
| set of codes and configuration may become an issue. | all the applications have made the same multiplexing choices, have | |||
| the same capabilities in number of simultaneous media streams | ||||
| combined with the same set of RTP/RTCP extensions being supported. | ||||
| Unfortunately this may not always be true. | ||||
| Don't switch media types for an SSRC: The primary reasons to avoid | In these cases one ends up in a situation where one might use a | |||
| switching from sending for example audio to sending video using | gateway to interconnect applications. This gateway then needs to | |||
| the same SSRC is the implications on a receiver. When this | change the multiplexing structure or adhere to limitations in each | |||
| happens, the processing chain in the receiver will have to switch | application. If one's goal is to make minimal amount of work in such | |||
| from one media type to another. As the different media type's | a gateway, there are some multiplexing choices that one should avoid. | |||
| entire processing chains are different and are connected to | The lowest amount of work represents solutions where one can take an | |||
| different outputs it is difficult to reuse the decoding chain, | SSRC from one RTP session in one application and forward it into | |||
| which a normal codec change likely can. Instead the entire | another RTP session. For example if one has one application that has | |||
| processing chain has to be torn down and replaced. In addition, | multiple SSRCs for one media type in one session and another | |||
| there is likely a clock rate switching problem, possibly resulting | application that instead has chosen to use multiple RTP sessions with | |||
| in synchronization loss at the point of switching media type if | only a single SSRC per end-point in each of these sessions. Then | |||
| some packet loss occurs. | mapping an SSRC from the side with one session into an RTP session is | |||
| possible. However mapping SSRC from different RTP sessions into a | ||||
| single RTP session has the potential of creating SSRC collisions, | ||||
| especially if an end-point has not generated independent random SSRC | ||||
| values in each RTP session. This issue is even more likely in a case | ||||
| where one side uses a single RTP session with multiple media types | ||||
| and the other uses different RTP session for different media or | ||||
| robustness mechanism such as retransmission [RFC4588]. Then it is | ||||
| more likely or maybe even required to use the same SSRC in the | ||||
| different RTP sessions. | ||||
| RTCP Bit-rate Issues: If the media types are significantly different | In cases where the used structure is incompatible, the gateway will | |||
| in bit-rate, the RTCP bandwidth rates assigned to each source in a | need to make SSRC translation. Thus this incurs overhead and some | |||
| session can result in interesting effects, like that the RTCP bit- | potential loss of functionality. First of all, if one translates the | |||
| rate share for an audio stream is larger than the actual audio | SSRC in an RTP header then one will be forced to decrypt and re- | |||
| bit-rate. In itself this doesn't cause any conflicts, only | encrypt if one uses SRTP and thus also needs to be part of the | |||
| potentially unnecessary overhead. It is possible to avoid this | security association. Secondly, changing the SSRC also means that | |||
| using AVPF or SAVPF and setting trr-int parameter, which can bring | one needs to translate all RTCP messages. This can be more complex, | |||
| down unnecessary regular reporting while still allowing for rapid | but important so that the gateway does not end up having to terminate | |||
| feedback. | the end-to-end RTCP chain. In that case the gateway will need to be | |||
| able to take the role of a true end-point in each session, which may | ||||
| include functions such as bit-rate adaptation and correctly respond | ||||
| to whatever RTCP extensions are being used, and then translate them | ||||
| or locally respond to them. Thirdly, an SSRC translation may require | ||||
| that one changes RTP payloads; for example, an RTP retransmission | ||||
| packet contains an original sequence number that must match the | ||||
| sequence number used in for the corresponding packet with the new | ||||
| SSRC. And for FEC packets this is even worse, as the original SSRC | ||||
| is included as part of the data for which FEC redundant data is | ||||
| calculated. A fourth issue is the potential for these gateways to | ||||
| block evolution of the applications by blocking unknown RTP and RTCP | ||||
| extensions that the regular application has been extended with. | ||||
| Decomposited end-points: Decomposited nodes that rely on the regular | If one uses security functions, like SRTP, they can as seen above | |||
| network to separate audio and video to different devices do not | incur both additional risk due to the gateway needing to be in | |||
| work well with this session setup. If they are forced to work, | security association between the end-points, unless the gateway is on | |||
| all media receiver parts of a decomposited end-point will receive | the transport level, and additional complexities in form of the | |||
| all media, thus doubling the bit-rate consumption for the end- | decrypt-encrypt cycles needed for each forwarded packet. SRTP, due | |||
| point. | to its keying structure, also makes it hard to move a flow from one | |||
| RTP session to another as each RTP session will have one or more | ||||
| different master keys and these must not be the same in multiple RTP | ||||
| sessions as that can result in two-time pads that completely breaks | ||||
| the confidentiality of the packets. | ||||
| RTP Mixers and Translators: An RTP mixer or Media Translator will | An additional issue around interworking is that for multi-party | |||
| also have to support this particular session setup, where it | applications it can be impossible to judge which different RTP | |||
| before could rely on the RTP session to determine what processing | multiplexing behaviors that will be used by end-points that attempt | |||
| options should be applied to the incoming packets. | to join a session. Thus if one attempts to use a multiplexing choice | |||
| that has poor interworking, one may have to switch at a later stage | ||||
| when someone wants to participate in a multi-party session using an | ||||
| RTP application supporting only another behavior. It is likely | ||||
| difficult to implement the switch without some media disruption. | ||||
| As can be seen, there is nothing in here that prevents using a single | To summarize, certain types of applications are likely to be inter- | |||
| RTP session for multiple media types, however it does create a number | worked. Sets of applications of similar type should strive to use | |||
| of limitations and special case implementation requirements. So | the same multiplexing structure to avoid the need to make an RTP | |||
| anyone considering to use this setup should carefully review if the | session level gateway. This as it incurs complexity costs, can force | |||
| reasons for using a single RTP session is sufficient to motivate this | the gateway to be part of security associations, force SSRC | |||
| special case. | translation and even payload translation which is also a potential | |||
| hinder to application evolution. | ||||
| 7.3. Signalling Aspects | 6.3.2. Multiple SSRC Legacy Considerations | |||
| Historically, the most common RTP use cases have been point to point | ||||
| Voice over IP (VoIP) or streaming applications, commonly with no more | ||||
| than one media source per end-point and media type (typically audio | ||||
| and video). Even in conferencing applications, especially voice | ||||
| only, the conference focus or bridge has provided a single stream | ||||
| with a mix of the other participants to each participant. It is also | ||||
| common to have individual RTP sessions between each end-point and the | ||||
| RTP mixer. | ||||
| When establishing RTP sessions that may contain end-points that | ||||
| aren't updated to handle multiple streams following these | ||||
| recommendations, a particular application can have issues with | ||||
| multiple SSRCs within a single session. These issues include: | ||||
| 1. Need to handle more than one stream simultaneously rather than | ||||
| replacing an already existing stream with a new one. | ||||
| 2. Be capable of decoding multiple streams simultaneously. | ||||
| 3. Be capable of rendering multiple streams simultaneously. | ||||
| RTP Session multiplexing could potentially avoid these issues if | ||||
| there is only a single SSRC at each end-point, and in topologies | ||||
| which appears like point to point as seen the end-point. However, | ||||
| forcing the usage of session multiplexing due to this reason would be | ||||
| a great mistake, as it is likely that a significant set of | ||||
| applications will need a combination of SSRC multiplexing of several | ||||
| media sources and session multiplexing for other aspects such as | ||||
| encoding alternatives, adding robustness or simply to support legacy. | ||||
| However, this issue does need consideration when deploying multiple | ||||
| media streams within an RTP session where legacy end-points may | ||||
| occur. | ||||
| 6.4. Signalling Aspects | ||||
| There exist various signalling solutions for establishing RTP | There exist various signalling solutions for establishing RTP | |||
| sessions. Many are SDP [RFC4566] based, however SDP functionality is | sessions. Many are SDP [RFC4566] based, however SDP functionality is | |||
| also dependent on the signalling protocols carrying the SDP. Where | also dependent on the signalling protocols carrying the SDP. Where | |||
| RTSP [RFC2326] and SAP [RFC2974] both use SDP in a declarative | RTSP [RFC2326] and SAP [RFC2974] both use SDP in a declarative | |||
| fashion, SIP [RFC3261] uses SDP with the additional definition of | fashion, while SIP [RFC3261] uses SDP with the additional definition | |||
| Offer/Answer [RFC3264]. The impact on signalling and especially SDP | of Offer/Answer [RFC3264]. The impact on signalling and especially | |||
| needs to be considered as it can greatly affect how to deploy a | SDP needs to be considered as it can greatly affect how to deploy a | |||
| certain multiplexing point choice. | certain multiplexing point choice. | |||
| 7.3.1. Session Oriented Properties | 6.4.1. Session Oriented Properties | |||
| One aspect of the existing signalling is that it is focused around | One aspect of the existing signalling is that it is focused around | |||
| sessions, or at least in the case of SDP the media description. | sessions, or at least in the case of SDP the media description. | |||
| There are a number of things that are signalled on a session level/ | There are a number of things that are signalled on a session level/ | |||
| media description but that are not necessarily strictly bound to an | media description but those are not necessarily strictly bound to an | |||
| RTP session and could be of interest to signal specifically for a | RTP session and could be of interest to signal specifically for a | |||
| particular media stream within the session. The following properties | particular media stream (SSRC) within the session. The following | |||
| have been identified as being potentially useful to signal not only | properties have been identified as being potentially useful to signal | |||
| on RTP session level: | not only on RTP session level: | |||
| o Bitrate/Bandwidth exist today only at aggregate or a common any | o Bitrate/Bandwidth exist today only at aggregate or a common any | |||
| media stream limit | media stream limit | |||
| o Which SSRC that will use which RTP Payload Types | o Which SSRC that will use which RTP Payload Types | |||
| Some of these issues are clearly SDP's problem rather than RTP | Some of these issues are clearly SDP's problem rather than RTP | |||
| limitations. However, if the aim is to deploy an SSRC multiplexed | limitations. However, if the aim is to deploy an SSRC multiplexed | |||
| solution that contains several sets of media streams with different | solution that contains several sets of media streams with different | |||
| properties (encoding/packetization parameter, bit-rate, etc), putting | properties (encoding/packetization parameter, bit-rate, etc), putting | |||
| each set in a different RTP session would directly enable negotiation | each set in a different RTP session would directly enable negotiation | |||
| of the parameters for each set. If insisting on SSRC multiplexing, a | of the parameters for each set. If insisting on SSRC multiplexing | |||
| number of signalling extensions are needed to clarify that there are | only, a number of signalling extensions are needed to clarify that | |||
| multiple sets of media streams with different properties and that | there are multiple sets of media streams with different properties | |||
| they shall in fact be kept different, since a single set will not | and that they shall in fact be kept different, since a single set | |||
| satisfy the applications requirements. | will not satisfy the application's requirements. | |||
| This does in fact create a strong driver to use RTP session | This does in fact create a strong driver to use RTP session | |||
| multiplexing for any case where different sets of media streams with | multiplexing for any case where different sets of media streams with | |||
| different requirements exist. | different requirements exist. | |||
| 7.3.2. SDP Prevents Multiple Media Types | 6.4.2. SDP Prevents Multiple Media Types | |||
| SDP encoded in its structure a prevention against using multiple | SDP encoded in its structure prevention against using multiple media | |||
| media types in the same RTP session. A media description in SDP can | types in the same RTP session. A media description in SDP can only | |||
| only have a single media type; audio, video, text, image, | have a single media type; audio, video, text, image, application. | |||
| application. This media type is used as the top-level media type for | This media type is used as the top-level media type for identifying | |||
| identifying the actual payload format bound to a particular payload | the actual payload format bound to a particular payload type using | |||
| type using the rtpmap attribute. Thus a high fence against using | the rtpmap attribute. Thus a high fence against using multiple media | |||
| multiple media types in the same session was created. | types in the same session was created. | |||
| There is a proposal in the MMUSIC WG for how one could allow multiple | There is an accepted WG item in the MMUSIC WG to define how multiple | |||
| media lines describe a single underlying transport | media lines describe a single underlying transport | |||
| [I-D.holmberg-mmusic-sdp-bundle-negotiation] and thus support either | [I-D.holmberg-mmusic-sdp-bundle-negotiation] and thus it becomes | |||
| one RTP session with multiple media types. There is also a solution | possible in SDP to define one RTP session with multiple media types. | |||
| for multiplexing multiple RTP sessions onto the same transport | ||||
| [I-D.westerlund-avtcore-single-transport-multiplexing]. | ||||
| 7.4. Network Apsects | 6.4.3. Media Stream Usage | |||
| Media streams being transported in RTP has some particular usage in | ||||
| an RTP application. This usage of the media stream is in many | ||||
| applications so far implicitly signalled. For example by having all | ||||
| audio media streams arriving in the only audio RTP session they are | ||||
| to be decoded, mixed and played out. However, in more advanced | ||||
| applications that use multiple media streams there will be more than | ||||
| a single usage or purpose among the set of media streams being sent | ||||
| or received. RTP applications will need to signal this usage | ||||
| somehow. Here the choice of SSRC multiplexing versus session | ||||
| multiplexing will have significant impact. If one uses SSRC | ||||
| multiplexing to its full extent one will have to explicitly indicate | ||||
| for each SSRC what its' usage and purpose are using some signalling | ||||
| between the application instances. | ||||
| This SSRC usage signalling will have some impact on the application | ||||
| and also on any central RTP nodes. It is important in the design to | ||||
| consider the implications of the need for additional signalling | ||||
| between the nodes. One consideration is if a receiver can utilize | ||||
| the media stream at all before it has received the signalling message | ||||
| describing the media stream and its usage. Another consideration is | ||||
| that any RTP central node, like an RTP mixer or translator that | ||||
| selects, mixes or processes streams, in most cases will need to | ||||
| receive the same signalling to know how to treat media streams with | ||||
| different usage in the right fashion. | ||||
| Application designers should consider putting media streams of the | ||||
| same usage and/or receiving the same treatment in middleboxes in the | ||||
| same RTP sessions and use the RTP session as an explicit indication | ||||
| of how to deal with media streams. By having session level | ||||
| indication of usage and have different RTP sessions for different | ||||
| usages, the need for stream specific signalling can be reduced. | ||||
| Especially signalling of the type that is time critical and needs to | ||||
| be provided prior to the media stream being available. | ||||
| 6.5. Network Aspects | ||||
| The multiplexing choice has impact on network level mechanisms that | The multiplexing choice has impact on network level mechanisms that | |||
| need to be considered by the implementor. | need to be considered by the implementor. | |||
| 7.4.1. Quality of Service | 6.5.1. Quality of Service | |||
| When it comes to Quality of Service mechanisms, they are either flow | When it comes to Quality of Service mechanisms, they are either flow | |||
| based or marking based. RSVP [RFC2205] is an example of a flow based | based or marking based. RSVP [RFC2205] is an example of a flow based | |||
| mechanism, while Diff-Serv [RFC2474] is an example of a Marking based | mechanism, while Diff-Serv [RFC2474] is an example of a Marking based | |||
| one. For a marking based scheme, the method of multiplexing will not | one. For a marking based scheme, the method of multiplexing will not | |||
| affect the possibility to use QoS. | affect the possibility to use QoS. | |||
| However, for a flow based scheme there is a clear difference between | However, for a flow based scheme there is a clear difference between | |||
| the methods. SSRC multiplexing will result in all media streams | the methods. SSRC multiplexing will result in all media streams | |||
| being part of the same 5-tuple (protocol, source address, destination | being part of the same 5-tuple (protocol, source address, destination | |||
| address, source port, destination port) which is the most common | address, source port, destination port) which is the most common | |||
| selector for flow based QoS. Thus, separation of the level of QoS | selector for flow based QoS. Thus, separation of the level of QoS | |||
| between media streams is not possible. That is however possible for | between media streams is not possible. That is however possible for | |||
| session based multiplexing, where each different version can be in a | session based multiplexing, where each different version can be in a | |||
| different RTP session that can be sent over different 5-tuples. | different RTP session that can be sent over different 5-tuples. | |||
| 7.4.2. NAT and Firewall Traversal | 6.5.2. NAT and Firewall Traversal | |||
| In today's network there exist a large number of middleboxes. The | In today's network there exist a large number of middleboxes. The | |||
| ones that normally have most impact on RTP are Network Address | ones that normally have most impact on RTP are Network Address | |||
| Translators (NAT) and Firewalls (FW). | Translators (NAT) and Firewalls (FW). | |||
| Below we analyze and comment on the impact of requiring more | Below we analyze and comment on the impact of requiring more | |||
| underlying transport flows in the presence of NATs and Firewalls: | underlying transport flows in the presence of NATs and Firewalls: | |||
| End-Point Port Consumption: A given IP address only has 65536 | End-Point Port Consumption: A given IP address only has 65536 | |||
| available local ports per transport protocol for all consumers of | available local ports per transport protocol for all consumers of | |||
| skipping to change at page 30, line 46 ¶ | skipping to change at page 32, line 6 ¶ | |||
| NAT Traversal Excess Time: Making the NAT/FW traversal takes a | NAT Traversal Excess Time: Making the NAT/FW traversal takes a | |||
| certain amount of time for each flow. It also takes time in a | certain amount of time for each flow. It also takes time in a | |||
| phase of communication between accepting to communicate and the | phase of communication between accepting to communicate and the | |||
| media path being established which is fairly critical. The best | media path being established which is fairly critical. The best | |||
| case scenario for how much extra time it can take following the | case scenario for how much extra time it can take following the | |||
| specified ICE procedures are: 1.5*RTT + Ta*(Additional_Flows-1), | specified ICE procedures are: 1.5*RTT + Ta*(Additional_Flows-1), | |||
| where Ta is the pacing timer, which ICE specifies to be no smaller | where Ta is the pacing timer, which ICE specifies to be no smaller | |||
| than 20 ms. That assumes a message in one direction, and then an | than 20 ms. That assumes a message in one direction, and then an | |||
| immediate triggered check back. This as ICE first finds one | immediate triggered check back. This as ICE first finds one | |||
| candidate pair that works prior to establish multiple flows. | candidate pair that works prior to establish multiple flows. | |||
| Thus, there are no extra time until one has found a working | Thus, there is no extra time until one has found a working | |||
| candidate pair. Based on that working pair the extra time is to | candidate pair. Based on that working pair the needed extra time | |||
| in parallel establish the, in most cases 2-3, additional flows. | is to in parallel establish the, in most cases 2-3, additional | |||
| flows. | ||||
| NAT Traversal Failure Rate: Due to the need to establish more than a | NAT Traversal Failure Rate: Due to the need to establish more than a | |||
| single flow through the NAT, there is some risk that establishing | single flow through the NAT, there is some risk that establishing | |||
| the first flow succeeds but that one or more of the additional | the first flow succeeds but that one or more of the additional | |||
| flows fail. The risk that this happens is hard to quantify, but | flows fail. The risk that this happens is hard to quantify, but | |||
| it should be fairly low as one flow from the same interfaces has | it should be fairly low as one flow from the same interfaces has | |||
| just been successfully established . Thus only rare events such | just been successfully established. Thus only rare events such as | |||
| as NAT resource overload, or selecting particular port numbers | NAT resource overload, or selecting particular port numbers that | |||
| that are filtered etc, should be reasons for failure. | are filtered etc, should be reasons for failure. | |||
| Deep Packet Inspection and Multiple Streams: Firewalls differ in how | ||||
| deeply they inspect packets. There exist some potential that | ||||
| deeply inspecting firewalls will have similar legacy issues with | ||||
| multiple SSRCs as some stack implementations. | ||||
| SSRC multiplexing keeps additional media streams within one RTP | SSRC multiplexing keeps additional media streams within one RTP | |||
| Session and does not introduce any additional NAT traversal | Session and does not introduce any additional NAT traversal | |||
| complexities per media stream. In contrast, the session multiplexing | complexities per media stream. In contrast, the session multiplexing | |||
| is using one RTP session per media stream. Thus additional lower | is using one RTP session per media stream. Thus additional lower | |||
| layer transport flows will be required, unless an explicit de- | layer transport flows will be required, unless an explicit de- | |||
| multiplexing layer is added between RTP and the transport protocol. | multiplexing layer is added between RTP and the transport protocol. | |||
| A proposal for how to multiplex multiple RTP sessions over the same | A proposal for how to multiplex multiple RTP sessions over the same | |||
| single lower layer transport exist in | single lower layer transport exist in | |||
| [I-D.westerlund-avtcore-single-transport-multiplexing]. | [I-D.westerlund-avtcore-single-transport-multiplexing]. | |||
| 7.4.3. Multicast | 6.5.3. Multicast | |||
| Multicast groups provides a powerful semantics for a number of real- | Multicast groups provides a powerful semantics for a number of real- | |||
| time applications, especially the ones that desire broadcast-like | time applications, especially the ones that desire broadcast-like | |||
| behaviors with one end-point transmitting to a large number of | behaviors with one end-point transmitting to a large number of | |||
| receivers, like in IPTV. But that same semantics do result in a | receivers, like in IPTV. But that same semantics do result in a | |||
| certain number of limitations. | certain number of limitations. | |||
| One limitation is that for any group, sender side adaptation to the | One limitation is that for any group, sender side adaptation to the | |||
| actual receiver properties causes a degradation for all participants | actual receiver properties causes degradation for all participants to | |||
| to what is supported by the receiver with the worst conditions among | what is supported by the receiver with the worst conditions among the | |||
| the group participants. In most cases this is not acceptable. | group participants. In most cases this is not acceptable. Instead | |||
| Instead various receiver based solutions are employed to ensure that | various receiver based solutions are employed to ensure that the | |||
| the receivers achieve best possible performance. By using scalable | receivers achieve best possible performance. By using scalable | |||
| encoding and placing each scalability layer in a different multicast | encoding and placing each scalability layer in a different multicast | |||
| group, the receiver can control the amount of traffic it receives. | group, the receiver can control the amount of traffic it receives. | |||
| To have each scalability layer on a different multicast group, one | To have each scalability layer on a different multicast group, one | |||
| RTP session per multicast group is used. | RTP session per multicast group is used. | |||
| If instead a single RTP session over multiple transports were to be | If instead a single RTP session over multiple transports were to be | |||
| deployed, i.e. multicast groups with each layer as it's own SSRC, | deployed, i.e. multicast groups with each layer as it's own SSRC, | |||
| then very different views of the RTP session would exist. That as | then very different views of the RTP session would exist. That as | |||
| one receiver may see only a single layer (SSRC), while another may | one receiver may see only a single layer (SSRC), while another may | |||
| see three SSRCs if it joined three multicast groups. This would | see three SSRCs if it joined three multicast groups. This would | |||
| skipping to change at page 32, line 11 ¶ | skipping to change at page 33, line 22 ¶ | |||
| able to determine if a receiver isn't reporting on a particular SSRC | able to determine if a receiver isn't reporting on a particular SSRC | |||
| due to that it is not a member of that multicast group, or because it | due to that it is not a member of that multicast group, or because it | |||
| doesn't receive it as a result of a transport failure. | doesn't receive it as a result of a transport failure. | |||
| Thus it appears easiest and most straightforward to use multiple RTP | Thus it appears easiest and most straightforward to use multiple RTP | |||
| sessions. In addition, the transport flow considerations in | sessions. In addition, the transport flow considerations in | |||
| multicast are a bit different from unicast. First of all there is no | multicast are a bit different from unicast. First of all there is no | |||
| shortage of port space, as each multicast group has its own port | shortage of port space, as each multicast group has its own port | |||
| space. | space. | |||
| 7.4.4. Multiplexing multiple RTP Session on a Single Transport | 6.5.4. Multiplexing multiple RTP Session on a Single Transport | |||
| For applications that doesn't need flow based QoS and like to save | For applications that doesn't need flow based QoS and like to save | |||
| ports and NAT/FW traversal costs, there is a proposal for how to | ports and NAT/FW traversal costs and where usage of multiple media | |||
| achieve multiplexing of multiple RTP sessions over the same lower | types in one RTP session is not suitable, there is a proposal for how | |||
| to achieve multiplexing of multiple RTP sessions over the same lower | ||||
| layer transport | layer transport | |||
| [I-D.westerlund-avtcore-single-transport-multiplexing]. Using such a | [I-D.westerlund-avtcore-single-transport-multiplexing]. Using such a | |||
| solution would allow session multiplexing without most of the | solution would allow session multiplexing without most of the | |||
| perceived downsides of additional RTP sessions creating a need for | perceived downsides of additional RTP sessions creating a need for | |||
| additional transport flows. | additional transport flows. | |||
| 7.5. Security Aspects | 6.6. Security Aspects | |||
| On the basic level there is no significant difference in security | On the basic level there is no significant difference in security | |||
| when having one RTP session and having multiple. However, there are | when having one RTP session and having multiple. However, there are | |||
| a few more detailed considerations that might need to be considered | a few more detailed considerations that might need to be considered | |||
| in certain usages. | in certain usages. | |||
| 7.5.1. Security Context Scope | 6.6.1. Security Context Scope | |||
| When using SRTP [RFC3711] the security context scope is important and | When using SRTP [RFC3711] the security context scope is important and | |||
| can be a necessary differentiation in some applications. As SRTP's | can be a necessary differentiation in some applications. As SRTP's | |||
| crypto suites (so far) is built around symmetric keys, the receiver | crypto suites (so far) is built around symmetric keys, the receiver | |||
| will need to have the same key as the sender. This results in that | will need to have the same key as the sender. This results in that | |||
| none in a multi-party session can be certain that a received packet | no one in a multi-party session can be certain that a received packet | |||
| really was sent by the claimed sender or by another party having | really was sent by the claimed sender or by another party having | |||
| access to the key. In most cases this is a sufficient security | access to the key. In most cases this is a sufficient security | |||
| property, but there are a few cases where this does create | property, but there are a few cases where this does create | |||
| situations. | situations. | |||
| The first case is when someone leaves a multi-party session and one | The first case is when someone leaves a multi-party session and one | |||
| wants to ensure that the party that left can no longer access the | wants to ensure that the party that left can no longer access the | |||
| media streams. This requires that everyone re-keys without | media streams. This requires that everyone re-keys without | |||
| disclosing the keys to the excluded party. | disclosing the keys to the excluded party. | |||
| skipping to change at page 33, line 11 ¶ | skipping to change at page 34, line 24 ¶ | |||
| stream can be based on the stream being encrypted with a key that | stream can be based on the stream being encrypted with a key that | |||
| user can't access without paying premium, having the key-management | user can't access without paying premium, having the key-management | |||
| limit access to the key. | limit access to the key. | |||
| In the latter case it is likely easiest from signalling, transport | In the latter case it is likely easiest from signalling, transport | |||
| (if done over multicast) and security to use a different RTP session. | (if done over multicast) and security to use a different RTP session. | |||
| That way the user(s) not intended to receive a particular stream can | That way the user(s) not intended to receive a particular stream can | |||
| easily be excluded. There is no need to have SSRC specific keys, | easily be excluded. There is no need to have SSRC specific keys, | |||
| which many of the key-management systems cannot handle. | which many of the key-management systems cannot handle. | |||
| 7.5.2. Key-Management for Multi-party session | 6.6.2. Key-Management for Multi-party session | |||
| Performing key-management for Multi-party session can be a challenge. | Performing key-management for Multi-party session can be a challenge. | |||
| This section considers some of the issues. | This section considers some of the issues. | |||
| Transport translator based session cannot use Security Description | Transport translator based session cannot use Security Description | |||
| [RFC4568] nor DTLS-SRTP [RFC5764] without an extension as each end- | [RFC4568] nor DTLS-SRTP [RFC5764] without an extension as each end- | |||
| point provides it's set of keys. In centralized conference, the | point provides its set of keys. In centralized conference, the | |||
| signalling counterpart is a conference server and the media plane | signalling counterpart is a conference server and the media plane | |||
| unicast counterpart (to which DTLS messages would be sent) is the | unicast counterpart (to which DTLS messages would be sent) is the | |||
| translator. Thus an extension like Encrypted Key Transport | translator. Thus an extension like Encrypted Key Transport | |||
| [I-D.ietf-avt-srtp-ekt] are needed or a MIKEY [RFC3830] based | [I-D.ietf-avt-srtp-ekt] is needed or a MIKEY [RFC3830] based solution | |||
| solution that allows for keying all session participants with the | that allows for keying all session participants with the same master | |||
| same master key. | key. | |||
| Keying of multicast transported SRTP face similar challenges as the | Keying of multicast transported SRTP face similar challenges as the | |||
| transport translator case. | transport translator case. | |||
| 6.6.3. Complexity Implications | ||||
| The usage of security functions can surface complexity implications | ||||
| of the choice of multiplexing and topology. This becomes especially | ||||
| evident in RTP topologies having any type of middlebox that processes | ||||
| or modifies RTP/RTCP packets. Where there is very small overhead for | ||||
| a not secured RTP translator or mixer to rewrite an SSRC value in the | ||||
| RTP packet, the cost of doing it when using cryptographic security | ||||
| functions is higher. For example if using SRTP [RFC3711], the actual | ||||
| security context and exact crypto key are determined by the SSRC | ||||
| field value. If one changes it, the encryption and authentication | ||||
| tag must be performed using another key. Thus changing the SSRC | ||||
| value implies a decryption using the old SSRC and its security | ||||
| context followed by an encryption using the new one. | ||||
| There exist many valid cases where a middlebox will be forced to | ||||
| perform such cryptographic operations due to the intended purpose of | ||||
| the middlebox, for example a media transcoding RTP translator cannot | ||||
| avoid performing these operations as they will produce a different | ||||
| payload compared to the input. However, there exist some cases where | ||||
| another topology and/or multiplexing choice could avoid the | ||||
| complexities. | ||||
| 6.7. Multiple Media Types in one RTP session | ||||
| Having different media types, like audio and video, in the same RTP | ||||
| sessions is not forbidden, only recommended against as earlier | ||||
| discussed in Section 6.2.1.1. When using multiple media types, there | ||||
| are a number of considerations: | ||||
| Payload Type gives Media Type: This solution is dependent on getting | ||||
| the media type from the Payload Type. Thus overloading this de- | ||||
| multiplexing point in a receiver making it serve two purposes. | ||||
| First to provide the main media type and determining the | ||||
| processing chain, then later for the exact configuration of the | ||||
| encoder and packetization. | ||||
| Payload Type field limitations: The total number of Payload Types | ||||
| available to use in an RTP session is fairly limited, especially | ||||
| if Multiplexing RTP Data and Control Packets on a Single Port | ||||
| [RFC5761] is used. For certain applications negotiating a large | ||||
| set of codes and configuration this may become an issue. | ||||
| An SSRC cannot use two clock rates simultaneously: The used RTP | ||||
| clock rate for an SSRC is determined from the payload type. As | ||||
| discussed in Appendix A it is not possible to simultaneously use | ||||
| two different clock rates for the same SSRC. Even switching clock | ||||
| rate once has potential issues if packet loss occurs at the same | ||||
| time. Different media types commonly have different clock rates | ||||
| preventing or creating issues to use two different media types for | ||||
| the same SSRC. | ||||
| Do not switch media types for an SSRC: The primary reasons to avoid | ||||
| switching from sending for example audio to sending video using | ||||
| the same SSRC is the implications on a receiver. When this | ||||
| happens, the processing chain in the receiver will have to switch | ||||
| from one media type to another. As the different media type's | ||||
| entire processing chains are different and are connected to | ||||
| different outputs it is difficult to reuse the decoding chain, | ||||
| which a normal codec change likely can. Instead the entire | ||||
| processing chain has to be torn down and replaced. In addition, | ||||
| there is likely a clock rate switching problem, possibly resulting | ||||
| in synchronization loss at the point of switching media type if | ||||
| some packet loss occurs. So this is a behavior that shall be | ||||
| avoided. | ||||
| RTCP Bit-rate Issues: If the media types are significantly different | ||||
| in bit-rate, the RTCP bandwidth rates assigned to each source in a | ||||
| session can result in interesting effects, like that the RTCP bit- | ||||
| rate share for an audio stream is larger than the actual audio | ||||
| bit-rate. In itself this doesn't cause any conflicts, only | ||||
| potentially unnecessary overhead. It is possible to avoid this | ||||
| using AVPF or SAVPF and setting trr-int parameter, which can bring | ||||
| down unnecessary regular reporting while still allowing for rapid | ||||
| feedback. | ||||
| De-composite end-points: De-composite nodes that rely on the regular | ||||
| network to separate audio and video to different devices do not | ||||
| work well with this session setup. If they are forced to work, | ||||
| all media receiver parts of a de-composite end-point will receive | ||||
| all media, thus doubling the bit-rate consumption for the end- | ||||
| point. | ||||
| Flow based QoS Separation: Flow based QoS mechanisms will see all | ||||
| the media streams in the RTP session as part of a single flow. | ||||
| Therefore there is no possibility to provide separated QoS | ||||
| behavior for the different media types or flows. | ||||
| RTP Mixers and Translators: An RTP mixer or Media Translator will | ||||
| also have to support this particular session setup, where it | ||||
| before could rely on the RTP session to determine what processing | ||||
| options should be applied to the incoming packets. | ||||
| Legacy Implementations: The use of multiple media types has the | ||||
| potential for even larger issues with legacy implementations than | ||||
| single media type SSRC multiplexing due to the occurrence of | ||||
| multiple media types among the payload type configurations. | ||||
| As can be seen, there is nothing in here that prevents using a single | ||||
| RTP session for multiple media types, however it does create a number | ||||
| of limitations and special case implementation requirements. So | ||||
| anyone considering using this setup should carefully review if the | ||||
| reasons for using a single RTP session are sufficient to motivate the | ||||
| needed special handling. | ||||
| 7. Arch-Types | ||||
| This section discusses some arch-types of how RTP multiplexing can be | ||||
| used in applications to achieve certain goals and a summary of their | ||||
| implications. For each arch-type there is discussion of benefits and | ||||
| downsides. | ||||
| 7.1. Single SSRC per Session | ||||
| In this arch-type each end-point in a point-to-point session has only | ||||
| a single SSRC, thus the RTP session contains only two SSRCs, one | ||||
| local and one remote. This session can be used both unidirectional, | ||||
| i.e. only a single media stream or bi-directional, i.e. both end- | ||||
| points have one media stream each. If the application needs | ||||
| additional media flows between the end-points, they will have to | ||||
| establish additional RTP sessions. | ||||
| The Pros: | ||||
| 1. This arch-type has great legacy interoperability potential as it | ||||
| will not tax any RTP stack implementations. | ||||
| 2. The signalling has good possibilities to negotiate and describe | ||||
| the exact formats and bit-rates for each media stream, especially | ||||
| using today's tools in SDP. | ||||
| 3. It does not matter if usage or purpose of the media stream is | ||||
| signalled on media stream level or session level as there is no | ||||
| difference. | ||||
| 4. It is possible to control security association per RTP session | ||||
| with current key-management. | ||||
| The Cons: | ||||
| a. The number of required RTP sessions cannot really be higher, | ||||
| which has the implications: | ||||
| * Linear growth of the amount of NAT/FW state with number of | ||||
| media streams. | ||||
| * Increased delay and resource consumption from NAT/FW | ||||
| traversal. | ||||
| * Likely larger signalling message and signalling processing | ||||
| requirement due to the amount of session related information. | ||||
| * Higher potential for a single media stream to fail during | ||||
| transport between the end-points. | ||||
| b. When the number of RTP sessions grows, the amount of explicit | ||||
| state for relating media stream also grows, linearly or possibly | ||||
| exponentially, depending on how the application needs to relate | ||||
| media streams. | ||||
| c. The port consumption may become a problem for centralized | ||||
| services, where the central node's port consumption grows rapidly | ||||
| with the number of sessions. | ||||
| d. For applications where the media streams are highly dynamic in | ||||
| their usage, i.e. entering and leaving, the amount of signalling | ||||
| can grow high. Issues arising from the timely establishment of | ||||
| additional RTP sessions can also arise. | ||||
| e. Cross session RTCP requests needs is likely to exist and may | ||||
| cause issues. | ||||
| f. If the same SSRC value is reused in multiple RTP sessions rather | ||||
| than being randomly chosen, interworking with applications that | ||||
| uses another multiplexing structure than this application will | ||||
| have issues and require SSRC translation. | ||||
| g. Cannot be used with Any Source Multicast (ASM) as one cannot | ||||
| guarantee that only two end-points participate as packet senders. | ||||
| Using SSM, it is possible to restrict to these requirements if no | ||||
| RTCP feedback is used. | ||||
| h. For most security mechanisms, each RTP session or transport flow | ||||
| requires individual key-management and security association | ||||
| establishment thus increasing the overhead. | ||||
| i. Does not support multiparty session within a session. Instead | ||||
| each multi-party participant will require an individual RTP | ||||
| session to a given end-point, even if a central node is used. | ||||
| RTP applications that need to inter-work with legacy RTP | ||||
| applications, like VoIP and video conferencing, can potentially | ||||
| benefit from this structure. However, a large number of media | ||||
| descriptions in SDP can also run into issues with existing | ||||
| implementations. For any application needing a larger number of | ||||
| media flows, the overhead can become very significant. This | ||||
| structure is also not suitable for multi-party sessions, as any given | ||||
| media stream from each participant, although having same usage in the | ||||
| application, must have its own RTP session. In addition, the dynamic | ||||
| behavior that can arise in multi-party applications can tax the | ||||
| signalling system and make timely media establishment more difficult. | ||||
| 7.2. Multiple SSRCs of the Same Media Type | ||||
| In this arch-type, each RTP session serves only a single media type. | ||||
| The RTP session can contain multiple media streams, either from a | ||||
| single end-point or due to multiple end-points. This commonly | ||||
| creates a low number of RTP sessions, typically only two one for | ||||
| audio and one for video with a corresponding need for two listening | ||||
| ports when using RTP and RTCP multiplexing. | ||||
| The Pros: | ||||
| 1. Low number of RTP sessions needed compared to single SSRC case. | ||||
| This implies: | ||||
| * Reduced NAT/FW state | ||||
| * Lower NAT/FW Traversal Cost in both processing and delay. | ||||
| 2. Allows for early de-multiplexing in the processing chain in RTP | ||||
| applications where all media streams of the same type have the | ||||
| same usage in the application. | ||||
| 3. Works well with media type de-composite end-points. | ||||
| 4. Enables Flow-based QoS with different prioritization between | ||||
| media types. | ||||
| 5. For applications with dynamic usage of media streams, i.e. they | ||||
| come and go frequently, having much of the state associated with | ||||
| the RTP session rather than an individual SSRC can avoid the need | ||||
| for in-session signalling of meta-information about each SSRC. | ||||
| 6. Low overhead for security association establishment. | ||||
| The Cons: | ||||
| a. May have some need for cross session RTCP requests for things | ||||
| that affect both media types in an asynchronous way. | ||||
| b. Some potential for concern with legacy implementations that does | ||||
| not support the RTP specification fully when it comes to handling | ||||
| multiple SSRC per end-point. | ||||
| c. Will not be able to control security association for sets of | ||||
| media streams within the same media type with today's key- | ||||
| management mechanisms, only between SDP media descriptions. | ||||
| For RTP applications where all media streams of the same media type | ||||
| share same usage, this structure provides efficiency gains in amount | ||||
| of network state used and provides more faith sharing with other | ||||
| media flows of the same type. At the same time, it is still | ||||
| maintaining almost all functionalities when it comes to negotiation | ||||
| in the signalling of the properties for the individual media type and | ||||
| also enabling flow based QoS prioritization between media types. It | ||||
| handles multi-party session well, independently of multicast or | ||||
| centralized transport distribution, as additional sources can | ||||
| dynamically enter and leave the session. | ||||
| 7.3. Multiple Sessions for one Media type | ||||
| In this arch-type one goes one step further than in the above | ||||
| (Section 7.2) by using multiple RTP sessions also for a single media | ||||
| type. The main reason for going in this direction is that the RTP | ||||
| application needs separation of the media streams due to their usage. | ||||
| Some typical reasons for going to this arch-type are scalability over | ||||
| multicast, simulcast, need for extended QoS prioritization of media | ||||
| streams due to their usage in the application, or the need for fine | ||||
| granular signalling using today's tools. | ||||
| The Pros: | ||||
| 1. More suitable for Multicast usage where receivers can | ||||
| individually select which RTP sessions they want to participate | ||||
| in, assuming each RTP session has its own multicast group. | ||||
| 2. Detailed indication of the application's usage of the media | ||||
| stream, where multiple different usages exist. | ||||
| 3. Less need for SSRC specific explicit signalling for each media | ||||
| stream and thus reduced need for explicit and timely signalling. | ||||
| 4. Enables detailed QoS prioritization for flow based mechanisms. | ||||
| 5. Works well with de-composite end-points. | ||||
| 6. Handles dynamic usage of media streams well. | ||||
| 7. For transport translator based multi-party sessions, this | ||||
| structure allows for improved control of which type of media | ||||
| streams an end-point receives. | ||||
| 8. The scope for who is included in a security association can be | ||||
| structured around the different RTP sessions, thus enabling such | ||||
| functionality with existing key-management. | ||||
| The Cons: | ||||
| a. Increases the amount of RTP sessions compared to Multiple SSRCs | ||||
| of the Same Media Type. | ||||
| b. Increased amount of session configuration state. | ||||
| c. May need synchronized cross-session RTCP requests and require | ||||
| some consideration due to this. | ||||
| d. For media streams that are part of scalability, simulcast or | ||||
| transport robustness it will be needed to bind sources, which | ||||
| must support multiple RTP sessions. | ||||
| e. Some potential for concern with legacy implementations that does | ||||
| not support the RTP specification fully when it comes to handling | ||||
| multiple SSRC per end-point. | ||||
| f. Higher overhead for security association establishment. | ||||
| g. If the applications need finer control than on media type level | ||||
| over which session participants that are included in different | ||||
| sets of security associations, most of today's key-management | ||||
| will have difficulties establishing such a session. | ||||
| For more complex RTP applications that have several different usages | ||||
| for media streams of the same media type and / or uses scalability or | ||||
| simulcast, this solution can enable those functions at the cost of | ||||
| increased overhead associated with the additional sessions. This | ||||
| type of structure is suitable for more advanced applications as well | ||||
| as multicast based applications requiring differentiation to | ||||
| different participants. | ||||
| 7.4. Multiple Media Types in one Session | ||||
| This arch-type is to use a single RTP session for multiple different | ||||
| media types, like audio and video, and possibly also transport | ||||
| robustness mechanisms like FEC or Retransmission. Each media stream | ||||
| will use its own SSRC and a given SSRC value from a particular end- | ||||
| point will never use the SSRC for more than a single media type. | ||||
| The Pros: | ||||
| 1. Single RTP session which implies: | ||||
| * Minimal NAT/FW state. | ||||
| * Minimal NAT/FW Traversal Cost. | ||||
| * Fate-sharing for all media flows. | ||||
| 2. Enables separation of the different media types based on the | ||||
| payload types so media type specific end-point or central | ||||
| processing can still be supported despite single session. | ||||
| 3. Can handle dynamic allocations of media streams well on an RTP | ||||
| level. Depends on the application's needs for explicit | ||||
| indication of the stream usage and how timely that can be | ||||
| signalled. | ||||
| 4. Minimal overhead for security association establishment. | ||||
| The Cons: | ||||
| a. Not suitable for interworking with other applications that uses | ||||
| individual RTP sessions per media type or multiple sessions for a | ||||
| single media type, due to high risk of forced SSRC translation. | ||||
| b. Negotiation of bandwidth for the different media types is | ||||
| currently not possible in SDP. This requires SDP extensions to | ||||
| enable payload or source specific bandwidth. Likely to be a | ||||
| problem due to media type asymmetry in required bandwidth. | ||||
| c. Does enforce higher bandwidth and processing on de-composite end- | ||||
| points. | ||||
| d. Flow based QoS cannot provide separate treatment to some media | ||||
| streams compared to other in the single RTP session. | ||||
| e. If there is significant asymmetry between the media streams RTCP | ||||
| reporting needs, there are some challenges in configuration and | ||||
| usage to avoid wasting RTCP reporting on the media stream that | ||||
| does not need that frequent reporting. | ||||
| f. Not suitable for applications where some receivers like to | ||||
| receive only a subset of the media streams, especially if | ||||
| multicast or transport translator is being used. | ||||
| g. Additional concern with legacy implementations that does not | ||||
| support the RTP specification fully when it comes to handling | ||||
| multiple SSRC per end-point, as also multiple simultaneous media | ||||
| types needs to be handled. | ||||
| h. If the applications need finer control over which session | ||||
| participants that are included in different sets of security | ||||
| associations, most key-management will have difficulties | ||||
| establishing such a session. | ||||
| The analysis in this document and considerations in Section 6.7 | ||||
| implies that this is suitable only in a set of restricted use cases. | ||||
| The aspect in the above list that can be most difficult to judge long | ||||
| term is likely the potential need for interworking with other | ||||
| applications and services. | ||||
| 7.5. Summary | ||||
| There are some clear relations between these arch-types. Both the | ||||
| "single SSRC per RTP session" and the "multiple media types in one | ||||
| session" are cases which require full explicit signalling of the | ||||
| media stream relations. However, they operate on two different | ||||
| levels where the first primarily enables session level binding, and | ||||
| the second needs to do it all on SSRC level. From another | ||||
| perspective, the two solutions are the two extreme points when it | ||||
| comes to number of RTP sessions required. | ||||
| The two other arch-types "Multiple SSRCs of the Same Media Type" and | ||||
| "Multiple Sessions for one Media Type" are examples of two other | ||||
| cases that first of all allows for some implicit mapping of the role | ||||
| or usage of the media streams based on which RTP session they appear | ||||
| in. It thus potentially allows for less signalling and in particular | ||||
| reduced need for real-time signalling in dynamic sessions. They also | ||||
| represent points in between the first two when it comes to amount of | ||||
| RTP sessions established, i.e. representing an attempt to reduce the | ||||
| amount of sessions as much as possible without compromising the | ||||
| functionality the session provides both on network level and on | ||||
| signalling level. | ||||
| 8. Guidelines | 8. Guidelines | |||
| This section contains a number of recommendations for implementors or | This section contains a number of recommendations for implementors or | |||
| specification writers when it comes to handling multi-stream. | specification writers when it comes to handling multi-stream. | |||
| Don't Require the same SSRC across Sessions: As discussed in | Do not Require the same SSRC across Sessions: As discussed in | |||
| Section 7.2.6 there exist drawbacks in using the same SSRC in | Section 6.2.4 there exist drawbacks in using the same SSRC in | |||
| multiple RTP sessions as a mechanism to bind related media streams | multiple RTP sessions as a mechanism to bind related media streams | |||
| together. Instead a mechanism to explicitly signal the relation | together. It is instead recommended that a mechanism to | |||
| SHOULD be used, either in RTP/RTCP or in the used signalling | explicitly signal the relation is used, either in RTP/RTCP or in | |||
| mechanism that establish the RTP session(s). | the used signalling mechanism that establishes the RTP session(s). | |||
| Use SSRC multiplexing for additional Media Sources: In the cases an | Use SSRC multiplexing for additional Media Sources: In the cases an | |||
| RTP end-point needs to transmit additional media source(s) of the | RTP end-point needs to transmit additional media source(s) of the | |||
| same media type and purpose in the application it is RECOMMENDED | same media type and purpose in the application, it is recommended | |||
| to send them as additional SSRCs in the same RTP session. For | to send them as additional SSRCs in the same RTP session. For | |||
| example a telepresence room where there are three cameras, and | example a tele-presence room where there are three cameras, and | |||
| each camera captures 2 persons sitting at the table, sending each | each camera captures 2 persons sitting at the table, sending each | |||
| camera as its own SSRC within a single RTP session is recommended. | camera as its own SSRC within a single RTP session is recommended. | |||
| Use additional RTP sessions for streams with different purposes: | Use additional RTP sessions for streams with different purposes: | |||
| When media streams have different purpose or processing | When media streams have different purpose or processing | |||
| requirements it is RECOMMENDED that the different types of streams | requirements it is recommended that the different types of streams | |||
| are put in different RTP sessions. | are put in different RTP sessions. | |||
| When using Session Multiplexing use grouping: When using Session | When using Session Multiplexing use grouping: When using Session | |||
| Multiplexing solutions it is RECOMMENDED to be explicitly group | Multiplexing solutions, it is recommended to be explicitly group | |||
| the involved RTP sessions using the signalling mechanism, for | the involved RTP sessions using the signalling mechanism, for | |||
| example The Session Description Protocol (SDP) Grouping Framework. | example The Session Description Protocol (SDP) Grouping Framework. | |||
| [RFC5888] | [RFC5888], using some appropriate grouping semantics. | |||
| RTP/RTCP Extensions May Support SSRC and Session Multiplexing: When | RTP/RTCP Extensions May Support SSRC and Session Multiplexing: When | |||
| defining an RTP or RTCP extension, the creator needs to consider | defining an RTP or RTCP extension, the creator needs to consider | |||
| if this extension is applicable in both SSRC multiplexed and | if this extension is applicable in both SSRC multiplexed and | |||
| Session multiplexed usages. If it is, then any generic extensions | Session multiplexed usages. Any extension intended to be generic | |||
| are RECOMMENDED to support both. Applications that are not as | is recommended to support both. Applications that are not as | |||
| generally applicable will have to consider if interoperability is | generally applicable will have to consider if interoperability is | |||
| better served by defining a single solution or providing both | better served by defining a single solution or providing both | |||
| options. | options. | |||
| Transport Support Extensions: When defining new RTP/RTCP extensions | Transport Support Extensions: When defining new RTP/RTCP extensions | |||
| intended for transport support, like the retransmission or FEC | intended for transport support, like the retransmission or FEC | |||
| mechanisms, they are RECOMMENDED to include support for both SSRC | mechanisms, they are recommended to include support for both SSRC | |||
| and Session multiplexing so that application developers can choose | and Session multiplexing so that application developers can choose | |||
| freely from the set of mechanisms without concerning themselves | freely from the set of mechanisms without concerning themselves | |||
| with if a particular solution only supports one of the | with which of the multiplexing choices a particular solution | |||
| multiplexing choices. | supports. | |||
| This discussion and guidelines points out that a small set of | 9. Proposal for Future Work | |||
| The above discussion and guidelines indicates that a small set of | ||||
| extension mechanisms could greatly improve the situation when it | extension mechanisms could greatly improve the situation when it | |||
| comes to using multiple streams independently of Session multiplexing | comes to using multiple streams independently of Session multiplexing | |||
| or SSRC multiplexing. These extensions are: | or SSRC multiplexing. These extensions are: | |||
| Media Source Identification: A Media source identification that can | Media Source Identification: A Media source identification that can | |||
| be used to bind together media streams that are related to the | be used to bind together media streams that are related to the | |||
| same media source. A proposal | same media source. A proposal | |||
| [I-D.westerlund-avtext-rtcp-sdes-srcname] exist for a new SDES | [I-D.westerlund-avtext-rtcp-sdes-srcname] exist for a new SDES | |||
| item SRCNAME that also can be used with the a=ssrc SDP attribute | item SRCNAME that also can be used with the a=ssrc SDP attribute | |||
| to provide signalling layer binding information. | to provide signalling layer binding information. | |||
| SSRC limiations within RTP sessions: By providing a signalling | SSRC limitations within RTP sessions: By providing a signalling | |||
| solution that allows the signalling peers to explicitly express | solution that allows the signalling peers to explicitly express | |||
| both support and limitations on how many simultaneous media | both support and limitations on how many simultaneous media | |||
| streams an end-point can handle within a given RTP Session. That | streams an end-point can handle within a given RTP Session. That | |||
| ensures that usage of SSRC multiplexing occurs when supported and | ensures that usage of SSRC multiplexing occurs when supported and | |||
| without overloading an end-point. This extension is proposed in | without overloading an end-point. This extension is proposed in | |||
| [I-D.westerlund-avtcore-max-ssrc]. | [I-D.westerlund-avtcore-max-ssrc]. | |||
| 9. RTP Specification Clarifications | 10. RTP Specification Clarifications | |||
| This section describes a number of clarifications to the RTP | This section describes a number of clarifications to the RTP | |||
| specifications that are likely necessary for aligned behavior when | specifications that are likely necessary for aligned behavior when | |||
| RTP sessions contains more SSRCs than one local and one remote. | RTP sessions contain more SSRCs than one local and one remote. | |||
| 9.1. RTCP Reporting from all SSRCs | 10.1. RTCP Reporting from all SSRCs | |||
| When one have multiple SSRC in an RTP node, then all these SSRC must | When one have multiple SSRC in an RTP node, all these SSRC must send | |||
| send RTCP SR or RR as long as the SSRC exist. It is not sufficient | RTCP SR or RR as long as the SSRC exist. It is not sufficient that | |||
| that only one SSRC in the node sends report blocks on the incoming | only one SSRC in the node sends report blocks on the incoming RTP | |||
| RTP streams. The reason for this is that a third party monitor may | streams. The reason for this is that a third party monitor may not | |||
| not necessarily be able to determine that all these SSRC are in fact | necessarily be able to determine that all these SSRC are in fact co- | |||
| co-located and originate from the same stack instance that gather | located and originate from the same stack instance that gather report | |||
| report data. | data. | |||
| 9.2. RTCP Self-reporting | 10.2. RTCP Self-reporting | |||
| For any RTP node that sends more than one SSRC, there exist the | For any RTP node that sends more than one SSRC, there is the question | |||
| question if SSRC1 needs to report its reception of SSRC2 and vice | if SSRC1 needs to report its reception of SSRC2 and vice versa. The | |||
| versa. The reason that they in fact need to report on all other | reason that they in fact need to report on all other local streams as | |||
| local streams as being received is report consistency. A third party | being received is report consistency. A third party monitor that | |||
| monitor that considers the full matrix of media streams and all known | considers the full matrix of media streams and all known SSRC reports | |||
| SSRC reports on these media streams would detect a gap in the reports | on these media streams would detect a gap in the reports which could | |||
| which could be a transport issue unless identified as in fact being | be a transport issue unless identified as in fact being sources from | |||
| sources from same node. | same node. | |||
| 9.3. Combined RTCP Packets | 10.3. Combined RTCP Packets | |||
| When a node contains multiple SSRCs, it is questionable if an RTCP | When a node contains multiple SSRCs, it is questionable if an RTCP | |||
| compound packet can only contain RTCP packets from a single SSRC or | compound packet can only contain RTCP packets from a single SSRC or | |||
| if multiple SSRCs can include their packets in a joint compound | if multiple SSRCs can include their packets in a joint compound | |||
| packet. The high level question is a matter for any receiver | packet. The high level question is a matter for any receiver | |||
| processing on what to expect. In addition to that question there is | processing on what to expect. In addition to that question there is | |||
| the issue of how to use the RTCP timer rules in these cases, as the | the issue of how to use the RTCP timer rules in these cases, as the | |||
| existing rules are focused on determining when a single SSRC can | existing rules are focused on determining when a single SSRC can | |||
| send. | send. | |||
| 10. IANA Considerations | 11. IANA Considerations | |||
| This document makes no request of IANA. | This document makes no request of IANA. | |||
| Note to RFC Editor: this section may be removed on publication as an | Note to RFC Editor: this section may be removed on publication as an | |||
| RFC. | RFC. | |||
| 11. Security Considerations | 12. Security Considerations | |||
| 12. Acknowledgements | There is discussion of the security implications of choosing SSRC vs | |||
| Session multiplexing in Section 6.6. | ||||
| 13. References | 13. Acknowledgements | |||
| 13.1. Normative References | The authors would like to thanks Harald Alvestrand for providing | |||
| input into the discussion regarding multiple media types in a single | ||||
| RTP session. | ||||
| 14. References | ||||
| 14.1. Normative References | ||||
| [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
| Requirement Levels", BCP 14, RFC 2119, March 1997. | Requirement Levels", BCP 14, RFC 2119, March 1997. | |||
| [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. | [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. | |||
| Jacobson, "RTP: A Transport Protocol for Real-Time | Jacobson, "RTP: A Transport Protocol for Real-Time | |||
| Applications", STD 64, RFC 3550, July 2003. | Applications", STD 64, RFC 3550, July 2003. | |||
| 13.2. Informative References | 14.2. Informative References | |||
| [ALF] Clark, D. and D. Tennenhouse, "Architectural | [ALF] Clark, D. and D. Tennenhouse, "Architectural | |||
| Considerations for a New Generation of Protocols", SIGCOMM | Considerations for a New Generation of Protocols", SIGCOMM | |||
| Symposium on Communications Architectures and | Symposium on Communications Architectures and | |||
| Protocols (Philadelphia, Pennsylvania), pp. 200--208, IEEE | Protocols (Philadelphia, Pennsylvania), pp. 200--208, IEEE | |||
| Computer Communications Review, Vol. 20(4), | Computer Communications Review, Vol. 20(4), | |||
| September 1990. | September 1990. | |||
| [I-D.alvestrand-rtp-sess-neutral] | ||||
| Alvestrand, H., "Why RTP Sessions Should Be Content | ||||
| Neutral", draft-alvestrand-rtp-sess-neutral-00 (work in | ||||
| progress), December 2011. | ||||
| [I-D.holmberg-mmusic-sdp-bundle-negotiation] | [I-D.holmberg-mmusic-sdp-bundle-negotiation] | |||
| Holmberg, C. and H. Alvestrand, "Multiplexing Negotiation | Holmberg, C. and H. Alvestrand, "Multiplexing Negotiation | |||
| Using Session Description Protocol (SDP) Port Numbers", | Using Session Description Protocol (SDP) Port Numbers", | |||
| draft-holmberg-mmusic-sdp-bundle-negotiation-00 (work in | draft-holmberg-mmusic-sdp-bundle-negotiation-00 (work in | |||
| progress), October 2011. | progress), October 2011. | |||
| [I-D.ietf-avt-srtp-ekt] | [I-D.ietf-avt-srtp-ekt] | |||
| McGrew, D., Andreasen, F., Wing, D., and K. Fischer, | Wing, D., McGrew, D., and K. Fischer, "Encrypted Key | |||
| "Encrypted Key Transport for Secure RTP", | Transport for Secure RTP", draft-ietf-avt-srtp-ekt-03 | |||
| draft-ietf-avt-srtp-ekt-02 (work in progress), March 2011. | (work in progress), October 2011. | |||
| [I-D.ietf-avtext-multiple-clock-rates] | [I-D.ietf-avtext-multiple-clock-rates] | |||
| Petit-Huguenin, M., "Support for multiple clock rates in | Petit-Huguenin, M., "Support for multiple clock rates in | |||
| an RTP session", draft-ietf-avtext-multiple-clock-rates-01 | an RTP session", draft-ietf-avtext-multiple-clock-rates-02 | |||
| (work in progress), July 2011. | (work in progress), January 2012. | |||
| [I-D.ietf-payload-rtp-howto] | [I-D.ietf-payload-rtp-howto] | |||
| Westerlund, M., "How to Write an RTP Payload Format", | Westerlund, M., "How to Write an RTP Payload Format", | |||
| draft-ietf-payload-rtp-howto-01 (work in progress), | draft-ietf-payload-rtp-howto-01 (work in progress), | |||
| July 2011. | July 2011. | |||
| [I-D.westerlund-avtcore-max-ssrc] | [I-D.westerlund-avtcore-max-ssrc] | |||
| Westerlund, M., Burman, B., and F. Jansson, "Multiple | Westerlund, M., Burman, B., and F. Jansson, "Multiple | |||
| Synchronization sources (SSRC) in RTP Session Signaling", | Synchronization sources (SSRC) in RTP Session Signaling", | |||
| draft-westerlund-avtcore-max-ssrc (work in progress), | draft-westerlund-avtcore-max-ssrc (work in progress), | |||
| skipping to change at page 39, line 23 ¶ | skipping to change at page 49, line 45 ¶ | |||
| Protocol (SDP) Grouping Framework", RFC 5888, June 2010. | Protocol (SDP) Grouping Framework", RFC 5888, June 2010. | |||
| [RFC6190] Wenger, S., Wang, Y., Schierl, T., and A. Eleftheriadis, | [RFC6190] Wenger, S., Wang, Y., Schierl, T., and A. Eleftheriadis, | |||
| "RTP Payload Format for Scalable Video Coding", RFC 6190, | "RTP Payload Format for Scalable Video Coding", RFC 6190, | |||
| May 2011. | May 2011. | |||
| [RFC6285] Ver Steeg, B., Begen, A., Van Caenegem, T., and Z. Vax, | [RFC6285] Ver Steeg, B., Begen, A., Van Caenegem, T., and Z. Vax, | |||
| "Unicast-Based Rapid Acquisition of Multicast RTP | "Unicast-Based Rapid Acquisition of Multicast RTP | |||
| Sessions", RFC 6285, June 2011. | Sessions", RFC 6285, June 2011. | |||
| Appendix A. Dismissing Payload Type Multiplexing | ||||
| This section documents a number of reasons why using the payload type | ||||
| as a multiplexing point for most things related to multiple streams | ||||
| is unsuitable. If one attempts to use Payload type multiplexing | ||||
| beyond it's defined usage, that has well known negative effects on | ||||
| RTP. To use Payload type as the single discriminator for multiple | ||||
| streams implies that all the different media streams are being sent | ||||
| with the same SSRC, thus using the same timestamp and sequence number | ||||
| space. This has many effects: | ||||
| 1. Putting restraint on RTP timestamp rate for the multiplexed | ||||
| media. For example, media streams that use different RTP | ||||
| timestamp rates cannot be combined, as the timestamp values need | ||||
| to be consistent across all multiplexed media frames. Thus | ||||
| streams are forced to use the same rate. When this is not | ||||
| possible, Payload Type multiplexing cannot be used. | ||||
| 2. Many RTP payload formats may fragment a media object over | ||||
| multiple packets, like parts of a video frame. These payload | ||||
| formats need to determine the order of the fragments to | ||||
| correctly decode them. Thus it is important to ensure that all | ||||
| fragments related to a frame or a similar media object are | ||||
| transmitted in sequence and without interruptions within the | ||||
| object. This can relatively simple be solved on the sender side | ||||
| by ensuring that the fragments of each media stream are sent in | ||||
| sequence. | ||||
| 3. Some media formats require uninterrupted sequence number space | ||||
| between media parts. These are media formats where any missing | ||||
| RTP sequence number will result in decoding failure or invoking | ||||
| of a repair mechanism within a single media context. The text/ | ||||
| T140 payload format [RFC4103] is an example of such a format. | ||||
| These formats will need a sequence numbering abstraction | ||||
| function between RTP and the individual media stream before | ||||
| being used with Payload Type multiplexing. | ||||
| 4. Sending multiple streams in the same sequence number space makes | ||||
| it impossible to determine which Payload Type and thus which | ||||
| stream a packet loss relates to. | ||||
| 5. If RTP Retransmission [RFC4588] is used and there is a loss, it | ||||
| is possible to ask for the missing packet(s) by SSRC and | ||||
| sequence number, not by Payload Type. If only some of the | ||||
| Payload Type multiplexed streams are of interest, there is no | ||||
| way of telling which missing packet(s) belong to the interesting | ||||
| stream(s) and all lost packets must be requested, wasting | ||||
| bandwidth. | ||||
| 6. The current RTCP feedback mechanisms are built around providing | ||||
| feedback on media streams based on stream ID (SSRC), packet | ||||
| (sequence numbers) and time interval (RTP Timestamps). There is | ||||
| almost never a field to indicate which Payload Type is reported, | ||||
| so sending feedback for a specific media stream is difficult | ||||
| without extending existing RTCP reporting. | ||||
| 7. The current RTCP media control messages [RFC5104] specification | ||||
| is oriented around controlling particular media flows, i.e. | ||||
| requests are done addressing a particular SSRC. Such mechanisms | ||||
| would need to be redefined to support Payload Type multiplexing. | ||||
| 8. The number of payload types are inherently limited. | ||||
| Accordingly, using Payload Type multiplexing limits the number | ||||
| of streams that can be multiplexed and does not scale. This | ||||
| limitation is exacerbated if one uses solutions like RTP and | ||||
| RTCP multiplexing [RFC5761] where a number of payload types are | ||||
| blocked due to the overlap between RTP and RTCP. | ||||
| 9. At times, there is a need to group multiplexed streams and this | ||||
| is currently possible for RTP Sessions and for SSRC, but there | ||||
| is no defined way to group Payload Types. | ||||
| 10. It is currently not possible to signal bandwidth requirements | ||||
| per media stream when using Payload Type Multiplexing. | ||||
| 11. Most existing SDP media level attributes cannot be applied on a | ||||
| per Payload Type level and would require re-definition in that | ||||
| context. | ||||
| 12. A legacy end-point that doesn't understand the indication that | ||||
| different RTP payload types are different media streams may be | ||||
| slightly confused by the large amount of possibly overlapping or | ||||
| identically defined RTP Payload Types. | ||||
| Authors' Addresses | Authors' Addresses | |||
| Magnus Westerlund | Magnus Westerlund | |||
| Ericsson | Ericsson | |||
| Farogatan 6 | Farogatan 6 | |||
| SE-164 80 Kista | SE-164 80 Kista | |||
| Sweden | Sweden | |||
| Phone: +46 10 714 82 87 | Phone: +46 10 714 82 87 | |||
| Email: magnus.westerlund@ericsson.com | Email: magnus.westerlund@ericsson.com | |||
| End of changes. 159 change blocks. | ||||
| 615 lines changed or deleted | 1219 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||