Network Working Group D. Grondal Internet-Draft B. Burman Intended status: Standards Track M. Westerlund Expires: April 26, 2012 Ericsson AB October 24, 2011 Media Stream Selection (MESS) draft-westerlund-dispatch-stream-selection-00 Abstract This document describes how media stream selection can be achieved in both a conferencing scenario and peer to peer communication. To allow endpoints to select specific media streams, all available media in the session must be identifiable and there is a need for messages than can be securely transported between endpoints and network nodes. This document also describes a way to distribute the identification information to all participating endpoints. The necessary messages can potentially be mapped onto several different encodings, and this document proposes one mapping that uses an extended version of the Binary Floor Control Protocol. Status of this Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This Internet-Draft will expire on April 26, 2012. Copyright Notice Copyright (c) 2011 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of Grondal, et al. Expires April 26, 2012 [Page 1] Internet-Draft MESS October 2011 publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Grondal, et al. Expires April 26, 2012 [Page 2] Internet-Draft MESS October 2011 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 1.2. Requirements Language . . . . . . . . . . . . . . . . . . 4 2. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3. Use Cases for MESS . . . . . . . . . . . . . . . . . . . . . . 5 3.1. Include Media Content . . . . . . . . . . . . . . . . . . 5 3.2. Exclude Media Content . . . . . . . . . . . . . . . . . . 5 3.3. Substitute Media Content . . . . . . . . . . . . . . . . . 5 3.4. Reset Media Content . . . . . . . . . . . . . . . . . . . 6 3.5. Reset All . . . . . . . . . . . . . . . . . . . . . . . . 6 4. Media Information . . . . . . . . . . . . . . . . . . . . . . 6 4.1. Unique Media ID . . . . . . . . . . . . . . . . . . . . . 6 4.2. Distribution of Media Information . . . . . . . . . . . . 7 4.3. Publishing Media Information from Endpoints . . . . . . . 7 4.4. Publishing Media Information from Conference Nodes . . . . 7 4.5. Receiving Media Information . . . . . . . . . . . . . . . 7 4.6. RTP Media Transport . . . . . . . . . . . . . . . . . . . 8 4.7. SDP Media Description . . . . . . . . . . . . . . . . . . 8 4.7.1. Point to Point Communication . . . . . . . . . . . . . 10 4.7.2. Conferencing Scenario . . . . . . . . . . . . . . . . 10 5. MESS Requests . . . . . . . . . . . . . . . . . . . . . . . . 10 5.1. Transport . . . . . . . . . . . . . . . . . . . . . . . . 10 5.2. BFCP Extensions . . . . . . . . . . . . . . . . . . . . . 11 5.2.1. OPERATION . . . . . . . . . . . . . . . . . . . . . . 11 5.2.2. MEDIA-IDENTIFICATION . . . . . . . . . . . . . . . . . 12 5.2.3. CHANNEL-IDENTIFICATION . . . . . . . . . . . . . . . . 13 5.3. Defined Messages . . . . . . . . . . . . . . . . . . . . . 14 5.3.1. MediaSelectionAck . . . . . . . . . . . . . . . . . . 14 5.3.2. Include . . . . . . . . . . . . . . . . . . . . . . . 14 5.3.3. Exclude . . . . . . . . . . . . . . . . . . . . . . . 15 5.3.4. Substitute . . . . . . . . . . . . . . . . . . . . . . 15 5.3.5. Reset . . . . . . . . . . . . . . . . . . . . . . . . 17 5.3.6. Reset All . . . . . . . . . . . . . . . . . . . . . . 17 6. MESS Responses . . . . . . . . . . . . . . . . . . . . . . . . 18 7. RTP Implications . . . . . . . . . . . . . . . . . . . . . . . 18 8. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 8.1. Client joins a conference . . . . . . . . . . . . . . . . 19 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 10. Security Considerations . . . . . . . . . . . . . . . . . . . 22 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 22 12. References . . . . . . . . . . . . . . . . . . . . . . . . . . 22 12.1. Normative References . . . . . . . . . . . . . . . . . . . 22 12.2. Informative References . . . . . . . . . . . . . . . . . . 23 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 23 Grondal, et al. Expires April 26, 2012 [Page 3] Internet-Draft MESS October 2011 1. Introduction Multimedia conferencing is becoming more and more important. The setup up of a multimedia conference is well defined, using for example SIP and SDP. However, as SIP/SDP is used for session setup it leaves little or no dynamic control over what media content to receive from other participants during the session. This document targets this weakness and describes functionality that grants receiving endpoints capabilities to dynamically select what information and media content are received from other participating clients. 1.1. Terminology These terms are commonly used throughout the document: Media Content: Media being sent from one specific media capture device, such as a microphone for audio media, or video camera for video media. Endpoint: An device that handles media that either originates a number of media content, terminates a number of media content, or some combination of both. As an example, an RTP Mixer is considered as an endpoint, while a simple RTP Translator that simply forwards all input streams is not. 1.2. Requirements Language The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119 [RFC2119]. 2. Motivation In a communication scenario where one or more endpoints offers more than one media content, but where a receiving endpoint cannot handle all simultaneous media content at once, there may be a need for that endpoint to actively and dynamically during an ongoing conference select what content to receive. A typical scenario would be a video conference where some endpoints have multiple cameras capturing different aspects of a room and a receiving endpoint can only render one video stream due to e.g. hardware limitations. Today, the only way to solve this is to have an RTP mixer handle the conference and let that choose one of the streams based on some criteria. It is up to the RTP mixer implementation which stream to choose, but a common criteria is some type of speaker activity. Grondal, et al. Expires April 26, 2012 [Page 4] Internet-Draft MESS October 2011 It would be possible to let the receiving endpoints to choose which media content(s) to receive, given that endpoints publish information about what media content is available to all other endpoints and if there would exist a protocol to request specific media content from other endpoints. This functionality is what Media Stream Selection (MESS) described in this document targets. It describes how to generate and distribute media content information in both conferencing scenarios as well as in point to point sessions. It also describes how to set up a control channel to send messages between endpoints and finally defines a set of messages that can be used to handle media content requests. 3. Use Cases for MESS This section presents some typical use cases targeted by MESS. The scenario is an endpoint participating in a conference, receiving media from a centralized conference node. It is assumed that all participating endpoints have published information about what media content they are offering. There are more available media from other participants in the conference than what the receiving endpoint in the use case can present simultaneously, and the conference node has some implemented policy how to select which media to forward. 3.1. Include Media Content An endpoint selects what content to receive from another endpoint based on that endpoint's published media content information. An endpoint can make new decisions about what content to receive dynamically at any time during the session. 3.2. Exclude Media Content An endpoint wishes to stop receiving content from another endpoint e.g. due to low quality or other reasons. The set of excluded media during a session is subject to change and an endpoint can make new decisions to exclude content dynamically at any time during the session. 3.3. Substitute Media Content An endpoint renders received media and wants to replace the received media with some other available media content. It can be seen as an atomic combination of the two use-cases above, first excluding one media content and effectively replacing it by including another. And endpoint can make new substitute decisions dynamically at any time during the session. Grondal, et al. Expires April 26, 2012 [Page 5] Internet-Draft MESS October 2011 3.4. Reset Media Content An endpoint no longer has any specific wish to always include or always exclude a certain content, but wants to return the decision to forward streams or not to the conference node. An endpoint can reset any previously included or excluded stream at any time during the session. At the beginning of the session, all media streams SHALL have a state corresponding to being reset and thus be under the conference node policy control. 3.5. Reset All An endpoint wishes to remove all previous decisions about included and excluded media. This method is a shortcut to avoid repeated reset messages described in Section 3.4. 4. Media Information To be able to identify the available media content, all different content must be given a unique media ID. The given ID must also be distributed to all participating endpoints. The following sections describe how to generate such IDs and how to distribute them. The text in SDP Media Description (Section 4.7) describes the specific case where media description is signaled with SDP [RFC4566], but other signaling methods MAY be used, in which case the mapping to SDP-specific lines and attributes do not apply and other mandatory mappings SHOULD instead be defined in a separate RFC. The text in Section 4.6 describes the specific case when RTP [RFC3550] is used for media transport. Other media transports MAY be used, in which case the mapping to RTP does not apply and other mandatory mappings SHOULD instead be defined in a separate RFC. 4.1. Unique Media ID To request specific media content, all involved endpoints need to agree on how to uniquely identify different content with a unique media ID. There is no particular algorithm specified how to generate unique media IDs, as it will depend on which media transport is used. The main requirements on such an algorithm are that media IDs are unique among all communicating endpoints and that all endpoints share the same definitions on what media streams are identified by what media IDs. Grondal, et al. Expires April 26, 2012 [Page 6] Internet-Draft MESS October 2011 4.2. Distribution of Media Information Assuming all available media content from all communicating endpoints are associated with some kind of media ID, those media IDs need to be distributed to endpoints wishing to actively control what content to receive. There might also be other interesting per-media related information that needs distribution, such as e.g. naming or describing individual media content to aid selection. 4.3. Publishing Media Information from Endpoints Endpoints wishing to join a session are responsible to send information about media content they will make available to the other party or parties. This is done by generating media IDs, or other sufficiently unique identification that can be used for generation of media IDs, for all transmitted media content. Depending on the capabilities of the signaling protocol used, an endpoint can also have the opportunity to convey other information than the media ID, such as e.g. describing or naming media content explicitly. 4.4. Publishing Media Information from Conference Nodes The SIP Event Package for Conference State [RFC4575] defines an XML schema used for distribution of conference information. The schema defines elements (among others) for users, endpoints and media. The defined element contains a media ID attribute. This attribute SHALL be used to carry generated media IDs. This means that media ID only needs to be unique within an endpoint context and referring clients MUST use both user, endpoint information and media ID to uniquely identify media content. User and endpoint information are relevant in a scenario covering multiple users and/or endpoints (e.g. where a middle node is responsible for forwarding requests or making decisions about media content selection), but may be redundant for a point to point scenario. Any description or naming of individual media content published by endpoints (as described in the previous section) SHOULD be included in the XML as body of , which is another sub element of . There may exist alternatives to obtain naming and description information, but it will in general depend on what is supported by the used media description protocol. 4.5. Receiving Media Information Reception of media content information is dependent upon in what context the endpoint exists. In a conferencing scenario, the distribution of media information is in general different than distribution of media content information in a point to point Grondal, et al. Expires April 26, 2012 [Page 7] Internet-Draft MESS October 2011 session, which must be taken into account when defining use of MESS with media description protocols. 4.6. RTP Media Transport When RTP is used for transmission of media content, a single RTP session can transfer a number of different media content. In such case every received data packet must carry an identifier, or something that can be used as identifier, to separate individual content. Without such an identifier it is simply not possible to demultiplex incoming packets correctly. Using other protocols for transmission offers similar problems when multiplexing. In the case of RTP, SSRC could be used as the sole identifier, but to avoid changing ID if the SSRC changes (e.g. due to an SSRC collision) an identifier not dependent on, but related to, SSRC is the best choice. RFC 4575 [RFC4575], a sub element of defines an element that MUST be used to carry the SSRC selected for the corresponding media content. This enables an endpoint to do reverse look-up of media ID on incoming packets using SSRC, or CSRC in the case media streams are aggregated by an RTP mixer. 4.7. SDP Media Description This section applies when SDP media description is used with RTP Media Transport. Use of MESS with other media transport in SDP MAY be used, but that is out of scope for this document and SHOULD instead be described in a separate RFC. The generated RTP media IDs MUST be included as ssrc attributes (described in Source-Specific SDP Attributes [RFC5576]). Assuming a single media in an SDP media block, using an i-line (as described in SDP [RFC4566]) is sufficient to name an individual media content. If a media block carries information about multiple SSRCs, this method is not enough to name all different media content. For this purpose a new source-specific attribute is proposed (previously mentioned in draft-lennox-mmusic-sdp-source-selection-02). a=ssrc: information: The new, optional, source-specific attribute, with identical syntax and semantics of as the i-line in SDP, except that it is specified per SSRC, provides a textual description of the media content represented by the SSRC included in the attribute declaration. Grondal, et al. Expires April 26, 2012 [Page 8] Internet-Draft MESS October 2011 In the case of RTP, an intercepting node in the network could be responsible for generating media descriptions upon reception of the actual RTP stream. However, such a solution will suffer from the fact that not all media may be sent to that node at all times. This would introduce a delay of media description creation until the intercepting node has received RTP packets from all media sources. In cases where a Media Gateway and it's controller are separate entities (see e.g. Media Gateway Control Protocol [RFC3435]), such as in 3GPP IMS split architecture where MRFP and an MRFC exchange SDP information, e.g. through H.248 or SIP, the MRFC receives the SIP INVITE with SDP from participants and therefore also information about what SSRCs the endpoint intends to use. The MRFP will see incoming SSRCs in the actual RTP streams, but not before any media traffic has occurred. The MRFC is also responsible for publishing the conference XML data [RFC4575], e.g. as a body in SIP NOTIFY to SUBSCRIBE'd endpoints. In short, the MRFC, or any other node acting as Conference AS, has the best information for generating and distributing media IDs and is chosen as the responsible node. There is no big difference in a call-out conferencing scenario where a conferencing node calls out to invited participants. The initial SDP will hold information about the capabilities of the network node and responding endpoints provide answer SDP's with media description (including SSRC) of there intended/offered media. In a distributed conference with several involved Conferencing AS'es, and also if 3GPP IMS split architecture is not used, the protocol to transfer media ID and SSRC information between Conferencing AS'es / MRFC's is out of scope for this document. A conference node SHOULD try to locate information from endpoints that name or describe individual media content in the SDP, and include the information in the body of the per-media tag. The information SHOULD be taken from, in this order if more preferred information is missing: 1. The value from an "information" SSRC attribute described above 2. The value from an i-line within the media block 3. The value field of a label attribute [RFC4574] within the media block 4. The value from an i-line at the SDP session level Other sources of information MAY be used, MAY be more preferred, and the MAY also be empty. The receiving client MAY e.g. Grondal, et al. Expires April 26, 2012 [Page 9] Internet-Draft MESS October 2011 use the content to amend originating user/endpoint information presented to the receiving user with the media content specific information. 4.7.1. Point to Point Communication In point to point communication, endpoints could publish SSRC information using SDP in request and response. This is e.g. valid for the SDP in both the SIP INVITE and the corresponding 200 OK, or in any provisional responses. The list of published SSRCs is the list of offered media content available for request. Also, the SDP can be searched for the information attribute described in Section 4.4 to extract information about naming of media content. 4.7.2. Conferencing Scenario In a conferencing scenario, the media content information is distributed using an XML body following the schema defined in Conference package [RFC4575], e.g. carried by a SIP NOTIFY. For use with SIP and once a client has SUBSCRIBEd for conference information, it SHOULD be prepared to receive SIP NOTIFYs. If the SIP NOTIFY carries this type of XML, the receiving endpoint can extract information about media IDs and media content descriptions by finding all elements in the received XML. This produces a valid request list of available media ID's and their corresponding SSRC values. 5. MESS Requests To request media streams, a communication channel between the endpoint and the node in control of the media streams needs to be setup. This document describes use of SIP/SDP for this purpose, but other methods MAY be used and SHOULD then be described in a separate RFC. The basic requirements on the communication channel used for MESS are to offer reliable transmission and a near real time response. 5.1. Transport Binary Floor Control Protocol is described in RFC 4582 [RFC4582]. BFCP is a protocol that is likely to already be supported by conference-aware nodes and clients. This makes it easy to extend existing implementations to handle any new defined message. It also uses a reliable transport. In the context of media stream selection it is highly related and is thus regarded a feasible choice. Grondal, et al. Expires April 26, 2012 [Page 10] Internet-Draft MESS October 2011 All MESS messages defined in this document are extensions to the existing messages described in BFCP [RFC4582]. This means that they are not dependent upon any other message and can be implemented separately from legacy messages. The legacy floor control functionality of BFCP requires additional protocols to handle floor creation. That is not needed by MESS and thus out of scope for this document. A possible way is described in SDP for BFCP [RFC4583]. 5.2. BFCP Extensions BFCP [RFC4582] defines 13 primitives used in BFCP. To implement MESS as an extension to BFCP requires this set of primitives to be extended with two other called "MediaSelection" having a value of 32 and "MediaSelectionAck" having a value of 33. MESS uses the same common header, referred to as COMMON-HEADER, as defined in BFCP [RFC4582]. The attributes also follows the same pattern as described in that RFC, i.e. they are in the format Type-Length-Value. +-------+----------------------+---------------------------+ | Value | Primitive | Direction | +-------+----------------------+---------------------------+ | 32 | MediaSelection | FloorParticipant -> FCS | | 33 | MediaSelectionAck | FCS -> FloorParticipant | +-------+----------------------+---------------------------+ FCS = Floor Control Server Media Selection Primitives Table 1: Media Selection Primitives In addition to these new primitives, MESS also defines a set of new attributes. +------+--------------------------+-------------+ | Type | Attribute | Format | +------+--------------------------+-------------+ | 32 | OPERATION | Unsigned16 | | 33 | MEDIA-IDENTIFICATION | Grouped | | 34 | CHANNEL-IDENTIFICATION | OctetString | +------+--------------------------+-------------+ Table 2: Media Selection Attributes 5.2.1. OPERATION The following is the format of the OPERATION attribute. Grondal, et al. Expires April 26, 2012 [Page 11] Internet-Draft MESS October 2011 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 1 0 0 0 0 0|M|0 0 0 0 0 1 0 0| Operation id | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Operation id: This field contains a 16-bit vale that identifies an operation to be performed. Defined entries in this document is Include, Exclude, Substitute, Reset, and Reset All. +-------+------------+ | Value | Operation | +-------+------------+ | 0 | Include | | 1 | Exclude | | 2 | Substitute | | 3 | Reset | | 4 | Reset All | +-------+------------+ Table 3: MESS Operations 5.2.2. MEDIA-IDENTIFICATION The MEDIA-IDENTIFICATION attribute is a grouped attribute consisting of a header, referred to as MEDIA-IDENTIFICATION-HEADER with identification type information followed by a sequence of other MEDIA-IDENTIFICATION attributes. The following is the format of the MEDIA-IDENTIFICATION-HEADER 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 1 0 0 0 0 1|M| Length | ID Type | | +-------------+-+---------------+---------------+ | | | / Media ID / / +--------------------+ | | Padding | +------------------------------------------+--------------------+ The ID Type field is a 8 bit field describing the type of media id. Defined types in this document are: +-------+------------+ | Value | Type | +-------+------------+ | 0 | User | | 1 | Endpoint | | 2 | Media | +-------+------------+ Grondal, et al. Expires April 26, 2012 [Page 12] Internet-Draft MESS October 2011 Table 4: MESS Media Identification Types The following describes the format of the grouped attribute. The Media ID field will contain different information based on the ID Type. The Media ID field in MEDIA-IDENTIFICATION attributes of type "User" is only allowed to hold MEDIA-IDENTIFICATION of type "Endpoint", and Media ID field in MEDIA-IDENTIFICATION attributes of type "Endpoint" is only allowed to hold MEDIA-IDENTIFICATION attributes of type "Media". The Media ID field in MEDIA- IDENTIFICATION attributes of type "Media" holds the actual media ID number. This allows expression of tree-like identifications with attributes of type User being root node with attributes of Endpoints as leafs containing only attributes of type "Media". The following expresses this relationship in ABNF [RFC5234] syntax. MEDIA-IDENTIFICATION = (USER-SUB-IDENTIFICATION / ENDPOINT-SUB-IDENTIFICATION / MEDIA-SUB-IDENTIFICATION) USER-SUB-IDENTIFICATION = (MEDIA-IDENTIFICATION-HEADER) [ENDPOINT-SUB-IDENTIFICATION] ENDPOINT-SUB-IDENTIFICATION = (MEDIA-IDENTIFICATION-HEADER) [MEDIA-SUB-IDENTIFICATION] MEDIA-SUB-IDENTIFICATION = (MEDIA-IDENTIFICATION-HEADER) ABNF for MEDIA-IDENTIFICATION attribute 5.2.3. CHANNEL-IDENTIFICATION The following is a description of the CHANNEL-IDENTIFICATION attribute. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 1 0 0 0 1 0|M| Length | | +-------------+-+---------------+ | | | / Channel Id / / +--------------------+ | | Padding | +------------------------------------------+--------------------+ This attribute is used to identify a specific channel to/from an endpoint. Grondal, et al. Expires April 26, 2012 [Page 13] Internet-Draft MESS October 2011 5.3. Defined Messages MESS defines 5 messages used to control what media content to receive. Floor participants MAY use the messages in this clause without having obtained a floor, and floor servers MAY accept the messages from participants not owning the floor. When floor control is bypassed in this way, the FLOOR-ID SHALL be ignored by receivers of this message implementing this RFC, and senders implementing this RFC SHALL set it to 0. If a floor chair requires a floor participant to own the floor before using the messages of this clause, they SHALL both follow regular BFCP floor control procedures as defined in BFCP [RFC4582]. For example, a floor participant not allowed to access the floor will receive a BFCP Error message containing Error Code 5 (Not authorized). When a floor control server implementing this RFC sends a BFCP SUPPORTED-PRIMITIVES attribute, the codes for messages defined in this clause MUST be included in the Primitives list. Extension attributes that may be defined in the future are referred to as EXTENSION-ATTRIBUTE in the ABNF, similarly as was done in section 5.3. of BFCP [RFC4582]. 5.3.1. MediaSelectionAck All MediaSelectionMessages MUST be replied to with a MediaSelectionAck. The format of the MediaSelectionAck is as follows: MediaSelectionAck = (COMMON-HEADER) *[EXTENSION-ATTRIBUTE] The COMMON-HEADER of such a message MUST contain the transaction id of the acknowledged message. 5.3.2. Include MESS Include messages are sent as BFCP messages with primitive "Media Selection" and the OPERATION attribute set to value "Include". Then follows a list of media identifications representing media streams that are always to be included from now on. Since there might be more than one transport channel in between the requesting node and the receiving node, the message MAY also contain information about which transport channel to use, a channel ID. In case RTP is used as transport, this channel ID SHOULD be a combination of SSRC and RTP Grondal, et al. Expires April 26, 2012 [Page 14] Internet-Draft MESS October 2011 session identification. If channel ID is missing there are no restrictions on the used transport and any transport channel MAY be used to deliver the stream. Other transports are out of scope for this document but need a similar identification possibility. Requests to Include an already included media SHALL be ignored. Note that the message is defined in a way that makes it additive and identifications for previously included media SHOULD NOT be included for every new request. A receiver of an include message MUST respond with a MediaSelectionAck containing the same transaction id. Include = (COMMON-HEADER) 1*(FLOOR-ID) (OPERATION) 1*(MEDIA-IDENTIFICATION) 1*(CHANNEL-IDENTIFICATION) *[EXTENSION-ATTRIBUTE] 5.3.3. Exclude MESS Exclude messages are sent as BFCP messages with primitive "Media Selection" and the OPERATION attribute set to value "Exclude". Then follows a list of media identifications representing media streams that are always to be excluded from now on. Requests to Exclude an already excluded media SHALL be ignored. Note that the message is defined in a way that makes it additive and identifications for previously excluded media SHOULD NOT be included for every new request. The exclude message MAY contain an optional channel ID limiting the exclude message so that the excluded stream might be sent using any other transport channel if available. If the channel ID is missing in the exclude message this means that the exclude concerns any channel between an endpoint and a sender. Exclude = (COMMON-HEADER) 1*(FLOOR-ID) (OPERATION) 1*(MEDIA-IDENTIFICATION) 1*(CHANNEL-IDENTIFICATION) *[EXTENSION-ATTRIBUTE] A receiver of an exclude message MUST respond with a MediaSelectionAck containing the same transaction id. 5.3.4. Substitute MESS Substitute messages are sent as BFCP messages with primitive "Media Selection" and the OPERATION attribute set to "Substitute". Grondal, et al. Expires April 26, 2012 [Page 15] Internet-Draft MESS October 2011 Then follows a list of pairs of tuples called MEDIA-TUPLE. A MEDIA- TUPLE contains a MEDIA-IDENTIFICATION and an optional CHANNEL- IDENTIFICATION. The following is a formal description of MEDIA-TUPLE. MEDIA-TUPLE = (MEDIA IDENTIFICATION) 1*(CHANNEL-IDENTIFICATION) The following is a formal description of the Substitute message. Substitute = (COMMON-HEADER) 1*(FLOOR-ID) (OPERATION) 1*(MEDIA-TUPLE MEDIA-TUPLE) *[EXTENSION-ATTRIBUTE] In the list of pairs of MEDIA-TUPLEs, the pair MUST be interpret as follows. The first MEDIA-TUPLE defines the media stream, and possibly a transport channel, that should be replaced and the second MEDIA-TUPLE defines the media stream, and optionally a transport channel, to use as a replacement for the first MEDIA-TUPLE. Note that the included MEDIA-INDENTIFICATIONs typically need to be of type USER-SUB-IDENTIFICATION, since they in general do not refer to media from the same user, but other addressing MAY be sufficient. Since CHANNEL-IDENTIFICATION is optional and might be missing for any MEDIA-TUPLE in the above description, such a missing attribute should be interpreted as follows. No Channel ID in any tuple: All media occurrences should be replaced using the already used channels. This is the same as an atomic version of a message series containing an exclude message and an include message without CHANNEL-IDENTIFICATION attributes. Channel ID in the replaced media tuple: Replace the identified media only on the identified channel. This is the same as an atomic version of a message series containing an exclude message with a CHANNEL-IDENTIFICATION attribute and an include message without CHANNEL-IDENTIFICATION attribute. Channel Id present in the replacing media tuple: Replace all occurrences of an identified media with the replacing media stream using the identified channel. This is the same as an atomic version of an exclude message without CHANNEL-IDENTIFICATION attribute followed by an include message with a CHANNEL- IDENTIFICATION. Grondal, et al. Expires April 26, 2012 [Page 16] Internet-Draft MESS October 2011 Channel Id present in both media tuples: Replace the identified media on the identified channel with the replacing media using the identified channel. This is the same as an atomic version of an exclude message followed by an include message, both holding a CHANNEL-IDENTIFICATION attribute. A receiver of a substitute message MUST respond with a MediaSelectionAck containing the same transaction id. 5.3.5. Reset MESS Reset messages are sent as BFCP messages with primitive "Media Selection" and the OPERATION attribute set to "Reset". The message carries a list of MEDIA-IDENTIFICATION to be reset. It does not matter if the media described by MEDIA-IDENTIFICATION has been excluded, included or neither of them before. The result at the floor control is always the same, the media associated with the received id will no longer be subject to explicit inclusion/ exclusion. Requests to Reset an already reset media SHALL be ignored. A receiver of a reset message MUST respond with a MediaSelectionAck containing the same transaction id. Reset = (COMMON-HEADER) 1*(FLOOR-ID) (OPERATION) 1*(MEDIA-IDENTIFICATION) *[EXTENSION-ATTRIBUTE] 5.3.6. Reset All MESS Reset All messages are sent as BFCP messages with primitive "Media Selection" and the OPERATION attribute set to "Reset All". It has no attributes. The message is equivalent to a MESS Reset message including MEDIA-IDENTIFICATION attributes of all streams that have previously been specified in "Include", "Exclude" or as second MEDIA- IDENTIFICATION attribute in "Substitute", effectively releasing all existing media streams from being subject to inclusion/exclusion. This operation can fully reset the inclusion/exclusion state even if the requesting endpoint has lost track of what restrictions were previously put. Reset All = (COMMON-HEADER) 1*(FLOOR-ID) (OPERATION) *[EXTENSION-ATTRIBUTE] Grondal, et al. Expires April 26, 2012 [Page 17] Internet-Draft MESS October 2011 A receiver of a reset all message MUST respond with a MediaSelectionAck containing the same transaction id. 6. MESS Responses This document does define an acknowledge response (Section 5.3.1) as well as an error message with several different error reasons. BFCP [RFC4582] defines attributes for error handling. The BFCP Error message in BFCP section 5.3.13 [RFC4582] SHALL be used also for error reporting applicable to this RFC. BFCP [RFC4582] Table 5 defines 9 error codes used in floor control. This document defines the following additional error codes that MAY be used in MESS responses: +--------+-------------------------------------+ | Value | Meaning | +--------+-------------------------------------+ | 16 | Media does not exist | | 17 | Endpoint does not exist | | 18 | Cannot include media | | 20 | Cannot exclude media | | 21 | Cannot substitute media | +--------+-------------------------------------+ Table 5: Media Selection Error Codes The exact reason for the failure MAY be included as UTF8 encoded text in the field "Error specific details" of the BFCP ERROR-CODE attribute. The ERROR-INFO attribute MAY also be used. 7. RTP Implications RTP is a widely used protocol to transfer media. Usage of MESS when media transport is handled using RTP might impact how RTCP reports must be handled when excluding media. In the case where RTP Translators [RFC5117] exists in between endpoints and if the RTP Transport Translators are able to adjust their forwarding rules based on the signalling defined in this document, RTCP reporting may become inconsistent for excluded media content. How this should be handled is out of scope for this document. The operations described in MESS are consistent with the operation of RTP mixers or direct end-point to end-point topologies. Grondal, et al. Expires April 26, 2012 [Page 18] Internet-Draft MESS October 2011 8. Examples Note that the SDP in the examples below is not complete. Only relevant parts have been included. 8.1. Client joins a conference A clients joins a conference by sending an SDP according to the following: s=MESS Example Client m=audio 49200 RTP/AVP 96 a=rtpmap:96 G719/48000/2 a=ssrc:521923924 cname:alice@foo.example.com a=mid:1 m=video 49300 RTP/AVP 96 a=rtpmap:96 H264/90000 a=ssrc:834753488 cname:alice@foo.example.com a=ssrc:834753488 information:"Alice cam" a=label:main video a=mid:2 a=content:main m=application 50000 TCP/BFCP * a=setup:passive a=connection:new In this SDP Alice explicitly names her video stream "Alice cam" by using the new attribute defined in this document. This information is associated with a specific SSRC. A conferencing node in the network then sends the following SIP NOTIFY sample body to subscribed clients. Grondal, et al. Expires April 26, 2012 [Page 19] Internet-Draft MESS October 2011 Alice connected Alice cam Video 834753488 sendrecv Any subscribing endpoint that receives this information can now actively request the "Alice cam" media from sip:alice@example.com to be explicitly included in received media streams. This is done by sending an Include message as defined in this document (some fields not encoded for clarity): Grondal, et al. Expires April 26, 2012 [Page 20] Internet-Draft MESS October 2011 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 0 1|0 0 0 0 0|0 0 1 0 0 0 0 0| Payload Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Conference ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Transaction ID | User ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 0 0 0 0 1 0|M|0 0 0 0 0 1 0 0| Floor ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 1 0 0 0 0 0|1|0 0 0 0 0 1 0 0|0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 1 0 0 0 0 1|1|0 0 0 1 0 1 0 1|0 0 0 0 0 0 0 0| | +-------------+-+---------------+---------------+ | | | / sip:alice@example.com / / +--------------------+ | | Padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 1 0 0 0 0 1|1| Length |0 0 0 0 0 0 0 1| | +-------------+-+---------------+---------------+ | | | / sip:4kfk4j392jsu@example.com;grid=433kj4j3u / / +--------------------+ | | Padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |0 1 0 0 0 0 1|1| Length |0 0 0 0 0 0 1 1| | +-------------+-+---------------+---------------+ | | | / 1 / / +--------------------+ | | Padding | +------------------------------------------+--------------------+ |0 1 0 0 0 1 0|1| Length | | +-------------+-+---------------+ | | Channel Id +--------------------+ | | Padding | +------------------------------------------+--------------------+ The receiver of this message MUST send an acknowledgement using the same transaction ID as soon as possible. 9. IANA Considerations Following the guidelines in SDP [RFC4566], in SDP Grouping Framework [RFC5888] and in RTP [RFC3550], the IANA is requested to register: Grondal, et al. Expires April 26, 2012 [Page 21] Internet-Draft MESS October 2011 A new source-specific attribute named "information" as defined in Section 4.3. Add the following entries to the BFCP [RFC4582] registry: o Primitives from Table 1 o Attributes from Table 2 o Error Codes from Table 5 Start a new registry for this document with: o Operations from Table 3 o Media Identification Types from Table 4 10. Security Considerations When using MESS there is a potential risk of exposing client behavior to other participants. Consider the case where multiple endpoints participates in a conference. Also assume that media transport is done using RTP. If the network between endpoints contains one (or more) RTP translators and even if MESS communication is strictly between floor server and floor participant, the RTCP traffic to/from endpoints could expose information about endpoints excluding other endpoints. Previously received RTCP traffic replaced with no traffic could be leaking information about an endpoint excluding other endpoints. 11. Acknowledgements Jonanthan Lennox and Henning Schulzrinne for their proposal of a source-specific information attribute in the expired Internet Draft draft-lennox-mmusic-sdp-source-selection-02. 12. References 12.1. Normative References [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, "RTP: A Transport Protocol for Real-Time Grondal, et al. Expires April 26, 2012 [Page 22] Internet-Draft MESS October 2011 Applications", STD 64, RFC 3550, July 2003. [RFC4566] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session Description Protocol", RFC 4566, July 2006. [RFC4575] Rosenberg, J., Schulzrinne, H., and O. Levin, "A Session Initiation Protocol (SIP) Event Package for Conference State", RFC 4575, August 2006. [RFC4582] Camarillo, G., Ott, J., and K. Drage, "The Binary Floor Control Protocol (BFCP)", RFC 4582, November 2006. [RFC5234] Crocker, D. and P. Overell, "Augmented BNF for Syntax Specifications: ABNF", STD 68, RFC 5234, January 2008. [RFC5576] Lennox, J., Ott, J., and T. Schierl, "Source-Specific Media Attributes in the Session Description Protocol (SDP)", RFC 5576, June 2009. 12.2. Informative References [RFC3435] Andreasen, F. and B. Foster, "Media Gateway Control Protocol (MGCP) Version 1.0", RFC 3435, January 2003. [RFC4574] Levin, O. and G. Camarillo, "The Session Description Protocol (SDP) Label Attribute", RFC 4574, August 2006. [RFC4583] Camarillo, G., "Session Description Protocol (SDP) Format for Binary Floor Control Protocol (BFCP) Streams", RFC 4583, November 2006. [RFC5117] Westerlund, M. and S. Wenger, "RTP Topologies", RFC 5117, January 2008. [RFC5888] Camarillo, G. and H. Schulzrinne, "The Session Description Protocol (SDP) Grouping Framework", RFC 5888, June 2010. Grondal, et al. Expires April 26, 2012 [Page 23] Internet-Draft MESS October 2011 Authors' Addresses Daniel Grondal Ericsson AB Farogatan 6 SE - 164 80 Kista, Sweden Phone: +46107147505 Fax: +46107175550 Email: daniel.grondal@ericsson.com URI: www.ericsson.com Bo Burman Ericsson AB Farogatan 6 SE - 164 90 Kista, Sweden Phone: +46107141311 Fax: +46107175550 Email: bo.burman@ericsson.com URI: www.ericsson.com Magnus Westerlund Ericsson AB Farogatan 6 SE- Kista 164 90, Sweden Phone: +46107148287 Fax: Email: magnus.westerlund@ericsson.com URI: www.ericsson.com Grondal, et al. Expires April 26, 2012 [Page 24]