mmusic Kutscher Internet-Draft Ott Expires: April 26, 2004 Bormann TZI, Universitaet Bremen October 27, 2003 Session Description and Capability Negotiation draft-ietf-mmusic-sdpng-07.txt Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http:// www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on April 26, 2004. Copyright Notice Copyright (C) The Internet Society (2003). All Rights Reserved. Abstract This document defines a language for describing multimedia sessions with respect to configuration parameters and capabilities of end-systems. This document is a product of the Multiparty Multimedia Session Control (MMUSIC) working group of the Internet Engineering Task Force. Comments are solicited and should be addressed to the working group's mailing list at mmusic@ietf.org and/or the authors. Document Revision Kutscher, et al. Expires April 26, 2004 [Page 1] Internet-Draft SDPng October 2003 $Revision: 6.18 $ Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Terminology and System Model . . . . . . . . . . . . . . . . 6 3. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.1 Outline of the Negotiation Process . . . . . . . . . . . . . 10 3.2 Capability Types . . . . . . . . . . . . . . . . . . . . . . 12 3.3 Application-specific Vocabulary . . . . . . . . . . . . . . 14 4. SDPng Syntax . . . . . . . . . . . . . . . . . . . . . . . . 15 4.1 SDPng Base Syntax . . . . . . . . . . . . . . . . . . . . . 15 4.2 Capabilities . . . . . . . . . . . . . . . . . . . . . . . . 17 4.2.1 Tokens . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.2.2 Token Sets . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.2.3 Numerical Values . . . . . . . . . . . . . . . . . . . . . . 18 4.2.4 Numerical Ranges . . . . . . . . . . . . . . . . . . . . . . 18 4.2.5 Sample SDPng cap Element . . . . . . . . . . . . . . . . . . 19 4.2.6 Referencing Capability Elements . . . . . . . . . . . . . . 20 4.3 Definitions . . . . . . . . . . . . . . . . . . . . . . . . 20 4.4 Configurations . . . . . . . . . . . . . . . . . . . . . . . 22 4.5 Constraints . . . . . . . . . . . . . . . . . . . . . . . . 24 4.6 Session Information . . . . . . . . . . . . . . . . . . . . 24 4.7 Summary of SDPng XML-Syntax . . . . . . . . . . . . . . . . 24 5. Specification of the Capability Negotiation . . . . . . . . 26 5.1 Offer/Answer . . . . . . . . . . . . . . . . . . . . . . . . 26 5.2 RFC2533 Negotiation . . . . . . . . . . . . . . . . . . . . 28 5.2.1 Translating SDPng to RFC 2533 Expressions . . . . . . . . . 28 5.2.2 Applying RFC 2533 Canonicalization . . . . . . . . . . . . . 31 5.2.3 Integrating Feature Sets into SDPng . . . . . . . . . . . . 31 5.2.4 Processing Negotiation Results . . . . . . . . . . . . . . . 32 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . 33 7. Open Issues . . . . . . . . . . . . . . . . . . . . . . . . 34 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 35 References . . . . . . . . . . . . . . . . . . . . . . . . . 36 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 37 A. Formal Syntax Specifications . . . . . . . . . . . . . . . . 38 A.1 SDPng Base DTD . . . . . . . . . . . . . . . . . . . . . . . 38 A.2 SDPng XML-Schema Specification . . . . . . . . . . . . . . . 38 B. Sample Package Definitions . . . . . . . . . . . . . . . . . 45 B.1 Sample RTP Package Definition . . . . . . . . . . . . . . . 45 B.2 Sample Audio Package Definition . . . . . . . . . . . . . . 46 B.3 Sample Video Package Definition . . . . . . . . . . . . . . 46 C. Sample SDPng Description . . . . . . . . . . . . . . . . . . 48 D. Use of SDPng in Conjunction with other IETF Signaling Protocols . . . . . . . . . . . . . . . . . . . . . . . . . 52 D.1 The Session Announcement Protocol (SAP) . . . . . . . . . . 52 D.2 Session Initiation Protocol (SIP) . . . . . . . . . . . . . 53 Kutscher, et al. Expires April 26, 2004 [Page 2] Internet-Draft SDPng October 2003 D.3 Real-Time Streaming Protocol (RTSP) . . . . . . . . . . . . 58 D.4 Media Gateway Control Protocol (MEGACOP) . . . . . . . . . . 59 E. Change History . . . . . . . . . . . . . . . . . . . . . . . 61 Intellectual Property and Copyright Statements . . . . . . . 64 Kutscher, et al. Expires April 26, 2004 [Page 3] Internet-Draft SDPng October 2003 1. Introduction Multiparty multimedia conferencing is one of the applications that require dynamic interchange of end-system capabilities and the negotiation of a parameter set that is appropriate for all sending and receiving end-systems in a conference. For some applications, e.g. for loosely coupled conferences or for broadcast scenarios, it may be sufficient to simply have session parameters be fixed by the initiator of a conference. In such a scenario no negotiation is required because only those participants with media tools that support the predefined settings can join a media session and/or a conference. This approach is applicable for conferences that are announced some time ahead of the actual start date of the conference. Potential participants can check the availability of media tools in advance and tools such as session directories can configure media tools upon startup. This procedure however fails to work for conferences initiated spontaneously including Internet phone calls or ad-hoc multiparty conferences. Fixed settings for parameters such as media types, their encoding etc. can easily inhibit the initiation of conferences, for example in situations where a caller insists on a fixed audio encoding that is not available at the callee's end-system. To allow for spontaneous conferences, the process of defining a conference's parameter set must therefore be performed either at conference start (for closed conferences) or maybe (potentially) even repeatedly every time a new participant joins an active conference. The latter approach may not be appropriate for every type of conference without applying certain policies: For conferences with TV-broadcast or lecture characteristics (one main active source) it is usually not desired to re-negotiate parameters every time a new participant with an exotic configuration joins because it may inconvenience existing participants or even exclude the main source from media sessions. But conferences with equal "rights" for participants that are open for new participants on the other hand would need a different model of dynamic capability negotiation, for example a telephone call that is extended to a 3-parties conference at some time during the session. SDP [2] allows to specify multimedia sessions (i.e. conferences, "session" as used here is not to be confused with "RTP session"!) by providing general information about the session as a whole and specifications for all the media streams (RTP sessions and others) to be used to exchange information within the multimedia session. Currently, media descriptions in SDP are used for two purposes: Kutscher, et al. Expires April 26, 2004 [Page 4] Internet-Draft SDPng October 2003 o to describe session parameters for announcements and invitations (the original purpose of SDP) and o to describe the capabilities of a system and possibly provide a choice between a number of alternatives (which SDP was not designed for). A distinction between these two "sets of semantics" is only made implicitly. This document is based upon a set of requirements specified in a companion document [1]. In the following, we first introduce a model for session description and capability negotiation as well as the basic terms used throughout this specification (Section 2). In Section 3, we provide an overview of options for capability negotiation. Next, we outline the concept for the concepts underlying SDPng and introduce the syntactical components step by step in Section 4. Appendix A provide formal specifications of SDPng such as XML DTD and Schema definitions, Appendix D describes the usage of SDPng in conjunction with IETF control protocol for multimedia communication and Appendix E lists the change history. Kutscher, et al. Expires April 26, 2004 [Page 5] Internet-Draft SDPng October 2003 2. Terminology and System Model Any (computer) system has, at a time, a number of rather fixed hardware as well as software resources. These resources ultimately define the limitations on what can be captured, displayed, rendered, replayed, etc. with this particular device. We term features enabled and restricted by these resources "system capabilities". Example: System capabilities may include: a limitation of the screen resolution for true color by the graphics board; available audio hardware or software may offer only certain media encodings (e.g. G.711 and G.723.1 but not GSM); and CPU processing power and quality of implementation may constrain the possible video encoding algorithms. In multiparty multimedia conferences, participants employ different "components" in conducting the conference. Example: In lecture multicast conferences one component might be the voice transmission for the lecturer, another the transmission of video pictures showing the lecturer and the third the transmission of presentation material. Depending on system capabilities, user preferences and other technical and political constraints, different configurations can be chosen to accomplish the use of these components in a conference. Each component can be characterized at least by (a) its intended use (i.e. the function it shall provide) and (b) one or more possible ways to realize this function. Each way of realizing a particular function is referred to as a "configuration". Example: A conference component's intended use may be to make transparencies of a presentation visible to the audience on the Mbone. This can be achieved either by a video camera capturing the image and transmitting a video stream via some video tool or by loading a copy of the slides into a distributed electronic white-board. For each of these cases, additional parameters may exist, variations of which lead to additional configurations (see below). Two configurations are considered different regardless of whether they employ entirely different mechanisms and protocols (as in the previous example) or they choose the same and differ only in a single parameter. Example: In case of video transmission, a JPEG-based still image protocol may be used, H.261 encoded CIF images could be sent, as Kutscher, et al. Expires April 26, 2004 [Page 6] Internet-Draft SDPng October 2003 could H.261 encoded QCIF images. All three cases constitute different configurations. Of course there are many more detailed protocol parameters. Each component's configurations are limited by the participating system's capabilities. In addition, the intended use of a component may constrain the possible configurations further to a subset suitable for the particular component's purpose. Example: In a system for highly interactive audio communication the component responsible for audio may decide not to use the available G.723.1 audio codec to avoid the additional latency but only use G.711. This would be reflected in this component only showing configurations based upon G.711. Still, multiple configurations are possible, e.g. depending on the use of A-law or u-Law, packetization and redundancy parameters, etc. In modeling multimedia sessions, we distinguish two types of configurations: o potential configurations (a set of any number of configurations per component) indicating a system's functional capabilities as constrained by the intended use of the various components; o actual configurations (exactly one per instance of a component) reflecting the mode of operation of this component's particular instantiation. Example: The potential configuration of the aforementioned video component may indicate support for JPEG, H.261/CIF, and H.261/ QCIF. A particular instantiation for a video conference may use the actual configuration of H.261/CIF for exchanging video streams. In summary, the key terms of this model are: o A multimedia session (streaming or conference) consists of one or more conference components for multimedia "interaction". o A component describes a particular type of interaction (e.g. audio conversation, slide presentation) that can be realized by means of different applications (possibly using different protocols). o A configuration is a set of parameters that are required to implement a certain variation (realization) of a certain component. There are actual and potential configurations. Kutscher, et al. Expires April 26, 2004 [Page 7] Internet-Draft SDPng October 2003 * Potential configurations describe possible configurations that are supported by an end-system. * An actual configuration is an "instantiation" of one of the potential configurations, i.e. a decision how to realize a certain component. In less abstract words, potential configurations describe what a system can do ("capabilities") and actual configurations describe how a system is configured to operate at a certain point in time (media stream spec). To decide on a certain actual configuration, a negotiation process needs to take place between the involved peers: 1. to determine which potential configuration(s) they have in common, and 2. to select one of this shared set of common potential configurations to be used for information exchange (e.g. based upon preferences, external constraints, etc.). Note that the meaning of the term "actual configuration" is highly application-specific. For example, for audio transport using RTP, an actual configuration is equivalent to a payload format (potentially plus format parameters), whereas for other applications it may be a MIME type. In SAP-based [8] session announcements on the Mbone, for which SDP was originally developed, the negotiation procedure is non-existent. Instead, the announcement contains the media stream description sent out (i.e. the actual configurations) which implicitly describe what a receiver must understand to participate. In point-to-point scenarios, the negotiation procedure is typically carried out implicitly: each party informs the other about what it can receive and the respective sender chooses from this set a configuration that it can transmit. Capability negotiation must not only work for 2-party conferences but is also required for multi-party conferences. Especially for the latter case it is required that the process to determine the subset of allowable potential configurations is deterministic to reduce the number of required round trips before a session can be established. For instance, in order to be used with SIP, the capability negotiation is required to work with the offer/answer model that is for session initiation with SIP -- limiting the negotiation to exactly one round trip. Kutscher, et al. Expires April 26, 2004 [Page 8] Internet-Draft SDPng October 2003 The requirements for the SDPng specification, subdivided into general requirements and requirements for session descriptions, potential and actual configurations as well as negotiation rules, are captured in a companion document [1]. The following list explains some terms used in this document: Actual Configuration An actual configuration is an "instantiation" of one of the potential configurations, i.e. a decision how to realize a certain component. Component A component describes a particular type of interaction (e.g. audio conversation, slide presentation) that can be realized by means of different applications (possibly using different protocols). Package A package is application specific data schema for expressing potential and actual configurations. For example, an audio package specifies the data schema for audio codecs. Potential Configuration Potential configurations describe possible configurations that are supported by an end-system ("capabilities"). Kutscher, et al. Expires April 26, 2004 [Page 9] Internet-Draft SDPng October 2003 3. Overview SDPng is a description language for both potential configurations (i.e. capabilities) of participants in multimedia conferences and for actual configurations (i.e. final specifications of parameters). Capability negotiation is the process of generating a usable set of potential configurations and finally an actual configuration from a set of potential configurations provided by each potential participant in a multimedia conference. SDPng itself is an application-independent framework that defines a description syntax and processing rules that are applied to the capability negotiation process. The rules specify how to process two or more capability description in general in order to obtain an interworking configuration. A capability description for an endpoint is a set of individual capabilities, each of which provides a fixed type, e.g., a numeric value or a list value. The set of types and the corresponding negotiation rules are defined in this memo. In the following, we provide an overview of the negotiation process in Section 3.1 and describe the different capability types and the corresponding negotiation rules in Section 3.2. 3.1 Outline of the Negotiation Process SDPng supports the specification of endpoint capabilities and defines a negotiation process: In a negotiation process, capability descriptions are exchanged between participants. These descriptions are processed in a "collapsing" step which results in a set of commonly supported potential configurations. In a second step, the final actual configuration is determined that is used for a conference. This section specifies the usage of SDPng for capability negotiation. It defines the collapsing algorithm and the procedures for exchanging SDPng documents in a negotiation phase. The description language and the rules for the negotiation phase that are defined here are (in general) independent of the means by which descriptions are conveyed during a negotiation phase (a reliable transport service with causal ordering is assumed). There are however properties and requirements of call signaling protocols that have been considered to allow for a seamless integration of the negotiation into the call setup process. For example, in order to be usable with SIP, it must be possible to negotiate the conference configuration within the two-way-handshake of the call setup phase. In order to use SDPng instead of SDP according to the offer/answer model defined in [13] it must be possible to determine an actual Kutscher, et al. Expires April 26, 2004 [Page 10] Internet-Draft SDPng October 2003 configuration in a single request/response cycle. Conceptually, the negotiation process comprises the following individual steps (considering two parties, A and B, where A tries to invite B to a conference). Please note that this describes the steps of the negotiation process conceptually -- it does not specify requirements for implementations. Specific procedures that MUST be followed by implementations are given below. 1. A determines its potential configurations for the components that should be used in the conference (e.g. "interactive audio" and "shared whiteboard") and sends a corresponding SDPng instance to B. This SDPng instances is denoted "CAP(A)". 2. B receives A's SDPng instance and analyzes the set of components in the description. For each component that B wishes to support it generates a list of potential configurations corresponding to B's capabilities, denoted "CAP(B)". 3. B applies the collapsing function and obtains a list of potential configurations that both A and B can support, denoted "CAP(A)xCAP(B) = CAP(AB)". 4. B sends CAP(B) to A. 5. A also applies the collapsing function and obtains "CAP(AB)". At this step, both A and B know the capabilities of each other and the potential configurations that both can support. 6. In order to obtain an actual configuration from the potential configuration that has been obtained, both participants have to pick a subset of the potential configurations that should actually be used in the conference and generate the actual configuration. It should be noted that it depends on the specific application whether each component must be assigned exactly one actual configuration or whether it is allowed to list multiple actual configurations. In this model we assume that A selects the actual configuration, denoted CFG(AB). 7. A augments CFG(AB) with the transport parameters it intends to use, e.g., on which endpoint addresses A wishes to receive data, obtaining CFG_T(A). A sends CFG_T(A) to A. 8. B receives CFG_T(A) and adds its own transport parameters, resulting in CFG_T(AB). CFG_T(AB) contains the selected actual configurations and the transport parameters of both A and B (plus any other SDPng data, e.g., meta-information on the conference). CFG_T(AB) is the complete conference description. Both A and B Kutscher, et al. Expires April 26, 2004 [Page 11] Internet-Draft SDPng October 2003 now have the following information: CAP(A) A's supported potential configurations CAP(B) B's supported potential configurations CAP(AB) The set of potential configurations supported by both A and B. CFG(AB) The set of actual configurations to be used. CFG_T(AB) The set of actual configurations to be used augmented with all required parameters. Note that the model presented here results in four SDPng messages. As an optimization, this procedure can be abbreviated to two exchanges by including the transport (and other) parameters into the potential configurations. A embeds its desired transport parameters into the list of potential configurations and B also sends all required parameters in the response together with B's potential configurations. Both A and B can then derive CFG_T(AB). Transport parameters are usually not negotiable, therefor they have to be distinguished from other configuration information. The SDPng capability negotiation process is specified in Section 5. 3.2 Capability Types The capability negotiation process relies on a fixed set of processing rules for different types of capabilities. The following types are defined: 1. Tokens (text strings) Example: PCMU Processing rule: Ascertain identity 2. Token lists Example: 8000 16000 Kutscher, et al. Expires April 26, 2004 [Page 12] Internet-Draft SDPng October 2003 Processing rule: Determine common subset 3. Numbers Example: Processing rule: Ascertain equality 4. Numerical ranges Example: Processing rule: Determine common subrange SDPng distinguishes between optional and mandatory capability definitions, with different processing rules for the negotiation process. Optional definitions are used for capabilities that can be provided by an entity but do not have to be supported by all participants. For example, an audio codec could provide optional codec parameters. The use of these parameters needs to be declared by a session description, but if the parameter is not understood by all implementations, a session can be established nevertheless. As a result, the failure of a single processing step for a definition that has been marked as "optional" does not lead to a failure of the capability negotiation as a whole. A mandatory capability on the other hand has to be supported by all participants. For example, the specification of an audio codec for an audio capability is mandatory, and for obtaining an interoperable configuration, all participants must support the same audio codec or set of audio codecs. In addition to capabilities, a SDPng description can also provide parameters that are not negotionable, e.g., transport parameters. In SDPng, there is a distinction between capability definitions (that are subject to a negotiation process) and parameters that are specified by each participant. In a description of alternative configurations for a specific component, capabilities and parameters can be referred to and describe the configurations. Kutscher, et al. Expires April 26, 2004 [Page 13] Internet-Draft SDPng October 2003 3.3 Application-specific Vocabulary While the SDPng specification defines the fundamental definition types, processing rules and the syntax definition for SDPng descriptions, it does not define any application-specific vocabulary. Application-specific vocabulary is defined in SDPng packages. An SDPng package defines a schema for application specific capability and parameter descriptions. Based on the description types specified by the SDPng base specification, a package definition specifies the capability and parameter definitions allowed for a specific application, the types of definitions and additional attributes, e.g., whether a definition element is optional with respect to the capability negotiation or not. The SDPng base specification does define some fundamental requirements for definition elements that are specified in package definitions, for example XML attributes for elements. Appendix A.2 provides an XML Schema definition that specifies some base types that to be used for package definitions. In order to allow for an application independent processing of SDPng description documents, SDPng descriptions are standalone, i.e., the package definition is not required to process a corresponding SDPng document. All information, e.g., the type of definitions and additional attributes are contained in the SDPng document itself. An SDPng implementation can thus be processed without access to the package definition. Kutscher, et al. Expires April 26, 2004 [Page 14] Internet-Draft SDPng October 2003 4. SDPng Syntax This section specifies the SDPng base syntax. An SDPng description is an XML document consisting of up to five parts: Capabilities Definitions Configurations Constraints Session Information The Capabilities section provides a list of individual capabilities. In a capability negotiation process, these capabilities are matched against corresponding definitions of other participants' capability descriptions. This section MUST be present in any SDPng description. The Definitions section provides definitions of commonly used parameters for later referencing. This section is OPTIONAL for SDPng descriptions. The Configurations section provides the description of the different conference components (applications in a conference). Each component description can provide a list of alternative configurations. This section MUST be present in any SDPng description. The Constraints section provides contraints on combinations of configurations. This section is OPTIONAL for SDPng descriptions. The Session Information section provides meta information on the conferences and on individual components. This section is OPTIONAL for SDPng documents. 4.1 SDPng Base Syntax An SDPng description is an XML document. The document root element MUST be an element of type "sdpng". The XML vocabulary for the SDPng base specification resides in the XML namespace "http://www.iana.org/ sdpng". The root element of an SDPng description MUST define an XML default namespace "http://www.iana.org/sdpng". In addition, the "sdpng" element MUST map the namespace prefix "sdpng" to the namespace name "http://www.iana.org/sdpng". The "sdpng" element type provides the child elements "cap", "def", "cfg", "constraints", and "info" for the different sections of the SDPng description. The Kutscher, et al. Expires April 26, 2004 [Page 15] Internet-Draft SDPng October 2003 default namespace is also applied to these elements. The encoding of the XML document MUST be UTF-8 (RFC2279, [16]). The following figure depicts the overall SDPng document structure. [...] [...] [...] [...] [...] Appendix A.1 provides a XML DTD that defines a corresponding document type. Note that the elements for the optional sections "Definitions", "Contraints", and "Session-Level Information" are OPTIONAL. Application-specific vocabulary resides in its own namespace. For each namespace name of an SDPng package, a namespace prefix MUST be declared in the start tag of the "sdpng" element. The following figure depicts the declaration of namespace prefixes for two package namespaces: [...] Kutscher, et al. Expires April 26, 2004 [Page 16] Internet-Draft SDPng October 2003 4.2 Capabilities A section for capability descriptions is an XML element that can provide a list of child elements. The element type is called "cap"(in the "sdpng" namespace). Each child element represents an individual capability. Each capability element MUST provide an attribute "name". The value of this attribute SHOULD be composed of a prefix (representing a namespace-name) and a unique name for the corresponding capability within that namespace. The namespace-name designates a namespace for the source of the capability definition, e.g., for the participant of a conference. If a prefix is specified, it MUST be separated by a colon (':') from the name. The namespace MUST be declared in the respective element or in ancestor elements, e.g., the root "sdpng" element. The following figure depicts a capability element inside a "cap" element. Note that the child elements of "audio:codec" and the other sections of the SDPng description are not shown. [One or more feature elements] [...] Each capability element provides a set of features. Each feature is represented by a child element. The element types are defined in package definitions. XML Namespaces are used to disambiguate element types and to allow for extensibility. Each feature element can provide a "range" of values -- not only a single value. For example, a feature element can specify a set of supported alternative values for a given property, e.g., for the sampling rate of an audio codec. SDPng provides two different ways for representing "value ranges": A feature element can specify a set of tokens or a numerical range. Each feature that is represented by an XML feature element has a well-defined type that is specified in the package definition. The type determines the representation of the element values so that type information is encoded implicitly in the description document. Each feature element MAY provide an attribute "status". If this attribute is present it MUST provide one of the following values: Kutscher, et al. Expires April 26, 2004 [Page 17] Internet-Draft SDPng October 2003 opt: This element describes an optional feature (as described by Section 3.2). The three different features types (as described in Section 3.2) are represented as described in the following sections. Section 4.2.5 provides a complete example. 4.2.1 Tokens Token elements provide a single token as element content. The token is of type Nmtoken (name token) as defined by [9]. The following example depicts a feature element of type token. PCMU Boolean values SHOULD be represented as token elements with a values of either "true" or "false". 4.2.2 Token Sets Token set elements provide a token list as element content. The token is of type Nmtokens (name tokens) as defined by [9]. The following example depicts a feature element of type token set. 8000 16000 4.2.3 Numerical Values Elements for numbers provide an attribute "val" with a numerical value. The following example depicts a feature element of type numerical value. 4.2.4 Numerical Ranges Elements for numerical ranges can provide an attribute "min" and an attribute "max". Both attributes provide a numerical value. At least one of these attributes MUST be present. The following example depicts a feature element of type numerical range. Kutscher, et al. Expires April 26, 2004 [Page 18] Internet-Draft SDPng October 2003 4.2.5 Sample SDPng cap Element PCMU 1 2 8000 16000 true [...] Capability elements MAY also provide elements from different XML namespaces. For example, a video-codec capability MAY be described with elements declaring general video capabilities, and this element MAY provide a list of additional codec specific feature elements, as depicted in the following example: H.263+ QCIF foo bar [...] Kutscher, et al. Expires April 26, 2004 [Page 19] Internet-Draft SDPng October 2003 4.2.6 Referencing Capability Elements The capablity elements of a "cap" element can be referenced in later sections of the SDPng document. The fundamental model is that capability elements specify individual capabilities (without transport and other non-negotionable parameters) and that these elements are later augmented in Definitions and Configurations sections. When referencing a capability element, e.g., the element video:codec, the same element name (general identifier) is used. The referencing element MUST provide an attribute "ref", and the value of this attribute SHOULD provide the value of the attribute "name" of the referenced element. The referencing element MAY also provide additional feature elements (that have not been provided by the referenced capability element). The referencing element MAY also provide feature elements that have already been provided by the referenced element. The referencing element MAY provide an attribute "name". The semantics of a reference are defined in the corresponding sections where references to definitions are used, i.e., in Section 4.3 and in Section 4.4. Section 5.2.4 provides implementation requirements for dealing with references to capability elements after a capability negotiation process. 4.3 Definitions The Definitions section is an optional section that can provide definitions of fixed parameters that are not negotionable such as transport parameters. An SDPng description document MAY provide a "def" element that can provide a set of definitions as child elements. Each child element of a "def" element provides an element type specified in a package definition. Such child elements are referred to as "definition elements". Definition elements can provide a set of child elements, each of which specifies a specific configuration value. Syntactically, these child elements MUST be "feature elements" as specified in Section 4.2. Child elements of a definition element MUST be of type Token or of type Numerical Value. A definition element MUST provide an attribute "name" that is used to specify a unique name in the scope of the current SDPng description. A definition element MAY provide an attribute "ref" that is used to reference a capability element as specified in Section 4.2. Kutscher, et al. Expires April 26, 2004 [Page 20] Internet-Draft SDPng October 2003 The following example depicts a def element with one definition element of type "rtp:udp". This element is used to specify fixed parameters of an RTP session -- the allowable parameters would have been specified in a corresponding SDPng RTP package. [...] [...] ::1 9456 1 [...] [...] [...] A definition element SHOULD reference a capability element provided in the "cap" element, as depicted in the example. In the example, the definition named "rtp-cfg1" provides RTP transport parameters and references the RTP capability named "rtp:rtpudpip6". The semantics of referencing the capability element are as follows: o An implementation MUST process the newly defined element by adopting the individual feature elements of the referenced capability element. o For feature elements that are present in both the capability element and the description element, the feature elements of the Kutscher, et al. Expires April 26, 2004 [Page 21] Internet-Draft SDPng October 2003 definition element take precedence over the feature elements of the capability element. Please note the implementation requirements for dealing with references to capability elements after a capability negotiation process provided in Section 5.2.4. 4.4 Configurations The Configurations section lists all the components that constitute the multimedia application (IP telephone call, real-time streaming application, multi-player gaming session etc.). For each of these components, the actual configurations are given. An SDPng document MUST provide a "cfg" element that represents the Configurations section. The "cfg" element provides one or more "component" element describing alternative configurations for the component. The "cfg" element SHOULD provide at least one "component" element. Each "component" element MUST provide an attribute "name" that identifies the component uniquely in the scope of the SDPng description. Each "component" element MUST provide one or more "alt" element, each of which describes an alternative configuration for the component. Each "alt" element MUST provide an attribute "name" that provides a unique identification for the alternative in the scope of the SDPng description. In addition, each "alt" element MUST also provide an attribute "media" for specifying the media type for this particular alternative. Currently defined values for this attribute are "audio", "video", "application", "data", and "control". The semantics of these values are described in [2]. Each "alt" element MUST provide one or more XML elements that describe the configuration parameters for the particular alternative configuration. The elements are defined by SDPng package specification and definition from different packages can be mixed. The type of the elements and their order is application dependent. Each definition element that is contained in an "alt" element SHOULD provide an attribute "ref". The "ref" attribute is used to specify a reference to a capability element (from a "cap" section) or to a definition element (from a "def" section). The value of an "ref" element MUST provide the value of a "name" attribute of an existing capability or definition element. A definition element MAY provide child elements (for the specification of additional feature and configuration parameters) but it MAY also be an empty element. The semantics of referencing the capability element are as follows: Kutscher, et al. Expires April 26, 2004 [Page 22] Internet-Draft SDPng October 2003 o An implementation MUST process the newly defined element by adopting the individual feature elements of the referenced capability or definition element. o For feature elements that are present in both the capability/ definition element and the current definition element, the feature elements of the current definition element take precedence over the feature elements of the referenced element. Please note the implementation requirements for dealing with references to capability elements after a capability negotiation process provided in Section 5.2.4. The following example depicts the description of a single configuration for a component named "interactive-audio". The description of the configuration references the "avp:pcmu" audio codec definition from the "cap" element and the "rtp-cfg1" RTP session definition from the "def" element. In this example, both elements of the "alt" element are empty elements that adopt the specified values from the referenced elements. [...] [...] ::1 9456 1 Kutscher, et al. Expires April 26, 2004 [Page 23] Internet-Draft SDPng October 2003 [...] [...] 4.5 Constraints The Constraints section allows to express constraints on the combination of configurations that apply across different components. The "constraints" element of an SDPng description is OPTIONAL. The usage of constraints will be specified in a separate document. 4.6 Session Information The Session Information section is represented by an "info" element and is intended for meta information on the conference itself and on the individual components. The "info" element is OPTIONAL and, if it is present, it MAY provide a list of information elements. The element types are specified in package definitions. 4.7 Summary of SDPng XML-Syntax The SDPng base specification defines the following XML element types that reside in the SDPng namespace designated by the namespace name "http://www.iana.org/sdpng": o sdpng o cap o def o cfg o component o alt Kutscher, et al. Expires April 26, 2004 [Page 24] Internet-Draft SDPng October 2003 o constraints o info Appendix A.1 provides an XML DTD that specifies the content model of the SDPng base elements. Kutscher, et al. Expires April 26, 2004 [Page 25] Internet-Draft SDPng October 2003 5. Specification of the Capability Negotiation The SDPng specification defines the syntax and the semantics of capability descriptions. The algorithms that are used for processing descriptions and for comparing capability descriptions from different participants are application specific. In this section, we specify two alternative algorithms for implementations: A model that is based on the SDP offer/answer scheme (Section 5.1 and a model that is based on the feature matching algorithm that is specified in RFC 2533 [15] (Section 5.2). 5.1 Offer/Answer The offer/answer model allows communicating peers to determine a (common) mode of operation to exchange media streams in a single round-trip. Basically, the offerer proposes a set of components, providing one or more alternatives ("potential configurations") for each of these. From this offer, the answerer learns which components may be used and which configurations are applicable to realize these components. The answerer indicates which components it supports (e.g. receiving a offer including audio and video, it may disallow the video session and go with an audio-only conversation) and also provides possible configurations to implement those components. Along with the media types and codec parameters, offerer and answerer specify which transport addresses to use and, in case of RTP, which payload types they want to use for sending. Offerer and answerer agree on a common set of media streams ("components") and on a possible set of codecs for each of these ("configurations") as well as the transport addresses and other parameters to be used. However, they do not fix a certain configuration (unless only a single one is exchanged in each direction). Instead, for each selected media stream, either peer may choose and dynamically switch to any of the configurations indicated by the other side in the respective offer or answer. For using SDPng with the offer/answer model (RFC 3264), the basic defined in RFC 3264 for generating offers and answers apply. The following considerations specifically apply when using offer/answer with SDPng (instead of SDP) documents: o For each component to be used, all necessary parameters MUST be given for at least one configuration per component, i.e. transport addresses and payload formats MUST be specified along with the capabilities. o Matching of components is done based upon their identification in the session part of the SDPng document using predefined Kutscher, et al. Expires April 26, 2004 [Page 26] Internet-Draft SDPng October 2003 identifiers for certain session types. For simple sessions, where applications can implicitly derive the semantics of the the offered components, no such explicit mapping is necessary. In this case, i.e. if the entire "" element or the respective elements in the "" element are absent, the order of appearance in the SDPng document is relevant as it is with SDP. o For each component, the answerer performs a capability matching process as per then application's requirements For all components that are acceptable, the answerer determines whether or not to accept the offer. If the answerer decides to accept the offer for a certain component, it MUST accept at least one of the potential configurations for the respective component. It SHOULD indicate this by setting the "status" attribute of the component and of the selected configuration(s) to "active" (but it MAY also omit the status attribute in both cases). It is RECOMMENDED that the answerer selects exactly one configuration for each component as "active". o The answerer MAY refuse individual configurations for a component from the offer in two ways. If the configuration shall not be used at all during a session, e.g. because the answerer does not support it or because the answere does not want to use this configuration at all, the answerer MUST set the "status" attribute of the respective component to "unused". In this case, the answerer MAY omit all the elements contained in the respective configuration's elements. This is equivalent to setting the port parameter to "0" in SDP. If a configuration shall be accepted (i.e. the respective capability shall be indicated) but no media session shall be instantiated (not even on hold!), the answerer MUST set the "status" attribute of the respective configuration to "available" and omit all media-session-specific parameters the configuration. o The answerer MAY refuse entire components that the offerer has included in two ways. If a component shall not be used at all during a session -- e.g. because the answerer does not support any of the configurations listed or because the answere does not want to use this component at all -- the answerer MUST set the "status" attribute of the respective component's to "unused". In this case, the answerer MAY omit all the elements contained in the respective component elements. This is equivalent to setting the port parameter to "0" in SDP. If a component shall be accepted (i.e. the respective capability shall be indicated) but no media session shall be instantiated Kutscher, et al. Expires April 26, 2004 [Page 27] Internet-Draft SDPng October 2003 (not even on hold!), the answerer MUST set the "status" attribute of the respective component to "available", omit all media-session-specific parameters from all acceptable configurations for the respective component. o For each component, the alternative potential configurations MUST be listed in the order of preference. Within a configuration, alternatives (e.g. different codecs) MUST also be listed in the order of preference. The considerations of RFC 3264 to simply arriving at symmetric codec use apply. If a component shall be put on hold, the status attribute of the component MUST be set to "sendonly", "recvonly", or "inactive", as appropriate. In this case, the status attributes of all the contained configurations that were previously active MUST be set to indicate "sendonly", "recvonly", or "inactive", as appropriate. The rules from RFC 3264 for putting media streams on hold SHALL apply. 5.2 RFC2533 Negotiation SDPng potential configurations can be processed using the RFC 2533 algorithm as defined in [15]. This involves the following steps: Translating SDPng capability descriptions to RFC 2533 feature set expressions; Applying the RFC 2533 feature match algorithm; and Integrating the resulting feature set expressions into the SDPng selection of conference configurations. 5.2.1 Translating SDPng to RFC 2533 Expressions SDPng capability descriptions can be translated to RFC 2533 feature sets in a straightforward way, because SDPng uses a subset of the mechanisms provided by RFC 2533 with a different syntax. Each capability is represented as an XML element with a set of child elements. We first describe how to translate a single capability element into a RFC 2533 feature set, and then consider the combination of multiple capability elements. Basically, all attributes of an SDPng capability element and its child elements MUST be transformed to an RFC 2533 expression, whereas each child element MUST be translated to a feature predicate. The resulting feature predicates are combined using the '&' (AND) Kutscher, et al. Expires April 26, 2004 [Page 28] Internet-Draft SDPng October 2003 operator. The name attributes MUST NOT be considered. Each predicate MUST be encapsulated by brackets ('(', ')'). The value or value range of each feature element is taken as a feature predicate value. Each feature element name is directly adopted as a feature tag, including the namespace name. The SDPng data types map to RFC 2533 feature types as follows: Token A token MUST be directly adopted as an RFC 2533 token. Token set A token set MUST be adopted as an RFC 2533 set (a comma-separated token list inside square brackets, such as "video:channels=[1,2]"). Number A single number in a "val" attribute of a feature elements of type number MUST be adopted as an RFC 2533 number. Numerical Ranges A numerical range MUST be transformed to a feature set expression with two feature predicates that are combined using the "&" (AND) operator. The first predicate specifies the lower limit and the second predicate specified the upper limit. For example, the element would be transformed to the following feature set: (& (bitrate>=64) (bitrate<=128)) A numerical range without a lower limit MUST be transformed to a corresponding predicate with a '<=' operator and a numerical range without a upper limit MUST be transformed to a corresponding predicate with a '>=' operator. For example, the element would be transformed to the following feature set: (bitrate<=128) The following sample SDPng potential configuration would be transformed as follows: Original SDPng expression: QCIF Kutscher, et al. Expires April 26, 2004 [Page 29] Internet-Draft SDPng October 2003 foo bar Transforming feature elements to feature predicates: (& (video:resolution=QCIF) (video:frame-rate<=24) (h263plus:A=foo) (h263plus:B=bar)) RFC 2533 uses the syntax rules of RFC 2506 [17] for feature tags. Note that in example above, the namespace name is not used for feature tags, instead we use the namespace prefix (for abbreviation). It should be noted, that implementations MUST replace the namespace prefix of SDPng elements with the namespace name when performing the translation to an RFC 2533 expression. The following figure depicts an corresponding expression for the previous example: (& (http://www.iana.org/sdpng/video:resolution=QCIF) (http://www.iana.org/sdpng/video:frame-rate<=24) (http://www.example.com/h263plus:A=foo) (http://www.example.com/h263plus:B=bar)) For this example, we assume that the prefix "video" has been assigned to the namespace name "http://www.iana.org/sdpng/video" and that the prefix "h263plus" has been assigned to the namespace name "http:// www.example.com/h263plus". In the following examples, we will use the abbreviated form (using the namespace prefix only). Multiple independent capability elements MUST each be transformed using the specification above and then combined into a single RFC 2533 feature set by connecting the individual feature sets using the '|' (OR) operator. For example, the following sample SDPng potential configuration would be transformed as follows: PCMU 1 2 8000 16000 QCIF foo bar Transforming feature elements to feature predicates: Kutscher, et al. Expires April 26, 2004 [Page 30] Internet-Draft SDPng October 2003 (| (& (video:encoding=PCMU) (video:channels=[1,2]) (video:sampling=[8000,16000])) (& (video:resolution=QCIF) (video:frame-rate<=24) (h263plus:A=foo) (h263plus:B=bar)) ) 5.2.2 Applying RFC 2533 Canonicalization After transforming different SDPng capability descriptions from different participants into their equivalent RFC 2533 form, the following steps MUST be performed to calculate the common subset of capabilities: 1. The individual feature sets MUST be combined into a single expression by creating a conjunction of the feature sets, i.e., the feature sets MUST be connected by the '&' (AND) operator. 2. The resulting expression MUST be reduced to disjunctive normal form, i.e., the canonical from as specified by RFC 2533 [15]. 5.2.3 Integrating Feature Sets into SDPng A feature set that has been created by combining multiple independent feature sets and by reducing the result for canonical form does not indicate directly which of the capability elements belong to the common subset of capabilities. SDPng uses the following approach: After a "collapsing process" that has determined the commonly supported capabilities, the resulting RFC 2533 expression is compared to the original SDPng capability description. For this purpose, each SDPng capability element is transformed to an RFC 2533 expression and matched against the negotiation result (by constructing a conjunction of the two feature sets). If the resulting canonical disjunctive form is non-empty, the respective capability element represents a commonly supported capability and can be adopted for the conference configuration. A future version of this document will specify how to adopt individual values from the negotiation result for the SDPng capability element. The following steps MUST be performed to determine whether an individual capability element (e.g., from one of the contributing SDPng capability descriptions) belongs to the result feature set. Let R be the result feature set obtained from the canonicalization as Kutscher, et al. Expires April 26, 2004 [Page 31] Internet-Draft SDPng October 2003 specified in Section 5.2.2. 1. For each capability element, generate the equivalent RFC 2533 feature set by applying the steps specified in Section 5.2.1. Let C be the resulting feature set. 2. Combine R and C into a single feature set by building a conjunction of the two feature sets (& R C). Let the result be the feature set T. 3. Reduce T to disjunctive normal form by applying the canonicalization as defined in RFC 2533 [15]. 4. If the remaining disjunction is non-empty, the constraints specified by capability element (the origin of C) can be satisfied by R, i.e., C represents a commonly supported capability. 5.2.4 Processing Negotiation Results The capability negotiation results in an updated list of capability elements of the SDPng "cap" element. The capability elements describe the commonly supported capabilities. Capabilities that are not supported by all end-systems have been removed. Definition elements (inside the SDPng "def" element) and configuration descriptions (inside the SDPng "alt" element) that reference capability elements that have been removed after the negotiation process, MUST be removed as well. Configuration description (inside the SDPng "alt" element) that reference non-existing definition elements (inside the SDPng "def" element") MUST also be removed. Kutscher, et al. Expires April 26, 2004 [Page 32] Internet-Draft SDPng October 2003 6. IANA Considerations The IANA should set up a registry for XML namespaces for SDPng and SDPng package definitions. The SDP parameter registry (http://www.iana.org/assignments/ sdp-parameters) should be converted to SDPng package definitions. Kutscher, et al. Expires April 26, 2004 [Page 33] Internet-Draft SDPng October 2003 7. Open Issues Revise usage of terminology (potential configuration, actual configuration) Do we need an explicit mechanism to declare the used packages? E.g., Data model for audio package: sampling-rate vs. RTP clock rate Bib. references: distinguish normative and informational A registry (reuse of SDP mechanisms and names etc.) needs to be set up (IANA considerations). Kutscher, et al. Expires April 26, 2004 [Page 34] Internet-Draft SDPng October 2003 8. Acknowledgements The authors would like to thank Teodora Guenkova, Goran Petrovic and Markus Nosse for their feedback and detailed comments. Kutscher, et al. Expires April 26, 2004 [Page 35] Internet-Draft SDPng October 2003 References [1] Kutscher, D., Ott, J., Bormann, C. and I. Curcio, "Requirements for Session Description and Capability Negotiation", Internet Draft draft-ietf-mmusic-sdpng-req-01.txt, April 2001. [2] Handley, M. and V. Jacobsen, "SDP: Session Description Protocol", RFC 2327, April 1998. [3] Schulzrinne, H., Casner, S., Frederick, R. and V. Jacobson, "RTP: A Transport Protocol for Real-Time Applications", RFC 3550, July 2003. [4] Schulzrinne, H. and S. Casner, "RTP Profile for Audio and Video Conferences with Minimal Control", RFC 3551, July 2003. [5] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., Handley, M., Bolot, J., Vega-Garcia, A. and S. Fosse-Parisis, "RTP Payload for Redundant Audio Data", RFC 2198, September 1997. [6] Rosenberg, J. and H. Schulzrinne, "An RTP Payload Format for Generic Forward Error Correction", RFC 2733, December 1999. [7] Perkins, C. and O. Hodson, "Options for Repair of Streaming Media", RFC 2354, June 1998. [8] Handley, M., Perkins, C. and E. Whelan, "Session Announcement Protocol", RFC 2974, October 2000. [9] World Wide Web Consortium (W3C), "Extensible Markup Language (XML) 1.0 (Second Edition)", Status W3C Recommendation, Version http://www.w3.org/TR/2000/REC-xml-20001006, October 2000. [10] World Wide Web Consortium (W3C), "Namespaces in XML", Status W3C Recommendation, Version http://www.w3.org/TR/1999/ REC-xml-names-19990114, January 1999. [11] World Wide Web Consortium (W3C), "XML Schema Part 1: Structures", Version http://www.w3.org/TR/2001/ REC-xmlschema-1-20010502/, Status W3C Recommendation, May 2001. [12] World Wide Web Consortium (W3C), "XML Schema Part 2: Datatypes", Version http://www.w3.org/TR/2001/ REC-xmlschema-2-20010502/, Status W3C Recommendation, May 2001. [13] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with SDP", RFC 3264, June 2002. Kutscher, et al. Expires April 26, 2004 [Page 36] Internet-Draft SDPng October 2003 [14] Hollenbeck, S., Rose, M. and L. Masinter, "Guidelines for the Use of Extensible Markup Language (XML) within IETF Protocols", BCP 70, RFC 3470, January 2003. [15] Klyne, G., "A Syntax for Describing Media Feature Sets", RFC 2533, March 1999. [16] Yergeau, F., "UTF-8, a transformation format of ISO 10646", RFC 2279, January 1998. [17] Holtman, K., Mutz, A. and T. Hardie, "Media Feature Tag Registration Procedure", BCP 31, RFC 2506, March 1999. Authors' Addresses Dirk Kutscher TZI, Universitaet Bremen Bibliothekstr. 1 Bremen 28359 Germany Phone: +49.421.218-7595, sip:dku@tzi.org Fax: +49.421.218-7000 EMail: dku@tzi.uni-bremen.de Joerg Ott TZI, Universitaet Bremen Bibliothekstr. 1 Bremen 28359 Germany Phone: +49.421.201-7028, sip:jo@tzi.org Fax: +49.421.218-7000 EMail: jo@tzi.uni-bremen.de Carsten Bormann TZI, Universitaet Bremen Bibliothekstr. 1 Bremen 28359 Germany Phone: +49.421.218-7024, sip:cabo@tzi.org Fax: +49.421.218-7000 EMail: cabo@tzi.org Kutscher, et al. Expires April 26, 2004 [Page 37] Internet-Draft SDPng October 2003 Appendix A. Formal Syntax Specifications A.1 SDPng Base DTD The following DTD specifies the SDPng base syntax. DTDs are not XML-Namespace aware, therefore the following DTD is for informational purposes only. Moreover, the content models for the element types "cap" and "def" have to be empty in this DTD as the specific element types for the allowed child elements are not defined by the base specification but by independent package definitions. Common requirements for these element types such as the "name" attribute cannot be expressed with XML DTDs. A.2 SDPng XML-Schema Specification Kutscher, et al. Expires April 26, 2004 [Page 39] Internet-Draft SDPng October 2003 Kutscher, et al. Expires April 26, 2004 [Page 40] Internet-Draft SDPng October 2003 Kutscher, et al. Expires April 26, 2004 [Page 41] Internet-Draft SDPng October 2003 Kutscher, et al. Expires April 26, 2004 [Page 42] Internet-Draft SDPng October 2003 Kutscher, et al. Expires April 26, 2004 [Page 43] Internet-Draft SDPng October 2003 Kutscher, et al. Expires April 26, 2004 [Page 44] Internet-Draft SDPng October 2003 Appendix B. Sample Package Definitions B.1 Sample RTP Package Definition Kutscher, et al. Expires April 26, 2004 [Page 45] Internet-Draft SDPng October 2003 B.2 Sample Audio Package Definition B.3 Sample Video Package Definition Kutscher, et al. Expires April 26, 2004 [Page 46] Internet-Draft SDPng October 2003 Kutscher, et al. Expires April 26, 2004 [Page 47] Internet-Draft SDPng October 2003 Appendix C. Sample SDPng Description PCMU 1 8000 GSM 1 8000 G723 1 8000 DVI4 1 8000 11025 16000 22050 LPC 1 8000 G722 1 8000 L16 Kutscher, et al. Expires April 26, 2004 [Page 48] Internet-Draft SDPng October 2003 1 2 44100 QCELP 1 8000 CN 1 8000 MPA 1 32000 44100 48000 G728 1 8000 G728 1 8000 G726-40 1 8000 G726-32 1 8000 G726-24 1 8000 G726-16 1 8000 Kutscher, et al. Expires April 26, 2004 [Page 49] Internet-Draft SDPng October 2003 G729D 1 8000 G729E 1 8000 GSM-EFR 1 8000 L8 1 2 8000 16000 RED 1 2 8000 16000 RED 1 var CelB 4 6 8 12 16 20 24 30 IP6 ::1 9546 Kutscher, et al. Expires April 26, 2004 [Page 50] Internet-Draft SDPng October 2003 0 Kutscher, et al. Expires April 26, 2004 [Page 51] Internet-Draft SDPng October 2003 Appendix D. Use of SDPng in Conjunction with other IETF Signaling Protocols This appendix is included temporarily and for informational purposes only. Ultimately, it is up to each existing and evolving application protocol to specify its use of SDPng. The SDPng model provides the notion of Components to indicate the intended types of collaboration between the users in e.g. a teleconferencing scenario. Three different abstractions are defined that are used for describing the properties of a specific Component: o a Capability refers to the fact that one of the involved parties supports one particular way of exchanging media -- defined in terms of transport, codec, and other parameters -- as part of the media session. o a Potential Configuration denotes a set of matching Capabilities from all those involved parties that may be used for one particular Component. o an Actual Configuration indicates the Potential Configuration(s) and its associated media session parameters which was/were chosen by the involved parties to instantiate a certain Component. As mentioned before, this abstract notion of the interactions between a number of communicating systems needs to be mapped to the application scenarios of SDPng in conjunction with the various IETF signaling protocols, including (but not limited to) SAP, SIP, RTSP, and MEGACO. In general, this section provides recommendations and possible scenarios for the use of SDPng within specific protocols and applications. Is does not specify normative requirements. D.1 The Session Announcement Protocol (SAP) SAP is used to disseminate a previously created (and typically fixed) session description to a potentially large audience. An interested member of the audience will use the SDPng description contained in SAP to join the announced media sessions. This means that a SAP announcement contains the Actual Configurations of all Components that are part of the overall teleconference or broadcast. Kutscher, et al. Expires April 26, 2004 [Page 52] Internet-Draft SDPng October 2003 A SAP announcement may contain multiple Actual Configurations for the same Component. In this case, the "same" (i.e. semantically equivalent) media data from one configuration must be available from each of the Actual Configurations. Each receiver of a SAP announcement with SDPng compares its locally stored Capabilities to realize a certain Component against the Actual Configurations contained in the announcement. If the intersection yields one or more Potential Configurations for the receiver, it chooses the one it sees fit best. If the intersection is empty, the receiver cannot participate in the announced session. SAP may be substituted by HTTP (in the general case, at least), SMTP, NNTP, or other IETF protocols suitable for conveying a media description from one entity to one or more other without the intend for further negotiation of the session parameters. SAP makes extensive use of the SDP session level attributes to provide a (limited) set of descriptive metadata for the session, including scheduling and subject information. Quite a bit of this information is application-specific and is therefore not defined in the baseline SDPng spec. D.2 Session Initiation Protocol (SIP) SIP is used to establish and modify multimedia sessions, and SDPng may be carried at least in SIP INVITE, ACK and UPDATE messages as well as in a number of responses. From dealing with legacy SDP (and its essential non-suitability for capability negotiation), a particular use and interpretation of SDP has been defined for SIP, generalized in the offer/answer model documented in RFC 3264. One of the important flexibilities introduced by SIP's usage of SDP is that a sender can change dynamically between all codecs that a receiver has indicated support (and has provided an address) for. Codec changes are not signaled out-of-band but only indicated by the payload type within the media stream. From this arises one important consequence to the conceptual view of a Component within SDPng. There is no clear distinction between Potential and Actual Configurations. There need not be a single Actual Configuration chosen at setup time within the SIP signaling. Instead, a number of Potential Configurations is signaled in SIP (with all transport parameters required for carrying media streams) and the Actual Configuration is only identified by the payload type which is actually being transmitted at any point in time. Note that since SDPng does not distinguish between Potential and Kutscher, et al. Expires April 26, 2004 [Page 53] Internet-Draft SDPng October 2003 Actual Configurations at the syntax, this has no implications on the SDPng signaling itself. SIP relies on an "offer/answer" model for the exchange of capability and configuration information. Either the caller or the callee sends an initial session description that is processed by the other side and returned. For capability negotiation, this means that the negotiation follows a two-stage-process: The "offerer" sends its capability description to the receiver. The receiver processes the offerers capabilities and his own capabilities and generates a result capability description that is sent back to the offerer. Both sides now know the commonly supported configurations and can initiate the media sessions. Because of this strict "offer/answer" model, the offerer must already send complete configurations (i.e. include transport addresses) along with the capability descriptions. The answer must also contain complete configuration parameters. The following figure shows, how SDPng content can be used in an INVITE request with a correspong 200 OK message. Simple description document with only one alternative: F1 INVITE A -> B INVITE sip:B@example.com SIP/2.0 Via: SIP/2.0/UDP hostA.example.com:5060 From: A To: B Call-ID: 1234@hostA.example.com CSeq: 1 INVITE Contact: Content-Type: application/sdpng Content-Length: 685 Kutscher, et al. Expires April 26, 2004 [Page 54] Internet-Draft SDPng October 2003 PCMU 1 8000 GSM 1 8000 192.168.47.11 51400 0 3 ================================================== F2 (100 Trying) B -> A SIP/2.0 100 Trying Via: SIP/2.0/UDP hostA.example.com:5060 From: A To: B Call-ID: 1234@hostA.example.com CSeq: 1 INVITE Content-Length: 0 Kutscher, et al. Expires April 26, 2004 [Page 55] Internet-Draft SDPng October 2003 ================================================== F3 180 Ringing B -> A SIP/2.0 180 Ringing Via: SIP/2.0/UDP hostA.example.com:5060 From: A To: B ;tag=987654 Call-ID: 1234@hostA.example.com CSeq: 1 INVITE Content-Length: 0 ================================================== F4 200 OK B -> A SIP/2.0 200 OK Via: SIP/2.0/UDP hostA.example.com:5060 From: A To: B ;tag=987654 Call-ID: 1234@hostA.example.com CSeq: 1 INVITE Contact: Content-Type: application/sdpng Content-Length: 479 PCMU 1 8000 GSM 1 Kutscher, et al. Expires April 26, 2004 [Page 56] Internet-Draft SDPng October 2003 8000 192.168.47.12 60006 3 ================================================== ACK from A to B omitted In the INVITE message, A sends B a description document that specifies exactly one component with two alternatives (the PCMU and GSM audio streams). The alternatives make reference to the capability section where the two codec types are defined. All required transport parameters all already contained in the respective descriptions. The element contains a definition for the RTP media sessions so that this needs not be repeated in the configuration of the single component. Note that the semantics of the component is not explicitly specified (in an element) but rather implied. In the 200 OK message, B sends an updated description document to A. B supports the payload format that A has offered and adds his own transport parameters to the configuration information, specifying the endpoint address where B wants to receive media data. In order to disambiguate its transport configurations from A's, B sets the attribute "endpoint" to the value "B". The specific value of the "endpoint" attribute is not important, the only requirements are that a party that contributes to the session description, must use a unique name for the endpoint attribute and that a contributing party must use the same value for the endpoint attributes of all elements Kutscher, et al. Expires April 26, 2004 [Page 57] Internet-Draft SDPng October 2003 it adds to the session description. D.3 Real-Time Streaming Protocol (RTSP) In contrast to SIP, RTSP has, from its intended usage, a clear distinction between offering a set of Potential Configurations (by the server) and choosing one out of these (by the client). However, there is no capability negotiation process involved: the server provides a complete SDPng document describing all Components making up a presentation and includes detailed codec and transport parameters for each of there. The client may only pick one out of alternatives for each of the offered Components but has no further option to negotiate parameters in depth. Where some additional exchange is necesary -- e.g. for the client's transport addresses and security parameters --, the respective parameters are no encoded in SDPng; instead, additional RTSP header fields and parameters are field for this purpose. Hence, SDPng is only used to describe alternatives to gain access to streaming media out of which the client has to choose. No interaction takes place at the SDPng level. C->M: DESCRIBE rtsp://foo/audio-play RTSP/1.0 CSeq: 1 M->C: RTSP/1.0 200 OK CSeq: 1 Content-Type: application/sdp Content-Length: ... PCMU 1 8000 Kutscher, et al. Expires April 26, 2004 [Page 58] Internet-Draft SDPng October 2003 GSM 1 8000 192.168.47.11 51400 0 3 C->M: SETUP rtsp://foo/audio-play RTSP/1.0 CSeq: 2 Transport: RTP/AVP;unicast;client_port=8000-8001 M->C: RTSP/1.0 200 OK CSeq: 2 Transport: RTP/AVP;unicast;client_port=8000-8001; server_port=51400-51401 Session: 12345678 To be continued with PLAY and, after the audio track has completed, finished with TEARDOWN. D.4 Media Gateway Control Protocol (MEGACOP) The MEGACO architecture also follows the SDPng model of a clear separation between Potential and Actual Configurations. Upon startup, a Media Gateway (MG) will "register" with its Media Gateway Kutscher, et al. Expires April 26, 2004 [Page 59] Internet-Draft SDPng October 2003 Controller (MGC) and the latter will audit the MG for its Capabilities. Those will be provided as Potential Configurations, possibly with extensive Constraints specifications. Whenever a media path needs to be set up by the MGC between two MGs or an MG needs to be reconfigured internally, the MGC will use (updated) Actual Configurations. Details and examples to be defined in a separate document. Kutscher, et al. Expires April 26, 2004 [Page 60] Internet-Draft SDPng October 2003 Appendix E. Change History draft-ietf-mmusic-sdpng-07.txt * New document structure: 1. Intro 2. Terminology and System Model 3. Overview 4. SDPng Syntax Specification 5. Negotiation Process * Changes to Section 3: Describe all concepts * Section 4 provides complete specification * Changed XML syntax: Represent tokens and token list as element content (not attributes) * a new element "def" for reusable definitions * Adapted secion 5 accordingly * Sample DTD, schema definition and same SDPng document in appendix * Updated section 5.1 (Offer/Answer) * Updated appendix D (Use of SDPng in conjunction with other IETF Signaling Protocols) draft-ietf-mmusic-sdpng-06.txt * Removed section on capability negotiation algorithm and section on formal specification. Added Section 3. * Removed specification of concrete XML syntax from Section 4. Added requirements and theoretic considerations. * Added clarification of term "actual configuration" in Section 2. * Changed "profile" to "package". Kutscher, et al. Expires April 26, 2004 [Page 61] Internet-Draft SDPng October 2003 * Added a list of terms with explanation at the end of Section 2. * Removed audio and RTP packages from appendix. * Added a section "Syntax Definition". * Added Section 5 ("Specification of the Capability Negotiation"). draft-ietf-mmusic-sdpng-05.txt * Moved audio and RTP packages to appendix. * Moved section "Use of SDPng in conjunction with other IETF Signaling Protocols" to appendix. * Changed mechanism for references to definitions: Definition elements provide an attribute "ref" that can be used to referenced existing definitions. Removed other mechanisms for referencing (attributes "format" and "transport", element type "use"). * Corrections to schema definitions and examples draft-ietf-mmusic-sdpng-04.txt * New section on capability negotiation. * New section on referencing definitions. * New section on properties. * New section on definition groups. draft-ietf-mmusic-sdpng-03.txt * Extension of the SDPng schema (use of Xlinks etc.) * Clarification in the text * Fixed examples * Added example libraries as appendices * More details on usage with SIP, including examples. Kutscher, et al. Expires April 26, 2004 [Page 62] Internet-Draft SDPng October 2003 draft-ietf-mmusic-sdpng-02.txt * Added a section on formal specification mechanisms. draft-ietf-mmusic-sdpng-01.txt * renamed section "Syntax Proposal" to "Syntax Definition Mechanisms". More text on DTD vs. schema. Edited the example description. * updated example definitions in section "Definitions" and "Components & Configurations" * section "Session Attributes" replaces section "Session" * new appendix on audio codec definitions Kutscher, et al. Expires April 26, 2004 [Page 63] Internet-Draft SDPng October 2003 Intellectual Property Statement The IETF takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights. Information on the IETF's procedures with respect to rights in standards-track and standards-related documentation can be found in BCP-11. Copies of claims of rights made available for publication and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementors or users of this specification can be obtained from the IETF Secretariat. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights which may cover technology that may be required to practice this standard. Please address the information to the IETF Executive Director. Full Copyright Statement Copyright (C) The Internet Society (2003). All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assignees. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION Kutscher, et al. Expires April 26, 2004 [Page 64] Internet-Draft SDPng October 2003 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Acknowledgment Funding for the RFC Editor function is currently provided by the Internet Society. Kutscher, et al. Expires April 26, 2004 [Page 65]