idnits 2.17.1 draft-kutscher-mmusic-sdpng-req-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 11 instances of lines with control characters in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (November 24, 2000) is 8544 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: '3' is defined on line 863, but no explicit reference was found in the text == Unused Reference: '4' is defined on line 866, but no explicit reference was found in the text == Unused Reference: '5' is defined on line 870, but no explicit reference was found in the text == Unused Reference: '6' is defined on line 873, but no explicit reference was found in the text == Unused Reference: '12' is defined on line 894, but no explicit reference was found in the text == Unused Reference: '13' is defined on line 899, but no explicit reference was found in the text == Unused Reference: '14' is defined on line 902, but no explicit reference was found in the text == Unused Reference: '15' is defined on line 905, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2327 (ref. '1') (Obsoleted by RFC 4566) ** Obsolete normative reference: RFC 1889 (ref. '2') (Obsoleted by RFC 3550) ** Obsolete normative reference: RFC 1890 (ref. '3') (Obsoleted by RFC 3551) ** Downref: Normative reference to an Informational RFC: RFC 2703 (ref. '6') ** Obsolete normative reference: RFC 2733 (ref. '7') (Obsoleted by RFC 5109) ** Downref: Normative reference to an Informational RFC: RFC 2354 (ref. '8') == Outdated reference: A later version (-01) exists of draft-camarillo-sip-sdp-00 -- Possible downref: Normative reference to a draft: ref. '9' -- Possible downref: Normative reference to a draft: ref. '10' ** Downref: Normative reference to an Experimental draft: draft-ietf-mmusic-sap-v2 (ref. '11') -- Possible downref: Normative reference to a draft: ref. '12' == Outdated reference: A later version (-10) exists of draft-ietf-mmusic-sdp-srcfilter-00 == Outdated reference: A later version (-04) exists of draft-beser-mmusic-capabilities-00 -- Possible downref: Normative reference to a draft: ref. '14' == Outdated reference: A later version (-05) exists of draft-ietf-avt-rtcp-bw-01 Summary: 11 errors (**), 0 flaws (~~), 14 warnings (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Kutscher 3 Internet-Draft Ott 4 Expires: May 25, 2001 Bormann 5 TZI, Universitaet Bremen 6 November 24, 2000 8 Requirements for Session Description and Capability Negotiation 9 draft-kutscher-mmusic-sdpng-req-01.txt 11 Status of this Memo 13 This document is an Internet-Draft and is in full conformance with 14 all provisions of Section 10 of RFC2026. 16 Internet-Drafts are working documents of the Internet Engineering 17 Task Force (IETF), its areas, and its working groups. Note that 18 other groups may also distribute working documents as 19 Internet-Drafts. 21 Internet-Drafts are draft documents valid for a maximum of six 22 months and may be updated, replaced, or obsoleted by other documents 23 at any time. It is inappropriate to use Internet-Drafts as reference 24 material or to cite them other than as "work in progress." 26 The list of current Internet-Drafts can be accessed at 27 http://www.ietf.org/ietf/1id-abstracts.txt 29 The list of Internet-Draft Shadow Directories can be accessed at 30 http://www.ietf.org/shadow.html. 32 This Internet-Draft will expire on May 25, 2001. 34 Copyright Notice 36 Copyright (C) The Internet Society (2000). All Rights Reserved. 38 Abstract 40 This document defines some terminology and lists a set of 41 requirements that are relevant for a framework for session 42 description and endpoint capability negotiation in multiparty 43 multimedia conferencing scenarios. 45 This document is intended for discussion in the Multiparty 46 Multimedia Session Control (MMUSIC) working group of the Internet 47 Engineering Task Force. Comments are solicited and should be 48 addressed to the working group's mailing list at confctrl@isi.edu 49 and/or the authors. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 54 2. Terminology and System Model . . . . . . . . . . . . . . . . 5 55 3. General Requirements . . . . . . . . . . . . . . . . . . . . 8 56 3.1 Simplicity . . . . . . . . . . . . . . . . . . . . . . . . . 8 57 3.2 Extensibility . . . . . . . . . . . . . . . . . . . . . . . 8 58 3.3 Firewall Friendliness . . . . . . . . . . . . . . . . . . . 8 59 3.4 Security . . . . . . . . . . . . . . . . . . . . . . . . . . 8 60 3.5 Text encoding . . . . . . . . . . . . . . . . . . . . . . . 8 61 3.6 Session vs. Media Description . . . . . . . . . . . . . . . 9 62 3.7 Mapping (of a Subset) to SDP . . . . . . . . . . . . . . . . 9 63 4. Session Description Requirements . . . . . . . . . . . . . . 10 64 4.1 Media Description . . . . . . . . . . . . . . . . . . . . . 10 65 4.1.1 Medium Type . . . . . . . . . . . . . . . . . . . . . . . . 10 66 4.1.2 Media Stream Packetization . . . . . . . . . . . . . . . . . 10 67 4.1.3 Transport . . . . . . . . . . . . . . . . . . . . . . . . . 10 68 4.1.4 QoS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 69 4.1.5 Resource Utilization . . . . . . . . . . . . . . . . . . . . 11 70 4.1.6 Dependencies . . . . . . . . . . . . . . . . . . . . . . . . 11 71 4.1.7 Other parameters (media-specific) . . . . . . . . . . . . . 11 72 4.1.8 Naming Hierarchy and/or Scoping . . . . . . . . . . . . . . 12 73 5. Requirements for Capability Description and Negotiation . . 13 74 5.1 Capability Constraints . . . . . . . . . . . . . . . . . . . 13 75 5.2 Processing Rules . . . . . . . . . . . . . . . . . . . . . . 13 76 6. Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . 15 77 7. SDPng: A Strawman Proposal . . . . . . . . . . . . . . . . . 16 78 7.1 Conceptual Outline . . . . . . . . . . . . . . . . . . . . . 16 79 7.1.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . 16 80 7.1.2 Components & Configurations . . . . . . . . . . . . . . . . 17 81 7.1.3 Constraints . . . . . . . . . . . . . . . . . . . . . . . . 18 82 7.1.4 Session . . . . . . . . . . . . . . . . . . . . . . . . . . 19 83 7.2 Syntax Proposal . . . . . . . . . . . . . . . . . . . . . . 19 84 7.3 Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . 21 85 References . . . . . . . . . . . . . . . . . . . . . . . . . 22 86 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 23 87 Full Copyright Statement . . . . . . . . . . . . . . . . . . 24 89 1. Introduction 91 Multiparty multimedia conferencing is one application that requires 92 the dynamic interchange of end system capabilities and the 93 negotiation of a parameter set that is appropriate for all sending 94 and receiving end systems in a conference. For some applications, 95 e.g. for loosely coupled conferences, it may be sufficient to simply 96 have session parameters be fixed by the initiator of a conference. 97 In such a scenario no negotiation is required because only those 98 participants with media tools that support the predefined settings 99 can join a media session and/or a conference. 101 This approach is applicable for conferences that are announced some 102 time ahead of the actual start date of the conference. Potential 103 participants can check the availability of media tools in advance 104 and tools like session directories can configure media tools on 105 startup. This procedure however fails to work for conferences 106 initiated spontaneously like Internet phone calls or ad-hoc 107 multiparty conferences. Fixed settings for parameters like media 108 types, their encoding etc. can easily inhibit the initiation of 109 conferences, for example in situations where a caller insists on a 110 fixed audio encoding that is not available at the callee's end 111 system. 113 To allow for spontaneous conferences, the process of defining a 114 conference's parameter set must therefore be performed either at 115 conference start (for closed conferences) or maybe (potentially) 116 even repeatedly every time a new participant joins an active 117 conference. The latter approach may not be appropriate for every 118 type of conference without applying certain policies: For 119 conferences with TV-broadcast or lecture characteristics (one main 120 active source) it is usually not desired to re-negotiate parameters 121 every time a new participant with an exotic configuration joins 122 because it may inconvenience existing participants or even exclude 123 the main source from media sessions. But conferences with equal 124 ``rights'' for participants that are open for new participants on 125 the other hand would need a different model of dynamic capability 126 negotiation, for example a telephone call that is extended to a 127 3-parties conference at some time during the session. 129 SDP [1] allows to specify multimedia sessions (i.e. conferences, 130 "session" as used here is not to be confused with "RTP session"!) 131 by providing general information about the session as a whole and 132 specifications for all the media streams (RTP sessions and others) 133 to be used to exchange information within the multimedia session. 135 Currently, media descriptions in SDP are used for two purposes: 137 o to describe session parameters for announcements and invitations 138 (the original purpose of SDP) 140 o to describe the capabilities of a system (and possibly provide a 141 choice between a number of alternatives). Note that SDP was not 142 designed to facilitate this. 144 A distinction between these two "sets of semantics" is only made 145 implicitly. 147 In the following we first introduce a model for session description 148 and capability negotiation and define some terms that are later used 149 to express some requirements. Note that this list of requirements is 150 possibly incomplete. The purpose of this document is to initiate the 151 development of a session description and capability negotiation 152 framework. 154 2. Terminology and System Model 156 Any (computer) system has, at a time, a number of rather fixed 157 hardware as well as software resources. These resources ultimately 158 define the limitations on what can be captured, displayed, rendered, 159 replayed, etc. with this particular device. We term features 160 enabled and restricted by these resources "system capabilities". 162 Example: System capabilities may include: a limitation of the 163 screen resolution for true color by the graphics board; available 164 audio hardware or software may offer only certain media encodings 165 (e.g. G.711 and G.723.1 but not GSM); and CPU processing power 166 and quality of implementation may constrain the possible video 167 encoding algorithms. 169 In multiparty multimedia conferences, participants employ different 170 "components" in conducting the conference. 172 Example: In lecture multicast conferences one component might be 173 the voice transmission for the lecturer, another the transmission 174 of video pictures showing the lecturer and the third the 175 transmission of presentation material. 177 Depending on system capabilities, user preferences and other 178 technical and political constraints, different configurations can be 179 chosen to accomplish the ``deployment'' of these components. 181 Each component can be characterized at least by (a) its intended use 182 (i.e. the function it shall provide) and (b) a one or more possible 183 ways to realize this function. Each way of realizing a particular 184 function is referred to as a "configuration". 186 Example: A conference component's intended use may be to make 187 transparencies of a presentation visible to the audience on the 188 Mbone. This can be achieved either by a video camera capturing 189 the image and transmitting a video stream via some video tool or 190 by loading a copy of the slides into a distributed electronic 191 whiteboard. For each of these cases, additional parameters may 192 exist, variations of which lead to additional configurations (see 193 below). 195 Two configurations are considered different regardless of whether 196 they employ entirely different mechanisms and protocols (as in the 197 previous example) or they choose the same and differ only in a 198 single parameter. 200 Example: In case of video transmission, a JPEG-based still image 201 protocol may be used, H.261 encoded CIF images could be sent as 202 could H.261 encoded QCIF images. All three cases constitute 203 different configurations. Of course there are many more detailed 204 protocol parameters. 206 Each component's configurations are limited by the participating 207 system's capabilities. In addition, the intended use of a component 208 may constrain the possible configurations further to a subset 209 suitable for the particular component's purpose. 211 Example: In a system for highly interactive audio communication 212 the component responsible for audio may decide not to use the 213 available G.723.1 audio codec to avoid the additional latency but 214 only use G.711. This would be reflected in this component only 215 showing configurations based upon G.711. Still, multiple 216 configurations are possible, e.g. depending on the use of A-law 217 or u-Law, packetization and redundancy parameters, etc. 219 In this system model, we distinguish two types of configurations: 221 o potential configurations 222 (a set of any number of configurations per component) indicating 223 a system's functional capabilities as constrained by the intended 224 use of the various components; 226 o actual configurations 227 (exactly one per instance of a component) reflecting the mode of 228 operation of this component's particular instantiation. 230 Example: The potential configuration of the aforementioned video 231 component may indicate support for JPEG, H.261/CIF, and 232 H.261/QCIF. A particular instantiation for a video conference 233 may use the actual configuration of H.261/CIF for exchanging 234 video streams. 236 In summary, the key terms of this model are: 238 o A multimedia session (streaming or conference) consists of one or 239 more conference components for multimedia "interaction". 241 o A component describes a particular type of interaction (e.g. 242 audio conversation, slide presentation) that can be realized by 243 means of different applications (possibly using different 244 protocols). 246 o A configuration is a set of parameters that are required to 247 implement a certain variation (realization) of a certain 248 component. There are actual and potential configurations. 250 * Potential configurations describe possible configurations that 251 are supported by an end system. 253 * An actual configuration is an "instantiation" of one of the 254 potential configurations, i.e. a decision how to realize a 255 certain component. 257 In less abstract words, potential configurations describe what a 258 system can do ("capabilities") and actual configurations describe 259 how a system is configured to operate at a certain point in time 260 (media stream spec). 262 To decide on a certain actual configuration, a negotiation process 263 needs to take place between the involved peers: 265 1. to determine which potential configuration(s) they have in 266 common, and 268 2. to select one of this shared set of common potential 269 configurations to be used for information exchange (e.g. based 270 upon preferences, external constraints, etc.). 272 In SAP [11] -based session announcements on the Mbone, for which SDP 273 was originally developed, the negotiation procedure is non-existent. 274 Instead, the announcement contains the media stream description sent 275 out (i.e. the actual configurations) which implicitly describe what 276 a receiver must understand to participate. 278 In point-to-point scenarios, the negotiation procedure is typically 279 carried out implicitly: each party informs the other about what it 280 can receive and the respective sender chooses from this set a 281 configuration that it can transmit. 283 Capability negotiation must not only work for 2-party conferences 284 but is also required for multi-party conferences. Especially for the 285 latter case it is required that the process of determining the 286 subset of allowable potential configurations is deterministic to 287 reduce the number of required round trips before a session can be 288 established. 290 In the following, we elaborate on requirements for an SDPng 291 specification, subdivided into general requirements and requirements 292 for session descriptions, potential and actual configurations as 293 well as negotiation rules. 295 3. General Requirements 297 Note that the order in which these requirements are presented does 298 not imply their relative importance. 300 3.1 Simplicity 302 The SDPng syntax shall be simple to parse and the protocol rules 303 shall be easy to implement. 305 3.2 Extensibility 307 SDPng shall be extensible in a backward compatible fashion. 308 Extensions should be doable without modifying the SDPng 309 specification itself. The spec should preclude two independent 310 extensions from clashing with each other (e.g. in the naming of 311 attributes). 313 Along with extensibility comes the requirement to identify certain 314 extensions as mandatory in a given context while others as optional. 316 3.3 Firewall Friendliness 318 It should be theoretically possible for firewalls (and other network 319 infrastructure elements) to process announcements etc. that contain 320 SDPng content. The concrete procedures have to be defined but if 321 possible the processing of the SDPng content should be doable 322 without interpretation of the textual descriptions. 324 3.4 Security 326 SDPng should allow independent security attributes for parts of a 327 session description. In particular, signing and/or encrypting parts 328 of a session description should be supported. 330 3.5 Text encoding 332 A concise text representation is desirable in order to enhance 333 portability and allow for simple implementations. At run time, size 334 of encoded packets should be minimized, processing as well. 336 A language that allows specifications to be formally validated is 337 desirable. 339 A tendency to use XML as basis for the specification language has 340 been expressed repeatedly. 342 3.6 Session vs. Media Description 344 In many application scenarios (particularly with SIP and 345 MEGACO/H.248), only media descriptions are needed and there is no 346 need for session description parameters. SDPng should make 347 parameter sets optional where it is conceivable that not all 348 application will need them. 350 3.7 Mapping (of a Subset) to SDP 352 It shall be possible to translate a subset of SDPng into standard 353 SDP session description to enable a certain minimal degree of 354 interoperability between SDP-based and SDPng-based systems. 355 However, as SDPng will provide enhanced functionality compared to 356 SDP, a full mapping to SDP is not possible. 358 Note: Backwards compatibility to the SDP syntax has been discussed 359 and it was found that this is not goal for SDPng, as it is felt that 360 RFC 2327 is too limiting. 362 Since several flavors of SDP have been developed (e.g., the MEGACO 363 WG uses certain non-SDP enhancements) it needs to be discussed which 364 of these flavors need to be considered for some kind of mapping. 366 4. Session Description Requirements 368 For now, we only consider requirements for media (stream) 369 descriptions. 371 4.1 Media Description 373 It must be possible to express the following information with SDPng: 375 4.1.1 Medium Type 377 Payload types and format parameters for audio and video are 378 well-defined and the basic semantics are clear (as defined in 379 RFC1889 [2] and RFC2327 [1]). 381 Format descriptions for text and whiteboard are currently only 382 defined in the context of specific applications, this is probably 383 going to change in the future (not an SDPng work item). 385 Non-standard (in terms of defined as a non-standard payload type) 386 codecs and format parameters can be accomplished by using dynamic 387 payload type mappings. This is a crucial feature of SDP that needs 388 to be preserved for RTP applications. 390 Current SDP only provides a= (a=fmtp) as means to specify codec 391 parameters but actually gives little support on how to do this. 392 Schemes for expressing more sophisticated parameters (e.g. 393 supporting nesting) may be necessary. Nevertheless, it is imperative 394 to keep the overall structure of a codec description manageable. 396 Note that there is a conflict between the desire to be able to use 397 any old SDP and translate it in SDPng and the desire to have a 398 useful structure in the SDPng data. 400 4.1.2 Media Stream Packetization 402 SDPng needs to be able to take care of more sophisticated payload 403 descriptions than simple payload type assignment. Audio/video 404 redundancy coding schemes need to be supported as need other 405 mechanisms for FEC (RFC 2733 [7]) and media stream repair (RFC 2354 406 [8]). Also, layered coding schemes need to be supported. 408 Finally, a separation of the media encoding scheme, the 409 packetization format, and possible repair schemes (and their 410 respective parameters) is required. 412 4.1.3 Transport 414 Since session descriptions are not only used to describe sessions 415 that use IPv4/RTP for media transport it must be possible to specify 416 different transport protocols (and their corresponding mandatory 417 parameters). This means SDPng must support different address 418 formats (IPv4, IPv6, E.164, NSAP, ...), multiplexing schemes (e.g. 419 to identify a channel on a TDM link), and different transport 420 protocol stacks (RTP/UDP/IP, RTP/AAL5/ATM, ...). Potential further 421 parameters and interdependencies for multiplexed transports should 422 be considered. 424 Additionally the requirement for expressing multiple addresses per 425 actual configuration (layered coding support) has emerged, as well 426 as the requirement for expressing multiple addresses per potential 427 configuration (one port per payload type to simplify processing at 428 the receiver). (A motivation has been provided by 429 draft-camarillo-sip-sdp-00.txt [9].) 431 In multi-unicast-scenarios it must be possible to specify more than 432 one transport address for a single media stream in an actual 433 configuration, i.e. by specifying address lists. 435 In "broadcast"- or "lecture"-like sessions source filters might be 436 needed that allow receivers to verify the source and apply filters 437 in multicast sessions. Similarly, for SSM, the transport address 438 includes an (Sender,Group) pair of IP addresses. 440 4.1.4 QoS 442 QoS-Parameters for different protocol domains (e.g. traffic 443 specification and flow specification or TOS bits for IP QoS) need to 444 be specified. draft-ietf-mmusic-sdp-qos-00.txt [10] has provided a 445 proposal for a syntax that can be used with SDP to describe network 446 and security preconditions that have to be met in order to establish 447 a session. 449 4.1.5 Resource Utilization 451 A requirement debated (but not yet agreed upon) was whether abstract 452 terms should be found to describe resource requirements (in terms of 453 CPU cycles, DSPs, etc.) 455 4.1.6 Dependencies 457 Certain codes may depend on other resources being available (e.g. a 458 G.723.1 audio codec may need a DTMF codec as well while a G.711 459 codec does not). Such interdependencies need to be expressed. 461 4.1.7 Other parameters (media-specific) 463 Extension mechanisms that allow to describe arbitrary other 464 parameters of media codecs and formats are mandatory. It is possibly 465 required to distinguish between mandatory and optional extension 466 parameters. 468 In particular, it must be possible to introduce new (optional) 469 parameters for a payload format and have old implementations still 470 parse the parameters correctly. 472 4.1.8 Naming Hierarchy and/or Scoping 474 Parameter names should be constructed in a way to avoid clashes and 475 thereby simplify independent development of e.g. codec parameter 476 descriptions in different groups. 478 5. Requirements for Capability Description and Negotiation 480 5.1 Capability Constraints 482 Capability negotiation is used to gain a session description (an 483 actual configuration) that is compatible with the different end 484 system capabilities and user preferences of the potential 485 participants of a conference. 487 A media capability description is the same as a potential 488 configuration, as it contains a set of allowable configurations for 489 different components that could be used to implement the 490 corresponding component. A capability description should allow 491 specifying a number of interdependencies among capabilities. 492 Traditional SDP only supports alternative capabilities and the 493 specification implicitly assumed that all capabilities could be 494 combined and basically used at the same time (looking at the pure 495 session description, at least). 497 Processing power, hardware, link, or other resources may preclude 498 the simultaneous use of certain configurations and/or limit the 499 number of simultaneous instantiations of one or more configurations. 500 This has led to a need to express in more detail constraints on 501 combinations of configurations including the following constraints: 503 o grouping capabilities (-> capability set); 505 o expressing simultaneous capability sets; 507 o expressing alternative capability sets; and 509 o constraining the number of uses of a certain capability (set). 511 It needs to be carefully investigated how much more sophistication 512 (if any) than simply listing alternatives needs to go into a base 513 specification of SDPng (and which extension mechanisms for certain 514 applications or for future revisions should be allowed). 516 Examples are known where complex capability descriptions are 517 available but are simply not used (at least not at the level of 518 sophistication that would be possible). This strongly calls for 519 keeping requirements on capability constraints rather modest (KISS). 521 5.2 Processing Rules 523 The processing of potential configurations includes the process of 524 "collapsing" sets of potential configurations offered by 525 participants, i.e. the computation of the intersection of these 526 potential configurations. 528 The processing (i.e. collapsing, forwarding etc.) of different 529 potential configurations in order to find a compatible subset must 530 work without having to know the semantics of the individual 531 parameters. This is a key requirement for extensibility. 533 Additionally it must be possible to make use of different 534 negotiation policies in order to reflect different conference types. 535 For example in a lecture-style conference the policy might be to 536 ensure that a capability collapsing process does not yield an actual 537 configuration that excludes the main source (i.e. the lecturer and 538 her end system) from the conference. 540 Preferences may also be considered in the negotiation process. This 541 may need to be considered at the SDPng level (e.g. to express 542 preferences, priorities). 544 Of course, the negotiation of configurations must not only work in 545 peer-to-peer-conference scenarios but also be usable in multi party 546 scenarios. 548 Negotiation of capabilities should take no longer than two or three 549 message exchanges. The description format must enable such 550 efficiency. 552 In order to allow for concise capability specification it will 553 probably be required to group descriptions of, say, codecs and to 554 establish a kind of hierarchy that allows to attach a certain 555 attribute or parameter to a whole group of codecs. 557 It might then also be required to have a naming scheme that allows 558 to name definitions in order to be able to later reference them in 559 subsequent definitions. This is useful in situations where some 560 definition extends a previous definition by just one parameter or in 561 situations where codecs are combined, for example for expressing 562 redundancy or layered codings. Different models of re-use are 563 conceivable. 565 6. Remarks 567 Explicitly addressing the issue of capability negotiation when 568 drafting the new session description language generates new sets of 569 requirements, some of which might conflict with other important 570 goals, such as simplicity, conciseness and SDP-compatibility. 572 However, we think that it's worthwhile to sketch a reasonably 573 complete and powerful solution first and then later develop a 574 migration path from today's technology instead of imposing 575 limitations at the outset to minimize the possibly necessary 576 changes. 578 7. SDPng: A Strawman Proposal 580 This section outlines a proposed solution for describing 581 capabilities that meets most of the above requirements. Note that 582 at this early point in time not all of the details are completely 583 filled in; rather, the focus is on the concepts of such a capability 584 description and negotiation language. 586 7.1 Conceptual Outline 588 Our concept for the description language follows the system model 589 introduced in the beginning of this document. We use a rather 590 abstract language to avoid misinterpretations due to different 591 intuitive understanding of terms as far as possible. 593 PLEASE NOTE that the examples in the following are given for 594 illustrative purposes only; they are not meant to be syntactically 595 complete or consistent. For a more real example refer to the end of 596 this section. 598 Our concept of a capability description language addresses various 599 pieces of a full description of system and application capabilities 600 in four separate "sections": 602 Definitions (elementary and compound) 604 Potential or Actual Configurations 606 Constraints 608 Session attributes 610 7.1.1 Definitions 612 The definition section specifies a number of "entities" that are 613 later referenced to avoid repetitions in more complex specifications 614 and allow for a concise representation. Entities are identified by 615 an "id" by which they may be referenced. Entities may be elementary 616 or compound (i.e. combinations of elementary entities). 618 Elementary entities do not reference other entities. Each 619 elementary entity only consists of one of more attributes and their 620 values. Default values specified in the definition section may be 621 overridden in descriptions for potential (and later actual) 622 configurations. 624 For the moment, elementary entities are defined for media types 625 (i.e. codecs) and for media transports. For each transport and for 626 each codec to be used, the respective attributes need to be defined. 628 This definition may be either within the "Definition" section itself 629 or in an external document (similar to the audio-visual profile or 630 an IANA registry that define payload types and media stream 631 identifiers. 633 Examples for elementary entities include "{media=audio, coding=PCM, 634 compression=ulaw, rate=8000}" to be identified by id="PCMU" and 635 "{transport=UDP, framing=RTP, network=IPv4, ...}" to be identified 636 by id="AVP". 638 Compound entities combine a number of elementary and/or other 639 compound entities for more complex descriptions. This mechanism can 640 be used for simple standard configurations such as G.711 over 641 RTP/AVP as well as to express more complex coding schemes including 642 e.g. FEC schemes, redundancy coding, and layered coding. Again, 643 such definitions may be standardized and externalized so that there 644 is no need to repeat them in every specification. 646 An example for a redundant audio payload format (following RFC 2198) 647 could be "{media=audio, coding=rfc2198, primary=ref:PCMU, 648 secondary=ref:GSM, pattern=1:2, pt=97}" referred to by 649 id="G711-Red". Standard uncompressed IP telephony audio could be 650 "{transport=ref:AVP, codec=ref:PCMU}" identified by id="IPTEL-UNC". 652 Both types of entities may have default values specified along with 653 them for each attribute. Some of these default values may be 654 overridden so that a codec definition can easily be re-used in a 655 different context (e.g. by specifying a different sampling rate) 656 without the need for a large number of base specifications. 658 This approach taken here allows to have simple as well as more 659 complex definitions which are commonly used be available in an 660 extensible set of reference documents. Care should be taken though 661 not to make the external references too complex and thus require too 662 much a priori knowledge in a protocol engine implementing SDPng. 664 Note: For negotiation between endpoints, it may be helpful to define 665 two modes of operation: explicit and implicit. Implicit 666 specifications may refer to externally defined entities to minimize 667 traffic volume, explicit specifications would list all external 668 definitions used in a description in the "Definitions" section. 670 7.1.2 Components & Configurations 672 The "Configurations" section contains all the components that 673 constitute the multimedia conference, IP telephone call, etc. For 674 each of these components, the potential and, later, the actual 675 configurations are given. Potential configurations are used during 676 capability exchange and/or negotiation, actual configurations to 677 configure media streams after negotiation or in session 678 announcements (e.g. via SAP). A potential and the actual 679 configuration of a component may be identical. 681 Each component has an identifier ("id") so that it can be referred 682 to, e.g. to associate semantics with a particular media stream. For 683 such a component, any number of configurations may be given with 684 each configuration describing an alternate way to realize the 685 functionality of the respective component. 687 Each configuration (potential as well as actual) is identified by an 688 "id". A configuration combines one or more (elementary and/or 689 compound) entities from the "Definitions" section to describe a 690 potential or an actual configuration. Within the specification of 691 the configuration, default values from the referenced entities may 692 be overwritten. 694 For example, an IP telephone call may require just a single 695 component id=interactive-audio with two possible ways of 696 implementing it. The two corresponding configurations are id=1 697 "{ref=IPTEL-UNC}" without modification, the other uses redundancy 698 coding by PCMU as both primary and secondary encoding: id=2 699 "{codec=ref:G711-Red;secondary=PCMU, transport=ref:AVP}". Typically, 700 transport address parameters such as the port number would also be 701 provided but are omitted here for brevity. 703 During/after the negotiation phase, an actual configuration is 704 chosen of out a number of alternative potential configurations, the 705 actual configuration may refer to the potential configuration just 706 by its "id", possibly allowing for some parameter modifications. 707 Alternatively, the full actual configuration may be given. 709 If, from the above example, potential configuration #1 is chosen,, 710 this could be expressed either in short form as "config=ref:1" or 711 fully specified as id=1 "{ref=IPTEL-UNC}". 713 7.1.3 Constraints 715 Definitions specify media, transport, and other capabilities, 716 configurations indicate which combinations of these could be used to 717 provide the desired functionality in a certain setting. 719 There may, however, be further constraints within a system (such a 720 CPU cycles, DSP available, dedicated hardware, etc.) that limit 721 which of these configurations can be instantiated in parallel (and 722 how many instances of these may exist). We deliberately do not 723 couple this aspect of system resource limitations to the various 724 application semantics as the constraints exist across application 725 boundaries. Also, in many cases, expressing such constraints is 726 simply not necessary (as many uses of the current SDP show), so 727 additional baggage can be avoided where this is not needed. 729 Therefore, we introduce a "Constraints" section to contain these 730 additional limitations. Constraints refer to potential 731 configurations and to entity definitions and express and use simple 732 logic to express mutual exclusion, limit the number of 733 instantiations, and allow only certain combinations. 735 By default, the "Constraints" section is empty (or missing) which 736 means that no further restrictions apply. 738 7.1.4 Session 740 The "Session" section is used to describe general parameters of the 741 communication relationship to be invoked or modified. It contains 742 most (if not all) of the general parameters of SDP (and thus will 743 easily be usable with SAP for session announcements). 745 In addition to the session description parameters, the "Session" 746 section also ties the various components to certain semantics. If, 747 in current SDP, two audio streams were specified (possibly even 748 using the same codecs), there was little way to differentiate 749 between their uses (e.g. live audio from an event broadcast vs. the 750 commentary from the TV studio). 752 This section also allows to tie together different media streams or 753 provide a more elaborate description of alternatives (e.g. subtitles 754 or not, which language, etc.). 756 Further uses are envisaged but need to be defined. 758 7.2 Syntax Proposal 760 In order to allow for the possibility to validate session 761 descriptions and in order to allow for structured extensibility it 762 is proposed to rely on a syntax framework that provides concepts as 763 well as concrete procedures for document validation and extending 764 the set of allows syntax elements. 766 SGML/XML technologies allow for the preparation of Document Type 767 Definitions (DTDs) that can define the allowed content models for 768 the elements of conforming documents. Documents can be formally 769 validated against a given DTD to check their conformance and 770 correctness. For XML, mechanisms have been defined that allow for 771 structured extensibility of a model of allowed syntax: XML Namespace 772 and XML Schema. 774 XML Schema allows to constrain the allowed document content, e.g. 776 for documents that contain structured data and also provide the 777 possibility that document instances can be conformant to several XML 778 Schema definitions at the same time, while allowing Schema 779 validators to check the conformance of these documents. 781 Extensions of the session description language, say for allowing to 782 express the parameters of a new media type, would require the 783 creation of a corresponding XML schema definition that contains the 784 specification of element types that can be used to describe 785 configurations of components for the new media type. Session 786 description documents have to reference the non-standard Schema 787 module, thus enabling parsers and validators to identify the 788 elements of the new extension module and to either ignore them (if 789 they are not supported) or to consider them for processing the 790 session/capability description. 792 It is important to note that the functionality of validating 793 capability and session description documents is not necessarily 794 required to generate or process them. For example, end-points would 795 be configured to understand only those parts of description 796 documents that are conforming to the baseline specification and 797 simply ignore extensions they cannot support. The usage of XML and 798 XML Schema is thus rather motivated by the need to allow for 799 extensions being defined and added to the language in a structured 800 way that does not preclude the possibility to have applications to 801 identify and process the extensions elements they might support. The 802 baseline specification of XML Schema definitions and profiles must 803 be well-defined and targeted to the set of parameters that are 804 relevant for the protocols and algorithms of the Internet Multimedia 805 Conferencing Architecture, i.e. transport over RTP/UDP/IP, the audio 806 video profile of RFC1890 etc. 808 The example below shows how the definition of codecs, 809 transport-variants and configuration of components could be 810 realized. Please note that this is not a complete example and that 811 identifiers have been chosen arbitrarily. 813 818 823 828 829 830 832 833 835 The example does also not include specifications of XML Schema 836 definitions or references to such definitions. This will be provided 837 in the next version of this draft. 839 A real-world capability description would likely be shorter than the 840 presented example because the codec and transport definitions can be 841 factored-out to profile definition documents that would only be 842 referenced in capability description documents. 844 7.3 Mappings 846 A mapping needs to be defined in particular to SDP that allows to 847 translate final session descriptions (i.e. the result of capability 848 negotiation processes) to SDP documents. In principle, this can be 849 done in a rather schematic fashion. 851 Furthermore, to accommodate SIP-H.323 gateways, a mapping from SDPng 852 to H.245 needs to be specified at some point. 854 References 856 [1] Handley, M. and V. Jacobsen, "SDP: Session Description 857 Protocol", RFC 2327, April 1998. 859 [2] Schulzrinne, H., Casner, S., Frederick, R. and V. Jacobsen, 860 "RTP: A Transport Protocol for Real-Time Applications", RFC 861 1889, January 1996. 863 [3] Schulzrinne, H., "RTP Profile for Audio and Video Conferences 864 with Minimal Control", RFC 1890, January 1996. 866 [4] Perkins, C., Kouvelas, I., Hodson, O., Hardman, V., Handley, 867 M., Bolot, J., Vega-Garcia, A. and S. Fosse-Parisis, "RTP 868 Payload for Redundant Audio Data", RFC 2198, September 1997. 870 [5] Klyne, G., "A Syntax for Describing Media Feature Sets", RFC 871 2533, March 1999. 873 [6] Klyne, G., "Protocol-independent Content Negotiation 874 Framework", RFC 2703, September 1999. 876 [7] Rosenberg, J. and H. Schulzrinne, "An RTP Payload Format for 877 Generic Forward Error Correction", RFC 2733, December 1999. 879 [8] Perkins, C. and O. Hodson, "Options for Repair of Streaming 880 Media", RFC 2354, June 1998. 882 [9] Camarillo, G., Holler, J. and G. AP Eriksson, "SDP media 883 alignment in SIP", Internet Draft 884 draft-camarillo-sip-sdp-00.txt, June 2000. 886 [10] Rosenberg, J., Schulzrinne, H. and S. Donovan, "Establishing 887 QoS and Security Preconditions for SDP Sessions", Internet 888 Draft draft-ietf-mmusic-sdp-qos-00.txt, June 1999. 890 [11] Handley, M., Perkins, C. and E. Whelan, "Session Announcement 891 Protocol", Internet Draft draft-ietf-mmusic-sap-v2-06.txt, 892 March 2000. 894 [12] Kumar, R. and M. Mostafa, "Conventions for the use of the 895 Session Description Protocol (SDP) for ATM Bearer 896 Connections", Internet Draft 897 draft-rajeshkumar-mmusic-sdp-atm-02.txt, July 2000. 899 [13] Quinn, B., "SDP Source-Filters", Internet Draft 900 draft-ietf-mmusic-sdp-srcfilter-00.txt, May 2000. 902 [14] Beser, B., "Codec Capabilities Attribute for SDP", Internet 903 Draft draft-beser-mmusic-capabilities-00.txt, March 2000. 905 [15] Casner, S., "SDP Bandwidth Modifiers for RTCP Bandwidth", 906 Internet Draft draft-ietf-avt-rtcp-bw-01.txt, March 2000. 908 Authors' Addresses 910 Dirk Kutscher 911 TZI, Universitaet Bremen 912 Bibliothekstr. 1 913 Bremen 28359 914 Germany 916 Phone: +49.421.218-7595 917 Fax: +49.421.218-7000 918 EMail: dku@tzi.uni-bremen.de 920 Joerg Ott 921 TZI, Universitaet Bremen 922 Bibliothekstr. 1 923 Bremen 28359 924 Germany 926 Phone: +49.421.201-7028 927 Fax: +49.421.218-7000 928 EMail: jo@tzi.uni-bremen.de 930 Carsten Bormann 931 TZI, Universitaet Bremen 932 Bibliothekstr. 1 933 Bremen 28359 934 Germany 936 Phone: +49.421.218-7024 937 Fax: +49.421.218-7000 938 EMail: cabo@tzi.org 940 Full Copyright Statement 942 Copyright (C) The Internet Society (2000). All Rights Reserved. 944 This document and translations of it may be copied and furnished to 945 others, and derivative works that comment on or otherwise explain it 946 or assist in its implmentation may be prepared, copied, published 947 and distributed, in whole or in part, without restriction of any 948 kind, provided that the above copyright notice and this paragraph 949 are included on all such copies and derivative works. However, this 950 document itself may not be modified in any way, such as by removing 951 the copyright notice or references to the Internet Society or other 952 Internet organizations, except as needed for the purpose of 953 developing Internet standards in which case the procedures for 954 copyrights defined in the Internet Standards process must be 955 followed, or as required to translate it into languages other than 956 English. 958 The limited permissions granted above are perpetual and will not be 959 revoked by the Internet Society or its successors or assigns. 961 This document and the information contained herein is provided on an 962 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 963 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 964 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 965 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 966 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 968 Acknowledgement 970 Funding for the RFC editor function is currently provided by the 971 Internet Society.