idnits 2.17.1 draft-jennings-xcon-media-control-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 9, 2004) is 7381 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: 'TODO' on line 768 == Unused Reference: '3' is defined on line 790, but no explicit reference was found in the text == Unused Reference: '4' is defined on line 795, but no explicit reference was found in the text == Unused Reference: '5' is defined on line 799, but no explicit reference was found in the text == Outdated reference: A later version (-01) exists of draft-mahy-xcon-media-policy-control-00 == Outdated reference: A later version (-05) exists of draft-ietf-sipping-conferencing-framework-00 Summary: 2 errors (**), 0 flaws (~~), 8 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 XCON WG C. Jennings 3 Internet-Draft Cisco Systems 4 Expires: August 9, 2004 B. Rosen 5 Marconi 6 February 9, 2004 8 Media Mixer Control for XCON 9 draft-jennings-xcon-media-control-00 11 Status of this Memo 13 This document is an Internet-Draft and is in full conformance with 14 all provisions of Section 10 of RFC2026. 16 Internet-Drafts are working documents of the Internet Engineering 17 Task Force (IETF), its areas, and its working groups. Note that other 18 groups may also distribute working documents as Internet-Drafts. 20 Internet-Drafts are draft documents valid for a maximum of six months 21 and may be updated, replaced, or obsoleted by other documents at any 22 time. It is inappropriate to use Internet-Drafts as reference 23 material or to cite them other than as "work in progress." 25 The list of current Internet-Drafts can be accessed at http:// 26 www.ietf.org/ietf/1id-abstracts.txt. 28 The list of Internet-Draft Shadow Directories can be accessed at 29 http://www.ietf.org/shadow.html. 31 This Internet-Draft will expire on August 9, 2004. 33 Copyright Notice 35 Copyright (C) The Internet Society (2004). All Rights Reserved. 37 Abstract 39 Conference mixers have many controls that change how the media is 40 combined for each participant in the conference. There is a need to 41 describe these to the clients connected to the a centralized 42 conference so that the clients can render a user interface and allow 43 the user to manipulate them. 45 This work is very early and far from complete. This draft sketched 46 the outline of a solution for consideration. It is being discussed on 47 the xcon@ietf.org mailing list. 49 Table of Contents 51 1. Conventions . . . . . . . . . . . . . . . . . . . . . . . . 4 52 2. Introduction to the Problem . . . . . . . . . . . . . . . . 4 53 2.1 Non Problems . . . . . . . . . . . . . . . . . . . . . . . . 4 54 3. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 4 55 3.1 Semantic information in a Conference . . . . . . . . . . . . 5 56 3.2 The Protocol . . . . . . . . . . . . . . . . . . . . . . . . 5 57 3.3 Templates . . . . . . . . . . . . . . . . . . . . . . . . . 5 58 3.4 Parameters . . . . . . . . . . . . . . . . . . . . . . . . . 6 59 3.5 Controls . . . . . . . . . . . . . . . . . . . . . . . . . . 6 60 3.6 Roles . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 61 4. Introductory Example . . . . . . . . . . . . . . . . . . . . 6 62 4.1 Simple Audio . . . . . . . . . . . . . . . . . . . . . . . . 6 63 5. Names and terminology . . . . . . . . . . . . . . . . . . . 8 64 5.1 Templates . . . . . . . . . . . . . . . . . . . . . . . . . 8 65 5.2 Participants . . . . . . . . . . . . . . . . . . . . . . . . 8 66 5.3 Streams . . . . . . . . . . . . . . . . . . . . . . . . . . 8 67 5.3.1 Stream Types . . . . . . . . . . . . . . . . . . . . . . . . 9 68 5.3.2 Stream URLs . . . . . . . . . . . . . . . . . . . . . . . . 9 69 5.3.3 Stream Priority . . . . . . . . . . . . . . . . . . . . . . 9 70 5.4 Roles . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 71 5.5 Controls . . . . . . . . . . . . . . . . . . . . . . . . . . 10 72 5.6 Parameters . . . . . . . . . . . . . . . . . . . . . . . . . 10 73 6. Solution . . . . . . . . . . . . . . . . . . . . . . . . . . 10 74 6.1 Templates . . . . . . . . . . . . . . . . . . . . . . . . . 11 75 6.1.1 Parameters . . . . . . . . . . . . . . . . . . . . . . . . . 11 76 6.1.2 Roles . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 77 6.1.3 Streams . . . . . . . . . . . . . . . . . . . . . . . . . . 12 78 6.1.4 Controls . . . . . . . . . . . . . . . . . . . . . . . . . . 12 79 6.1.5 Conference State . . . . . . . . . . . . . . . . . . . . . . 12 80 6.1.6 Transport Protocol . . . . . . . . . . . . . . . . . . . . . 13 81 6.2 Controls . . . . . . . . . . . . . . . . . . . . . . . . . . 13 82 6.2.1 Requirements . . . . . . . . . . . . . . . . . . . . . . . . 13 83 6.2.2 Strings . . . . . . . . . . . . . . . . . . . . . . . . . . 14 84 6.2.3 Integer . . . . . . . . . . . . . . . . . . . . . . . . . . 14 85 6.2.4 Boolean . . . . . . . . . . . . . . . . . . . . . . . . . . 15 86 6.2.5 Selection . . . . . . . . . . . . . . . . . . . . . . . . . 15 87 6.2.6 Multiple Selection . . . . . . . . . . . . . . . . . . . . . 15 88 6.2.7 Frame . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 89 7. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 16 90 7.1 Audio Video Presentation . . . . . . . . . . . . . . . . . . 16 91 8. Template Registry . . . . . . . . . . . . . . . . . . . . . 17 92 9. Comparison to other solutions . . . . . . . . . . . . . . . 18 93 10. CPCP vs. MPCP vs. CCP vs. MCP . . . . . . . . . . . . . . . 18 94 11. IANA . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 95 12. Security . . . . . . . . . . . . . . . . . . . . . . . . . . 18 96 13. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . 18 97 Normative References . . . . . . . . . . . . . . . . . . . . 18 98 Informative References . . . . . . . . . . . . . . . . . . . 18 99 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 19 100 Intellectual Property and Copyright Statements . . . . . . . 20 102 1. Conventions 104 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 105 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 106 document are to be interpreted as described in RFC-2119 [1]. 108 2. Introduction to the Problem 110 This work tries to solve the problem of allowing a conference 111 participant to manipulate the media flow in a mixer. It defines a 112 protocol between the end user's software manipulating the conference 113 and the centralized conference mixer. This needs to be rich enough 114 for a mixer to express what information it wants from a mixer yet 115 simple enough to allow the client to render a useful user interface 116 to the user. This work takes into account that real mixers have 117 constraints on what media flows are possible and that UIs have 118 buttons, knobs, etc that users manipulate. The goal is for a 119 conferencing end point made by one vendor to work with mixers or 120 conference systems made by another vendor. 122 2.1 Non Problems 124 There are several topics that are completely internal to the 125 conference systems and are out of scope for this this work. These 126 include: 128 How the focus manipulates the mixer. 130 How one describes what a mixer is capable of doing. 132 3. Overview 134 When a conference is created, it is instantiated from a template. The 135 template describes what controls are available for the client to 136 manipulate the media. The conference also describes roles that the 137 client can take on, such as Moderator. The template can have 138 parameters that are set when it is instantiated to allow one template 139 to describe variations of similar flow models. 141 This document describes the templates and ways for the client to 142 understand and manipulate the media in the conference. It allows for 143 the following: 145 A conference consists of several participants and multiple streams 146 of media flowing between the participant and the mixer. 148 Sidebars are mini conferences that are just like conferences 149 except that a sidebar cannot itself contain sidebars. 151 Clients can discover the template chosen for use in a conference, 152 and the Values of the parameters set for the conference 154 Clients can discover the available streams in a conference. 156 Clients can send media on a participant stream and receive media 157 and receive media on a mixer stream. 159 Clients can discover the Participants in a conference and their 160 role (this is more conference policy than media policy). 162 Clients can join a conference as a participant and assume a 163 particular role. 165 Conferences, Streams, and Participants can have controls that 166 manipulate the media sent and received. 168 The role of the participant will control what view of the 169 conference they have and which media streams they can manipulate. 171 3.1 Semantic information in a Conference 173 The conference has a list of Participants. Each Participant has a set 174 of Controls That he can manipulate. Each conference has a list of 175 sidebars. Each conference has a list of Streams. Each Stream has 176 attributes such as name, type, priority and list of contributing 177 participants. 179 3.2 The Protocol 181 The protocol between the client and the conference server allows the 182 client to get the semantic information in the conference, find out 183 when it changes, and make changes to it. It's probably something like 184 XCAP. [TODO add ref] 186 3.3 Templates 188 Templates define a model for the reception, manipulation and 189 transmission of streams. A template provides enough information that 190 the client can intelligently render a useful GUI to the end user to 191 manipulate the model. There is a registry of well known templates, 192 but a conference server can define new ones. A convener can find all 193 the templates a conference server supports and select one to use when 194 creating the conference. 196 A template for a very basic audio conference, for example, may 197 indicate that there is one audio stream for each participant, and one 198 output mixer stream named "primary". Each participant in the stream 199 has a single binary control called "Mute". There is only one Role 200 that can be used, called "participant". 202 3.4 Parameters 204 Parameters are variables in the template that are set when the 205 conference is created. For example, in the audio conference, the 206 maximum number of participants might be a parameter. If the value 207 was set to 10 when the conference is instantiated, then up to 10 208 participant streams can be accepted into the mixer. The template can 209 indicate the valid range for max number of participants, perhaps from 210 2 to 128. 212 3.5 Controls 214 Controls are variables participants may manipulate to control the 215 media streams of the conference. Conferences can have controls, 216 participants in a conference can have controls, and streams in a 217 conference can have controls. Controls can also be implicitly created 218 by stream action, for example a selector control based on the loudest 219 speaker. Controls have a name, and a value. Controls are defined in 220 the template. 222 3.6 Roles 224 Participants in a conference can take on different Roles that change 225 what ccontrols they may manipulate. The template defines what Roles 226 are available for the client. The moderator (which itself is a role) 227 can change the role of a particular participant. 229 4. Introductory Example 231 4.1 Simple Audio 233 The client selects the basic audio template that looks like: 235 248 The client retrieves this template and uses it to create a conference 249 where it sets the max-participants to 10. Alice and Bob join this 250 conference and the conference server tells Bob about the state of the 251 conference media. There is only one role "participamt". Each 252 participant contributes one input stream. There is also an output 253 stream per participant. There is a single control, called mute, for 254 each participant. 256 After Alice and Bob have joined, the conference server informs Bob 257 that the current state of the conference is as shown in the xml 258 below. 260 261 10 262 263 264 265 268 271 0 273 274 275 276 279 282 0 284 285 286 288 There are two participants, Alice and Bob, who both contribute input 289 streams and receive Mix streams and neither is muted. 291 Bob's client decides to change the Mute state for its audio stream 292 and sends the following to the conference server to change the state 293 of the conference. 295 296 297 1 298 299 300 302 A key part of this is that Bob's client may have known about this 303 basic audio template and what the semantics of the "mute" control 304 implied. The client may have connected this up with a button of the 305 client's that was labeled mute. On the other hand, Bob's client may 306 not have known anything about this template and simply rendered a 307 button on the screen and labeled it "mute" with no idea what this 308 would do. A third client may not have been table to deal with the 309 control at all and may have just ignored it. Clearly the user 310 interface can be better if the client understands the semantics of 311 what the template means, but the user interface is still functional 312 when the client does not. 314 5. Names and terminology 316 5.1 Templates 318 Templates contain a list of stream, roles for participants, 319 parameters that need to be set, and controls for the conference. 321 5.2 Participants 323 Participants are the logical user entities participating in a 324 conference. 326 5.3 Streams 328 The stream is a named stream of media. An example is a simple audio 329 conference with 6 participants and a mixer that mixes the loudest 330 three. Each participant contributes an input stream. There is a 331 single logical output stream, but every participant gets a "custom" 332 version of this stream, because, in normal mixers, each participants 333 can hear all inputs except his own. This is commonly referred to as 334 "mix-minus". If the output steam also has a control (mute), the 335 output streams for each participant may also vary depending on the 336 state of the control. 338 Streams all have a type, a name, a direction (in or out), one or more 339 URLs, and a priority. The URL is the source or sink of the stream. 340 The priority indicates how important this particular stream is to the 341 conference and the type indicates the type of media carried in this 342 steam. 344 Streams have types. These correspond to the major MIME types of the 345 media they send. 347 5.3.1 Stream Types 349 5.3.1.1 Audio 351 Streams originate as participant contributions (dir="in") that are 352 mixed using some kind of algorithm. Intermediate streams may be 353 created, which are subsequently mixed with other streams yielding 354 streams which are sent to participants (dir="out"). Controls 355 commonly available on audio streams include input or output faders 356 (volume controls), stereo balance, and mute. 358 5.3.1.2 Video 360 Streams originate as participant contributions (dir="in") that are 361 combined with some kind of algorithm. Intermediate streams may be 362 created, which are subsequently combined with other streams yielding 363 streams which are sent to participants (dir="out"). Controls 364 commonly available on video streams might include selectors for 365 choosing a tiling format, selectors which input streams appear on 366 output tiles, and video mutes. 368 5.3.1.3 Text 370 Streams originate as participant contributions (dir="in") (Instant 371 Messages). Messages from all participants are combined using some 372 algorithm. Intermediate streams may be created, which are 373 subsequently combined with other text streams yielding streams which 374 are sent to participants (dir="out"). 376 5.3.1.4 Application 378 At a minimal level, this consist of a URL that defines the 379 application. Many systems will simply update an http URL that fetches 380 an HTML page that shows the current presentation. 382 5.3.2 Stream URLs 384 Streams have URLs that specify the source or sink of the stream. 385 These would typically be a SIP, H323 or XMPP URL. 387 5.3.3 Stream Priority 388 Streams have a priority from 0 to 1. Zero indicates that a client, by 389 default, should not play/display this stream unless the user 390 specifically requests it. A priority of 1 indicates that, by default, 391 the client should render this stream and should warn the user if it 392 cannot. Other values only define an ordering, and clients should 393 attempt to use their resources to display the higher priority streams 394 before the lower. 396 5.4 Roles 398 Roles are defined as part of Conference Policy but are used here so 399 that the Media Policy can define separate streams and controls 400 depending on role. Roles are defined by in the template. Some 401 templates may allow a participant to take on more than one role at a 402 time. Each template must define a role named "participant", which is 403 the default role. "Moderator" is a typical role, as is 404 "Floor-Holder", but templates do not intrinsically define or require 405 such roles. 407 5.5 Controls 409 Controls manipulate the state of the conference while it is 410 instantiated. All controls have a name, a type, a current value and 411 permissions that indicate whether or not the current client can 412 modify them. They may also have, optionally, a min and max value. 414 A control can be defined as being part of a role. In that case, all 415 participants who assume that role have an instance of the control. A 416 control may also be defined as part of a stream, in which case all 417 contributors of that stream (dir="in") have an instance of the 418 control, or all sinks of the stream (dir="out") have an instance of 419 the control. There can be global controls, which are available to 420 all participants. Implicit controls extract values from streams (or 421 other controls), such as choosing video inputs based on loudest 422 speakers 424 5.6 Parameters 426 Parameters are variables that modify the function of the template. 427 They are fixed when the conference is instantiated. Parameters allow 428 a single template definition to describe a range of possible mixer 429 capabilities. 431 Parameters have a name, a type, a value and, optionally, a mix and 432 max value. 434 6. Solution 435 6.1 Templates 437 A template is an xml document. The template definition includes a 438 name, which is a string, for example: 440