idnits 2.17.1 

draft-mahy-xcon-media-policy-control-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Looks like you're using RFC 2026 boilerplate.  This must be updated to
     follow RFC 3978/3979, as updated by RFC 4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not
     match the current year

  == Line 241 has weird spacing: '...ue type  regis...'

  == Line 492 has weird spacing: '...ence to  the...'

  == Line 495 has weird spacing: '...ence to  the...'

  == Line 504 has weird spacing: '...ence to  the...'

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (February 16, 2004) is 7374 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Possible downref: Normative reference to a draft: ref. '2' 

  -- Possible downref: Normative reference to a draft: ref. '3' 

  == Outdated reference: A later version (-02) exists of
     draft-koskelainen-xcon-xcap-cpcp-usage-00

  -- Possible downref: Normative reference to a draft: ref. '4' 

  ** Obsolete normative reference: RFC 2616 (ref. '5') (Obsoleted by RFC
     7230, RFC 7231, RFC 7232, RFC 7233, RFC 7234, RFC 7235)

  -- Possible downref: Non-RFC (?) normative reference: ref. '6'

  -- Possible downref: Non-RFC (?) normative reference: ref. '7'

  == Outdated reference: A later version (-26) exists of
     draft-ietf-mmusic-sdp-new-13

  ** Obsolete normative reference: RFC 3388 (ref. '11') (Obsoleted by RFC
     5888)

  -- Possible downref: Non-RFC (?) normative reference: ref. '12'

  == Outdated reference: A later version (-05) exists of
     draft-ietf-sipping-conferencing-framework-00

  == Outdated reference: A later version (-01) exists of
     draft-ietf-sipping-conferencing-requirements-00

  == Outdated reference: A later version (-12) exists of
     draft-ietf-sipping-conference-package-00

  -- No information found for draft-koskelainen-xcon-floor-control-reqs - is
     the name correct?


     Summary: 3 errors (**), 0 flaws (~~), 11 warnings (==), 9 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	XCON BOF                                                         R. Mahy
3	Internet-Draft                                                 N. Ismail
4	Expires: August 16, 2004                             Cisco Systems, Inc.
5	                                                       February 16, 2004

7	  Media Policy Manipulation in the Conference Policy Control Protocol
8	              draft-mahy-xcon-media-policy-control-01.txt

10	Status of this Memo

12	   This document is an Internet-Draft and is in full conformance with
13	   all provisions of Section 10 of RFC2026.

15	   Internet-Drafts are working documents of the Internet Engineering
16	   Task Force (IETF), its areas, and its working groups. Note that other
17	   groups may also distribute working documents as Internet-Drafts.

19	   Internet-Drafts are draft documents valid for a maximum of six months
20	   and may be updated, replaced, or obsoleted by other documents at any
21	   time. It is inappropriate to use Internet-Drafts as reference
22	   material or to cite them other than as "work in progress."

24	   The list of current Internet-Drafts can be accessed at http://
25	   www.ietf.org/ietf/1id-abstracts.txt.

27	   The list of Internet-Draft Shadow Directories can be accessed at
28	   http://www.ietf.org/shadow.html.

30	   This Internet-Draft will expire on August 16, 2004.

32	Copyright Notice

34	   Copyright (C) The Internet Society (2004). All Rights Reserved.

36	Abstract

38	   The SIP conferencing framework defines a model for tightly-coupled
39	   conferencing signaled via the Session Initiation Protocol (SIP), in
40	   which a Conference Policy Control Protocol is used to manipulate
41	   policies relevant to a specific conference, such as conference
42	   membership policy, authorization policy, and media layout. This
43	   document describes a logical model, which can apply to any session
44	   setup protocol, to describe media processing in a tightly-coupled
45	   conference. It also defines specific protocol semantics and a
46	   specific syntax to manipulate that model.

48	Table of Contents

50	   1.  Conventions  . . . . . . . . . . . . . . . . . . . . . . . . .  3
51	   2.  Overview . . . . . . . . . . . . . . . . . . . . . . . . . . .  3
52	   2.1 Streams  . . . . . . . . . . . . . . . . . . . . . . . . . . .  4
53	   2.2 Groups and Bundles . . . . . . . . . . . . . . . . . . . . . .  5
54	   2.3 Operators  . . . . . . . . . . . . . . . . . . . . . . . . . .  5
55	   2.4 Collections  . . . . . . . . . . . . . . . . . . . . . . . . .  6
56	   2.5 Using these Elements . . . . . . . . . . . . . . . . . . . . .  8
57	   3.  Some Standard Operators  . . . . . . . . . . . . . . . . . . . 10
58	   4.  More about Collections . . . . . . . . . . . . . . . . . . . . 14
59	   4.1 The Basic Audio Collection . . . . . . . . . . . . . . . . . . 15
60	   4.2 Basic Video MP Collection  . . . . . . . . . . . . . . . . . . 16
61	   4.3 Basic Audio Collection with Floor Control  . . . . . . . . . . 17
62	   4.4 Basic Video Collection with Floor Control  . . . . . . . . . . 18
63	   4.5 Sidebar Audio Collection . . . . . . . . . . . . . . . . . . . 19
64	   5.  Semantics  . . . . . . . . . . . . . . . . . . . . . . . . . . 21
65	   5.1 Transactions . . . . . . . . . . . . . . . . . . . . . . . . . 21
66	   5.2 Client Behavior  . . . . . . . . . . . . . . . . . . . . . . . 21
67	   5.3 Server Behavior  . . . . . . . . . . . . . . . . . . . . . . . 22
68	   5.4 Notifications of media policy changes  . . . . . . . . . . . . 23
69	   6.  Formal Syntax  . . . . . . . . . . . . . . . . . . . . . . . . 23
70	   7.  Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
71	   8.  Security Considerations  . . . . . . . . . . . . . . . . . . . 31
72	   9.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 31
73	   10. Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 31
74	       Normative References . . . . . . . . . . . . . . . . . . . . . 31
75	       Informational References . . . . . . . . . . . . . . . . . . . 32
76	       Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 33
77	   A.  Standard Tile Order  . . . . . . . . . . . . . . . . . . . . . 33
78	       Intellectual Property and Copyright Statements . . . . . . . . 34

80	1. Conventions

82	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
83	   "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in this
84	   document are to be interpreted as described in RFC-2119 [1].

86	2. Overview

88	   The SIP conferencing framework [13] defines a model for
89	   tightly-coupled conferences setup via SIP [8], in which a Conference
90	   Policy Control Protocol is used to manipulate policies which are
91	   relevant to a specific conference instance, such as conference
92	   membership policy, authorization policy, and media layout.  (As
93	   discussed later, the bulk of this model is applicable to
94	   tightly-coupled conferences accessed using almost any session setup
95	   protocol.) While the conference policy control protocol provides many
96	   non-media specific functions [4] such as membership policy and
97	   authorization policy, this document specifically addresses
98	   requirements [3] to manipulate the way in which media in such a
99	   conference is selected, combined, and modified. It defines a logical
100	   model of media processing using a "media topology graph". By
101	   manipulating the graph, authorized users can change the media
102	   processing behavior of the mixers associated with a specific
103	   conference.

105	   Here we will briefly summarize the terminology used in SIP
106	   conferencing framework in protocol-inspecific terms. Each
107	   "conference" is an instance of a multi-media conversation which has a
108	   unique protocol-specific identifier.  Other (optional) identifiers
109	   can represent a conference-factory (an identifier which creates new
110	   conferences when contacted).  Conferences can contain
111	   sub-conferences, which have a unique identifier within the
112	   conference, and optionally a unique, protocol-specific, external
113	   identifier as well.  Each conference identifier is managed by a
114	   logical role called a focus, which manages session state for all
115	   sessions in the conference.  The focus is responsible for
116	   coordinating media combining through logical mixers.  Mixers perform
117	   the actual selection and combination operations.  A logical
118	   Conference Policy server manages creation and deletion of
119	   conferences, authorization, conference longevity, and the media
120	   layout or topology.  In addition, the focus can use protocol-specific
121	   notification mechanisms to provide access to a basic roster and
122	   changes in media or non-media aspects of conference policy.  Finally,
123	   the conference policy may be configured such that mixers use the
124	   information returned dynamically by Floor control server(s) to affect
125	   media selection.

127	   A media topology graph is a loop-free graph which consists of
128	   individual media streams, logical groups of media streams, and
129	   functions or "operations" performed on those streams. These elements
130	   are typically associated with a specific subconference.  A
131	   subconference simply defines a context which allows different groups
132	   of users to share a media topology and participant roster with a
133	   subset of the participants in a conference.  Subconferences are
134	   defined in the conferencing framework, and are typically used to
135	   enable conferencing sidebars. For convenience purposes,
136	   subgraphs--called collections--of connected operators can be defined,
137	   instantiated, and manipulated just like individual elements. These
138	   elements and their properties are described below.

140	2.1 Streams

142	   In the beginning there were Streams. These are the actual media
143	   streams sent and/or received by or on behalf of conference
144	   participants. Media streams are typically established when conference
145	   participants join a conference and are described by the SDP [9]
146	   media lines in the offer/answer [10] exchange between the
147	   participants and the focus, or the analogous exchange in other
148	   protocols (ex: H.245 [12] logical channel establishment).  Within the
149	   media topology graph, each stream is described by a media type,
150	   direction and at least one identifier. Initially media types
151	   considered include audio, video or text. (Other media types can also
152	   be considered in the future.) The direction "in" corresponds to
153	   streams originating from the conference participants to the
154	   conference, and "out" for streams originating from the conference and
155	   terminating at the conference participant. Stream identifiers can be
156	   network identifiers or aliases. Network identifiers consist of an
157	   address family (IPv4 or IPv6), an IP address, and a port number.

159	   Aliases can also be created for any of the streams, either
160	   automatically or when created manually. One such automatic alias
161	   consists of a participant identifier and a media stream instance (for
162	   example, in SDP, either the media stream identification "mid" as
163	   specified in RFC3388 [11] or the position of the media line
164	   describing the stream in SDP). Another set of automatic aliases can
165	   be created automatically when per media line i-lines (description
166	   lines) appear in the SDP.

168	   Conference Policy servers provide clients with lists of stream
169	   descriptions as part of protocol-specific notification mechanisms
170	   such as the SIP conference package [15] and in response to inventory
171	   requests as specified in Section 5.3. Clients use the stream
172	   identifier that is part of a stream description to associate and
173	   connect (or disconnect) a specific stream with a specific group.
174	   (Stream identifiers also play an important role in the naming of the
175	   logical internal streams which make up the "bundles" described later
176	   in this section.)

178	      Editors Note: The distinction between external streams and
179	      internal (logical) streams may be confusing.  If this becomes a
180	      problem, one or both terms will be renamed.

182	2.2 Groups and Bundles

184	   Media groups (hereafter just "groups") are created automatically by
185	   servers within the context of a sub-conference as specified in
186	   Section 5.3 and have a media type and a direction.   Input groups
187	   take individual streams and aggregate them into a bundle of named
188	   streams.  Likewise, output groups accept a bundle of named streams,
189	   and distribute these as appropriate to individual output streams.
190	   One motivation for naming streams in a bundle is described shortly.
191	   Also, the process used to distribute output streams is described in
192	   the server behavior section.  Groups do not connect directly to other
193	   groups.

195	   Bundles are a logical concept which represent a set of individually
196	   tagged (named) logical streams.  Input bundles contain tags which
197	   describe which identifier or participant is contributing to a logical
198	   stream.  Output bundles contain tags which describe which identifiers
199	   or participants should receive a logical stream.  This distinction
200	   allows participants to receive different streams even when their
201	   logical description of the topology is the same.  For example, in
202	   most audio conferences participants do not hear their own input.
203	   Most output bundles also contain a default logical stream.

205	2.3 Operators

207	   Next are Operators. Operators are basic elements that perform simple
208	   media operations. They select among media streams, combine streams,
209	   or perform other media processing. Each operator has a type, one or
210	   more inputs, one logical output, and an optional set of parameters.
211	   The type uniquely identifies the operator and specifies the media
212	   service offered.

214	   Selection operators typically accept an input bundle and generate an
215	   ordered Set of names of logical streams.  These sets can be further
216	   manipulated by other operators, but typically they are used as input
217	   to a mixing or combining operator.  Mixing operators typically
218	   receive an input bundle and an ordered list and generate an output
219	   bundle.  Obviously at least one mixer in the topology graph must be
220	   present which can switch the orientation of the streams.  Other types
221	   of mixers may receive one or more output bundles, perform the
222	   appropriate content manipulation, and return a bundle which preserves
223	   the sense of the original tags.

225	   For example, the simplest type of mixer is a promiscuous media mux.
226	   It receives an input bundle and generates a bundle consisting of a
227	   single default stream (all of the original streams appended to each
228	   other).  In another simple variation, a media mux generates a named
229	   output stream in the output bundle which contains all the other
230	   output except that of the sender, for each named input stream in the
231	   input bundle.  Most mixing operations actually combine input streams
232	   in some media-specific way (for example: tiling for video).  Other
233	   types of operators can provide other arbitrary media or set
234	   manipulations such as adjust volume, cross-fade, etc.  Operators
235	   cannot connect directly to input or output streams. Each type of
236	   operator defines the semantics of the operation and any parameters.
237	   Parameters define aspects of the operator's function that can differ
238	   from one instance of the operator to another.

240	   This document defines a set of standard operators (see Section 3 ).
241	   Each standard operator has a unique type  registered with IANA and an
242	   XML schema describing the operator. Server implementations can
243	   support any of the set of standard operators. As well, implementors
244	   can define their own operators and operator types.  Clients can
245	   discover which operators are supported by making inventory requests
246	   to the Server.  Authorized clients can then instantiate operators
247	   using the method specified in Section 5.2.

249	2.4 Collections

251	   Finally there are Collections. Collections are subgraphs created by
252	   connecting different operators together. Each collection can provide
253	   a specific, potentially sophisticated, media service.  Like
254	   operators, a collection has a type that uniquely identifies it and
255	   specifies its function.  Each collection has one or more inputs, one
256	   logical output and an optional set of parameters.  As with operators,
257	   this specification defines a set of standard collections that offer
258	   the most common mixing and switching media functions available.  Each
259	   standard collection has a unique type that will be registered with
260	   IANA and an XML schema describing the collection. Server
261	   implementations can support any of the set of standard collections
262	   and they can also define their own proprietary collections. Each
263	   newly defined collection needs a unique type and a published XML
264	   schema. Clients can make inventory requests to Servers to get the set
265	   of collections supported by the server. Clients can then instantiate
266	   collections using the method specified in Section 5.2. Clients can
267	   also make their own collections to provide new media services by
268	   using the method specified in Section 4.

270	   Below follows an example diagram of a media topology graph for a
271	   simple audio conference using the default audio collection.

273	                                 Input Streams

275	                            A     B     C     D     E

277	                            |     |     |     |     |
278	                            |     |     |     |     |
279	                            v     v     v     v     v
280	                     +----------------------------------+
281	                     |                                  |
282	                     |  Subconference 0 (Main conf)     |
283	                     |  Audio Input Group               |
284	                     |                                  |
285	                     +----------------------------------+
286	                                   ||
287	                                   \/
288	               .............................................
289	               :          ||                ||             :
290	               :   Input  ||       Input    ||             :
291	               :   Bundle ||       Bundle   ||             :
292	               :          ||                \/             :
293	               :          ||       +-------------+         :
294	               :          ||       |  Speaker    |         :
295	               :          ||       |  Selection  |         :
296	               :          ||       |  Operator   |         :
297	       Default :          ||       |             |         :
298	       Audio   :          ||       +-------------+         :
299	       Collec- :          ||          /                    :
300	        tion   :          ||         /  Ordered List of    :
301	               :          \/        /          Speakers    :
302	               :      +---------------+                    :
303	               :      |  Audio        |                    :
304	               :      |  MixMinus     |                    :
305	               :      |  Operator     |                    :
306	               :      |               |                    :
307	               :      +---------------+                    :
308	               :          ||                               :
309	               :          || Output Bundle                 :
310	               :          \/                               :
311	               .............................................
312	                                    ||
313	                                    \/
314	                      +----------------------------------+
315	                      |                                  |
316	                      |  Subconference 0 (Main conf)     |
317	                      |  Audio Input Group               |
318	                      |                                  |
319	                      +----------------------------------+
320	                             |     |     |     |     |
321	                             |     |     |     |     |
322	                             v     v     v     v     v

324	                             A     B     C     D     E

326	                                  Output Streams

328	2.5 Using these Elements

330	   This document defines numerous standard operators (in Section 3) to
331	   facilitate interoperability. Implementors are free to extend this
332	   list of operators, and an IANA registration process is defined for
333	   this purpose. Note that specific conference servers may (MAY) support
334	   as few or as many operators as they choose, however each conference
335	   server needs to (MUST) support at least one standard collection per
336	   media type (these are defined in Section 4) which the conference
337	   server is capable of handling.

339	   Media manipulation is generally media-specific. When a subconference
340	   is created, an input group and an output group are automatically
341	   created for each media type supported by the conference server, and a
342	   specific collection can be instantiated (again, for each media type).
343	   Once instantiated, collections are simply a subgraph of operators
344	   connected in some specific way. The resulting graph can be modified,
345	   attached, detached, and deleted without affecting the collection from
346	   which the graph was copied. Note also that more than one collection
347	   can be incorporated into the topology graph for a given subconference
348	   and media type.

350	   Manipulating the topology graph for a tightly-coupled conference
351	   enables a number of useful features, many of which are described in
352	   the XCON scenarios [16] and SIP conferencing high-level requirements
353	   [14] documents.

355	   For example, noisy participants can be "muted" from a conference by
356	   disconnecting their audio from the appropriate input group.
357	   Participants can be moved to a sidebar by disconnecting their media
358	   streams (some or all of them) and reconnecting them to the input and
359	   output groups created for the corresponding subconference.
360	   Interaction with floor control [17] is coordinated by including an
361	   operator which selects only media streams corresponding to
362	   participants who have the appropriate floor. The resulting logical
363	   output stream or group of streams can be connected to a suitable
364	   filtering, mixing, or combining operator (for example tiling for
365	   video).

367	   Obviously, authorization is required to allow manipulation of media
368	   topology by multiple parties (participants and non-participants
369	   alike). The effects of manipulating the media topology graph can
370	   range from simple, benign changes which only affect the participant
371	   requesting the change, to complete failure of the conference. Clearly
372	   no one-size-fits-all policy can be applied.  However it is useful to
373	   recognize several different categories or severities of impact.

375	   o  connecting and disconnecting your own streams to a group

377	   o  connecting and disconnecting another participants streams

379	   o  creating subconferences

381	   o  instantiating arbitrary operators or collections

383	   o  connecting and disconnecting operators and collections to your own
384	      groups

386	   o  connecting and disconnecting operators and collections which
387	      affect an existing conference or subconference

389	   The rest of the functions of the Conference Policy Control Protocol
390	   (CPCP for brevity) are mostly orthogonal to media manipulation and so
391	   they are described in a separate document [4]. However it is
392	   important to mention the interaction between the media
393	   topology-specific and other aspects of the policy. Conferences and
394	   subconferences can be created and deleted by CPCP. Although not
395	   topology dependent, when these are created the media topology will
396	   change automatically to reflect this. Also, one participant may wish
397	   to invite several other participants to a subconference (sidebar),
398	   but the initiating participant may not have permission to change the
399	   stream connection properties of all of the participants. In this
400	   case, the initiator places the participant in a pending state. This
401	   informs the participant that the initiator would like the participant
402	   to join the sidebar. Then the participant (or an agent acting on his
403	   or her behalf) either makes the requested change to the media
404	   topology by connecting his or her streams to the appropriate groups
405	   (a media topology task), or removes himself or herself from the
406	   pending list (a non-media related task). Finally, in many cases
407	   authorized users can set authorization policy related to a variety of
408	   aspects of conference policy. While setting these policies is
409	   non-media related, many uses of these policies do affect the media
410	   topology. Note that because of this separation, it is possible to
411	   produce an implementation of CPCP which runs on two separate servers,
412	   one responsible for media topology and the other responsible for the
413	   balance of conference policy functions.

415	3. Some Standard Operators

417	   This sections specifies a set of operators that are needed to provide
418	   the most common media processing operators used in conferencing
419	   today. Each operator performs a specific function. Each type of
420	   operator is registerd with IANA and has an XML Schema [7] that
421	   defines how to use the operator. Server implementations are free to
422	   support any number of these operators (or none of them) as well as
423	   define their own operators.

425	   The operators described below are logical operators which are useful
426	   for describing conference features. Implementations may use any
427	   internal representation which generates externally identical
428	   functionality.  The formal syntax for using these operators is
429	   described in Section 6.

431	   The "audioSelectSpeakers" operator takes an audio input bundle and
432	   generates an ordered list of names of streams.  This list is ordered
433	   by the priority for including them in an audio mix.  No specific
434	   algorithm is specified for selecting which speakers are the "best",
435	   but commercial implementations typically use a combination of last,
436	   loudest, and longest speakers. The actual list of selected speakers
437	   is dynamically calculated by a conference mixer.  A generically vague
438	   definition was intentionally chosen to allow most implementations to
439	   offer this operator.

441	   The "audioMixMinus" operator takes an audio input bundle and an
442	   ordered list of names of streams and generates an audio output
443	   bundle.  It selects the first <n> of the streams from the ordered
444	   list, where <n> is an implementation-specific integer.  The output
445	   bundle contains a default stream (which mixes all <n> logical
446	   streams) and one logical stream for each stream present in the
447	   original input bundle which contains a mix of all <n> logical streams
448	   except for input streams corresponding to the same participant as
449	   that output stream.  In general this property of a mixer is called an
450	   exclusive property because it causes participant ouputs to be
451	   excluded from their own inputs.  With these two operators, you can
452	   build the default audio collection described in Section 4.1 and
453	   illustrated in the figure in Section 2.4.

455	   The "allParticipantsSet" operator takes an input bundle and generates
456	   an unordered list of all the stream names which could conceivably
457	   contribute to that bundle.

459	   The "videoSelectSpeakers" operator takes an audio input bundle (to
460	   determine who is speaking) and generates an ordered list of names of
461	   streams.  This list is ordered by the priority for including any
462	   corresponding video streams in a video mix.  Note that at a given
463	   instant the output of videoSelectSpeakers and audioSelectSpeakers may
464	   be different.  For example, video speaker selection algorithms
465	   typically delay their selection to avoid swapping speakers in the
466	   presence of noise such as coughs.

468	   The "setIntersection" operator takes an (optionally) ordered list and
469	   an unordered list and generates a new list in the same order as the
470	   first list.  The new list contains the intersection of the members of
471	   the two lists.

473	   The "streamMux" operator takes an input bundle and an ordered list of
474	   streams, and generates an output bundle where each output stream
475	   contains at least <n> and at most <m> of the input streams muxed in
476	   priority order.  (<n> and <m> are attributes which specify the
477	   minimum and maximum number of streams respectively).  This operator
478	   also takes an attribute which indicates if the operator should
479	   include input streams corresponding to the output stream's
480	   participant.  With these additional four operators you can build the
481	   default multipoint video collection described in Section 4.2.  A
482	   client using these operators directly to create the same effect would
483	   follow these steps. (Note that in most cases the correct "connector"
484	   to use is implicit from the direction and type of the connection.)

486	   1.  Instantiate a streamMux operator with the following parameters:
487	       n=1, m=1, exclusive=true.

489	   2.  Instantiate an allParticipants operator, a setIntersection
490	       operator, and a videoSelectSpeakers operator.

492	   3.  Connect the video input group for this conference to  the
493	       allParticipants operator

495	   4.  Connect the audio input group for this conference to  the
496	       videoSpeakerSelection operator

498	   5.  Connect the allParticipants operator to the "unordered" input of
499	       the setIntersection operator

501	   6.  Connect the videoSelectSpeakers operator to the "ordered" input
502	       of the setIntersection operator

504	   7.  Connect the video input group for this conference to  the
505	       streamMux operator

507	   8.  Connect the (output of the) setIntersection operator to the
508	       streamMux operator

510	   9.  Connect the streamMux operator to the video output group for this
511	       conference

513	   The "selectFloorHolders" operator takes an input bundle and a
514	   mandatory attribute which names the floor, and generates an unordered
515	   list of names of streams which have been granted the named floor.
516	   With this additional operator you can build the floor controlled
517	   audio collection in Section 4.3 and the floor controlled video
518	   collection in Section 4.4.

520	   The "volume" operator takes an audio bundle and generates an audio
521	   bundle which has been adjusted to modify the volume of all streams
522	   according to the attributes provided.  Either a qualitative or
523	   quantitative attribute can be provided.  The quantitative attribute
524	   is an integer percentage compared to the input volume.   The
525	   qualitative attributes are "normal", "soft", "softer", "very soft",
526	   "loud", "louder", and "very loud".

528	   The "audioMix" operator takes in one or more output bundles and
529	   generates a new output bundle.  This operator preserves tags.  In
530	   other words, the output bundle contains streams for each member in
531	   the intersection of the participants in the input bundles. With these
532	   additional two operators, you can build the audio sidebar collection
533	   in Section 4.5 which addresses both sidebar and coaching scenarios.

535	   The "tile" operator takes at least one input video bundle and an
536	   ordered list of names of streams.  It generates a video output bundle
537	   where each output stream consists of tiled windows with a fixed
538	   orientation and in priority order as described in Appendix A.  One
539	   attribute to this operator selects the number of tiles, and another
540	   selects if the tile operator is an exclusive or non-exclusive mix.
541	   If an exclusive operator is chosen, whenever a tile would display the
542	   input of the current participant the next video source is selected
543	   instead from the ordered list.  Bundles can be connected to a
544	   specific tile of the tile operator. For example, tile 4 may be
545	   connected to a bundle which shows one of the current floor holders,
546	   or to a stream corresponding to a named participant in an input
547	   bundle. With this additional operator, you can build a fixed tile
548	   continuous presence video layout.

550	      Is there anyway to do this with one input bundle and set or list
551	      manipulation?  Possibly use weighted lists or position-based
552	      manipulation?  We should be able to use setSubtraction and/or
553	      subSets to enable this functionality.

555	   The "autotile" operator dynamically selects a number of tiles between
556	   a minimum and maximum number of streams and incorporates them in a
557	   tiled layout automatically. Like the tile operator, this operator can
558	   be exclusive or non-exclusive and specific bundles may be connected
559	   to specific tiles. With this additional operator, you can build the
560	   an automatically tiled continuous presence video layout.

562	   In addition to those operators just listed, future versions of this
563	   document will contain additional standard operators.  Some other
564	   operators for consideration are listed below.

566	   o  textMux

568	   o  textMuxExclusive

570	   o  explicitList

572	   o  explicitWeightedList

574	   o  sortSet

576	   o  setIntersection

578	   o  setAddition

580	   o  setSubtraction

582	   o  subSet

584	   o  volumeWeighted

586	   o  smilLayout (apply a W3C SMIL stylesheet)

588	   o  textStylesheet

590	   o  xsltLayout

592	   o  selectExplicitParticipants

594	   o  containsContributor

596	   o  doesNotContainContributor

598	   o  crossFade

600	   o  invertSet

602	   o  playUrl

604	   o  selectLast

606	   o  selectLoudest
607	   o  selectLongest

609	   o  stereo2mono

611	   o  pan

613	   o  text2speech

615	   o  speech2text

617	   o  speech2gesture

619	   o  speech2signlanguage

621	4. More about Collections

623	   To create a new collection, a client defines a list of "connectors"
624	   which form the interface between the collection and external graphs.
625	   These connectors are strongly typed as input or output bundles or
626	   sets, and may be further restricted to media type. Then the
627	   "interior" subgraph is created by connecting operators and these
628	   connectors to each other. It is even possible to make use of existing
629	   collections inside a collection, although this makes loop detection
630	   more difficult for the server. Once a new collection is defined, the
631	   XML description is stored on the conference policy server as a
632	   collection template. These are stored in a context completely removed
633	   from individual conferences. Templates persist until they are
634	   removed.

636	   Collections are instantiated just like operators. In some cases
637	   however, the conference policy server may hide the internal structure
638	   of a collection. Also, some conference policy servers may choose to
639	   implement only collections (individual operators cannot be
640	   instantiated). Conference policy server MUST implement at least one
641	   standard collection for each media type they support. Of course they
642	   MAY implement as many other standard or vendor-specific collections
643	   as desired.

645	   Below we list some of these standard collections.  For each
646	   collection we give a short textual description and describe the media
647	   topology subgraph which describes the behavior of that collection.

649	   o  The basicAudioCollection (see Section 4.1)

651	   o  basicMpVideoCollection (see Section 4.2)

653	   o  sidebarAudioCollection (see Section 4.5)
654	   o  audioStreamSelectionCollection

656	   o  videoStreamSelectionCollection

658	   o  basicTextCollection

660	   o  textWithStylesheetCollection

662	   o  smilLayoutVideoCollection

664	   o  stereoAudioCollection

666	   And a subset of these collections which are floor control enabled...

668	   o  audioWithFloorControlCollection (see Section 4.3)

670	   o  mpVideoWithFloorControlCollection (see Section 4.4)

672	   o  audioStreamSelectionWithFloorControlCollection

674	   o  videoStreamSelectionWithFloorControlCollection

676	   o  textWithFloorControlCollection

678	   o  textWithStylesheetWithFloorControlCollection

680	4.1 The Basic Audio Collection

682	   <connectionTemplate name="basicAudioCollection">
683	     <connectors>
684	       <connector name="input" type="bundle"
685	           media="audio" direction="in"/>
686	       <connector name="output" type="bundle"
687	           media="audio" direction="out"/>
688	     </connectors>
689	     <operators>
690	       <operator type="audioSelectSpeakers"/>
691	       <operator type="audioMixMinus"/>
692	     </operators>
693	     <connections>
694	       <connection>
695	         <from element="connector" name="input"/>
696	         <to element="operator" type="audioSelectSpeakers"/>
697	       </connection>
698	       <connection>
699	         <from element="connector" name="input"/>
700	         <to element="operator" type="audioMixMinus"/>

702	       </connection>
703	       <connection>
704	         <front element="operator" type="audioSelectSpeakers"/>
705	         <to element="operator" type="audioMixMinus"/>
706	       </connection>
707	       <connection>
708	         <from element="operator" type="audioMixMinus"/>
709	         <to element="connector" name="output"/>
710	       </connection>
711	     </connections>
712	   </connectionTemplate>

714	4.2 Basic Video MP Collection

716	   <connectionTemplate name="basicMpVideoCollection">
717	     <connectors>
718	       <connector name="in.audio" type="bundle"
719	           media="audio" direction="in"/>
720	       <connector name="in.video" type="bundle"
721	           media="video" direction="in"/>
722	       <connector name="output" type="bundle"
723	           media="video" direction="out"/>
724	     </connectors>
725	     <operators>
726	       <operator type="allParticipants"/>
727	       <operator type="videoSelectSpeakers"/>
728	       <operator type="setIntersection"/>
729	       <operator type="streamMux" n="1" m="1" exclusive="true"/>
730	     </operators>
731	     <connections>
732	       <connection>
733	         <from element="connector" name="in.audio"/>
734	         <to element="operator" type="videoSelectSpeakers"/>
735	       </connection>
736	       <connection>
737	         <from element="connector" name="in.video"/>
738	         <to element="operator" type="allParticipants"/>
739	       </connection>
740	       <connection>
741	         <from element="connector" name="in.video"/>
742	         <to element="operator" type="streamMux"/>
743	       </connection>
744	       <connection>
745	         <front element="operator" type="videoSelectSpeakers"/>
746	         <to element="operator" type="setIntersection"
747	             port="ordered"/>
748	       </connection>
749	       <connection>
750	         <front element="operator" type="allParticipants"/>
751	         <to element="operator" type="setIntersection"
752	             port="unordered"/>
753	       </connection>
754	       <connection>
755	         <front element="operator" type="setIntersection"/>
756	         <to element="operator" type="streamMux"/>
757	       </connection>
758	       <connection>
759	         <from element="operator" type="streamMux"/>
760	         <to element="connector" name="output"/>
761	       </connection>
762	     </connections>
763	   </connectionTemplate>

765	4.3 Basic Audio Collection with Floor Control

767	      OPEN ISSUE: How do we pass parameters (like the name of the floor)
768	      into the interior of a collection?

770	   <connectionTemplate name="audioWithFloorControlCollection">
771	     <connectors>
772	       <connector name="input" type="bundle"
773	           media="audio" direction="in"/>
774	          <parameter name="floor" value="$floor"/>
775	       <connector name="output" type="bundle"
776	           media="audio" direction="out"/>
777	     </connectors>
778	     <operators>
779	       <operator type="audioSelectSpeakers"/>
780	       <operator type="selectFloorHolders" floor="$floor"/>
781	       <operator type="setIntersection"/>
782	       <operator type="audioMixMinus"/>
783	     </operators>
784	     <connections>
785	       <connection>
786	         <from element="connector" name="input"/>
787	         <to element="operator" type="audioSelectSpeakers"/>
788	       </connection>
789	       <connection>
790	         <from element="connector" name="input"/>
791	         <to element="operator" type="selectFloorHolders"/>

793	       </connection>
794	       <connection>
795	         <from element="connector" name="input"/>
796	         <to element="operator" type="audioMixMinus"/>
797	       </connection>
798	       <connection>
799	         <front element="operator" type="audioSelectSpeakers"/>
800	         <to element="operator" type="setIntersection"
801	             port="ordered"/>
802	       </connection>
803	       <connection>
804	         <front element="operator" type="selectFloorHolders"/>
805	         <to element="operator" type="setIntersection"
806	             port="unordered"/>
807	       </connection>
808	       <connection>
809	         <front element="operator" type="setIntersection"/>
810	         <to element="operator" type="audioMixMinus"/>
811	       </connection>
812	       <connection>
813	         <from element="operator" type="audioMixMinus"/>
814	         <to element="connector" name="output"/>
815	       </connection>
816	     </connections>
817	   </connectionTemplate>

819	4.4 Basic Video Collection with Floor Control

821	   <connectionTemplate name="mpVideoWithFloorControlCollection">
822	     <connectors>
823	       <connector name="in.audio" type="bundle"
824	           media="audio" direction="in"/>
825	       <connector name="in.video" type="bundle"
826	           media="video" direction="in"/>
827	          <parameter name="floor" value="$floor"/>
828	       <connector name="output" type="bundle"
829	           media="video" direction="out"/>
830	     </connectors>
831	     <operators>
832	       <operator type="allParticipants"/>
833	       <operator type="selectFloorHolders" floor="$floor"/>
834	       <operator type="videoSelectSpeakers"/>
835	       <operator type="setIntersection" instance="1"/>
836	       <operator type="setIntersection" instance="2"/>
837	       <operator type="streamMux" n="1" m="1" exclusive="true"/>

839	     </operators>
840	     <connections>
841	       <connection>
842	         <from element="connector" name="in.audio"/>
843	         <to element="operator" type="videoSelectSpeakers"/>
844	       </connection>
845	       <connection>
846	         <from element="connector" name="in.video"/>
847	         <to element="operator" type="allParticipants"/>
848	       </connection>
849	       <connection>
850	         <from element="connector" name="in.video"/>
851	         <to element="operator" type="streamMux"/>
852	       </connection>
853	       <connection>
854	         <front element="operator" type="videoSelectSpeakers"/>
855	         <to element="operator" type="setIntersection"
856	             port="ordered" instance="1"/>
857	       </connection>
858	       <connection>
859	         <front element="operator" type="allParticipants"/>
860	         <to element="operator" type="setIntersection" instance="2"/>
861	       </connection>
862	       <connection>
863	         <front element="operator" type="selectFloorHolders"/>
864	         <to element="operator" type="setIntersection" instance="2"/>
865	       </connection>
866	       <connection>
867	         <front element="operator" type="setIntersection" instance="2"/>
868	         <to element="operator" type="setIntersection"
869	             port="unordered" instance="1"/>
870	       </connection>
871	       <connection>
872	         <front element="operator" type="setIntersection" instance="1"/>
873	         <to element="operator" type="streamMux"/>
874	       </connection>
875	       <connection>
876	         <from element="operator" type="streamMux"/>
877	         <to element="connector" name="output"/>
878	       </connection>
879	     </connections>
880	   </connectionTemplate>

882	4.5 Sidebar Audio Collection
883	   <connectionTemplate name="sidebarAudioCollection">
884	     <connectors>
885	       <connector name="in.thisconf" type="bundle"
886	           media="audio" direction="in"/>
887	       <connector name="in.mainconf" type="bundle"
888	           media="audio" direction="in"/>
889	          <parameter name="volume" value="$vol"/>
890	       <connector name="output" type="bundle"
891	           media="audio" direction="out"/>
892	     </connectors>
893	     <operators>
894	       <operator type="volume" level="$vol"/>
895	       <operator type="audioSelectSpeakers"/>
896	       <operator type="audioMixMinus"/>
897	       <operator type="audioMix"/>
898	     </operators>
899	     <connections>
900	       <connection>
901	         <from element="connector" name="in.thisconf"/>
902	         <to element="operator" type="audioSelectSpeakers"/>
903	       </connection>
904	       <connection>
905	         <from element="connector" name="in.mainconf"/>
906	         <to element="operator" type="volume"/>
907	       </connection>
908	       <connection>
909	         <front element="operator" type="audioSelectSpeakers"/>
910	         <to element="operator" type="audioMixMinus"/>
911	       </connection>
912	       <connection>
913	         <front element="operator" type="audioMixMinus"/>
914	         <to element="operator" type="audioMix"/>
915	       </connection>
916	       <connection>
917	         <front element="operator" type="volume"/>
918	         <to element="operator" type="audioMix"/>
919	       </connection>
920	       <connection>
921	         <from element="operator" type="audioMix"/>
922	         <to element="connector" name="output"/>
923	       </connection>
924	     </connections>
925	   </connectionTemplate>

927	5. Semantics

929	5.1 Transactions

931	   Manipulations of a "live" media topology graph are performed as
932	   transactions. This insures that the media graph transitions from one
933	   consistent state to another. It should never be in a partially
934	   connected or disconnected state. Loop detection is always performed
935	   by the server before a transaction is accepted.

937	   Note that operators are automatically deleted unless they have at
938	   least one input connection and at least one output connection. As a
939	   result, a transaction which instantiates an operator must connect it
940	   to an input source and an output source during the same transaction,
941	   otherwise adding the operator would have no effect.

943	   A transaction encloses one or more topology graph manipulations which
944	   must all succeed or all fail. Within the transaction, individual
945	   steps consist of either creating or instantiating elements or
946	   connecting them together.  Note that there is an important
947	   distinction between groups and aliases and collections and operators.
948	   Groups and aliases are created   (they don't exist before they are
949	   created), while collections and operators are instantiated (a copy of
950	   the original is placed in the media topology graph).

952	   While nearly any RPC-style protocol could be used to express media
953	   policy transactions, this document describes an XCAP [2] profile for
954	   manipulating media policy. XCAP is a usage of HTTP [5] which uses
955	   XPath [6] to address fragments of an XML document in the Request URI.
956	   Two XML schemas are defined--one for managing collections for later
957	   use, and another for real-time manipulation of media policy graphs.

959	      Note that support for transactions is currently an open issue in
960	      XCAP.

962	5.2 Client Behavior

964	   To query the media policy for a particular conference, a client
965	   merely fetches the media policy document (or document fragment) of
966	   interest.  In some cases the document will be filtered to remove
967	   hidden or private information.  Similarly, if the client is
968	   authorized, it can view the internal structure of a collection
969	   template by just fetching its definition document. When filtered, a
970	   collection template may just describe the connectors associated with
971	   it and a textual description.

973	   A client connects a stream to a group merely by writing the stream
974	   into the appropriate group structure in the target conference or
975	   subconference. Likewise a client disconnects a stream by deleting the
976	   stream from the appropriate group structure. The client permissions
977	   determine if this request fails, requires confirmation from the
978	   affected target, or succeeds immediately. Since a stream can only
979	   exist in one group at a time, if a write operation succeeds and the
980	   stream is already connected it results in a reassignment rather than
981	   the same stream in multiple groups.

983	   To instantiate a new operator or collection, just append an XML
984	   fragment of code which describes the parameters for that operator to
985	   the appropriate XPath (the operators or collections XPath). To make a
986	   connection, just append the appropriate XML fragment describing that
987	   connection to the connections XPath. Deleting an XPath, removes the
988	   operation, collection, or connection. Once an connection is removed
989	   this may cause one or more operations to be automatically deleted.
990	   Likewise, when an operation is deleted, all its connections are
991	   deleted as well. Just using these simple mechanisms allow authorized
992	   clients to perform arbitrary manipulations of the media topology.

994	   Finally, to create a new collection, the client writes an XML
995	   description of the collection into the collectionTemplates XPath.

997	5.3 Server Behavior

999	   Servers must maintain a list of all operator and collection types
1000	   that can be used by Clients within a conference. Servers must return
1001	   such a list to all authorized Clients in response to inventory
1002	   queries. For operators and collections that have parameters, a list
1003	   of acceptable parameter values must also be specified for each
1004	   parameter.

1006	   For each transaction received by the Server it must proceed with the
1007	   steps that follow. For each request within the transaction the Server
1008	   must verify that the party initiating the request is authorized to
1009	   initiate this specific request in the context of the sub-conference
1010	   specified within the request. If the initiator is not authorized, the
1011	   Server must not execute any part of the transaction and return the
1012	   appropriate "Authorization Failure" response to the initiator. An
1013	   example if user A requests to connect the input audio stream of user
1014	   B to group X in sub-conference "sidebar-1" and the output audio
1015	   stream of user B to group Y in sub-conference "sidebar-1". The Server
1016	   must verify that user A is authorized to manipulate the media policy
1017	   of user B and is authorized to manipulate "sidebar-1".

1019	   For each request the Server must verify that any changes in the media
1020	   policy of any participant as a result of the execution of the request
1021	   is authorized by the conference policy. If any party is not
1022	   authorized for the media policy changes that result from the
1023	   execution of any request within the transaction then the server must
1024	   not execute any part of the transaction and return the appropriate
1025	   "Authorization Failure" response to the initiator. In the example
1026	   used in the previous point, the Server must verify that user B is
1027	   authorized to join "sidebar-1".

1029	   The Server should verify that all requests to instantiate, create
1030	   and/or connect elements are conforming to the XML schema and
1031	   descriptions of the elements. If any request does not conform to the
1032	   XML schema of the elements that it is operating on then the Server
1033	   must not execute any part of the transaction and return the
1034	   appropriate "XML Schema Error" response to the initiator. For example
1035	   an operator that takes one video input bundle can not be connected to
1036	   an audio bundle.

1038	   The Server should verify that all the relevant mixers have enough
1039	   resources to perform the actual media processing required as a result
1040	   of the execution of the transaction. If not enough resources are
1041	   available the Server must not execute any part of the transaction and
1042	   return the appropriate "No Available Resources" response to the
1043	   initiator. Note that resources needed for trans-coding and
1044	   trans-rating should be accounted for. Editor Note: More details and
1045	   some examples need to be provided to explain this section and
1046	   specifically the last bullet.

1048	5.4 Notifications of media policy changes

1050	   Media topology changes should result in an appropriate
1051	   protocol-specific notification to those (authorized) parties who have
1052	   requested (subscribed for) them. In the case of SIP, this
1053	   notification will be a notification from the SIP conference package,
1054	   but will send an application/media-policy+xml MIME type in the
1055	   notification body in addition to, or instead of the basic roster
1056	   information normally provided by that event package. Note that the
1057	   protocol should allow hidden transactions for which no notifications
1058	   will be sent as a result of the media policy change.

1060	   Editors Note: Need to describe how pending operations are handled
1061	   with notifications.

1063	6. Formal Syntax

1065	   Below is an XCAP encoding (using XML Schema) for media-topology
1066	   manipulation of an active conference (or subconference):

1068	   <?xml version="1.0" encoding="UTF-8"?>
1069	   <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
1070	      <xs:element name="media-policy">
1071	         <xs:complexType>
1072	            <xs:sequence>
1073	               <xs:element maxOccurs="1" minOccurs="0"
1074	                           name="groups" type="groupsType"/>
1075	               <xs:element maxOccurs="1" minOccurs="0"
1076	                           name="collections" type="collectionsType"/>
1077	               <xs:element maxOccurs="1" minOccurs="0"
1078	                           name="operators" type="operatorsType"/>
1079	               <xs:element maxOccurs="1" minOccurs="0"
1080	                           name="connections" type="connectionsType"/>
1081	            </xs:sequence>
1082	         </xs:complexType>
1083	      </xs:element>
1084	      <xs:complexType name="groupsType">
1085	         <xs:sequence>
1086	            <xs:element maxOccurs="unbounded" minOccurs="0"
1087	                        name="group" type="groupType"/>
1088	         </xs:sequence>
1089	      </xs:complexType>
1090	      <xs:complexType name="groupType">
1091	         <xs:sequence>
1092	            <xs:element maxOccurs="unbounded" minOccurs="0"
1093	                        name="stream" type="streamType"/>
1094	         </xs:sequence>
1095	         <xs:attribute name="direction" use="required">
1096	            <xs:simpleType>
1097	               <xs:restriction base="xs:string">
1098	                  <xs:enumeration value="in"/>
1099	                  <xs:enumeration value="out"/>
1100	               </xs:restriction>
1101	            </xs:simpleType>
1102	         </xs:attribute>
1103	         <xs:attribute name="media" use="required">
1104	            <xs:simpleType>
1105	               <xs:restriction base="xs:string">
1106	                  <xs:enumeration value="audio"/>
1107	                  <xs:enumeration value="video"/>
1108	                  <xs:enumeration value="text"/>
1109	                  <!-- add extensibility of the media type? -->
1110	               </xs:restriction>
1111	            </xs:simpleType>
1112	         </xs:attribute>
1113	      </xs:complexType>
1114	      <xs:complexType name="streamType">
1115	         <xs:sequence>
1116	            <xs:element maxOccurs="unbounded" minOccurs="0"
1117	                        name="alias" type="xs:string"/>
1118	         </xs:sequence>
1119	         <xs:attribute name="fam" use="required">
1120	            <xs:simpleType>
1121	               <xs:restriction base="xs:string">
1122	                  <xs:enumeration value="ipv4"/>
1123	                  <xs:enumeration value="ipv6"/>
1124	               </xs:restriction>
1125	            </xs:simpleType>
1126	         </xs:attribute>
1127	         <xs:attribute name="addr" type="xs:string" use="required"/>
1128	         <xs:attribute name="proto" type="xs:string" use="optional"/>
1129	         <xs:attribute name="sock" type="xs:integer" use="optional"/>
1130	         <xs:attribute name="demux" type="xs:string"/>
1131	      </xs:complexType>
1132	      <xs:complexType name="collectionsType">
1133	         <xs:sequence>
1134	            <xs:element maxOccurs="unbounded" minOccurs="0"
1135	                        name="collection" type="operatorType"/>
1136	         </xs:sequence>
1137	      </xs:complexType>
1138	      <xs:complexType name="operatorsType">
1139	         <xs:sequence>
1140	            <xs:element maxOccurs="unbounded" minOccurs="0"
1141	                        name="collection" type="operatorType"/>
1142	         </xs:sequence>
1143	      </xs:complexType>
1144	      <xs:complexType name="operatorType">
1145	         <xs:attribute name="type" type="xs:string" use="required"/>
1146	         <xs:anyAttribute namespace="##other" processContents="lax"/>
1147	      </xs:complexType>
1148	      <xs:complexType name="connectionsType">
1149	         <xs:sequence>
1150	            <xs:element maxOccurs="unbounded" minOccurs="0"
1151	                        name="connection" type="connectionType"/>
1152	         </xs:sequence>
1153	      </xs:complexType>
1154	      <xs:complexType name="connectionType">
1155	         <xs:sequence>
1156	            <xs:element maxOccurs="1" minOccurs="1"
1157	                        name="to" type="connectType"/>
1158	            <xs:element maxOccurs="1" minOccurs="1"
1159	                        name="from" type="connectType"/>
1160	         </xs:sequence>
1161	      </xs:complexType>
1162	      <xs:complexType name="connectType">
1163	         <xs:attribute name="element" use="required">
1164	            <xs:simpleType>
1165	               <xs:restriction base="xs:string">
1166	                  <xs:enumeration value="group"/>
1167	                  <xs:enumeration value="collection"/>
1168	                  <xs:enumeration value="operator"/>
1169	               </xs:restriction>
1170	            </xs:simpleType>
1171	         </xs:attribute>
1172	         <xs:attribute name="type" type="xs:string" use="optional"/>
1173	         <xs:attribute name="conf" type="xs:string" use="optional"/>
1174	         <xs:attribute name="media" use="optional">
1175	            <xs:simpleType>
1176	               <xs:restriction base="xs:string">
1177	                  <xs:enumeration value="audio"/>
1178	                  <xs:enumeration value="video"/>
1179	                  <xs:enumeration value="text"/>
1180	               </xs:restriction>
1181	            </xs:simpleType>
1182	         </xs:attribute>
1183	         <xs:attribute name="direction" use="optional">
1184	            <xs:simpleType>
1185	               <xs:restriction base="xs:string">
1186	                  <xs:enumeration value="in"/>
1187	                  <xs:enumeration value="out"/>
1188	               </xs:restriction>
1189	            </xs:simpleType>
1190	         </xs:attribute>
1191	         <xs:attribute name="port" type="xs:string" use="optional"/>
1192	         <xs:attribute name="instance" type="xs:string" use="optional"/>
1193	      </xs:complexType>
1194	   </xs:schema>

1196	   And here is an XML schema for describing collection templates:

1198	      <?xml version="1.0" encoding="UTF-8"?>
1199	   <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
1200	      <xs:element name="collectionTemplates">
1201	         <xs:complexType>
1202	            <xs:sequence>
1203	               <xs:element maxOccurs="1" minOccurs="0"
1204	                   name="connectors" type="connectorsType"/>
1205	               <xs:element maxOccurs="1" minOccurs="0"
1206	                   name="collections" type="collectionsType"/>
1207	               <xs:element maxOccurs="1" minOccurs="0"
1208	                   name="operators" type="operatorsType"/>
1209	               <xs:element maxOccurs="1" minOccurs="0"
1210	                   name="connections" type="connectionsType"/>

1212	            </xs:sequence>
1213	            <xs:attribute name="name" type="xs:string"/>
1214	         </xs:complexType>
1215	      </xs:element>
1216	      <xs:complexType name="connectorsType">
1217	         <xs:sequence>
1218	            <xs:element maxOccurs="unbounded" minOccurs="0"
1219	                name="connector" type="connectorType"/>
1220	         </xs:sequence>
1221	      </xs:complexType>
1222	      <xs:complexType name="connectorType">
1223	         <xs:attribute name="name" use="required"/>
1224	         <xs:attribute name="type" use="required">
1225	            <xs:simpleType>
1226	               <xs:restriction base="xs:string">
1227	                  <xs:enumeration value="bundle"/>
1228	                  <xs:enumeration value="set"/>
1229	               </xs:restriction>
1230	            </xs:simpleType>
1231	         </xs:attribute>
1232	         <xs:attribute name="direction" use="required">
1233	            <xs:simpleType>
1234	               <xs:restriction base="xs:string">
1235	                  <xs:enumeration value="in"/>
1236	                  <xs:enumeration value="out"/>
1237	               </xs:restriction>
1238	            </xs:simpleType>
1239	         </xs:attribute>
1240	      </xs:complexType>
1241	      <xs:complexType name="collectionsType">
1242	         <xs:sequence>
1243	            <xs:element maxOccurs="unbounded" minOccurs="0"
1244	                name="collection" type="operatorType"/>
1245	         </xs:sequence>
1246	      </xs:complexType>
1247	      <xs:complexType name="operatorsType">
1248	         <xs:sequence>
1249	            <xs:element maxOccurs="unbounded" minOccurs="0"
1250	                name="collection" type="operatorType"/>
1251	         </xs:sequence>
1252	      </xs:complexType>
1253	      <xs:complexType name="operatorType">
1254	         <xs:attribute name="type" type="xs:string" use="required"/>
1255	         <xs:anyAttribute namespace="##other" processContents="lax"/>
1256	      </xs:complexType>
1257	      <xs:complexType name="connectionsType">
1258	         <xs:sequence>
1259	            <xs:element maxOccurs="unbounded" minOccurs="0"
1260	                name="connection" type="connectionType"/>
1261	         </xs:sequence>
1262	      </xs:complexType>
1263	      <xs:complexType name="connectionType">
1264	         <xs:sequence>
1265	            <xs:element maxOccurs="1" minOccurs="1"
1266	                name="to" type="connectType"/>
1267	            <xs:element maxOccurs="1" minOccurs="1"
1268	                name="from" type="connectType"/>
1269	         </xs:sequence>
1270	      </xs:complexType>
1271	      <xs:complexType name="connectType">
1272	         <xs:attribute name="element" use="required">
1273	            <xs:simpleType>
1274	               <xs:restriction base="xs:string">
1275	                  <xs:enumeration value="connector"/>
1276	                  <xs:enumeration value="collection"/>
1277	                  <xs:enumeration value="operator"/>
1278	               </xs:restriction>
1279	            </xs:simpleType>
1280	         </xs:attribute>
1281	         <xs:attribute name="type" type="xs:string" use="optional"/>
1282	         <xs:attribute name="conf" type="xs:string" use="optional"/>
1283	         <xs:attribute name="media" use="optional">
1284	            <xs:simpleType>
1285	               <xs:restriction base="xs:string">
1286	                  <xs:enumeration value="audio"/>
1287	                  <xs:enumeration value="video"/>
1288	                  <xs:enumeration value="text"/>
1289	               </xs:restriction>
1290	            </xs:simpleType>
1291	         </xs:attribute>
1292	         <xs:attribute name="direction" use="optional">
1293	            <xs:simpleType>
1294	               <xs:restriction base="xs:string">
1295	                  <xs:enumeration value="in"/>
1296	                  <xs:enumeration value="out"/>
1297	               </xs:restriction>
1298	            </xs:simpleType>
1299	         </xs:attribute>
1300	         <xs:attribute name="port" type="xs:string" use="optional"/>
1301	         <xs:attribute name="instance" type="xs:string" use="optional"/>
1302	      </xs:complexType>
1303	   </xs:schema>

1305	7. Examples

1307	   Below is a diagram which shows a sample media topology (with streams,
1308	   collections, and groups) for an audio and video conference with an
1309	   audio sidebar.

1311	        Audio and Video Conference with one Audio Sidebar

1313	           (streams)              (streams)                (streams)

1315	       A B   D E F  H  J      A   C D   F G H I            B   E   J
1316	       | |   | | |  |  |      |   | |   | | | |            |   |   |
1317	       | |   | | |  |  |      |   | |   | | | |            |   |   |
1318	       V V   V V V  V  V      V   V V   V V V V            V   V   V
1319	     +------------------+   +------------------+   +-------------------+
1320	     | Main Video In    |   | Main Audio In    |   | Sidebar Audio Out |
1321	     |  (group)         |   | (group)          |   | (group)           |
1322	     +------------------+   +------------------+   +-------------------+
1323	              ||            //        ||                      ||
1324	              ||           //         ||          +------+    ||
1325	              ||          //          ||          |+----+|    ||
1326	              ||         //           ||          ||    ||    ||
1327	              \/        //            \/          ||    \/    \/
1328	     ...................V.   ...................  ||  ..................
1329	     :                   :   :                 :  ||  :                :
1330	     :                   :   :                 :  ||  :                :
1331	     :   vendor          :   :   standard      :  ||  :   standard     :
1332	     :   defined         :   :   conference    :  ||  :   sidebar      :
1333	     :   video           :   :   audio         :  ||  :   audio        :
1334	     :   collection      :   :   collection    :  ||  :   collection   :
1335	     :                   :   :                 :  ||  :                :
1336	     :                   :   :                 :  ||  :                :
1337	     .....................   ...................  ||  ..................
1338	               ||                     ||     ||   ||          ||
1339	               ||                     ||     |+---+|          ||
1340	               ||                     ||     +-----+          ||
1341	               \/                     \/                      \/
1342	     +------------------+   +------------------+   +-------------------+
1343	     | Main Video Out   |   | Main Audio Out   |   | Sidebar Audio Out |
1344	     | (group)          |   | (group)          |   | (group)           |
1345	     +------------------+   +------------------+   +-------------------+
1346	       | | | | | |  |  |       |  | |  | | | |             |   |   |
1347	       | | | | | |  |  |       |  | |  | | | |             |   |   |
1348	       V V V V V V  V  V       V  V V  V V V V             V   V   V
1349	       A B C D E F  H  J       A  C D  F G H I             B   E   J

1351	           (streams)              (streams)                (streams)

1353	   Here we have the media topologies description documents for the
1354	   combined audio/video conference in the figure above.  The first media
1355	   topology is for the main conference, and the second is for the
1356	   subconference used by the audio sidebar.  Specific streams are
1357	   omitted for brevity.

1359	   <media-topology>
1360	     <groups>
1361	       <group dir="in" media="audio"/>
1362	       <group dir="out" media="audio"/>
1363	       <group dir="in" media="video"/>
1364	       <group dir="out" media="video"/>
1365	     </groups>
1366	     <collections>
1367	       <collection type="basicAudioCollection"/>
1368	       <collection type="example.com.videoCollection" size="7"/>
1369	     </collections>
1370	     <connections>
1371	       <connection>
1372	         <from element="group" direction="in" media="audio"/>
1373	         <to element="collection" type="basicAudioCollection"/>
1374	       </connection>
1375	       <connection>
1376	         <from element="group" direction="in" media="video"/>
1377	         <to element="collection" type="example.com.videoCollection"/>
1378	       </connection>
1379	       <connection>
1380	         <from element="collection" type="basicAudioCollection"/>
1381	         <to element="group" direction="out" media="audio"/>
1382	       </connection>
1383	       <connection>
1384	         <from element="collection" type="example.com.videoCollection"/>
1385	         <to element="group" direction="out" media="video"/>
1386	       </connection>
1387	     </connections>
1388	   </media-topology>

1390	   Below is the media topology description document for the
1391	   subconference.  Note that conf=".."  refers to the parent of the
1392	   current conference

1394	   <media-topology>
1395	     <groups>
1396	       <group dir="in" media="audio"/>
1397	       <group dir="out" media="audio"/>
1398	     </groups>
1399	     <collections>
1400	       <collection type="sidebarAudioCollection" volume="soft"/>

1402	     </collections>
1403	     <connections>
1404	       <connection>
1405	         <from element="group" direction="in" media="audio"/>
1406	         <to element="collection" type="sidebarAudioCollection"
1407	             port="in.thisconf"/>
1408	       </connection>
1409	       <connection>
1410	         <from element="group" direction="out" media="audio" conf=".."/>
1411	         <to element="collection" type="sidebarAudioCollection"
1412	             port="in.mainconf"/>
1413	       </connection>
1414	       <connection>
1415	         <from element="collection" type="sidebarAudioCollection"/>
1416	         <to element="group" direction="out" media="audio"/>
1417	       </connection>
1418	     </connections>
1419	   </media-topology>

1421	8. Security Considerations

1423	   Much needs to be written here. Authorization rules will be discussed
1424	   in Section 5.3. Privacy and filtering rules will be discussed there
1425	   as well.

1427	9. IANA Considerations

1429	   This document defines an IANA registry of Media Operators, and
1430	   another of Media Collections.

1432	10. Acknowledgments

1434	   This work was the result of discussions among the SIP Conferencing
1435	   Design Team.

1437	Normative References

1439	   [1]   Bradner, S., "Key words for use in RFCs to Indicate Requirement
1440	         Levels", BCP 14, RFC 2119, March 1997.

1442	   [2]   Rosenberg, J., "The Extensible Markup Language (XML)
1443	         Configuration Access Protocol  (XCAP)",
1444	         draft-rosenberg-simple-xcap-00 (work in progress), May 2003.

1446	   [3]   Even, R., Levin, O. and N. Ismail, "Conferencing media policy
1447	         Requirements", draft-even-xcon-media-policy-requirements-00.txt
1448	         (work in progress), June 2003.

1450	   [4]   Koskelainen, P. and H. Khartabil, "XCAP Usage for Conference
1451	         Policy Manipulation",
1452	         draft-koskelainen-xcon-xcap-cpcp-usage-00.txt (work in
1453	         progress), June 2003.

1455	   [5]   Fielding, R., Gettys, J., Mogul, J., Nielsen, H., Masinter, L.,
1456	         Leach, P. and T. Berners-Lee, "Hypertext Transfer Protocol --
1457	         HTTP/1.1", RFC 2616, June 1999.

1459	   [6]   Clark, J. and S. DeRose, "XML Path Language (XPath) Version
1460	         1.0", W3C Recommendation xpath, November 1999, <http://
1461	         www.w3.org/TR/xpath>.

1463	   [7]   Thompson, H., Beech, D., Maloney, M. and N. Mendelsohn, "XML
1464	         Schema Part 1: Structures", W3C REC-xmlschema-1, May 2001,
1465	         <http://www.w3.org/TR/xmlschema-1/>.

1467	   [8]   Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A.,
1468	         Peterson, J., Sparks, R., Handley, M. and E. Schooler, "SIP:
1469	         Session Initiation Protocol", RFC 3261, June 2002.

1471	   [9]   Jacobson, V., Perkins, C. and M. Handley, "SDP: Session
1472	         Description Protocol", draft-ietf-mmusic-sdp-new-13 (work in
1473	         progress), May 2003.

1475	   [10]  Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with
1476	         Session Description Protocol (SDP)", RFC 3264, June 2002.

1478	   [11]  Camarillo, G., Eriksson, G., Holler, J. and H. Schulzrinne,
1479	         "Grouping of Media Lines in the Session Description Protocol
1480	         (SDP)", RFC 3388, December 2002.

1482	   [12]  International Telecommunications Union, "CONTROL PROTOCOL FOR
1483	         MULTIMEDIA COMMUNICATION", ITU Recommendation H.245, 1998.

1485	Informational References

1487	   [13]  Rosenberg, J., "A Framework for Conferencing with the Session
1488	         Initiation Protocol",
1489	         draft-ietf-sipping-conferencing-framework-00 (work in
1490	         progress), May 2003.

1492	   [14]  Levin, O. and R. Even, "High Level Requirements for Tightly
1493	         Coupled SIP Conferencing",
1494	         draft-ietf-sipping-conferencing-requirements-00 (work in
1495	         progress), April 2003.

1497	   [15]  Rosenberg, J. and H. Schulzrinne, "A Session Initiation
1498	         Protocol (SIP) Event Package for Conference State",
1499	         draft-ietf-sipping-conference-package-00 (work in progress),
1500	         June 2002.

1502	   [16]  Even, R. and N. Ismail, "Conferencing Scenarios",
1503	         draft-even-xcon-conference-scenarios-00.txt (work in progress),
1504	         June 2003.

1506	   [17]  Koskelainen, P., "Floor Control Requirements",
1507	         draft-koskelainen-xcon-floor-control-reqs-00.txt (work in
1508	         progress), June 2003.

1510	Authors' Addresses

1512	   Rohan Mahy
1513	   Cisco Systems, Inc.
1514	   5617 Scotts Valley Drive
1515	   Scotts Valley, CA  95066
1516	   USA

1518	   EMail: rohan@cisco.com

1520	   Nermeen Ismail
1521	   Cisco Systems, Inc.
1522	   170 W Tasman Dr
1523	   San Jose, CA  95134
1524	   USA

1526	   EMail: nismail@cisco.com

1528	Appendix A. Standard Tile Order
1529	Intellectual Property Statement

1531	   The IETF takes no position regarding the validity or scope of any
1532	   intellectual property or other rights that might be claimed to
1533	   pertain to the implementation or use of the technology described in
1534	   this document or the extent to which any license under such rights
1535	   might or might not be available; neither does it represent that it
1536	   has made any effort to identify any such rights. Information on the
1537	   IETF's procedures with respect to rights in standards-track and
1538	   standards-related documentation can be found in BCP-11. Copies of
1539	   claims of rights made available for publication and any assurances of
1540	   licenses to be made available, or the result of an attempt made to
1541	   obtain a general license or permission for the use of such
1542	   proprietary rights by implementors or users of this specification can
1543	   be obtained from the IETF Secretariat.

1545	   The IETF invites any interested party to bring to its attention any
1546	   copyrights, patents or patent applications, or other proprietary
1547	   rights which may cover technology that may be required to practice
1548	   this standard. Please address the information to the IETF Executive
1549	   Director.

1551	Full Copyright Statement

1553	   Copyright (C) The Internet Society (2004). All Rights Reserved.

1555	   This document and translations of it may be copied and furnished to
1556	   others, and derivative works that comment on or otherwise explain it
1557	   or assist in its implementation may be prepared, copied, published
1558	   and distributed, in whole or in part, without restriction of any
1559	   kind, provided that the above copyright notice and this paragraph are
1560	   included on all such copies and derivative works. However, this
1561	   document itself may not be modified in any way, such as by removing
1562	   the copyright notice or references to the Internet Society or other
1563	   Internet organizations, except as needed for the purpose of
1564	   developing Internet standards in which case the procedures for
1565	   copyrights defined in the Internet Standards process must be
1566	   followed, or as required to translate it into languages other than
1567	   English.

1569	   The limited permissions granted above are perpetual and will not be
1570	   revoked by the Internet Society or its successors or assignees.

1572	   This document and the information contained herein is provided on an
1573	   "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
1574	   TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
1575	   BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
1576	   HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
1577	   MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

1579	Acknowledgement

1581	   Funding for the RFC Editor function is currently provided by the
1582	   Internet Society.