idnits 2.17.1 draft-rosenberg-sipping-conferencing-framework-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** There is 1 instance of too long lines in the document, the longest one being 1 character in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The "Author's Address" (or "Authors' Addresses") section title is misspelled. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (October 28, 2002) is 7850 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: 'TBD' on line 501 -- Obsolete informational reference (is this intentional?): RFC 1889 (ref. '2') (Obsoleted by RFC 3550) -- Obsolete informational reference (is this intentional?): RFC 3265 (ref. '4') (Obsoleted by RFC 6665) -- Obsolete informational reference (is this intentional?): RFC 2396 (ref. '7') (Obsoleted by RFC 3986) -- Obsolete informational reference (is this intentional?): RFC 3015 (ref. '9') (Obsoleted by RFC 3525) Summary: 4 errors (**), 0 flaws (~~), 2 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force SIPPING WG 3 Internet Draft J. Rosenberg 4 dynamicsoft 5 draft-rosenberg-sipping-conferencing-framework-00.txt 6 October 28, 2002 7 Expires: April 2003 9 A Framework for Conferencing with the Session Initiation Protocol 11 STATUS OF THIS MEMO 13 This document is an Internet-Draft and is in full conformance with 14 all provisions of Section 10 of RFC2026. 16 Internet-Drafts are working documents of the Internet Engineering 17 Task Force (IETF), its areas, and its working groups. Note that 18 other groups may also distribute working documents as Internet- 19 Drafts. 21 Internet-Drafts are draft documents valid for a maximum of six months 22 and may be updated, replaced, or obsoleted by other documents at any 23 time. It is inappropriate to use Internet-Drafts as reference 24 material or to cite them other than as "work in progress". 26 The list of current Internet-Drafts can be accessed at 27 http://www.ietf.org/ietf/1id-abstracts.txt 29 To view the list Internet-Draft Shadow Directories, see 30 http://www.ietf.org/shadow.html. 32 Abstract 34 The Session Initiation Protocol (SIP) supports the initiation, 35 modification, and termination of media sessions between user agents. 36 These sessions are managed by SIP dialogs, which represent a SIP 37 relationship between a pair of user agents. Because dialogs are 38 between pairs of user agents, SIP's usage for two-party 39 communications (such as a phone call), is obvious. Communications 40 sessions with multiple participants, generally known as conferencing, 41 is more complicated. This document defines a framework for how such 42 conferencing can occur. This framework describes the overall 43 architecture, terminology, and protocol components needed for multi- 44 party conferencing. 46 Table of Contents 48 1 Introduction ........................................ 3 49 2 Terminology ......................................... 3 50 3 Basic Architecture .................................. 7 51 4 Usage of URIs ....................................... 11 52 5 Functions of the Elements ........................... 12 53 5.1 Focus ............................................... 12 54 5.2 Conference Policy Server ............................ 13 55 5.3 Mixers .............................................. 14 56 5.4 Media Policy Server ................................. 14 57 5.5 Conference Notification Service ..................... 15 58 5.6 Participants ........................................ 16 59 5.7 Conference Policy ................................... 16 60 5.8 Media Policy ........................................ 17 61 6 Physical Realization ................................ 17 62 6.1 Centralized Server .................................. 17 63 6.2 Endpoint Server ..................................... 17 64 6.3 Media Server Component .............................. 18 65 6.4 Distributed Mixing .................................. 21 66 6.5 Cascaded Mixers ..................................... 22 67 7 Common Operations ................................... 22 68 7.1 Creating Conferences ................................ 22 69 7.2 Adding Participants ................................. 25 70 7.3 Removing Participants ............................... 27 71 7.4 Approving Policy Changes ............................ 27 72 7.5 Creating Sidebars ................................... 28 73 8 Security Considerations ............................. 28 74 9 Contributors ........................................ 29 75 10 Authors Addresses ................................... 29 76 11 Normative References ................................ 29 77 12 Informative References .............................. 29 79 1 Introduction 81 The Session Initiation Protocol (SIP) [1] supports the initiation, 82 modification, and termination of media sessions between user agents. 83 These sessions are managed by SIP dialogs, which represent a SIP 84 relationship between a pair of user agents. Because dialogs are 85 between pairs of user agents, SIP's usage for two-party 86 communications (such as a phone call), is obvious. Communications 87 sessions with multiple participants, however, are more complicated. 88 SIP can support many models of multi-party communications. One, 89 referred to as loosely coupled conferences, makes use of multicast 90 media groups. In the loosely coupled model, there is no signaling 91 relationship between participants in the conference. There is no 92 central point of control or conference server. Participation is 93 gradually learned through control information that is passed as part 94 of the conference (using the Real Time Control Protocol (RTCP) [2], 95 for example). Loosely coupled conferences are easily supported in SIP 96 by using multicast addresses within its session descriptions. 98 In another model, referred to as fully distributed multiparty 99 conferencing, each participant maintains a signaling relationship 100 with each other participant, using SIP. There is no central point of 101 control; it is completely distributed amongst the participants. SIP 102 does not yet support this model. 104 In another model, sometimes referrred to as the tightly coupled 105 conference, there is a central point of control. Each participant 106 connects to this central point. It provides a variety of conference 107 functions, and may possibly perform media mixing functions as well. 108 Tightly coupled conferences are not directly addressed by the SIP 109 specification, although basic ones are possible without any 110 additional protocol support. 112 This document is one of a series of specifications that discusses 113 tightly coupled conferences. Here, we present the overall framework 114 for tightly coupled conferencing, referred to simply as 115 "conferencing" from this point forward. This framework presents a 116 general architectural model for these conferences, presents 117 terminology used to discuss such conferences, and describes the sets 118 of protocols involved in a conference. The aim of the framework is to 119 meet the general requirements for conferencing that are outlined in 120 [3]. 122 2 Terminology 124 Conference: Sadly, conference is an overused term which has 125 different meanings in different contexts. In SIP, a 126 conference is an instance of a multi-party conversation. 128 Within the context of this specification, a conference is 129 always a tightly coupled conference. 131 Loosely Coupled Conference: A loosely coupled conference is a 132 conference without coordinated signaling relationships 133 amongst participants. Loosely coupled conferences use 134 multicast for distribution of conference memberships. 136 Tightly Coupled Conference: A tightly coupled conference is a 137 conference in which a single user agent, referred to as a 138 focus, maintains a dialog with each participant. The focus 139 plays the role of the centralized manager of the 140 conference, and is addressed by a conference URI. 142 Focus: The focus is a SIP user agent that is addressed by a 143 conference URI. The focus maintains a SIP signaling 144 relationship with each participant in the conference. The 145 focus is responsible for insuring, in some way, that each 146 participant receives the media that make up the conference. 147 The focus also implements conference policies. The focus is 148 a logical role. 150 Conference URI: A URI, usually a SIP URI, which identifies the 151 focus of a conference. 153 Participants: The set of user agents, each identified by a URI, 154 which are connected to the focus for a particular 155 conference. 157 Conference Notification Service: A conference notification 158 service is a logical function provided by the focus. The 159 focus can act as a notifier [4], accepting subscriptions to 160 the conference state, and notifying subscribers about 161 changes to that state. The state includes the state 162 maintained by the focus itself, the conference policy, and 163 the media policy. 165 Conference Policy Server: A conference policy server is a 166 logical function which can store and manipulate rules 167 associated with participation in a conference. These rules 168 include directives on the lifespan of the conference, who 169 can and cannot join the conference, definitions of roles 170 available in the conference and the responsibilities 171 associated with those roles, and policies on who is allowed 172 to request which roles. The conference policy server is a 173 logical role. 175 Media Policy Server: A media policy server is a logical function 176 which can store and manipulate rules associated with the 177 media distribution of the conference. These rules can 178 specify which participants receive media from which other 179 participants, and the ways in which that media is combined 180 for each participant. In the case of audio, these rules can 181 include the relative volumes at which each participant is 182 mixed. In the case of video, these rules can indicate 183 whether the video is tiled, whether the video indicates the 184 loudest speaker, and so on. 186 Conference Policy: The set of rules manipulated by the 187 conference policy server. 189 Conference Policy Control Protocol: The client-server protocol 190 used by clients to manipulate the conference policy. 192 Media Policy: The set of rules manipulated by the media policy 193 server. The media policy is used by the focus to determine 194 the mixing characteristics for the conference. 196 Media Policy Control Protocol: The client-server protocol used 197 by clients to manipulate the media policy. 199 Mixer: As defined in the Real Time Transport Protocol [2], a 200 mixer receives a set of media streams, and combines their 201 media in a type-specific manner, redistributing the result 202 to each participant. We use the term here to include 203 combining of non-RTP media streams as well, such as instant 204 messaging sessions [5]. 206 Basic Conference: A basic conference is one where there is no 207 conference policy server, media policy server, or 208 conference subscription server - only a focus. 210 Basic Participant: A basic participant is a participant in a 211 conference that is not aware that it is actually in a 212 conference. As far as the UA is concerned, it is a point- 213 to-point call. 215 Cascaded Conference: A conference in which a participant is the 216 focus of another conference. 218 Complex Conference: A complex conference includes at least one 219 of a conference policy server, media policy server, or 220 conference subscription server, in addition to the focus. 222 Complex Participant: A complex participant is a participant in a 223 conference that has learned, through automated means, that 224 it is in a conference, and that can use a conference policy 225 control protocol, media policy control protocol, or 226 conference subscription, to implement advanced 227 functionality. 229 Conference Server: A conference server is a physical server 230 which contains, at a minimum, the focus. It may also 231 include a media policy server, a conference policy server, 232 and a mixer. 234 Singleton: In this context, a singleton is a conference 235 participant that is not a focus. A singleton represents a 236 single user in a conference. 238 Conference Topology: The conference topology is a graph that 239 defines the connectivity amongst participants connected 240 through conferences. Each node in the graph represents a 241 user agent, whether it is a focus or a singleton. Each leaf 242 node in the tree represents an singleton, and an internal 243 node represents a focus. An edge between two nodes implies 244 that there is a SIP dialog between them. Ideally, 245 conference topologies are trees, not arbitrary graphs. 247 Conversation Space: For each conference URI, there is a unique 248 conversation space. The conversation space is defined as 249 the set of singleton in the conference topology associated 250 with that URI. The conference topology associated with a 251 conference URI is the one that is constructed by starting 252 with the focus for that URI. Under normal circumstances, 253 the set of singleton in a conversation space will all 254 receive each others media. 256 Instant Conference: A conference in which the focus is 257 constructed the instant the first INVITE for a URI is 258 received, and then destroyed in which the last participant 259 has left. 261 Mass Invitation: A conference policy control protocol request to 262 invite a large number of users into the conference. 264 Mass Ejection: A conference policy control protocol request to 265 remove a large number of users from the conference. 267 Sidebar: A sidebar appears to the users as a "conference within 268 the conference". It is a dicsussion amongst a subset of the 269 participants, not heard by the remaining participants in 270 the conference. 272 Anonymous Participant: An anonymous participant is one that is 273 known to other participants (through the conference 274 notification service), but whose identity is being 275 withheld. 277 Invisible Participant: An invisible participant is one that is 278 not known to other participants in the conference. They may 279 be known to the moderator, depending on conference policy. 281 3 Basic Architecture 283 A SIP conference is represented by a URI. This URI identifies the 284 focus, which is the user agent at the center of the conference. Any 285 participant that is involved in the conference is connected to the 286 focus by a SIP dialog. The result is a star topology, shown in Figure 287 1. 289 The focus has access to a conference policy and media policy, an 290 instance of which exist for each focus. In a basic SIP conference, 291 these policies are administratively defined. 293 Users join the conference by sending an INVITE to the conference URI. 294 As long as the conference policy allows, the INVITE is accepted by 295 the focus and the user is brought into the conference. Users can 296 leave the conference by sending a BYE, as they would in a normal 297 call. Indeed, a participant in a basic conference does not need to 298 know that the focus is anything other than a normal SIP user agent. 300 Similarly, the focus can terminate a dialog with a participant, 301 should the conference policy change to indicate that the participant 302 is no longer allowed in the conference. A focus can also initiate an 303 INVITE, should the conference policy indicate that the focus needs to 304 bring a participant into the conference. 306 The focus is responsible for making sure that the media streams which 307 constitute the conference are available to the participants in the 308 conference. It does that through the use of one or more mixers, each 309 of which combines a number of input media streams to produce one or 310 more output media streams. The focus uses the media policy to 311 determine the proper configuration of the mixers. 313 With these basic capabilities, a large number of common conferencing 314 applications can be built. None of them require any extensions to 315 SIP; they merely require that the focus is aware of its role and 316 responsibilities in maintaining the conference. However, basic 317 conferences do not allow for the participants to control the way in 318 which the conference operates. 320 +-----------+ 321 | | 322 | | 323 |Participant| 324 | | 325 | | 326 +-----------+ 327 | 328 |SIP 329 |Dialog 330 | 331 | 332 +-----------+ +-----------+ +-----------+ 333 | | | | | | 334 | | | | | | 335 |Participant|-----------| Focus |------------|Participant| 336 | | SIP | | SIP | | 337 | | Dialog | | Dialog | | 338 +-----------+ +-----------+ +-----------+ 339 | 340 | 341 |SIP 342 |Dialog 343 | 344 | 345 +-----------+ 346 | | 347 | | 348 |Participant| 349 | | 350 | | 351 +-----------+ 353 Figure 1: Basic SIP Conference 355 A complex SIP conference is one in which additional interfaces are 356 exposed, allowing for a richer set of controls and information on the 357 conference. In particular, a complex SIP conference can include a 358 conference policy server and a media policy server, and the focus can 359 expose a conference notification service. The model for these 360 conferences is shown in Figure 2. This figure shows the view from one 361 participant. The conference now encompasses an additional set of 362 functions. In addition to maintaining the dialog with the focus, the 363 participant now has access to these other functions. It can, using a 364 conference event package [6], SUBSCRIBE to the conference URI, and be 365 connected to the conference notification service provided by the 366 focus. Through this package, it can learn about changes in 367 participants (effectively, the state of the dialogs), the media 368 policy, and the conference policy. 370 The participant can also communicate with the conference policy 371 server, using a conference policy control protocol. This is a 372 strictly client-server transactional protocol. This protocol might 373 not be a protocol at all; it can be performed using a web interface. 374 In this case, no standardized protocols or policies are needed. 375 However, the web interface can only be manipulated by humans, not 376 automata. For this reason, the participant can use a protocol 377 designed specifically for this purpose. 379 The participant can also communicate with the media policy server, 380 using a media policy control protocol. This is a strictly client- 381 server transactional operation. This can also be through a web 382 interface, or through an explicit protocol. 384 The focus will access the media and conference policies. There is a 385 tight coupling between these policies and the focus. Not only does it 386 need read access to these policies, but it needs to know when they 387 have changed. Such changes might result in SIP signaling (for 388 example, the ejection of a user from the conference using BYE), and 389 most changes will require a notification to be sent to subscribers to 390 the conference notification service. 392 The conference policy and media policy servers need not be available 393 in any particular conference. Even when available, they need not be 394 used by all participants. A participant in a conference that does not 395 access any of these functions, and which doesn't even know that the 396 focus is a focus, is called a basic participant. A conference 397 participant that can discover and access these additional function is 398 a complex participant. Any conference can include basic and complex 399 participants. 401 The interfaces between (1) the focus and the media policy, (2) the 402 focus and the conference policy, (3) the conference policy server and 403 the conference policy, and (4) the media policy server and the media 404 policy are not subject to standardization at the time of this 405 writing. They are intended primarily to show the logical roles 406 Conference ..................................... 407 Policy . +-----------+ . 408 Control . | | . 409 Protocol . |Participant| . 410 +------------------->| Policy | . 411 | . | Server | . 412 | . | | \ . 413 | Media . +-----------+ \ . 414 | Policy . +-----------+ \ //-----\\ . 415 | Control . | | > || || . 416 | Protocol . | Media | \\-----// . 417 | +------------->| Policy | | | . 418 | | . | Server |----> |Conference . 419 | | . | | | | . 420 | | . +-----------+ | & | . 421 | | . | | . 422 | | . | Media | . 423 +-----------+ . +-----------+ | Policy| . 424 | | . | | \ // . 425 | | . | | \-----/ . 426 |Participant|<--------->| Focus | | . 427 | | SIP . | | | . 428 | | Dialog . | |<-----------+ . 429 +-----------+ . |...........| . 430 ^ . | Conference| . 431 | . |Notification . 432 +------------>| Service | . 433 Subscription. +-----------+ . 434 . . 435 . . 436 . . 437 . . 438 ..................................... 440 Conference 441 Functions 443 Figure 2: Complex SIP Conference 444 to encourage clarity in the requirements and to allow individual 445 implementations the flexibility to compose a conferencing system in a 446 scalable and robust manner. 448 4 Usage of URIs 450 It is fundamental to this framework that a conference is uniquely 451 identified by a URI, and that this URI identify the focus which is 452 responsible for the conference. This URI is always a SIP or SIPS URI. 454 The conference URI is opaque to any participants which might use it. 455 There is no way to look at the URI, and know for certain whether it 456 identifies a focus, as opposed to a user or an interface on a PSTN 457 gateway. This is in line with the general philosophy of URI usage 458 [7]. However, contextual information surrounding the URI (for 459 example, SIP header parameters) may indicate that the URI represents 460 a conference. 462 The conference URI can represent a long-lived conference or interest 463 group, such as "sip:discussion-on-dogs@example.com". The focus 464 identified by this URI would always exist, and always be managing the 465 conference for whatever participants are currently joined. The 466 conference URI can also represent an "instant" conference, for 467 example, "sip:a8sd9998as-9s8daa@example.com". An instant conference 468 is one where the focus is instantiated when the first URI for it 469 arrives, and then destroyed when the last participant leaves. Both of 470 these represent variations in the policies implemented by the focus, 471 and cannot be determined from inspection of the URI. 473 Ideally, a conference URI is never constructed or guessed by a user. 474 Rather, conference URIs are learned through many mechanisms. A 475 conference URI can be emailed or sent in an instant message. A 476 conference URI can be linked on a web page. A conference URI can be 477 obtained from a conference policy control protocol, which can be used 478 to create conferences and the policies associated with them. 480 To determine that a SIP URI does represent a focus, standard 481 techniques for URI capability discovery can be used. First, a 482 participant can send an OPTIONS to a SIP URI, and if it represents a 483 focus, the response will indicate such [TBD]. The response will also 484 indicate whether or not the focus has implemented the subscription 485 notification service. This is known by the presence of an Allow 486 header in the response, indicating support for the SUBSCRIBE method, 487 along with an Allow-Events header, indicating support for the 488 conferencing package. A second method for determining that a URI 489 represents a focus is through a refresh request. The Allow and 490 Allow-Events headers, along with the caller preferences specification 491 [8] can indicate the same information that would be learned through 492 an OPTIONS query. 494 The other functions in a conference are also represented by URIs. If 495 the conference policy and media policy servers are implemented 496 through web pages, these servers are regular HTTP URIs. If they are 497 accessed using an explicit protocol, they are the URIs defined for 498 those protocols. 500 Starting with the conference URI, the URIs for the other logical 501 entities in the conference can be learned using [TBD]. 503 OPEN ISSUE: I suppose we cannot say more until the protocol 504 work is done. But, we have a requirement here - that there 505 be a way to learn these URIs starting only with the 506 conference URI. 508 5 Functions of the Elements 510 This section gives a more detailed description of the functions 511 typically implemented in each of the elements. 513 5.1 Focus 515 As its name implies, the focus is the center of the conference. All 516 participants in the conference are connected to it using a SIP 517 dialog. The focus is responsible for maintaining the dialogs 518 connected to it. It insures that the dialogs are connected to a set 519 of participants who are allowed to participate in the conference, as 520 defined by the conference policy. The focus also uses SIP to 521 manipulate the media sessions, in order to make sure each participant 522 obtains all the media for the conference. To do that, the focus makes 523 use of the services of a mixer. 525 When a focus receives an INVITE, it checks the conference policy. The 526 conference policy might indicate that this participant is not allowed 527 to join, in which case the call can be rejected. It might indicate 528 that another participant, acting as a moderator, needs to approve 529 this new participant. In that case, the INVITE might be parked on a 530 music-on-hold server, or a 183 response might be sent to indicate 531 progress. A notification, using the conference notification service, 532 would be sent to the moderator. The moderator then has the ability to 533 manipulate the policies using the conference policy control protocol. 534 If the policies are changed to allow this new participant, the focus 535 can accept the INVITE (or unpark it from the music-on-hold server). 536 The interpretation of the conference policy by the focus is, itself, 537 a matter of local policy, and not subject to standardization. 539 If a participant manipulated the conference policy to indicate that a 540 certain other participant was no longer allowed in the conference, 541 the focus would send a BYE to that other participant to remove them. 542 This is often referred to as "ejecting" a user from the conference. 543 The process of ejecting fundamentally constitutes these two steps - 544 the establishment of the policy through the conference policy 545 protocol, and the implementation of that policy (using a BYE) by the 546 focus. 548 Similarly, if a participant manipulated the conference policy to 549 indicate that a number of users need to be added to the conference, 550 the focus would send an INVITE to those participants. This is often 551 referred to as the "mass invitation" function. As with ejection, it 552 is fundamentally composed of the policy functions that specify the 553 participants which should be present, and the implementation of those 554 functions using SIP. A policy request to add a set of users might not 555 require an INVITE to execute it; those users might already be 556 participants in the conference. 558 A similar model exists for media policy. If the media policy 559 indicates that a participant should not receive any video, the focus 560 might implement that policy by sending a re-INVITE, removing the 561 media stream to that participant. Alternatively, if the video is 562 being centrally mixed, it could inform the mixer to send a black 563 screen to that participant. The means by which the policy is 564 implemented are not subject to specification. 566 5.2 Conference Policy Server 568 The conference policy server allows clients to manipulate and 569 interact with the conference policy. The conference policy is used by 570 the focus to make authorization decisions and guide its overall 571 behavior. Logically speaking, there is a one-to-one mapping between a 572 conference policy and a focus. 574 The conference policy is represented by a URI. There is a unique 575 conference policy for each focus. The conference policy URI points to 576 a conference policy server which can manipulate that conference 577 policy. A conference policy server also has a "top level" URI which 578 can be used to access functions that are independent of any 579 conference. Perhaps the most important of these functions is the 580 creation of a new conference. This will result in the construction of 581 a new conference URI, which can then be used to join the conference 582 itself. 584 The conference policy server is accessed using a client-server 585 transactional protocol. The client can be a participant in the 586 conference, or it can be a third party. Access control lists for who 587 can modify a conference policy are themselves part of the conference 588 policy. The conference policy server also allows clients to create 589 new conferences. This would result in the instantiation of a focus 590 (and therefore, a conference URI associated with that focus), a 591 conference policy, and a media policy. The conference policy server 592 will also have rules about who can create conferences. 594 The conference policy also includes per-participant policies that 595 specify how the focus is to handle a particular participant. These 596 include whether or not the participant is anonymous, for example. 598 5.3 Mixers 600 A mixer is responsible for combining the media streams that make up 601 the conference, and generating one or more output streams that are 602 distributed to recipients (which could be participants or other 603 mixers). The combination process is specific to the media type, and 604 is directed by the focus, under the guidance of the rules described 605 in the media policy. 607 A mixer is not aware of a "conference" as an entity, per se. A mixer 608 receives media streams as inputs, and based on directions provided by 609 the focus, generates media streams as outputs. There is no grouping 610 of media streams beyond the policies that describe the ways in which 611 the streams are mixed. 613 A mixer is always under the control of a focus. The focus is 614 responsible for interpreting the media policy, and then installing 615 the appropriate rules in the mixer. If the focus is directly 616 controlling a mixer, the mixer can either be co-resident with the 617 focus, or can be controlled through a protocol like Megaco [9]. 619 However, a focus need not directly control a mixer. Rather, a focus 620 can delegate the mixing to the participants, each of which has their 621 own mixer. This is described in Section 6.4. 623 5.4 Media Policy Server 625 The media policy server is similar to the conference policy server. 626 It is accessed using a transactional client-server protocol. It 627 manipulates a media policy, identified by a URI. The focus has the 628 responsibility of acting on that media policy, implementing it 629 through direct or indirect control of mixers. 631 The media policy describes the way in which the set of inputs to the 632 mixer are combined to generate the set of outputs. Media policies can 633 span media types. In other words, the policy on how one media stream 634 is mixed can be based on characteristics of other media streams. 636 Media policies can be based on any quantifiable characteristic of the 637 media stream (its source, volume, codecs, speaking/silence, etc.), 638 and they can be based on internal or external variables accessible by 639 the media policy. 641 The media policy server is responsible for reconciliation of 642 potentially conflicting requests regarding the media policy for the 643 conference. 645 The client of the media policy protocol can be any entity interested 646 in manipulating media policies. Clearly, participants might be 647 interested in manipulating them. A participant might want to raise or 648 lower the volume for one of the other participants it is hearing. Or, 649 a participant might want to switch from a tiled video view, to just 650 viewing the active speaker. A client of the media policy protocol 651 could also be another server whose job is to determine the media 652 policy. As an example, a floor control server is responsible for 653 determining which participant(s) in a conference are allowed to speak 654 at any given time, based on participant requests and access rules. 655 The floor control server would act as a client of the media policy 656 server, and inform the media policy server about who is allowed to 657 speak. 659 The client of the media policy protocol could also be another media 660 policy server, as described in Section 6.4. 662 Some examples of media policies include: 664 o The video output is the picture of the loudest speaker (video 665 follows audio). 667 o The audio from each participant will be mixed with equal 668 weight, and distributed to all other participants. 670 o The audio and video that is distributed is the one selected by 671 the floor control server. 673 5.5 Conference Notification Service 675 The focus can provide a conference notification service. In this 676 role, it acts as a notifier, as defined in RFC 3265 [4]. It accepts 677 subscriptions from clients for the conference URI, and generates 678 notifications to them as the state of the conference changes. 680 This state is composed of three separate pieces. The first is the 681 state of the focus, the second is the conference policy, and the 682 third is the media policy. 684 The state of the focus includes the participants connected to the 685 focus, and information about the dialogs associated with them. As new 686 participants join, this state would change, allowing subscribers to 687 learn about them. Similarly, when someone leaves, this state also 688 changes, allowing subscribers to learn about this fact. 690 The state of the conference policy includes the set of participants 691 that are allowed, or not allowed, to join the conference, and the set 692 of participants who are to be explicitly added to the conference. It 693 includes the roles which are assigned to each participant, such as 694 whether they are a moderator. If there was a change in role, for 695 example, a new moderator was selected, the focus would inform 696 subscribers. 698 The state of the media policy includes the media streams being 699 received by each participant, the audio or video modalities, and so 700 on. 702 5.6 Participants 704 A participant in a conference is any SIP user agent that has a dialog 705 with the focus. This SIP user agent can be a PC application, a SIP 706 hardphone, or a PSTN gateway. It can also be another focus. A 707 conference which has a participant that is the focus of another 708 conference is called a cascaded conference. They can also be used to 709 provide scalable conferences where there are regional sub- 710 conferences, each of which is connected to the main conference. A 711 conference topology refers to a graph which shows each focus and each 712 participant as a vertex, with a connection between each participant 713 and its focus. 715 5.7 Conference Policy 717 The conference policy contains the rules that guide the operation of 718 the focus. These rules can be simple, such as an access list that 719 defines the set of allowed participants in a conference. The rules 720 can also be incredibly complex, specifying time-of-day based rules on 721 participation conditional on the presence of other participants. It 722 is important to understand that there is no restriction on the type 723 of rules that can be encapsulated in a conference policy. 725 However, there does exist a protocol means by which a client can 726 request a change in the conference policy. This is done by 727 communicating with the conference policy server, which manipulates 728 the conference policy. By the nature of conference policies, not all 729 aspects of the policy can be manipulated with the conference policy 730 control protocol. It is the responsibility of the conference policy 731 server to reconcile the various requests with the conference policy. 733 5.8 Media Policy 735 The media policy contains the rules that guide the operation of the 736 mixer. The focus uses these rules to interact with the mixer to 737 implement them. These rules can be simple (mix all media from all 738 participants), or they can be incredibly complex. It is important to 739 understand that there is no restriction on the type of rules that can 740 be encapsulated in a media policy. 742 However, there does exist a protocol means by which a client can 743 request a change in the media policy. This is done by communicating 744 with the media policy server, which manipulates the media policy. By 745 the nature of media policies, not all aspects of the policy can be 746 manipulated with the media policy control protocol. It is the 747 responsibility of the media policy server to reconcile the various 748 requests with the media policy. 750 6 Physical Realization 752 In this section, we present several physical instantiations of these 753 components, to show how these basic functions can be combined to 754 solve a variety of problems. 756 6.1 Centralized Server 758 In the most simplistic realization of this framework, there is a 759 single physical server in the network which implements the focus, the 760 conference policy server, the media policy server, and the mixer. 761 This is the classic "one box" solution, shown in Figure 3. 763 6.2 Endpoint Server 765 Another important model is that of a locally-mixed ad-hoc conference. 766 In this scenario, two users (A and B) are in a regular point-to-point 767 call. One of the participants (A) decides to conference in a third 768 participant, C. To do this, A begins acting as a focus. Its existing 769 dialog with B becomes the first dialog attached to the focus. B would 770 re-INVITE A on that dialog, changing its Contact URI to a new value 771 which identifies the focus. In essence, A "mutates" from a single- 772 user UA to a focus plus a single user UA, and in the process of such 773 a mutation, its URI changes. Then, the focus makes an outbound INVITE 774 to C. When C accepts, it mixes the media from A and C together, 775 redistributing the results. The mixed media is also played locally. 776 Figure 4 shows a diagram of this transition. 778 It is important to note that the external interfaces in this model, 779 Conference Server 780 ................................... 781 . . 782 . +------+ +------------+ . 783 . |Media | | Conference | . 784 . |Policy| |Notification| . 785 . |Server| | Server | . 786 . +------+ +------------+ . 787 . +----------+ . 788 . |Conference| . 789 . | Policy | +-------+ +-----+ . 790 . | Server | | Focus | |Mixer| . 791 . +----------+ +-------+ +-----+ . 792 ................//.\.......--./.... 793 // \ ---- / 794 // -\- /RTP 795 SIP // ---- \ / 796 // --- \SIP / 797 // ---- RTP \ / 798 / -- \ / 799 +-----------+ +-----------+ 800 |Participant| |Participant| 801 +-----------+ +-----------+ 803 Figure 3: Centralized server architecture 805 between A and B, and between B and C, are exactly the same to those 806 that would be used in a centralized server model. B could also 807 include a media policy server and conference subscription server too, 808 allowing the participants to have access to them if they so desired. 809 Just because the focus is co-resident with a participant does not 810 mean any aspect of the behaviors and external interfaces will change. 812 6.3 Media Server Component 813 B B 814 +------+ +------+ 815 | | | | 816 | UA | | UA | 817 | | | | 818 +------+ +------+ 819 | . | . 820 | . | . 821 | . | . 822 | . Transition | . 823 | . ------------> | . 824 SIP| .RTP SIP| .RTP 825 | . | . 826 | . | . 827 | . | . 828 | . | . 829 | . +----------+ 830 +------+ | +------+ | SIP +------+ 831 | | | |Focus | |----------| | 832 | UA | | |M.Pol.| | | UA | 833 | | | |C.Pol.| |..........| | 834 +------+ | |Mixer | | RTP +------+ 835 | +------+ | 836 A | + | C 837 | + <..|....... 838 | + | . 839 | +------+ | . 840 | |Parti-| | . 841 | |cipant| | . 842 | | | | . 843 | +------+ | . 844 +----------+ . 845 B . 846 . 848 Internal 849 Interface 851 Figure 4: Transition from two-party call to conference 852 +------------+ +------------+ 853 | App Server| SIP |Conf. Cmpnt.| 854 | |-------------| | 855 | Focus | Conf. Proto | Focus | 856 | C.Pol |-------------| M.Pol | 857 | M.Pol | Media Proto | Mixer | 858 |Notification|-------------| | 859 | | | | 860 +------------+ +------------+ 861 | \ .. . 862 | \\ RTP... . 863 | \\ .. . 864 | SIP \\ ... . 865 SIP | \\ ... .RTP 866 | ..\ . 867 | ... \\ . 868 | ... \\ . 869 | .. \\ . 870 | ... \\ . 871 | .. \ . 872 +-----------+ +-----------+ 873 |Participant| |Participant| 874 +-----------+ +-----------+ 876 Figure 5: Media server component model 878 In this model, shown in Figure 5, each conference involves two 879 centralized servers. One of these servers, referred to as the 880 "application server" owns and manages the conference and media 881 policies, and maintains a dialog with each participant. As a result, 882 it represents the focus seen by all participants in a conference. 883 However, this server doesn't provide any media support. To perform 884 the actual media mixing function, it makes use of a second server, 885 called the "mixing server". This server includes a focus, but has no 886 conference policy server or conference notification service. It has a 887 default conference policy, which accepts all invitations from the 888 top-level focus. Its media policy server accepts any controls made by 889 the application server. The focus in the application server uses 890 third party call control to connect the media streams of each user to 891 the mixing server, as needed. If the focus in the application server 892 receives a media policy control command from a client, it delegates 893 that to the media server by making the same media policy control 894 command to it. 896 This model allows for the mixing server to be used as a resource for 897 a variety of different conferencing applications. This is because it 898 is unaware of any conference or media policies; it is merely a 899 "slave" to the top-level server, doing whatever it asks. This is 900 consistent with the SIP Application Server Component Model [10]. 902 6.4 Distributed Mixing 904 In a distributed mixed conference, there is still a centralized 905 server which implements the focus, conference policy server, and 906 media policy server. However, there is no centralized mixer. Rather, 907 there is a mixer in each endpoint, along with a media policy server. 908 The focus distributes the media by using third party call control 909 [11] to move a media stream between each participant and each other 910 participant. As a result, if there are N participants in the 911 conference, there will be a single dialog between each participant 912 and the focus, but the session description associated with that 913 dialog will be constructed to allow media to be distributed amongst 914 the participants. This is shown in Figure 6. 916 There are several ways in which the media can be distributed to each 917 participant for mixing. In a multi-unicast model, each participant 918 sends a copy of its media to each other participant. In this case, 919 the session description manages N-1 media streams. In a multicast 920 model, each participant joins a common multicast group, and each 921 participant sends a single copy of its media stream to that group. 922 The underlying multicast infrastructure then distributes the media, 923 so that each participant gets a copy. In a single-source multicast 924 model (SSM), each participant sends its media stream to a central 925 point, using unicast. The central point then redistributes the media 926 to all participants using multicast. The focus is responsible for 927 selecting the modality of media distribution, and for handling any 928 hybrids that would be necessitated from clients with mixed 929 capabilities. 931 When a new participant joins or is added, the focus will perform the 932 necessary third party call control to distribute the media from the 933 new participant to all the other participants, and vice-a-versa. 935 The central conference server also includes a media policy server. Of 936 course, the central conference server cannot implement any of the 937 media policies directly. Rather, it would delegate the implementation 938 to the media policy servers co-resident with a participant. As an 939 example, if a participant decides to switch the overall conference 940 mode from "video follows audio" to "tiled video", they would 941 communicate with the central media policy server. This media policy 942 server, in turn, would communicate with the media policy servers co- 943 resident with each participant, using the same media policy control 944 protocol, and instruct them to use "tiled video". 946 This model requires additional functionality in user agents, which 947 may or may not be present. The participants, therefore, must be able 948 to advertise this capability to the focus. 950 6.5 Cascaded Mixers 952 In very large conferences, it may not be possible to have a single 953 mixer that can handle all of the media. A solution to this is to use 954 cascaded mixers. In this architecture, there is a centralized focus, 955 but the mixing function is implemented by a multiplicity of mixers, 956 scattered throughout the network. Each participant is connected to 957 one, and only one of the mixers. The focus uses some kind of control 958 protocol (such as MEGACO [9]) to connect the mixers together, so that 959 all of the participants can hear each other. 961 This architecture is shown in Figure 7. 963 7 Common Operations 965 There are a large number of ways in which users can interact with a 966 conference. They can join, leave, set policies, approve members, and 967 so on. This section is meant as an overview of the basic primitives, 968 summarizing how they operate. More detailed examples with complete 969 call flows can be found in [12]. 971 7.1 Creating Conferences 973 There are many ways in which a conference can be created. Ultimately, 974 all of them result in the establishment of a conference URI which 975 identifies a focus. In all cases, a conference URI must be created by 976 the focus itself, or an element which is responsible for managing 977 URIs that are used by the focus. Otherwise, the uniqueness of 978 conference URIs could not be guaranteed. 980 +---------+ 981 |Partcpnt | 982 media | | media 983 ...............| |.................. 984 . | Mixer | . 985 . |M.Pol.Srv| . 986 . +---------+ . 987 . | . 988 . | . 989 . | . 990 . dialog | . 991 . | . 992 . | . 993 . | . 994 . +---------+ . 995 . |Cnf.Srvr.| . 996 . | | . 997 . | Focus | . 998 . |M.Pol.Srv| . 999 . / |C.Pol.Srv| \ . 1000 . / +---------+ \ . 1001 . / \ . 1002 . / \ . 1003 . / dialog \ . 1004 . / \ . 1005 . /dialog \ . 1006 . / \ . 1007 . / \ . 1008 . / \ . 1009 . . 1010 +---------+ +---------+ 1011 |Partcpnt | |Partcpnt | 1012 | | | | 1013 | | ......................... | | 1014 | Mixer | | Mixer | 1015 |M.Pol.Srv| media |M.Pol.Srv| 1016 +---------+ +---------+ 1018 Figure 6: Dialog and media streams in a distributed mixed conference 1019 +---------+ 1020 +-----------------------| |------------------------+ 1021 | ++++++++++++++++++++| |++++++++++++++++++ | 1022 | + +------| Focus |---------+ + | 1023 | + | | | | + | 1024 | + | +-| |--+ | + | 1025 | + | | +---------+ | | + | 1026 | + | | + | | + | 1027 | + | | + | | + | 1028 | + | | + | | + | 1029 | + | | +---------+ | | + | 1030 | + | | | | | | + | 1031 | + | | | Mixer 2 | | | + | 1032 | + | | | | | | + | 1033 | + | | +---------+ | | + | 1034 | + | |... . .... | | + | 1035 | + .|....| . .|.... | + | 1036 | + ...... | | . | ..|... + | 1037 | + ... | | . | | ....+ | 1038 | +---------+ | | +---------+ | | +---------+ | 1039 | | | | | | | | | | | | 1040 | | Mixer 2 | | | | Mixer 3 | | | | Mixer 4 | | 1041 | | | | | | | | | | | | 1042 | +---------+ | | +---------+ | | +---------+ | 1043 | . . | | . . | | . . | 1044 | . . | | .. . | | .. . | 1045 | . . | | . . | | . . | 1046 +---------+ . | +---------+ . | +---------+ . | 1047 | Prtcpnt | . | | Prtcpnt | . | | Prtcpnt | . | 1048 | 1 | . | | 1 | . | | 1 | . | 1049 +---------+ . | +---------+ . | +---------+ . | 1050 . | . | . | 1051 +---------+ +---------+ +---------+ 1052 | Prtcpnt | | Prtcpnt | | Prtcpnt | 1053 | 1 | | 1 | | 1 | 1054 +---------+ +---------+ +---------+ 1056 ------- SIP Dialog 1057 ....... Media Flow 1058 +++++++ Control Protocol 1060 Figure 7: Cascaded Mixers 1062 protocol, a client can instruct the conference policy server to 1063 create a new conference. The result of this operation is a conference 1064 URI, which is returned to the client. 1066 Another way to obtain a conference URI is to literally guess. In an 1067 instant conferencing server, there are literally an infinite number 1068 of conference URIs which can be used. Each of them is a valid 1069 conference URI, since it identifies a focus, and when an INVITE is 1070 sent to it, will join the user into that conference. As a result, a 1071 client can simply choose one of them at random, so long as it is 1072 configured with the domain portion of the URI and any naming 1073 conventions in use by the instant conferencing server. 1075 OPEN ISSUE: Do we need to specify standards for this? 1077 The previous two approaches are used to obtain conference URIs for 1078 focuses that are hosted within centralized servers. Creation of 1079 conferences where the focus resides in an endpoint operates 1080 differently. There, the endpoint itself creates the conference URI, 1081 and hands it out to other endpoints which are to be the participants. 1082 What differs from case to case is how the endpoint decides to create 1083 a conference. 1085 One important case is the ad-hoc conference described in Section 6.2. 1086 There, an endpoint unilaterally decides to create the conference 1087 based on local policy. The dialogs that were connected to the UA are 1088 migrated to the endpoint-hosted focus, using a re-INVITE to pass the 1089 conference URI to the newly joined participants. 1091 Alternatively, one UA can ask another UA to create an endpoint-hosted 1092 conference. This is accomplished with the SIP Join header [13]. The 1093 UA which receives the Join header in an invitation may need to create 1094 a new conference URI (a new one is not needed if the dialog that is 1095 being joined is already part of a conference). The conference URI is 1096 then handed to the recently joined participants through a re-INVITE. 1098 7.2 Adding Participants 1100 There are two modes for adding participants to a conference - first 1101 party additions, and third party additions. In a first party 1102 addition, the participant that wishes to join makes a direct attempt 1103 to join. In a third party addition, some other participant takes 1104 action with the aim of causing a third party to be added to the 1105 conference. 1107 First person additions are trivially accomplished with a standard 1108 INVITE. A participant can send an INVITE request to the conference 1109 URI, and if the conference policy allows them to join, they are added 1110 to the conference. 1112 If a UA does not know the conference URI, but has learned about a 1113 dialog which is connected to a conference (by using the dialog event 1114 package, for example [14]), the UA can join the conference by using 1115 the Join header to join the dialog. 1117 Third party invitations can be done in one of several ways. The first 1118 approach is for the user to ask the third party to send an INVITE to 1119 the conference URI. This can be done automatically through the usage 1120 of REFER [15]. The participant would send a REFER request to the 1121 third party. The Refer-To header field in that request would contain 1122 the conference URI. There are countless non-automated means for 1123 asking a participant to send an INVITE to the conference URI. A user 1124 can send an instant message [16] to the third party, containing an 1125 HTML document which requests the user to click on the hyperlink to 1126 join the conference: 1128 1129 Hey, would you like to join 1130 the conference now? 1131 1133 The second approach for third party additions is for the participant 1134 to ask the focus to add the third party to the conference. In this 1135 case, however, a REFER cannot be used. REFER would have the effect of 1136 telling the focus to send an INVITE to the new potential participant. 1137 However, just sending this INVITE is not sufficient for adding the 1138 new member. In more complex realizations, such as the distributed 1139 mixing scenario of Section 6.4, a multiplicity of invitations will 1140 need to be sent. This would require the focus to attach additional 1141 meaning to REFER; it would have to be interpreted as a request to add 1142 a participant to the conference. However, it is fundamental to the 1143 concept of REFER that the recipient not attach specific application 1144 semantics to it. Therefore, it cannot be used. Rather, the user would 1145 use the conference policy control protocol to request that the focus 1146 add the new participant. The conference policy control protocol can 1147 also be used to add a multiplicity of new users. This is referred to 1148 as mass invitation. 1150 In many cases, a new participant will not wish to join the conference 1151 unless they can join with a particicular set of policies. As an 1152 example, a participant may want to join anonymously, so that other 1153 participants know that someone has joined, but not who. To accomplish 1154 this, the conference policy control protocol is used to establish 1155 these policies prior to the generation or acceptance of an invitation 1156 to the conference. For example, if a user wishes to join a conference 1157 with a known conference URI, the user would obtain the URI for the 1158 conference policy, manipulate the policy to set themself as an 1159 anonymous participant, and then actually join the conference by 1160 sending an INVITE request to the conference URI. 1162 OPEN ISSUE: Will this always work? Are there cases where 1163 the conference policy cannot be manipulated until the 1164 INVITE has been sent? This would require a preconditions- 1165 style solution. 1167 7.3 Removing Participants 1169 As with additions, there are two modalities for departures - first 1170 person (in which a user explicitly leaves), and third person, where 1171 they are removed by a different user. 1173 First person departures are trivially accomplished by terminating the 1174 dialog that the participant is using to connect to the focus. 1176 Third person departures can be done in one of two ways. First, a user 1177 can make use of the REFER method to instruct the third party to send 1178 a BYE to the conference server on the dialog that connects them to 1179 the focus. This requires the user to have knowledge of the dialog 1180 identifiers used by that participant. The second mechanism, which is 1181 much cleaner, is to use the conference policy control protocol to 1182 inform the focus that the participant is explicitly barred from the 1183 conference. This will cause the focus to eject the user, sending them 1184 a BYE in addition to whatever other signaling is needed to remove 1185 them. The conference policy control protocol can also be used to 1186 remove a large number of users. This is generally referred to as mass 1187 ejection. 1189 7.4 Approving Policy Changes 1191 A conference policy for a particular conference may designate one or 1192 more users as moderators for some set of media policy or conference 1193 policy change requests. This means that those moderators need to 1194 approve the specific policy change. Typically, moderators are used to 1195 approve member additions and removals. However, the framework allows 1196 for moderators to be associated with any policy change that can be 1197 made. 1199 The general model to support moderator approval is through the 1200 conference notification service. The moderator subscribes to the 1201 notification service. They are authenticated by the focus, which 1202 determines that they are a moderator for the conference. Whenever a 1203 policy change request is made by a client that requires moderator 1204 approval, the policy change is not actually committed. Rather, it is 1205 marked as pending by the conference policy server. Any moderators for 1206 that specific policy request who are subscribed to the conference 1207 notification service will receive a notification of the pending 1208 change. The moderators, using the conference policy control protocol, 1209 can approve the specific change. This commits the new policy. All 1210 participants are then notified of the new policy through the 1211 notification service. 1213 7.5 Creating Sidebars 1215 A sidebar is a "conference within a conference", allowing a subset of 1216 the participants to converse amongst themselves. Frequently, 1217 participants in a sidebar will still receive media from the main 1218 conference, but "in the background". For audio, this may mean that 1219 the volume of the media is reduced, for example. 1221 There are two ways to represent a sidebar in this framework. The 1222 first is to treat it as a specific kind of media policy. It is a 1223 media policy which would request that sidebar participants be "in the 1224 foreground", and others "in the background". There are no additional 1225 dialogs or conferences established. The media policy control protocol 1226 would allow a user to explicitly request sidebars. The server would 1227 alert users (through the notification service) that they have been 1228 invited to the sidebar. They would use the media policy control 1229 protocol to approve their participation in it. 1231 An alternative view is that a sidebar truly is a conference within a 1232 conference, and would be implemented that way. There would be a new 1233 conference URI associated with the sidebar. Standard techniques would 1234 be used to add users to the sidebar, approve their membership, and so 1235 on. The sidebar would itself be a participant in the main conference. 1236 Users would continue to receive their media stream only through the 1237 main conference. They would have a dialog with the sidebar focus, but 1238 no media would be exchanged on this dialog. 1240 OPEN ISSUE: It is still unclear as to which model is 1241 preferrable. We should pick one. 1243 8 Security Considerations 1245 Conferences frequently require security features in order to properly 1246 operate. The conference policy may dictate that only certain 1247 participants can join, or that certain participants can create new 1248 policies. Generally speaking, conference applications are very 1249 concerned about authorization decisions. Mechanisms for establishing 1250 and enforcing such authorization rules is a central concept 1251 throughout this document. 1253 Of course, authorization rules require authentication. Normal SIP 1254 authentication mechanisms should suffice for the the conference 1255 authorization mechanisms described here. 1257 9 Contributors 1259 This document is the result of discussions amongst the conferencing 1260 design team. The members of this team include: 1262 Brian Rosen 1263 Rohan Mahy 1264 Henning Schulzrinne 1265 Orit Levin 1266 Roni Even 1267 Tom Taylor 1268 Petri Koskelainen 1269 Nermeen Ismail 1270 Andy Zmolek 1271 Joerg Ott 1272 Dan Petrie 1274 10 Authors Addresses 1276 Jonathan Rosenberg 1277 dynamicsoft 1278 72 Eagle Rock Avenue 1279 First Floor 1280 East Hanover, NJ 07936 1281 email: jdrosen@dynamicsoft.com 1283 11 Normative References 1285 12 Informative References 1287 [1] J. Rosenberg, H. Schulzrinne, G. Camarillo, A. Johnston, J. 1289 Peterson, R. Sparks, M. Handley, and E. Schooler, "SIP: session 1290 initiation protocol," RFC 3261, Internet Engineering Task Force, June 1291 2002. 1293 [2] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, "RTP: a 1294 transport protocol for real-time applications," RFC 1889, Internet 1295 Engineering Task Force, Jan. 1996. 1297 [3] O. Levin et al. , "Requirements for tightly coupled SIP 1298 conferencing," Internet Draft, Internet Engineering Task Force, July 1299 2002. Work in progress. 1301 [4] A. B. Roach, "Session initiation protocol (sip)-specific event 1302 notification," RFC 3265, Internet Engineering Task Force, June 2002. 1304 [5] B. Campbell and J. Rosenberg, "Instant message sessions in 1305 simple," Internet Draft, Internet Engineering Task Force, Oct. 2002. 1306 Work in progress. 1308 [6] J. Rosenberg and H. Schulzrinne, "A session initiation protocol 1309 (SIP) event package for conference state," Internet Draft, Internet 1310 Engineering Task Force, June 2002. Work in progress. 1312 [7] T. Berners-Lee, R. Fielding, and L. Masinter, "Uniform resource 1313 identifiers (URI): generic syntax," RFC 2396, Internet Engineering 1314 Task Force, Aug. 1998. 1316 [8] H. Schulzrinne and J. Rosenberg, "Session initiation protocol 1317 (SIP) caller preferences and callee capabilities," Internet Draft, 1318 Internet Engineering Task Force, July 2002. Work in progress. 1320 [9] F. Cuervo, N. Greene, A. Rayhan, C. Huitema, B. Rosen, and J. 1321 Segers, "Megaco protocol version 1.0," RFC 3015, Internet Engineering 1322 Task Force, Nov. 2000. 1324 [10] J. Rosenberg, P. Mataga, and H. Schulzrinne, "An application 1325 server component architecture for SIP," Internet Draft, Internet 1326 Engineering Task Force, Mar. 2001. Work in progress. 1328 [11] J. Rosenberg, J. Peterson, H. Schulzrinne, and G. Camarillo, 1329 "Best current practices for third party call control in the session 1330 initiation protocol," Internet Draft, Internet Engineering Task 1331 Force, June 2002. Work in progress. 1333 [12] A. Johnston and O. Levin, "Session initiation call control - 1334 conferencing for user agents," Internet Draft, Internet Engineering 1335 Task Force, Oct. 2002. Work in progress. 1337 [13] R. Mahy and D. Petrie, "The session initiation protocol (sip) 1338 join header," Internet Draft, Internet Engineering Task Force, Oct. 1339 2002. Work in progress. 1341 [14] J. Rosenberg and H. Schulzrinne, "A session initiation protocol 1342 (SIP) event package for dialog state," Internet Draft, Internet 1343 Engineering Task Force, June 2002. Work in progress. 1345 [15] R. Sparks, "The SIP refer method," Internet Draft, Internet 1346 Engineering Task Force, July 2002. Work in progress. 1348 [16] B. Campbell and J. Rosenberg, "Session initiation protocol 1349 extension for instant messaging," Internet Draft, Internet 1350 Engineering Task Force, Sept. 2002. Work in progress. 1352 Full Copyright Statement 1354 Copyright (c) The Internet Society (2002). All Rights Reserved. 1356 This document and translations of it may be copied and furnished to 1357 others, and derivative works that comment on or otherwise explain it 1358 or assist in its implementation may be prepared, copied, published 1359 and distributed, in whole or in part, without restriction of any 1360 kind, provided that the above copyright notice and this paragraph are 1361 included on all such copies and derivative works. However, this 1362 document itself may not be modified in any way, such as by removing 1363 the copyright notice or references to the Internet Society or other 1364 Internet organizations, except as needed for the purpose of 1365 developing Internet standards in which case the procedures for 1366 copyrights defined in the Internet Standards process must be 1367 followed, or as required to translate it into languages other than 1368 English. 1370 The limited permissions granted above are perpetual and will not be 1371 revoked by the Internet Society or its successors or assigns. 1373 This document and the information contained herein is provided on an 1374 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 1375 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 1376 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 1377 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 1378 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.