idnits 2.17.1 draft-ietf-sipping-cc-framework-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 24. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 1806. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1817. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1824. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1830. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Introduction section. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (March 4, 2007) is 6262 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Looks like a reference, but probably isn't: 'JTAPI' on line 228 -- Looks like a reference, but probably isn't: 'CSTA' on line 229 -- Looks like a reference, but probably isn't: 'VoiceXML' on line 703 == Unused Reference: '2' is defined on line 1658, but no explicit reference was found in the text -- Obsolete informational reference (is this intentional?): RFC 3265 (ref. '4') (Obsoleted by RFC 6665) -- Obsolete informational reference (is this intentional?): RFC 4566 (ref. '5') (Obsoleted by RFC 8866) == Outdated reference: A later version (-15) exists of draft-ietf-sipping-service-examples-12 == Outdated reference: A later version (-12) exists of draft-ietf-sipping-cc-transfer-07 == Outdated reference: A later version (-05) exists of draft-mahy-sip-remote-cc-04 == Outdated reference: A later version (-07) exists of draft-ietf-sip-answermode-01 Summary: 2 errors (**), 0 flaws (~~), 6 warnings (==), 12 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 SIPPING WG R. Mahy 3 Internet-Draft Plantronics 4 Intended status: Informational B. Campbell 5 Expires: September 5, 2007 R. Sparks 6 Estacado Systems 7 J. Rosenberg 8 Cisco Systems 9 D. Petrie 10 SIP EZ 11 A. Johnston, Ed. 12 Avaya 13 March 4, 2007 15 A Call Control and Multi-party usage framework for the Session 16 Initiation Protocol (SIP) 17 draft-ietf-sipping-cc-framework-07 19 Status of this Memo 21 By submitting this Internet-Draft, each author represents that any 22 applicable patent or other IPR claims of which he or she is aware 23 have been or will be disclosed, and any of which he or she becomes 24 aware will be disclosed, in accordance with Section 6 of BCP 79. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF), its areas, and its working groups. Note that 28 other groups may also distribute working documents as Internet- 29 Drafts. 31 Internet-Drafts are draft documents valid for a maximum of six months 32 and may be updated, replaced, or obsoleted by other documents at any 33 time. It is inappropriate to use Internet-Drafts as reference 34 material or to cite them other than as "work in progress." 36 The list of current Internet-Drafts can be accessed at 37 http://www.ietf.org/ietf/1id-abstracts.txt. 39 The list of Internet-Draft Shadow Directories can be accessed at 40 http://www.ietf.org/shadow.html. 42 This Internet-Draft will expire on September 5, 2007. 44 Copyright Notice 46 Copyright (C) The IETF Trust (2007). 48 Abstract 50 This document defines a framework and requirements for multi-party 51 usage of SIP. To enable discussion of multi-party features and 52 applications we define an abstract call model for describing the 53 media relationships required by many of these. The model and actions 54 described here are specifically chosen to be independent of the SIP 55 signaling and/or mixing approach chosen to actually setup the media 56 relationships. In addition to its dialog manipulation aspect, this 57 framework includes requirements for communicating related information 58 and events such as conference and session state, and session history. 59 This framework also describes other goals that embody the spirit of 60 SIP applications as used on the Internet. 62 Table of Contents 64 1. Motivation and Background . . . . . . . . . . . . . . . . . . 4 65 2. Key Concepts . . . . . . . . . . . . . . . . . . . . . . . . . 6 66 2.1. "Conversation Space" Model . . . . . . . . . . . . . . . . 6 67 2.2. Comparison with Related Definitions . . . . . . . . . . . 7 68 2.3. Signaling Models . . . . . . . . . . . . . . . . . . . . . 8 69 2.4. Mixing Models . . . . . . . . . . . . . . . . . . . . . . 9 70 2.4.1. Tightly Coupled . . . . . . . . . . . . . . . . . . . 9 71 2.4.2. Loosely Coupled . . . . . . . . . . . . . . . . . . . 11 72 2.5. Conveying Information and Events . . . . . . . . . . . . . 11 73 2.6. Componentization and Decomposition . . . . . . . . . . . . 13 74 2.6.1. Media Intermediaries . . . . . . . . . . . . . . . . . 14 75 2.6.2. Mixer . . . . . . . . . . . . . . . . . . . . . . . . 14 76 2.6.3. Transcoder . . . . . . . . . . . . . . . . . . . . . . 14 77 2.6.4. Media Relay . . . . . . . . . . . . . . . . . . . . . 14 78 2.6.5. Queue Server . . . . . . . . . . . . . . . . . . . . . 14 79 2.6.6. Parking Place . . . . . . . . . . . . . . . . . . . . 15 80 2.6.7. Announcements and Voice Dialogs . . . . . . . . . . . 15 81 2.7. Use of URIs . . . . . . . . . . . . . . . . . . . . . . . 17 82 2.7.1. Naming Users in SIP . . . . . . . . . . . . . . . . . 17 83 2.7.2. Naming Services with SIP URIs . . . . . . . . . . . . 19 84 2.8. Invoker Independence . . . . . . . . . . . . . . . . . . . 20 85 2.9. Billing issues . . . . . . . . . . . . . . . . . . . . . . 21 86 3. Catalog of call control actions and sample features . . . . . 21 87 3.1. Early Dialog Actions . . . . . . . . . . . . . . . . . . . 22 88 3.1.1. Remote Answer . . . . . . . . . . . . . . . . . . . . 22 89 3.1.2. Remote Forward or Put . . . . . . . . . . . . . . . . 22 90 3.1.3. Remote Busy or Error Out . . . . . . . . . . . . . . . 22 91 3.2. Single Dialog Actions . . . . . . . . . . . . . . . . . . 22 92 3.2.1. Remote Dial . . . . . . . . . . . . . . . . . . . . . 23 93 3.2.2. Remote On and Off Hold . . . . . . . . . . . . . . . . 23 94 3.2.3. Remote Hangup . . . . . . . . . . . . . . . . . . . . 23 96 3.3. Multi-dialog actions . . . . . . . . . . . . . . . . . . . 23 97 3.3.1. Transfer . . . . . . . . . . . . . . . . . . . . . . . 23 98 3.3.2. Take . . . . . . . . . . . . . . . . . . . . . . . . . 24 99 3.3.3. Add . . . . . . . . . . . . . . . . . . . . . . . . . 24 100 3.3.4. Local Join . . . . . . . . . . . . . . . . . . . . . . 25 101 3.3.5. Insert . . . . . . . . . . . . . . . . . . . . . . . . 25 102 3.3.6. Split . . . . . . . . . . . . . . . . . . . . . . . . 25 103 3.3.7. Near-fork . . . . . . . . . . . . . . . . . . . . . . 25 104 3.3.8. Far fork . . . . . . . . . . . . . . . . . . . . . . . 26 105 4. Security Considerations . . . . . . . . . . . . . . . . . . . 26 106 5. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 27 107 6. Appendix A: Example Features . . . . . . . . . . . . . . . . . 27 108 6.1. Implementation of these features . . . . . . . . . . . . . 31 109 6.1.1. Call Park . . . . . . . . . . . . . . . . . . . . . . 32 110 6.1.2. Call Pickup . . . . . . . . . . . . . . . . . . . . . 32 111 6.1.3. Music on Hold . . . . . . . . . . . . . . . . . . . . 32 112 6.1.4. Call Monitoring . . . . . . . . . . . . . . . . . . . 33 113 6.1.5. Barge-in . . . . . . . . . . . . . . . . . . . . . . . 33 114 6.1.6. Intercom . . . . . . . . . . . . . . . . . . . . . . . 33 115 6.1.7. Speakerphone paging . . . . . . . . . . . . . . . . . 33 116 6.1.8. Distinctive ring . . . . . . . . . . . . . . . . . . . 34 117 6.1.9. Voice message screening . . . . . . . . . . . . . . . 34 118 6.1.10. Single Line Extension/Multiple Line Appearance . . . . 34 119 6.1.11. Click-to-dial . . . . . . . . . . . . . . . . . . . . 34 120 6.1.12. Pre-paid calling . . . . . . . . . . . . . . . . . . . 34 121 6.1.13. Voice Portal . . . . . . . . . . . . . . . . . . . . . 35 122 7. Informative References . . . . . . . . . . . . . . . . . . . . 36 123 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 38 124 Intellectual Property and Copyright Statements . . . . . . . . . . 40 126 1. Motivation and Background 128 The Session Initiation Protocol [1] (SIP) was defined for the 129 initiation, maintenance, and termination of sessions or calls between 130 one or more users. However, despite its origins as a large-scale 131 multiparty conferencing protocol, SIP is used today primarily for 132 point to point calls. This two-party configuration is the focus of 133 the SIP specification and most of its extensions. 135 This document defines a framework and requirements for multi-party 136 usage of SIP. Most multi-party operations manipulate SIP session 137 dialogs (also known as call legs) or SIP conference media policy to 138 cause participants in a conversation to perceive specific media 139 relationships. In other protocols that deal with the concept of 140 calls, this manipulation is known as call control. In addition to 141 its dialog or policy manipulation aspect, "call control" also 142 includes communicating information and events related to manipulating 143 calls, including information and events dealing with session state 144 and history, conference state, user state, and even message state. 146 Based on input from the SIP community, the authors compiled the 147 following set of goals for SIP call control and multiparty 148 applications: 149 o Define Primitives, Not Services. Allow for a handful of robust 150 yet simple mechanisms that can be combined to deliver features and 151 services. Throughout this document we refer to these simple 152 mechanisms as "primitives". Primitives should be sufficiently 153 robust that when they are combined they can be used to build lots 154 of services. However, the goal is not to define a provably 155 complete set of primitives. Note that while the IETF will NOT 156 standardize behavior or services, it may define example services 157 for informational purposes, as in service examples [6]. 158 o Participant oriented. The primitives should be designed to 159 provide services that are oriented around the experience of the 160 participants. The authors observe that end users of features and 161 services usually don't care how a media relationship is setup. 162 Their ultimate experience is based only on the resulting media and 163 other externally visible characteristics. 164 o Signaling Model independent: Support both a central control and a 165 peer-to-peer feature invocation model (and combinations of the 166 two). Baseline SIP already supports a centralized control model 167 described in 3pcc [7], and the SIP community has expressed a great 168 deal of interest in peer-to-peer or distributed call control using 169 primitives such as those defined in REFER [8], Replaces [9], and 170 Join [10]. 171 o Mixing Model independent: The bulk of interesting multiparty 172 applications involve mixing or combining media from multiple 173 participants. This mixing can be performed by one or more of the 174 participants, or by a centralized mixing resource. The experience 175 of the participants should not depend on the mixing model used. 176 While most examples in this document refer to audio mixing, the 177 framework applies to any media type. In this context a "mixer" 178 refers to combining media in an appropriate, media-specific way. 179 This is consistent with model described in the SIP conferencing 180 framework. 181 o Invoker oriented. Only the user who invokes a feature or a 182 service needs to know exactly which service is invoked or why. 183 This is good because it allows new services to be created without 184 requiring new primitives from all the participants; and it allows 185 for much simpler feature authorization policies, for example, when 186 participation spans organizational boundaries. As discussed in 187 section 3.8, this also avoids exponential state explosion when 188 combining features. The invoker only has to manage a user 189 interface or API to prevent local feature interactions. All the 190 other participants simply need to manage the feature interactions 191 of a much smaller number of primitives. 192 o Primitives make full use of URIs. URIs are a very powerful 193 mechanism for describing users and services. They represent a 194 plentiful resource that can be extremely expressive and easily 195 routed, translated, and manipulated--even across organizational 196 boundaries. URIs can contain special parameters and informational 197 headers that need only be relevant to the owner of the namespace 198 (domain) of the URI. Just as a user who selects an http: URL need 199 not understand the significance and organization of the web site 200 it references, a user may encounter a SIP URL that translates into 201 an email-style group alias, that plays a pre-recorded message, or 202 runs some complex call-handling logic. Note that while this may 203 seem paradoxical to the previous goal, both goals can be satisfied 204 by the same model. 205 o Make use of SIP headers and SIP event packages to provide SIP 206 entities with information about their environment. These should 207 include information about the status / handling of dialogs on 208 other user agents, information about the history of other contacts 209 attempted prior to the current contact, the status of 210 participants, the status of conferences, user presence 211 information, and the status of messages. 212 o Encourage service decomposition, and design to make use of 213 standard components using well-defined, simple interfaces. Sample 214 components include a SIP mixer, recording service, announcement 215 server, and voice dialog server. (This is not an exhaustive 216 list). 217 o Include authentication, authorization, policy, logging, and 218 accounting mechanisms to allow these primitives to be used safely 219 among mutually untrusted participants. Some of these mechanisms 220 may be used to assist in billing, but no specific billing system 221 will be endorsed. 223 o Permit graceful fallback to baseline SIP. Definitions for new SIP 224 call control extensions/primitives must describe a graceful way to 225 fallback to baseline SIP behavior. Support for one primitive must 226 not imply support for another primitive. 227 o There is no desire or goal to reinvent traditional models, such as 228 the model used the [H.450] family of protocols, [JTAPI], or the 229 [CSTA] call model, as these other models do not share the design 230 goals presented in this document. 232 2. Key Concepts 234 2.1. "Conversation Space" Model 236 This document introduces the concept of an abstract "conversation 237 space" (essentially as a set of participants who believe they are all 238 communicating among one another). Each conversation space contains 239 one or more participants. 241 Participants are SIP User Agents that send original media to or 242 terminate and receive media from other members of the conversation 243 space. Logically, every participant in the conversation space has 244 access to all the media generated in that space (this is strictly 245 true if all participants share a common media type). A SIP User 246 Agent that does not contribute or consume any media is NOT a 247 participant; nor is a user agent that merely forwards, transcoders, 248 mixes, or selects media originating elsewhere in the conversation 249 space. [Note that a conversation space consists of zero or more SIP 250 calls or SIP conferences. A conversation space is similar to the 251 definition of a "call" in some other call models.] 253 Participants may represent human users or non-human users (referred 254 to as robots or automatons in this document). Some participants may 255 be hidden within a conversation space. Some examples of hidden 256 participants include: robots that generate tones, images, or 257 announcements during a conference to announce users arriving and 258 departing, a human call center supervisor monitoring a conversation 259 between a trainee and a customer, and robots that record media for 260 training or archival purposes. 262 Participants may also be active or passive. Active participants are 263 expected to be intelligent enough to leave a conversation space when 264 they no longer desire to participate. (An attentive human 265 participant is obviously active.) Some robotic participants (such as 266 a voice messaging system, an instant messaging agent, or a voice 267 dialog system) may be active participants if they can leave the 268 conversation space when there is no human interaction. Other robots 269 (for example our tone generating robot from the previous example) are 270 passive participants. A human participant "on-hold" is passive. 272 An example diagram of a conversation space can be shown as a "bubble" 273 or ovals, or as a "set" in curly or square brace notation. Each set, 274 oval, or "bubble" represents a conversation space. Hidden 275 participants are shown in lowercase letters. 277 { A , B } [ A , B ] 279 .-. .---. 280 / \ / \ 281 / A \ / A b \ 282 ( ) ( ) 283 \ B / \ C D / 284 \ / \ / 285 '-' '---' 287 2.2. Comparison with Related Definitions 289 In SIP, a call is "an informal term that refers to some communication 290 between peers, generally set up for the purposes of a multimedia 291 conversation." Obviously we cannot discuss normative behavior based 292 on such an intentionally vague definition. The concept of a 293 conversation space is needed because the SIP definition of call is 294 not sufficiently precise for the purpose of describing the user 295 experience of multiparty features. 297 Do any other definitions convey the correct meaning? SIP, and SDP 298 [5] both define a conference as "a multimedia session identified by a 299 common session description." A session is defined as "a set of 300 multimedia senders and receivers and the data streams flowing from 301 senders to receivers." Both of these definitions are heavily 302 oriented toward multicast sessions with little differentiation among 303 participants. As such, neither is particularly useful for our 304 purposes. In fact, the definition of "call" in some call models is 305 more similar to our definition of a conversation space. 307 Some examples of the relationship between conversation spaces, SIP 308 dialogs, and SIP sessions are listed below. In each example, a human 309 user will perceive that there is a single call. 310 o A simple two-party call is a single conversation space, a single 311 session, and a single dialog. 312 o A locally mixed three-way call is two sessions and two dialogs. 313 It is also a single conversation space. 314 o A simple dial-in audio conference is a single conversation space, 315 but is represented by as many dialogs and sessions as there are 316 human participants. 318 o A multicast conference is a single conversation space, a single 319 session, and as many dialogs as participants. 321 2.3. Signaling Models 323 Obviously to make changes to a conversation space, you must be able 324 to use SIP signaling to cause these changes. Specifically there must 325 be a way to manipulate SIP dialogs (call legs) to move participants 326 into and out of conversation spaces. Although this is not as 327 obvious, there also must be a way to manipulate SIP dialogs to 328 include non-participant user agents that are otherwise involved in a 329 conversation space (ex: B2BUAs, 3pcc controllers, mixers, 330 transcoders, translators, or relays). 332 Implementations may setup the media relationships described in the 333 conversation space model using the approach described in 3pcc [7]. 334 The 3pcc approach relies on only the following 3 primitive 335 operations: 336 o Create a new dialog (INVITE) 337 o Modify a dialog (reINVITE) 338 o Destroy a dialog (BYE) 340 The main advantage of the 3pcc approach is that it only requires very 341 basic SIP support from end systems to support call control features. 342 As such, third-party call control is a natural way to handle protocol 343 conversion and mid-call features. It also has the advantage and 344 disadvantage that new features can/must be implemented in one place 345 only (the controller), and neither requires enhanced client 346 functionality, nor takes advantage of it. 348 In addition, a peer-to-peer approach is discussed at length in this 349 draft. The primary drawback of the peer-to-peer model is additional 350 complexity in the end system and authentication and management 351 models. The benefits of the peer-to-peer model include: 352 o state remains at the edges 353 o call signaling need only go through participants involved (there 354 are no additional points of failure) 355 o peers can take advantage of end-to-end message integrity or 356 encryption 357 o setup time is shorter (fewer messages and round trips are 358 required) 360 The peer-to-peer approach relies on additional "primitive" 361 operations, some of which are identified here. 362 o Replace an existing dialog 363 o Join a new dialog with an existing dialog 364 o Support SIP conference policy control 365 o Locally perform media forking (multi-unicast) 366 o Ask another UA to send a request on your behalf 368 Many of the features, primitives, and actions described in this 369 document also require some type of media mixing, combining, or 370 selection as described in the next section. 372 2.4. Mixing Models 374 SIP permits a variety of mixing models, which are discussed here 375 briefly. This topic is discussed more thoroughly in the SIP 376 conferencing framework [15] and cc-conferencing [19]. SIP supports 377 both tightly-coupled and loosely-coupled conferencing, although more 378 sophisticated behavior is available in tightly-coupled conferences. 379 In a tightly-coupled conference, a single SIP user agent (called the 380 focus) has a direct dialog relationship with each participant (and 381 may control non participant user agents as well). In a loosely- 382 coupled conference there is no coordinated signaling relationships 383 among the participants. 385 For brevity, only the two most popular conferencing models are 386 significantly discussed in this document (local and centralized 387 mixing). Applications of the conversation spaces model to loosely- 388 coupled multicast and distributed full unicast mesh conferences are 389 left as an exercise for the reader. Note that a distributed full 390 mesh conference can be used for basic conferences, but does not 391 easily allow for more complex conferencing actions like splitting, 392 merging, and sidebars. 394 Call control features should be designed to allow a mixer (local or 395 centralized) to decide when to reduce a conference back to a 2-party 396 call, or drop all the participants (for example if only two 397 automatons are communicating). The actual heuristics used to release 398 calls are beyond the scope of this document, but may depend on 399 properties in the conversation space, such as the number of active, 400 passive, or hidden participants; and the send-only, receive-only, or 401 send-and-receive orientation of various participants. 403 2.4.1. Tightly Coupled 405 Tightly coupled conferences utilize a central point for signaling and 406 authentication known as a focus [15]. The actual media can be 407 centrally mixed or distributed. 409 2.4.1.1. (Single) End System Mixing 411 The first model we call "end system mixing". In this model, user A 412 calls user B, and they have a conversation. At some point later, A 413 decides to conference in user C. To do this, A calls C, using a 414 completely separate SIP call. This call uses a different Call-ID, 415 different tags, etc. There is no call set up directly between B and 416 C. No SIP extension or external signaling is needed. A merely 417 decides to locally join two dialogs. 419 B C 420 \ / 421 \ / 422 A 424 A receives media streams from both B and C, and mixes them. A sends 425 a stream containing A's and C's streams to B, and a stream containing 426 A's and B's streams to C. Basically, user A handles both signaling 427 and media mixing. 429 2.4.1.2. Centralized Mixing 431 In a centralized mixing model, all participants have a pairwise SIP 432 and media relationship with the mixer. Common applications of 433 centralized mixing include ad-hoc conferences and scheduled dial-in 434 or dial-out conferences. In the figure below, the mixer M receives 435 and sends media to participants A, B, C, D, and E. 437 B C 438 \ / 439 \ / 440 M --- A 441 / \ 442 / \ 443 D E 445 2.4.1.3. Centralized Signaling, Distributed Media 447 In this conferencing model, there is a centralized controller, as in 448 the dial-in and dial-out cases. However, the centralized server 449 handles signaling only. The media is still sent directly between 450 participants, using either multicast or multi-unicast. Multi-unicast 451 is when a user sends multiple packets (one for each recipient, 452 addressed to that recipient). This is referred to as a 453 "Decentralized Multipoint Conference" in [H.323]. Full mesh media 454 with centralized mixing is another approach. 456 2.4.2. Loosely Coupled 458 In these models, there is no point of central control of SIP 459 signaling. As in the "Centralized Signaling, Distributed Media" case 460 above, all endpoints send media to all other endpoints. Consequently 461 every endpoint mixes their own media from all the other sources, and 462 sends their own media to every other participant. 464 2.4.2.1. Large-Scale Multicast Conferences 466 Large-scale multicast conferences were the original motivation for 467 both the Session Description Protocol SDP [5] and SIP. In a large- 468 scale multicast conference, one or more multicast addresses are 469 allocated to the conference. Each participant joins that multicast 470 groups, and sends their media to those groups. Signaling is not sent 471 to the multicast groups. The sole purpose of the signaling is to 472 inform participants of which multicast groups to join. Large-scale 473 multicast conferences are usually pre-arranged, with specific start 474 and stop times. However, multicast conferences do not need to be 475 pre-arranged, so long as a mechanism exists to dynamically obtain a 476 multicast address. 478 2.4.2.2. Full Distributed Unicast Conferencing 480 In this conferencing model, each participant has both a pairwise 481 media relationship and a pairwise SIP relationship with every other 482 participant (a full mesh). This model requires a mechanism to 483 maintain a consistent view of distributed state across the group. 484 This is a classic hard problem in computer science. Also, this model 485 does not scale well for large numbers of participants. because for 486 participants the number of media and SIP relationships is 487 approximately n-squared. As a result, this model is not generally 488 available in commercial implementations; to the contrary it is 489 primarily the topic of research or experimental implementations. 490 Note that this model assumes peer-to-peer signaling. 492 2.5. Conveying Information and Events 494 Participants should have access to information about the other 495 participants in a conversation space, so that this information can be 496 rendered to a human user or processed by an automaton. Although some 497 of this information may be available from the Request-URI or To, 498 From, Contact, or other SIP headers, another mechanism of reporting 499 this information is necessary. 501 Many applications are driven by knowledge about the progress of calls 502 and conferences. In general these types of events allow for the 503 construction of distributed applications, where the application 504 requires information on session dialog and conference state, but is 505 not necessarily co-resident with an endpoint user agent or conference 506 server. For example, a focus involved in a conversation space may 507 wish to provide URLs for conference status, and/or conference/floor 508 control. 510 The SIP Events [4] architecture defines general mechanisms for 511 subscription to and notification of events within SIP networks. It 512 introduces the notion of a package that is a specific "instantiation" 513 of the events mechanism for a well-defined set of events. 515 Event packages are needed to provide the status of a user's session 516 dialogs, provide the status of conferences and its participants, 517 provide user presence information, provide the status of 518 registrations, and provide the status of user's messages. While this 519 is not an exhaustive list, these are sufficient to enable the sample 520 features described in this document. 522 The conference event package [12] allows users to subscribe to 523 information about an entire tightly-coupled SIP conference. 524 Notifications convey information about the participants such as: the 525 SIP URL identifying each user, their status in the space (active, 526 declined, departed), URLs to invoke other features (such as sidebar 527 conversations), links to other relevant information (such as floor 528 control policies), and if floor control policies are in place, the 529 user's floor control status. For conversation spaces created from 530 cascaded conferences, conversation state can be gathered from 531 relevant foci and merged into a cohesive set of state. 533 The session dialog package [11] provides information about all the 534 dialogs the target user is maintaining, what conversations the user 535 in participating in, and how these are correlated. Likewise the 536 registration package [13] provides notifications when contacts have 537 changed for a specific address-of-record. The combination of these 538 allows a user agent to learn about all conversations occurring for 539 the entire registered contact set for an address-of-record. 541 Note that user presence in SIP [14] has a close relationship with 542 these later two event packages. It is fundamental to the presence 543 model that the information used to obtain user presence is 544 constructed from any number of different input sources. Examples of 545 other such sources include calendaring information and uploads of 546 presence documents. These two packages can be considered another 547 mechanism that allows a presence agent to determine the presence 548 state of the user. Specifically, a user presence server can act as a 549 subscriber for the session dialog and registration packages to obtain 550 additional information that can be used to construct a presence 551 document. 553 The multi-party architecture may also need to provide a mechanism to 554 get information about the status /handling of a dialog (for example, 555 information about the history of other contacts attempted prior to 556 the current contact). Finally, the architecture should provide ample 557 opportunities to present informational URIs that relate to calls, 558 conversations, or dialogs in some way. For example, consider the SIP 559 Call-Info header, or Contact headers returned in a 300-class 560 response. Frequently additional information about a call or dialog 561 can be fetched via non-SIP URIs. For example, consider a web page 562 for package tracking when calling a delivery company, or a web page 563 with related documentation when joining a dial-in conference. The 564 use of URIs in the multiparty framework is discussed in more detail 565 in Section 3.7. 567 Finally the interaction of SIP with stimulus-signaling-based 568 applications, that allow a user agent to interact with an application 569 without knowledge of the semantics of that application, is discussed 570 in the SIP application interaction framework [16]. Stimulus 571 signaling can occur to a user interface running locally with the 572 client, or to a remote user interface, through media streams. 573 Stimulus signaling encompasses a wide range of mechanisms, ranging 574 from clicking on hyperlinks, to pressing buttons, to traditional Dual 575 Tone Multi Frequency (DTMF) input. In all cases, stimulus signaling 576 is supported through the use of markup languages, which play a key 577 role in that framework. 579 2.6. Componentization and Decomposition 581 This framework proposes a decomposed component architecture with a 582 very loose coupling of services and components. This means that a 583 service (such as a conferencing server or an auto-attendant) need not 584 be implemented as an actual server. Rather, these services can be 585 built by combining a few basic components in straightforward or 586 arbitrarily complex ways. 588 Since the components are easily deployed on separate boxes, by 589 separate vendors, or even with separate providers, we achieve a 590 separation of function that allows each piece to be developed in 591 complete isolation. We can also reuse existing components for new 592 applications. This allows rapid service creation, and the ability 593 for services to be distributed across organizational domains anywhere 594 in the Internet. 596 For many of these components it is also desirable to discover their 597 capabilities, for example querying the ability of a mixer to host a 598 10 dialog conference, or to reserve resources for a specific time. 599 These actions could be provided in the form of URLs, provided there 600 is an a priori means of understanding their semantics. For example 601 if there is a published dictionary of operations, a way to query the 602 service for the available operations and the associated URLs, the URL 603 can be the interface for providing these service operations. This 604 concept is described in more detail in the context of dialog 605 operations in section 607 2.6.1. Media Intermediaries 609 Media Intermediaries are not participants in any conversation space, 610 although an entity that is also a media translator may also have a 611 co-located participant component (for example a mixer that also 612 announces the arrival of a new participant; the announcement portion 613 is a participant, but the mixer itself is not). Media intermediaries 614 should be as transparent as possible to the end users--offering a 615 useful, fundamental service; without getting in the way of new 616 features implemented by participants. Some common media 617 intermediaries are described below. 619 2.6.2. Mixer 621 A SIP mixer is a component that combines media from all dialogs in 622 the same conversation in a media specific way. For example, the 623 default combining for an audio conference might be an N-1 624 configuration, while a text mixer might interleave text messages on a 625 per-line basis. More details about how to manipulate the media 626 policy used by mixers is being discussed in the XCON Working Group. 628 2.6.3. Transcoder 630 A transcoder translates media from one encoding or format to another 631 (for example, GSM voice to G.711, MPEG2 to H.261, or text/html to 632 text/plain), or from one media type to another (for example text to 633 speech). A more thorough discussion of transcoding is described in 634 SIP transcoding services invocation [17]. 636 2.6.4. Media Relay 638 A media relay terminates media and simply forwards it to a new 639 destination without changing the content in any way. Sometimes media 640 relays are used to provide source IP address anonymity, to facilitate 641 middlebox traversal, or to provide a trusted entity where media can 642 be forcefully disconnected. 644 2.6.5. Queue Server 646 A queue server is a location where calls can be entered into one of 647 several FIFO (first-in, first-out) queues. A queue server would 648 subscribe to the presence of groups or individuals who are interested 649 in its queues. When detecting that a user is available to service a 650 queue, the server redirects or transfers the last call in the 651 relevant queue to the available user. On a queue-by-queue basis, 652 authorized users could also subscribe to the call state (dialog 653 information) of calls within a queue. Authorized users could use 654 this information to effectively pluck (take) a call out of the queue 655 (for example by sending an INVITE with a Replaces header to one of 656 the user agents in the queue). 658 2.6.6. Parking Place 660 A parking place is a location where calls can be terminated 661 temporarily and then retrieved later. While a call is "parked", it 662 can receive media "on-hold" such as music, announcements, or 663 advertisements. Such a service could be further decomposed such that 664 announcements or music are handled by a separate component. 666 2.6.7. Announcements and Voice Dialogs 668 An announcement server is a server that can play digitized media 669 (frequently audio), such as music or recorded speech. These servers 670 are typically accessible via SIP, HTTP, or RTSP. An analogous 671 service is a recording service that stores digitized media. A 672 convention for specifying announcements in SIP URIs is described in 673 [24]. Likewise the same server could easily provide a service that 674 records digitized media. 676 A "voice dialog" is a model of spoken interactive behavior between a 677 human and an automaton that can include synthesized speech, digitized 678 audio, recognition of spoken and DTMF key input, recording of spoken 679 input, and interaction with call control. Voice dialogs frequently 680 consist of forms or menus. Forms present information and gather 681 input; menus offer choices of what to do next. 683 Spoken dialogs are a basic building block of applications that use 684 voice. Consider for example that a voice mail system, the 685 conference-id and passcode collection system for a conferencing 686 system, and complicated voice portal applications all require a voice 687 dialog component. 689 2.6.7.1. Text-to-Speech and Automatic Speech Recognition 691 Text-to-Speech (TTS) is a service that converts text into digitized 692 audio. TTS is frequently integrated into other applications, but 693 when separated as a component, it provides greater opportunity for 694 broad reuse. Automatic Speech Recognition (ASR) is a service that 695 attempts to decipher digitized speech based on a proposed grammar. 696 Like TTS, ASR services can be embedded, or exposed so that many 697 applications can take advantage of such services. A standardized 698 (decomposed) interface to access standalone TTS and ASR services is 699 currently being developed in the SPEECHSC Working Group. 701 2.6.7.2. VoiceXML 703 [VoiceXML] is a W3C recommendation that was designed to give authors 704 control over the spoken dialog between users and applications. The 705 application and user take turns speaking: the application prompts the 706 user, and the user in turn responds. Its major goal is to bring the 707 advantages of web-based development and content delivery to 708 interactive voice response applications. We believe that VoiceXML 709 represents the ideal partner for SIP in the development of 710 distributed IVR servers. VoiceXML is an XML based scripting language 711 for describing IVR services at an abstract level. VoiceXML supports 712 DTMF recognition, speech recognition, text-to-speech, and playing out 713 of recorded media files. The results of the data collected from the 714 user are passed to a controlling entity through an HTTP POST 715 operation. The controller can then return another script, or 716 terminate the interaction with the IVR server. 718 A VoiceXML server also need not be implemented as a monolithic 719 server. Below is a diagram of a VoiceXML browser that is split into 720 media and non-media handling parts. The VoiceXML interpreter handles 721 SIP dialog state and state within a VoiceXML document, and sends 722 requests to the media component over another protocol. 724 +-------------+ 725 | | 726 | VoiceXML | 727 | Interpreter | 728 | (signaling) | 729 +-------------+ 730 ^ ^ 731 | | 732 SIP | | RTSP 733 | | 734 | | 735 v v 736 +-------------+ +-------------+ 737 | | | | 738 | SIP UA | RTP | RTSP Server | 739 | |<------>| (media) | 740 | | | | 741 +-------------+ +-------------+ 743 Figure : Decomposed VoiceXML Server 745 2.7. Use of URIs 747 All naming in SIP uses URIs. URIs in SIP are used in a plethora of 748 contexts: the Request-URI; Contact, To, From, and *-Info headers; 749 application/uri bodies; and embedded in email, web pages, instant 750 messages, and ENUM records. The request-URI identifies the user or 751 service that the call is destined for. 753 SIP URIs embedded in informational SIP headers, SIP bodies, and non- 754 SIP content can also specify methods, special parameters, headers, 755 and even bodies. For example: 757 sip:bob@b.example.com;method=REFER?Refer-To=http://example.com/~alice 759 Throughout this draft we discuss call control primitive operations. 760 One of the biggest problems is defining how these operations may be 761 invoked. There are a number of ways to do this. One way is to 762 define the primitives in the protocol itself such that SIP methods 763 (for example REFER) or SIP headers (for example Replaces) indicate a 764 specific call control action. Another way to invoke call control 765 primitives is to define a specific Request-URI naming convention. 766 Either these conventions must be shared between the client (the 767 invoker) and the server, or published by or on behalf of the server. 768 The former involves defining URL construction techniques (e.g. URL 769 parameters and/or token conventions) as proposed in [24]. The latter 770 technique usually involves discovering the URI via a SIP event 771 package, a web page, a business card, or an Instant Message. Yet 772 another means to acquire the URLs is to define a dictionary of 773 primitives with well-defined semantics and provide a means to query 774 the named primitives and corresponding URLs that may be invoked on 775 the service or dialogs. 777 2.7.1. Naming Users in SIP 779 An address-of-record, or public SIP address, is a SIP (or SIPS) URI 780 that points to a domain with a location server that can map the URI 781 to set of Contact URIs where the user might be available. Typically 782 the Contact URIs are populated via registration. 784 Address of Record Contacts 786 sip:bob@biloxi.example.com -> sip:bob@babylon.biloxi.example.com:5060 787 sip:bbrown@mailbox.provider.example.net 788 sip:+1.408.555.6789@mobile.example.net 790 Callee Capabilities [20] defines a set of additional parameters to 791 the Contact header that define the characteristics of the user agent 792 at the specified URI. For example, there is a mobility parameter 793 that indicates whether the UA is fixed or mobile. When a user agent 794 registers, it places these parameters in the Contact headers to 795 characterize the URIs it is registering. This allows a proxy for 796 that domain to have information about the contact addresses for that 797 user. 799 When a caller sends a request, it can optionally request Caller 800 Preferences [21], by including the Accept-Contact and Reject-Contact 801 headers that request certain handling by the proxy in the target 802 domain. These headers contain preferences that describe the set of 803 desired URIs to which the caller would like their request routed. 804 The proxy in the target domain matches these preferences with the 805 Contact characteristics originally registered by the target user. 806 The target user can also choose to run arbitrarily complex "Find-me" 807 feature logic on a proxy in the target domain. 809 There is a strong asymmetry in how preferences for callers and 810 callees can be presented to the network. While a caller takes an 811 active role by initiating the request, the callee takes a passive 812 role in waiting for requests. This motivates the use of callee- 813 supplied scripts and caller preferences included in the call request. 814 This asymmetry is also reflected in the appropriate relationship 815 between caller and callee preferences. A server for a callee should 816 respect the wishes of the caller to avoid certain locations, while 817 the preferences among locations has to be the callee's choice, as it 818 determines where, for example, the phone rings and whether the callee 819 incurs mobile telephone charges for incoming calls. 821 SIP User Agent implementations are encouraged to make intelligent 822 decisions based on the type of participants (active/passive, hidden, 823 human/robot) in a conversation space. This information is conveyed 824 via the session dialog package or in a SIP header parameter 825 communicated using an appropriate SIP header. For example, a music 826 on hold service may take the sensible approach that if there are two 827 or more unhidden participants, it should not provide hold music; or 828 that it will not send hold music to robots. 830 Multiple participants in the same conversation space may represent 831 the same human user. For example, the user may use one participant 832 for video, chat, and whiteboard media on a PC and another for audio 833 media on a SIP phone. In this case, the address-of-record is the 834 same for both user agents, but the Contacts are different. In 835 addition, human users may add robot participants that act on their 836 behalf (for example a call recording service, or a calendar 837 reminder). Call Control features in SIP should continue to function 838 as expected in such an environment. 840 2.7.2. Naming Services with SIP URIs 842 A critical piece of defining a session level service that can be 843 accessed by SIP is defining the naming of the resources within that 844 service. This point cannot be overstated. 846 In the context of SIP control of application components, we take 847 advantage of the fact that the standard SIP URI has a user part. 848 Most services may be thought of as user automatons that participate 849 in SIP sessions. It naturally follows that the user address, or the 850 left-hand-side of the URI, should be utilized as a service indicator. 852 For example, media servers commonly offer multiple services at a 853 single host address. Use of the user part as a service indicator 854 enables service consumers to direct their requests without ambiguity. 855 It has the added benefit of enabling media services to register their 856 availability with SIP Registrars just as any "real" SIP user would. 857 This maintains consistency and provides enhanced flexibility in the 858 deployment of media services in the network. 860 There has been much discussion about the potential for confusion if 861 media services URIs are not readily distinguishable from other types 862 of SIP UAs. The use of a service namespace provides a mechanism to 863 unambiguously identify standard interfaces while not constraining the 864 development of private or experimental services. 866 In SIP, the Request-URI identifies the user or service that the call 867 is destined for. The great advantage of using URIs (specifically, 868 the SIP Request-URI) as a service identifier comes because of the 869 combination of two facts. First, unlike in the PSTN, where the 870 namespace (dialable telephone numbers) are limited, URIs come from an 871 infinite space. They are plentiful, and they are free. Secondly, 872 the primary function of SIP is call routing through manipulations of 873 the Request-URI. In the traditional SIP application, this URI 874 represents people. However, the URI can also represent services, as 875 we propose here. This means we can apply the routing services SIP 876 provides to routing of calls to services. The result - the problem 877 of service invocation and service location becomes a routing problem, 878 for which SIP provides a scalable and flexible solution. Since there 879 is such a vast namespace of services, we can explicitly name each 880 service in a finely granular way. This allows the distribution of 881 services across the network. For further discussion about services 882 and SIP URIs, see RFC 3087 [22] 884 Consider a conferencing service, where we have separated the names of 885 ad-hoc conferences from scheduled conferences, we can program proxies 886 to route calls for ad-hoc conferences to one set of servers, and 887 calls for scheduled ones to another, possibly even in a different 888 provider. In fact, since each conference itself is given a URI, we 889 can distribute conferences across servers, and easily guarantee that 890 calls for the same conference always get routed to the same server. 891 This is in stark contrast to conferences in the telephone network, 892 where the equivalent of the URI - the phone number - is scarce. An 893 entire conferencing provider generally has one or two numbers. 894 Conference IDs must be obtained through IVR interactions with the 895 caller, or through a human attendant. This makes it difficult to 896 distribute conferences across servers all over the network, since the 897 PSTN routing only knows about the dialed number. 899 For more examples, consider the URI conventions of RFC 4240 [24] for 900 media servers and RFC 4458 [25] for voicemail and IVR systems. 902 In practical applications, it is important that an invoker does not 903 necessarily apply semantic rules to various URIs it did not create. 904 Instead, it should allow any arbitrary string to be provisioned, and 905 map the string to the desired behavior. The administrator of a 906 service may choose to provision specific conventions or mnemonic 907 strings, but the application should not require it. In any large 908 installation, the system owner is likely to have pre-existing rules 909 for mnemonic URIs, and any attempt by an application to define its 910 own rules may create a conflict. Implementations should allow an 911 arbitrary mix of URLs from these schemes, or any other scheme that 912 renders valid SIP URIs to be provisioned, rather than enforce only 913 one particular scheme. 915 As we have shown, SIP URIs represent an ideal, flexible mechanism for 916 describing and naming service resources, be they queues, conferences, 917 voice dialogs, announcements, voicemail treatments, or phone 918 features. 920 2.8. Invoker Independence 922 With functional signaling, only the invoker of features in SIP need 923 to know exactly which feature they are invoking. One of the primary 924 benefits of this approach is that combinations of functional features 925 work in SIP call control without requiring complex feature 926 interaction matrices. For example, let us examine the combination of 927 a "transfer" of a call that is "conferenced". 929 Alice calls Bob. Alice silently "conferences in" her robotic 930 assistant Albert as a hidden party. Bob transfers Alice to Carol. 931 If Bob asks Alice to Replace her leg with a new one to Carol then 932 both Alice and Albert should be communicating with Carol 933 (transparently). 935 Using the peer-to-peer model, this combination of features works fine 936 if A is doing local mixing (Alice replaces Bob's dialog with 937 Carol's), or if A is using a central mixer (the mixer replaces Bob's 938 dialog with Carol's). A clever implementation using the 3pcc model 939 can generate similar results. 941 New extensions to the SIP Call Control Framework should attempt to 942 preserve this property. 944 2.9. Billing issues 946 Billing in the PSTN is typically based on who initiated a call. At 947 the moment billing in a SIP network is neither consistent with 948 itself, nor with the PSTN. (A billing model for SIP should allow for 949 both PSTN-style billing, and non-PSTN billing.) The example below 950 demonstrates one such inconsistency. 952 Alice places a call to Bob. Alice then blind transfers Bob to Carol 953 through a PSTN gateway. In current usage of REFER, Bob may be billed 954 for a call he did not initiate (his UA originated the outgoing dialog 955 however). This is not necessarily a terrible thing, but it 956 demonstrates a security concern (Bob must have appropriate local 957 policy to prevent fraud). Also, Alice may wish to pay for Bob's 958 session with Carol. There should be a way to signal this in SIP. 960 Likewise a Replacement call may maintain the same billing 961 relationship as a Replaced call, so if Alice first calls Carol, then 962 asks Bob to Replace this call, Alice may continue to receive a bill. 964 Further work in SIP billing should define a way to set or discover 965 the direction of billing. 967 3. Catalog of call control actions and sample features 969 Call control actions can be categorized by the dialogs upon which 970 they operate. The actions may involve a single or multiple dialogs. 971 These dialogs can be early or established. Multiple dialogs may be 972 related in a conversation space to form a conference or other 973 interesting media topologies. 975 It should be noted that it is desirable to provide a means by which a 976 party can discover the actions that may be performed on a dialog. 977 The interested party may be independent or related to the dialogs. 978 One means of accomplishing this is through the ability to define and 979 obtain URLs for these actions as described in section . 981 Below are listed several call control "actions" that establish or 982 modify dialogs and relate the participants in a conversation space. 984 The names of the actions listed are for descriptive purposes only 985 (they are not normative). This list of actions is not meant to be 986 exhaustive. 988 In the examples, all actions are initiated by the user "Alice" 989 represented by UA "A". 991 3.1. Early Dialog Actions 993 The following are a set of actions that may be performed on a single 994 early dialog. These actions can be thought of as a set of remote 995 control operations. For example an automaton might perform the 996 operation on behalf of a user. Alternatively a user might use the 997 remote control in the form of an application to perform the action on 998 the early dialog of a UA that may be out of reach. All of these 999 actions correspond to telling the UA how to respond to a request to 1000 establish an early dialog. These actions provide useful 1001 functionality for PDA, PC and server based applications that desire 1002 the ability to control a UA. A proposed mechanism for this type of 1003 functionality is described in Remote Call Control [23]. 1005 3.1.1. Remote Answer 1007 A dialog is in some early dialog state such as 180 Ringing. It may 1008 be desirable to tell the UA to answer the dialog. That is tell it to 1009 send a 200 Ok response to establish the dialog. 1011 3.1.2. Remote Forward or Put 1013 It may be desirable to tell the UA to respond with a 3xx class 1014 response to forward an early dialog to another UA. 1016 3.1.3. Remote Busy or Error Out 1018 It may be desirable to instruct the UA to send an error response such 1019 as 486 Busy Here. 1021 3.2. Single Dialog Actions 1023 There is another useful set of actions that operate on a single 1024 established dialog. These operations are useful in building 1025 productivity applications for aiding users to control their phone. 1026 For example a CRM application that sets up calls for a user 1027 eliminating the need for the user to actually enter an address. 1028 These operations can also be thought of a remote control actions. A 1029 proposed mechanism for this type of functionality is described in 1030 Remote Call Control [23]. 1032 3.2.1. Remote Dial 1034 This action instructs the UA to initiate a dialog. This action can 1035 be performed using the REFER method. 1037 3.2.2. Remote On and Off Hold 1039 This action instructs the UA to put an established dialog on hold. 1040 Though this operation can be conceptually be performed with the REFER 1041 method, there is no semantics defined as to what the referred party 1042 should do with the SDP. There is no way to distinguish between the 1043 desire to go on or off hold. 1045 3.2.3. Remote Hangup 1047 This action instructs the UA to terminate an early or established 1048 dialog. A REFER request with the following Refer-To URI and Target- 1049 Dialog header field [26] performs this action. Note: this example 1050 does not show the full set of header fields. 1052 REFER sip:carol@client.chicago.net SIP/2.0 1053 Refer-To: sip:bob@babylon.biloxi.example.com;method=BYE 1054 Target-Dialog: 13413098;local-tag=879738;remote-tag=023214 1056 3.3. Multi-dialog actions 1058 These actions apply to a set of related dialogs. 1060 3.3.1. Transfer 1062 The conversation space changes as follows: 1064 before after 1065 { A , B } --> { C , B } 1067 A replaces itself with C. 1069 To make this happen using the peer-to-peer approach, "A" would send 1070 two SIP requests. A shorthand for those requests is shown below: 1072 REFER B Refer-To:C 1073 BYE B 1075 To make this happen instead using the 3pcc approach, the controller 1076 sends requests represented by the shorthand below: 1078 INVITE C (w/SDP of B) 1079 reINVITE B (w/SDP of C) 1080 BYE A 1082 Features enabled by this action: - blind transfer - transfer to a 1083 central mixer (some type of conference or forking) - transfer to park 1084 server (park) - transfer to music on hold or announcement server - 1085 transfer to a "queue" - transfer to a service (such as Voice Dialogs 1086 service) - transition from local mixer to central mixer 1088 This action is frequently referred to as "completing an attended 1089 transfer". It is described in more detail in cc-transfer [18]. 1091 3.3.2. Take 1093 The conversation space changes as follows: { B , C } --> { B , A } A 1094 forcibly replaces C with itself. In most uses of this primitive, A 1095 is just "un-replacing" itself. Using the peer-to-peer approach, "A" 1096 sends: INVITE B Replaces: 1098 Using the 3pcc approach (all requests sent from controller) INVITE A 1099 (w/SDP of B) reINVITE B (w/SDP of A) BYE C 1101 Features enabled by this action: - transferee completes an attended 1102 transfer - retrieve from central mixer (not recommended) - retrieve 1103 from music on hold or park - retrieve from queue - call center take - 1104 voice portal resuming ownership of a call it originated - answering- 1105 machine style screening (pickup) - pickup of a ringing call (i.e. 1106 early dialog) 1108 Note: that pick up of a ringing call has perhaps some interesting 1109 additional requirements. First of all it is an early dialog as 1110 opposed to an established dialog. Secondly the party which is to 1111 pickup the call may only wish to do so only while it is an early 1112 dialog. That is in the race condition where the ringing UA accepts 1113 just before it receives signaling from the party wishing to take the 1114 call, the taking party wishes to yield or cancel the take. The goal 1115 is to avoid yanking an answered call from the called party. 1117 This action is described in Replaces [9] and in cc-transfer [18]. 1119 3.3.3. Add 1121 Note that the following 4 actions are described in cc-conferencing 1122 [19]. 1124 This is merely adding a participant to a SIP conference. The 1125 conversation space changes as follows: { A , B } --> { A, B, C } A 1126 adds C to the conversation. Using the peer-to-peer approach, adding 1127 a party using local mixing requires no signaling. To transition from 1128 a 2-party call or a locally mixed conference to centrally mixing A 1129 could send the following requests: REFER B Refer-To: conference-URI 1130 INVITE conference-URI BYE B To add a party to a conference: REFER C 1131 Refer-To: conference-URI or REFER conference-URI Refer-To: C Using 1132 the 3pcc approach to transition to centrally mixed, the controller 1133 would send: INVITE mixer leg 1 (w/SDP of A) INVITE mixer leg 2 (w/SDP 1134 of B) INVITE C (late SDP) reINVITE A (w/SDP of mixer leg 1) reINVITE 1135 B (w/SDP of mixer leg 2) INVITE mixer leg3 (w/SDP of C) To add a 1136 party to a SIP conference: INVITE C (late SDP) INVITE conference-URI 1137 (w/SDP of C) Features enabled: - standard conference feature - call 1138 recording - answering-machine style screening (screening) 1140 3.3.4. Local Join 1142 The conversation space changes like this: { A, B} , {A, C} --> {A, B, 1143 C} or like this { A, B} , {C, D} --> {A, B, C, D} A takes two 1144 conversation spaces and joins them together into a single space. 1145 Using the peer-to-peer approach, A can mix locally, or REFER the 1146 participants of both conversation spaces to the same central mixer 1147 (as in 5.3) For the 3pcc approach, the call flows for inserting 1148 participants, and joining and splitting conversation spaces are 1149 tedious yet straightforward, so these are left as an exercise for the 1150 reader. Features enabled: - standard conference feature - leaving a 1151 sidebar to rejoin a larger conference 1153 3.3.5. Insert 1155 The conversation space changes like this: { B , C } --> {A, B, C } A 1156 inserts itself into a conversation space. A proposed mechanism for 1157 signaling this using the peer-to-peer approach is to send a new 1158 header in an INVITE with "joining" semantics. For example: INVITE B 1159 Join: If B accepted the INVITE, B would accept 1160 responsibility to setup the dialogs and mixing necessary (for 1161 example: to mix locally or to transfer the participants to a central 1162 mixer) Features enabled: - barge-in - call center monitoring - call 1163 recording 1165 3.3.6. Split 1167 { A, B, C, D } --> { A, B } , { C, D } If using a central conference 1168 with peer-to-peer REFER C Refer-To: conference-URI (new URI) REFER D 1169 Refer-To: conference-URI (new URI) BYE C BYE D Features enabled: - 1170 sidebar conversations during a larger conference 1172 3.3.7. Near-fork 1174 A participates in two conversation spaces simultaneously: { A, B } 1175 --> { B , A } & { A , C } A is a participant in two conversation 1176 spaces such that A sends the same media to both spaces, and renders 1177 media from both spaces, presumably by mixing or rendering the media 1178 from both. We can define that A is the "anchor" point for both 1179 forks, each of which is a separate conversation space. This action 1180 is purely local implementation (it requires no special signaling). 1181 Local features such as switching calls between the background and 1182 foreground are possible using this media relationship. 1184 3.3.8. Far fork 1186 The conversation space diagram... { A, B } --> { A , B } & { B , C } 1187 A requests B to be the "anchor" of two conversation spaces. This is 1188 easily setup by creating a conference with two sub-conferences and 1189 setting the media policy appropriately such that B is a participant 1190 in both. Media forking can also be setup using 3pcc as described in 1191 Section 5.1 of RFC3264 [3] (an offer/answer model for SDP). The 1192 session descriptions for forking are quite complex. Controllers 1193 should verify that endpoints can handle forked-media, for example 1194 using prior configuration. 1196 Features enabled: 1197 o barge-in 1198 o voice portal services 1199 o whisper 1200 o hotword detection 1201 o sending DTMF somewhere else 1203 4. Security Considerations 1205 Call Control primitives provide a powerful set of features that can 1206 be dangerous in the hands of an attacker. To complicate matters, 1207 call control primitives are likely to be automatically authorized 1208 without direct human oversight. 1210 The class of attacks that are possible using these tools include the 1211 ability to eavesdrop on calls, disconnect calls, redirect calls, 1212 render irritating content (including ringing) at a user agent, cause 1213 an action that has billing consequences, subvert billing (theft-of- 1214 service), and obtain private information. Call control extensions 1215 must take extra care to describe how these attacks will be prevented. 1217 We can also make some general observations about authorization and 1218 trust with respect to call control. The security model is 1219 dramatically dependent on the signaling model chosen (see section 1220 3.2) 1222 Let us first examine the security model used in the 3pcc approach. 1224 All signaling goes through the controller, which is a trusted entity. 1225 Traditional SIP authentication and hop-by-hop encryption and message 1226 integrity work fine in this environment, but end-to-end encryption 1227 and message integrity may not be possible. 1229 When using the peer-to-peer approach, call control actions and 1230 primitives can be legitimately initiated by a) an existing 1231 participant in the conversation space, b) a former participant in the 1232 conversation space, or c) an entity trusted by one of the 1233 participants. For example, a participant always initiates a 1234 transfer; a retrieve from Park (a take) is initiated on behalf of a 1235 former participant; and a barge-in (insert or far-fork) is initiated 1236 by a trusted entity (an operator for example). 1238 Authenticating requests by an existing participant or a trusted 1239 entity can be done with baseline SIP mechanisms. In the case of 1240 features initiated by a former participant, these should be protected 1241 against replay attacks by using a unique name or identifier per 1242 invocation. The Replaces header exhibits this behavior as a by- 1243 product of its operation (once a Replaces operation is successful, 1244 the dialog being Replaced no longer exists). For other requests, a 1245 "one-time" Request-URI may be provided to the feature invoker. 1247 To authorize call control primitives that trigger special behavior 1248 (such as an INVITE with Replaces or Join semantics), the receiving 1249 user agent may have trouble finding appropriate credentials with 1250 which to challenge or authorize the request, as the sender may be 1251 completely unknown to the receiver, except through the introduction 1252 of a third party. These credentials need to be passed transitively 1253 in some way or fetched in an event body, for example. 1255 5. IANA Considerations 1257 This document required no action by IANA. 1259 6. Appendix A: Example Features 1261 Primitives are defined in terms of their ability to provide features. 1262 These example features should require an amply robust set of services 1263 to demonstrate a useful set of primitives. They are described here 1264 briefly. Note that the descriptions of these features are non- 1265 normative. Some of these features are used as examples in section 6 1266 to demonstrate how some features may require certain media 1267 relationships. Note also that this document describes a mixture of 1268 both features originating in the world of telephones, and features 1269 that are clearly Internet oriented. 1271 Example Feature Definitions: 1273 Call Waiting - Alice is in a call, then receives another call. Alice 1274 can place the first call on hold, and talk with the other caller. 1275 She can typically switch back and forth between the callers. 1277 Blind Transfer - Alice is in a conversation with Bob. Alice asks Bob 1278 to contact Carol, but makes no attempt to contact Carol 1279 independently. In many implementations, Alice does not verify Bob's 1280 success or failure in contacting Carol. 1282 Attended Transfer - The transferring party establishes a session with 1283 the transfer target before completing the transfer. 1285 Consultative transfer - the transferring party establishes a session 1286 with the target and mixes both sessions together so that all three 1287 parties can participate, then disconnects leaving the transferee and 1288 transfer target with an active session. 1290 Conference Call - Three or more active, visible participants in the 1291 same conversation space. 1293 Call Park - A call participant parks a call (essentially puts the 1294 call on hold), and then retrieves it at a later time (typically from 1295 another location). 1297 Call Pickup - A party picks up a call that was ringing at another 1298 location. One variation allows the caller to choose which location, 1299 another variation just picks up any call in that user's "pickup 1300 group". 1302 Music on Hold - When Alice places a call with Bob on hold, it 1303 replaces its audio with streaming content such as music, 1304 announcements, or advertisements. 1306 Call Monitoring - A call center supervisor joins an in-progress call 1307 for monitoring purposes. 1309 Barge-in - Carol interrupts Alice who has a call in-progress call 1310 with Bob. In some variations, Alice forcibly joins a new conversation 1311 with Carol, in other variations, all three parties are placed in the 1312 same conversation (basically a 3-way conference). 1314 Hotline - Alice picks up a phone and is immediately connected to the 1315 technical support hotline, for example. 1317 Auto Answer - Calls to a certain address or location answer 1318 immediately via a speakerphone. 1320 Intercom - Alice typically presses a button on a phone that 1321 immediately connects to another user or phone and causes that phone 1322 to play her voice over its speaker. Some variations immediately 1323 setup two-way communications, other variations require another button 1324 to be pressed to enable a two-way conversation. 1326 Speakerphone paging - Alice calls the paging address and speaks. Her 1327 voice is played on the speaker of every idle phone in a preconfigured 1328 group of phones. 1330 Speed dial - Alice dials an abbreviated number, or enters an alias, 1331 or presses a special speed dial button representing Bob. Her action 1332 is interpreted as if she specified the full address of Bob. 1334 Call Return - Alice calls Bob. Bob misses the call or is disconnected 1335 before he is finished talking to Alice. Bob invokes Call return that 1336 calls Alice, even if Alice did not provide her real identity or 1337 location to Bob. 1339 Inbound Call Screening - Alice doesn't want to receive calls from 1340 Matt. Inbound Screening prevents Matt from disturbing Alice. In 1341 some variations this works even if Matt hides his identity. 1343 Outbound Call Screening - Alice is paged and unknowingly calls a PSTN 1344 pay-service telephone number in the Caribbean, but local policy 1345 blocks her call, and possibly informs her why. 1347 Call Forwarding - Before a dialog is accepted it is redirected to 1348 another location, for example, because the originally intended 1349 recipient is busy, does not answer, is disconnected from the network, 1350 configured all requests to go somewhere else. 1352 Message Waiting - Bob calls Alice when she steps away from her phone, 1353 when she returns a visible or audible indicator conveys that someone 1354 has left her a voicemail message. The message waiting indication may 1355 also convey how many messages are waiting, from whom, what time, and 1356 other useful pieces of information. 1358 Do Not Disturb - Alice selects the Do Not Disturb option. Calls to 1359 her either ring briefly or not at all and are forwarded elsewhere. 1360 Some variations allow specially authorized callers to override this 1361 feature and ring Alice anyway. 1363 Distinctive ring - Incoming calls have different ring cadences or 1364 sample sounds depending on the From party, the To party, or other 1365 factors. 1367 Automatic Callback: Alice calls Bob, but Bob is busy. Alice would 1368 like Bob to call her automatically when he is available. When Bob 1369 hangs up, Alice's phone rings. When Alice answers, Bob's phone 1370 rings. Bob answers and they talk. 1372 Find-Me - Alice sets up complicated rules for how she can be reached 1373 (possibly using CPL [27], presence [14], or other factors). When Bob 1374 calls Alice, his call is eventually routed to a temporary Contact 1375 where Alice happens to be available. 1377 Whispered call waiting - Alice is in a conversation with Bob. Carol 1378 calls Alice. Either Carol can "whisper" to Alice directly ("Can you 1379 get lunch in 15 minutes?"), or an automaton whispers to Alice 1380 informing her that Carol is trying to reach her. 1382 Voice message screening - Bob calls Alice. Alice is screening her 1383 calls, so Bob hears Alice's voicemail greeting. Alice can hear Bob 1384 leave his message. If she decides to talk to Bob, she can take the 1385 call back from the voicemail system, otherwise she can let Bob leave 1386 a message. This emulates the behavior of a home telephone answering 1387 machine 1389 Presence-Enabled Conferencing: Alice wants to set up a conference 1390 call with Bob and Cathy when they all happen to be available (rather 1391 than scheduling a predefined time). The server providing the 1392 application monitors their status, and calls all three when they are 1393 all "online", not idle, and not in another call. 1395 IM Conference Alerts: A user receives an notification as an Instant 1396 Message whenever someone joins a conference they are also in. 1398 Single Line Extension/Multiple Line Appearance -- A group of phones 1399 are all treated as "extensions" of a single line. A call for one 1400 rings them all. As soon as one answers, the others stop ringing. If 1401 any extension is actively in a conversation, another extension can 1402 "pick up" and immediately join the conversation. This emulates the 1403 behavior of a home telephone line with multiple phones. 1405 Click-to-dial - Alice looks in her company directory for Bob. When 1406 she finds Bob, she clicks on a URL to call him. Her phone rings (or 1407 possibly answers automatically), and when she answers, Bob's phone 1408 rings. 1410 Pre-paid calling - Alice pays for a certain currency or unit amount 1411 of calling value. When she places a call, she provides her account 1412 number somehow. If her account runs out of calling value during a 1413 call her call is disconnected or redirected to a service where she 1414 can purchase more calling value. 1416 Voice Portal - A service that allows users to access a portal site 1417 using spoken dialog interaction. For example, Alice needs to 1418 schedule a working dinner with her co-worker Carol. Alice uses a 1419 voice portal to check Carol's flight schedule, find a restaurant near 1420 her hotel, make a reservation, get directions there, and page Carol 1421 with this information. 1423 6.1. Implementation of these features 1425 Example Features: 1426 Call Hold [service-examples] 1427 Call Waiting Local Implementation 1428 Blind Transfer [cc-transfer] 1429 Attended Transfer [cc-transfer] 1430 Consultative transfer [cc-transfer] 1431 Conference Call [conf-models] 1432 Call Park [cc-framework]/[service-examples] 1433 Call Pickup [cc-framework]/[service-examples] 1434 Music on Hold [cc-framework]/[service-examples] 1435 Call Monitoring [cc-framework] 1436 Barge-in [cc-framework]/[Insert or Far-Fork 1437 Hotline Local Implementation 1438 Auto Answer [sip-answermode] 1439 Speed dial Local Implementation 1440 Intercom [cc-framework]/[sip-answermode] 1441 Speakerphone paging [cc-framework]/Speed dial + Auto Answer 1442 Call Return Proxy feature 1443 Inbound Call Screening Proxy or Local implementation 1444 Outbound Call Screening Proxy feature 1445 Call Forwarding Proxy or Local implementation 1446 Message Waiting [msg-waiting] 1447 Do Not Disturb [presence] 1448 Distinctive ring [cc-framework]/Proxy or Local implementation 1449 Automatic Callback two person presence-based conference 1450 Find-Me Proxy service based on presence 1451 Whispered call waiting Local implementation 1452 Voice Message Screening [cc-framework] 1453 Presence-based 1454 Conferencing call when presence = available 1455 IM Conference Alerts subscribe to conference status 1456 Single Line Extension [cc-framework] 1457 Multiple Appearances [cc-framework] 1458 Click-to-dial [service-examples] 1459 Pre-Paid Calling [cc-framework] 1460 Voice Portal [cc-framework] 1462 6.1.1. Call Park 1464 Call park requires the ability to: put a dialog some place, advertise 1465 it to users in a pickup group and to uniquely identify it in a means 1466 that can be communicated (including human voice). The dialog can be 1467 held locally on the UA parking the dialog or alternatively 1468 transferred to the park service for the pickup group. The parked 1469 dialog then needs to be labeled (e.g. orbit 12) in a way that can be 1470 communicated to the party that is to pick up the call. The UAs in 1471 the pick up group discovers the parked dialog(s) via the dialog 1472 package from the park service. If the dialog is parked locally the 1473 park service merely aggregates the parked call states from the set of 1474 UAs in the pickup up group. 1476 6.1.2. Call Pickup 1478 There are two different features that are called call pickup. The 1479 first is the pickup of a parked dialog. The UA from which the dialog 1480 is to be picked up subscribes to the session dialog state of the park 1481 service or the UA that has locally parked the dialog. Dialogs that 1482 are parked should be labeled with an identifier. The labels are used 1483 by the UA to allow the user to indicate which dialog is to be picked 1484 up. The UA picking up the call invoked the URL in the call state 1485 that is labeled as replace-remote. 1487 The other call pickup feature involves picking up an early dialog 1488 (typically ringing). This feature uses some of the same primitives 1489 as the pick up of a parked call. The call state of the UA ringing 1490 phone is advertised using the dialog package. The UA that is to 1491 pickup the early dialog subscribes either directly to the ringing UA 1492 or to a service aggregating the states for UAs in the pickup group. 1493 The call state identifies early dialogs. The UA uses the call 1494 state(s) to help the user choose which early dialog that is to be 1495 picked up. The UA then invokes the URL in the call state labeled as 1496 replace-remote. 1498 6.1.3. Music on Hold 1500 Music on hold can be implemented a number of ways. One way is to 1501 transfer the held call to a holding service. When the UA wishes to 1502 take the call off hold it basically performs a take on the call from 1503 the holding service. This involves subscribing to call state on the 1504 holding service and then invoking the URL in the call state labeled 1505 as replace-remote. 1507 Alternatively music on hold can be performed as a local mixing 1508 operation. The UA holding the call can mix in the music from the 1509 music service via RTP (i.e. an additional dialog) or RTSP or other 1510 streaming media source. This approach is simpler (i.e. the held 1511 dialog does not move so there is less chance of loosing them) from a 1512 protocol perspective, however it does use more LAN bandwidth and 1513 resources on the UA. 1515 6.1.4. Call Monitoring 1517 Call monitoring is a Join operation. The monitoring UA sends a Join 1518 to the dialog it wants to listen to. It is able to discover the 1519 dialog via the dialog state on the monitored UA. The monitoring UA 1520 sends SDP in the INVITE that indicates receive only media. As the UA 1521 is monitoring only it does not matter whether the UA indicates it 1522 wishes the send stream be mix or point to point. 1524 6.1.5. Barge-in 1526 Barge-in works the same as call monitoring except that it must 1527 indicate that the send media stream to be mixed so that all of the 1528 other parties can hear the stream from UA barging in. 1530 6.1.6. Intercom 1532 The UA initiates a dialog using INVITE and the Answer-Mode: Auto 1533 header field as described in [28]. The called UA accepts the INVITE 1534 with a 200 OK and automatically enables the speakerphone. 1536 Alternatively this can be a local decision for the UA to answer based 1537 upon called party identification. 1539 6.1.7. Speakerphone paging 1541 Speakerphone paging can be implemented using either multicast or 1542 through a simple multipoint mixer. In the multicast solution the 1543 paging UA sends a multicast INVITE with send only media in the SDP 1544 (see also RFC3264). The automatic answer and enabling of the 1545 speakerphone is a locally configured decision on the paged UAs. The 1546 paging UA sends RTP via the multicast address indicated in the SDP. 1548 The multipoint solution is accomplished by sending an INVITE to the 1549 multipoint mixer. The mixer is configured to automatically answer 1550 the dialog. The paging UA then sends REFER requests for each of the 1551 UAs that are to become paging speakers (The UA is likely to send out 1552 a single REFER that is parallel forked by the proxy server). The UAs 1553 performing as paging speakers are configured to automatically answer 1554 based upon caller identification (e.g. To field, URI or Referred-To 1555 headers). 1557 Finally as a third option, the user agent can send a mass-invitation 1558 request to a conference server, which would create a conference and 1559 send INVITEs containing the Answer-Mode: Auto header field to all 1560 user agents in the paging group. 1562 6.1.8. Distinctive ring 1564 The target UA either makes a local decision based on information in 1565 an incoming INVITE (To, From, Contact, Request-URI) or trusts an 1566 Alert-Info header provided by the caller or inserted by a trusted 1567 proxy. In the latter case, the UA fetches the content described in 1568 the URI (typically via http) and renders it to the user. 1570 6.1.9. Voice message screening 1572 At first, this is the same as call monitoring. In this case the 1573 voicemail service is one of the UAs. The UA screening the message 1574 monitors the call on the voicemail service, and also subscribes to 1575 dialog information. If the user screening their messages decides to 1576 answer, they perform a Take from the voicemail system (for example, 1577 send an INVITE with Replaces to the UA leaving the message) 1579 6.1.10. Single Line Extension/Multiple Line Appearance 1581 Incoming calls ring all the extensions through basic parallel 1582 forking. Each extension subscribes to dialog events from each other 1583 extension. While one user has an active call, any other UA extension 1584 can insert itself into that conversation (it already knows the dialog 1585 information) in the same way as barge-in. 1587 Standardization work to allow line appearance numbers to be 1588 coordinated across a group of UAs is currently underway. 1590 6.1.11. Click-to-dial 1592 The application or server that hosts the click-to-dial application 1593 captures the URL to be dialed and can setup the call using 3pcc or 1594 can send a REFER request to the UA that is to dial the address. As 1595 users sometimes change their mind or wish to give up listing to a 1596 ringing or voicemail answered phone, this application illustrates the 1597 need to also have the ability to remotely hangup a call. 1599 6.1.12. Pre-paid calling 1601 For prepaid calling, the user's media always passes through a device 1602 that is trusted by the pre-paid provider. This may be the other 1603 endpoint (for example a PSTN gateway). In either case, an 1604 intermediary proxy or B2BUA can periodically verify the amount of 1605 time available on the pre-paid account, and use the session-timer 1606 extension to cause the trusted endpoint (gateway) or intermediary 1607 (media relay) to send a reINVITE before that time runs out. During 1608 the reINVITE, the SIP intermediary can re-verify the account and 1609 insert another session-timer header. 1611 Note that while most pre-paid systems on the PSTN use an IVR to 1612 collect the account number and destination, this isn't strictly 1613 necessary for a SIP-originated prepaid call. SIP requests and SIP 1614 URIs are sufficiently expressive to convey the final destination, the 1615 provider of the prepaid service, the location from which the user is 1616 calling, and the prepaid account they want to use. If a pre-paid IVR 1617 is used, the mechanism described below (Voice Portals) can be 1618 combined as well. 1620 6.1.13. Voice Portal 1622 A voice portal is essentially a complex collection of voice dialogs 1623 used to access interesting content. One of the most desirable call 1624 control features of a Voice Portal is the ability to start a new 1625 outgoing call from within the context of the Portal (to make a 1626 restaurant reservation, or return a voicemail message for example). 1627 Once the new call is over, the user should be able to return to the 1628 Portal by pressing a special key, using some DTMF sequence (ex: a 1629 very long pound or hash tone), or by speaking a hotword (ex: "Main 1630 Menu"). 1632 In order to accomplish this, the Voice Portal starts with the 1633 following media relationship: 1635 { User , Voice Portal } 1637 The user then asks to make an outgoing call. The Voice Portal asks 1638 the User to perform a Far-Fork. In other words the Voice Portal 1639 wants the following media relationship: 1641 { Target , User } & { User , Voice Portal } 1643 The Voice Portal is now just listening for a hotword or the 1644 appropriate DTMF. As soon as the user indicates they are done, the 1645 Voice Portal Takes the call from the old Target, and we are back to 1646 the original media relationship. 1648 This feature can also be used by the account number and phone number 1649 collection menu in a pre-paid calling service. A user can press a 1650 DTMF sequence that presents them with the appropriate menu again. 1652 7. Informative References 1654 [1] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., 1655 Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP: 1656 Session Initiation Protocol", RFC 3261, June 2002. 1658 [2] Bradner, S., "Key words for use in RFCs to Indicate Requirement 1659 Levels", BCP 14, RFC 2119, March 1997. 1661 [3] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with 1662 Session Description Protocol (SDP)", RFC 3264, June 2002. 1664 [4] Roach, A., "Session Initiation Protocol (SIP)-Specific Event 1665 Notification", RFC 3265, June 2002. 1667 [5] Handley, M., Jacobson, V., and C. Perkins, "SDP: Session 1668 Description Protocol", RFC 4566, July 2006. 1670 [6] Johnston, A., "Session Initiation Protocol Service Examples", 1671 draft-ietf-sipping-service-examples-12 (work in progress), 1672 January 2007. 1674 [7] Rosenberg, J., Peterson, J., Schulzrinne, H., and G. Camarillo, 1675 "Best Current Practices for Third Party Call Control (3pcc) in 1676 the Session Initiation Protocol (SIP)", BCP 85, RFC 3725, 1677 April 2004. 1679 [8] Sparks, R., "The Session Initiation Protocol (SIP) Refer 1680 Method", RFC 3515, April 2003. 1682 [9] Mahy, R., Biggs, B., and R. Dean, "The Session Initiation 1683 Protocol (SIP) "Replaces" Header", RFC 3891, September 2004. 1685 [10] Mahy, R. and D. Petrie, "The Session Initiation Protocol (SIP) 1686 "Join" Header", RFC 3911, October 2004. 1688 [11] Rosenberg, J., Schulzrinne, H., and R. Mahy, "An INVITE- 1689 Initiated Dialog Event Package for the Session Initiation 1690 Protocol (SIP)", RFC 4235, November 2005. 1692 [12] Rosenberg, J., Schulzrinne, H., and O. Levin, "A Session 1693 Initiation Protocol (SIP) Event Package for Conference State", 1694 RFC 4575, August 2006. 1696 [13] Rosenberg, J., "A Session Initiation Protocol (SIP) Event 1697 Package for Registrations", RFC 3680, March 2004. 1699 [14] Rosenberg, J., "A Presence Event Package for the Session 1700 Initiation Protocol (SIP)", RFC 3856, August 2004. 1702 [15] Rosenberg, J., "A Framework for Conferencing with the Session 1703 Initiation Protocol (SIP)", RFC 4353, February 2006. 1705 [16] Rosenberg, J., "A Framework for Application Interaction in the 1706 Session Initiation Protocol (SIP)", 1707 draft-ietf-sipping-app-interaction-framework-05 (work in 1708 progress), July 2005. 1710 [17] Camarillo, G., "Framework for Transcoding with the Session 1711 Initiation Protocol (SIP)", 1712 draft-ietf-sipping-transc-framework-05 (work in progress), 1713 December 2006. 1715 [18] Sparks, R., "Session Initiation Protocol Call Control - 1716 Transfer", draft-ietf-sipping-cc-transfer-07 (work in 1717 progress), October 2006. 1719 [19] Johnston, A. and O. Levin, "Session Initiation Protocol (SIP) 1720 Call Control - Conferencing for User Agents", BCP 119, 1721 RFC 4579, August 2006. 1723 [20] Rosenberg, J., Schulzrinne, H., and P. Kyzivat, "Indicating 1724 User Agent Capabilities in the Session Initiation Protocol 1725 (SIP)", RFC 3840, August 2004. 1727 [21] Rosenberg, J., Schulzrinne, H., and P. Kyzivat, "Caller 1728 Preferences for the Session Initiation Protocol (SIP)", 1729 RFC 3841, August 2004. 1731 [22] Campbell, B. and R. Sparks, "Control of Service Context using 1732 SIP Request-URI", RFC 3087, April 2001. 1734 [23] Jennings, C. and R. Mahy, "Remote Call Control in the Session 1735 Initiation Protocol (SIP) using the REFER method and the 1736 session-oriented dialog package", draft-mahy-sip-remote-cc-04 1737 (work in progress), October 2006. 1739 [24] Burger, E., Van Dyke, J., and A. Spitzer, "Basic Network Media 1740 Services with SIP", RFC 4240, December 2005. 1742 [25] Jennings, C., Audet, F., and J. Elwell, "Session Initiation 1743 Protocol (SIP) URIs for Applications such as Voicemail and 1744 Interactive Voice Response (IVR)", RFC 4458, April 2006. 1746 [26] Rosenberg, J., "Request Authorization through Dialog 1747 Identification in the Session Initiation Protocol (SIP)", 1748 RFC 4538, June 2006. 1750 [27] Lennox, J., Wu, X., and H. Schulzrinne, "Call Processing 1751 Language (CPL): A Language for User Control of Internet 1752 Telephony Services", RFC 3880, October 2004. 1754 [28] Willis, D. and A. Allen, "Requesting Answering Modes for the 1755 Session Initiation Protocol (SIP)", 1756 draft-ietf-sip-answermode-01 (work in progress), May 2006. 1758 Authors' Addresses 1760 Rohan Mahy 1761 Plantronics 1762 345 Encincal Street 1763 Santa Cruz, CA 1764 USA 1766 Email: rohan@ekabal.com 1768 Ben Campbell 1769 Estacado Systems 1771 Email: ben@nostrum.com 1773 Robert Sparks 1774 Estacado Systems 1776 Email: rjsparks@nostrum.com 1778 Jonathan Rosenberg 1779 Cisco Systems 1781 Email: jdrosen@cisco.com 1783 Dan Petrie 1784 SIP EZ 1786 Email: dpetrie@sipez.com 1787 Alan Johnston (editor) 1788 Avaya 1790 Email: alan@sipstation.com 1792 Full Copyright Statement 1794 Copyright (C) The IETF Trust (2007). 1796 This document is subject to the rights, licenses and restrictions 1797 contained in BCP 78, and except as set forth therein, the authors 1798 retain all their rights. 1800 This document and the information contained herein are provided on an 1801 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1802 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 1803 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 1804 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 1805 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1806 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1808 Intellectual Property 1810 The IETF takes no position regarding the validity or scope of any 1811 Intellectual Property Rights or other rights that might be claimed to 1812 pertain to the implementation or use of the technology described in 1813 this document or the extent to which any license under such rights 1814 might or might not be available; nor does it represent that it has 1815 made any independent effort to identify any such rights. Information 1816 on the procedures with respect to rights in RFC documents can be 1817 found in BCP 78 and BCP 79. 1819 Copies of IPR disclosures made to the IETF Secretariat and any 1820 assurances of licenses to be made available, or the result of an 1821 attempt made to obtain a general license or permission for the use of 1822 such proprietary rights by implementers or users of this 1823 specification can be obtained from the IETF on-line IPR repository at 1824 http://www.ietf.org/ipr. 1826 The IETF invites any interested party to bring to its attention any 1827 copyrights, patents or patent applications, or other proprietary 1828 rights that may cover technology that may be required to implement 1829 this standard. Please address the information to the IETF at 1830 ietf-ipr@ietf.org. 1832 Acknowledgment 1834 Funding for the RFC Editor function is provided by the IETF 1835 Administrative Support Activity (IASA).