idnits 2.17.1 draft-ietf-sipping-app-interaction-framework-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1.a on line 16. -- Found old boilerplate from RFC 3978, Section 5.5 on line 1744. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1721. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1728. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1734. ** The document seems to lack an RFC 3978 Section 5.1 IPR Disclosure Acknowledgement. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. ** The document uses RFC 3667 boilerplate or RFC 3978-like boilerplate instead of verbatim RFC 3978 boilerplate. After 6 May 2005, submission of drafts without verbatim RFC 3978 boilerplate is not accepted. The following non-3978 patterns matched text found in the document. That text should be removed or replaced: This document is an Internet-Draft and is subject to all provisions of Section 3 of RFC 3667. By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 6 instances of too long lines in the document, the longest one being 4 characters in excess of 72. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 795: '...ription). As such, user agents SHOULD...' RFC 2119 keyword, line 850: '... the application MAY push presentation...' RFC 2119 keyword, line 860: '... the application MAY push presentation...' RFC 2119 keyword, line 880: '... An application MUST NOT attempt to p...' RFC 2119 keyword, line 883: '...t an application MUST NOT push a user ...' (49 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 16, 2005) is 7007 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 3265 (ref. '3') (Obsoleted by RFC 6665) -- Possible downref: Non-RFC (?) normative reference: ref. '4' == Outdated reference: A later version (-08) exists of draft-ietf-sipping-kpml-07 == Outdated reference: A later version (-15) exists of draft-ietf-sip-gruu-02 == Outdated reference: A later version (-06) exists of draft-ietf-sip-identity-03 == Outdated reference: A later version (-05) exists of draft-ietf-sipping-conferencing-framework-03 == Outdated reference: A later version (-06) exists of draft-ietf-sipping-dialog-package-05 -- Obsolete informational reference (is this intentional?): RFC 2833 (ref. '17') (Obsoleted by RFC 4733, RFC 4734) Summary: 8 errors (**), 0 flaws (~~), 7 warnings (==), 9 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 SIPPING J. Rosenberg 2 Internet-Draft Cisco Systems 3 Expires: August 17, 2005 February 16, 2005 5 A Framework for Application Interaction in the Session Initiation 6 Protocol (SIP) 7 draft-ietf-sipping-app-interaction-framework-04 9 Status of this Memo 11 This document is an Internet-Draft and is subject to all provisions 12 of section 3 of RFC 3667. By submitting this Internet-Draft, each 13 author represents that any applicable patent or other IPR claims of 14 which he or she is aware have been or will be disclosed, and any of 15 which he or she become aware will be disclosed, in accordance with 16 RFC 3668. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that 20 other groups may also distribute working documents as 21 Internet-Drafts. 23 Internet-Drafts are draft documents valid for a maximum of six months 24 and may be updated, replaced, or obsoleted by other documents at any 25 time. It is inappropriate to use Internet-Drafts as reference 26 material or to cite them other than as "work in progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/1id-abstracts.txt. 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html. 34 This Internet-Draft will expire on August 17, 2005. 36 Copyright Notice 38 Copyright (C) The Internet Society (2005). 40 Abstract 42 This document describes a framework for the interaction between users 43 and Session Initiation Protocol (SIP) based applications, and defines 44 a new Refer-To header field parameter and option tag in support of 45 that framework. By interacting with applications, users can guide 46 the way in which they operate. The focus of this framework is 47 stimulus signaling, which allows a user agent to interact with an 48 application without knowledge of the semantics of that application. 50 Stimulus signaling can occur to a user interface running locally with 51 the client, or to a remote user interface, through media streams. 52 Stimulus signaling encompasses a wide range of mechanisms, ranging 53 from clicking on hyperlinks, to pressing buttons, to traditional Dual 54 Tone Multi Frequency (DTMF) input. In all cases, stimulus signaling 55 is supported through the use of markup languages, which play a key 56 role in this framework. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 61 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 4 62 3. A Model for Application Interaction . . . . . . . . . . . . . 7 63 3.1 Functional vs. Stimulus . . . . . . . . . . . . . . . . . 9 64 3.2 Real-Time vs. Non-Real Time . . . . . . . . . . . . . . . 9 65 3.3 Client-Local vs. Client-Remote . . . . . . . . . . . . . . 10 66 3.4 Presentation Capable vs. Presentation Free . . . . . . . . 11 67 4. Interaction Scenarios on Telephones . . . . . . . . . . . . . 11 68 4.1 Client Remote . . . . . . . . . . . . . . . . . . . . . . 12 69 4.2 Client Local . . . . . . . . . . . . . . . . . . . . . . . 12 70 4.3 Flip-Flop . . . . . . . . . . . . . . . . . . . . . . . . 12 71 5. Framework Overview . . . . . . . . . . . . . . . . . . . . . . 13 72 6. Deployment Topologies . . . . . . . . . . . . . . . . . . . . 15 73 6.1 Third Party Application . . . . . . . . . . . . . . . . . 16 74 6.2 Co-Resident Application . . . . . . . . . . . . . . . . . 16 75 6.3 Third Party Application and User Device Proxy . . . . . . 17 76 6.4 Proxy Application . . . . . . . . . . . . . . . . . . . . 18 77 7. Application Behavior . . . . . . . . . . . . . . . . . . . . . 19 78 7.1 Client Local Interfaces . . . . . . . . . . . . . . . . . 19 79 7.1.1 Discovering Capabilities . . . . . . . . . . . . . . . 19 80 7.1.2 Pushing an Initial Interface Component . . . . . . . . 20 81 7.1.3 Updating an Interface Component . . . . . . . . . . . 22 82 7.1.4 Terminating an Interface Component . . . . . . . . . . 22 83 7.2 Client Remote Interfaces . . . . . . . . . . . . . . . . . 23 84 7.2.1 Originating and Terminating Applications . . . . . . . 23 85 7.2.2 Intermediary Applications . . . . . . . . . . . . . . 24 86 8. User Agent Behavior . . . . . . . . . . . . . . . . . . . . . 24 87 8.1 Advertising Capabilities . . . . . . . . . . . . . . . . . 24 88 8.2 Receiving User Interface Components . . . . . . . . . . . 25 89 8.3 Mapping User Input to User Interface Components . . . . . 26 90 8.4 Receiving Updates to User Interface Components . . . . . . 27 91 8.5 Terminating a User Interface Component . . . . . . . . . . 27 92 9. Inter-Application Feature Interaction . . . . . . . . . . . . 28 93 9.1 Client Local UI . . . . . . . . . . . . . . . . . . . . . 28 94 9.2 Client-Remote UI . . . . . . . . . . . . . . . . . . . . . 29 95 10. Intra Application Feature Interaction . . . . . . . . . . . 30 96 11. Example Call Flow . . . . . . . . . . . . . . . . . . . . . 30 97 12. Security Considerations . . . . . . . . . . . . . . . . . . 35 98 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . 36 99 13.1 SIP Option Tag . . . . . . . . . . . . . . . . . . . . . . 36 100 13.2 Header Field Parameter . . . . . . . . . . . . . . . . . . 36 101 14. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 36 102 15. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 36 103 16. References . . . . . . . . . . . . . . . . . . . . . . . . . 37 104 16.1 Normative References . . . . . . . . . . . . . . . . . . . . 37 105 16.2 Informative References . . . . . . . . . . . . . . . . . . . 37 106 Author's Address . . . . . . . . . . . . . . . . . . . . . . . 38 107 Intellectual Property and Copyright Statements . . . . . . . . 39 109 1. Introduction 111 The Session Initiation Protocol (SIP) [1] provides the ability for 112 users to initiate, manage, and terminate communications sessions. 113 Frequently, these sessions will involve a SIP application. A SIP 114 application is defined as a program running on a SIP-based element 115 (such as a proxy or user agent) that provides some value-added 116 function to a user or system administrator. Examples of SIP 117 applications include pre-paid calling card calls, conferencing, and 118 presence-based [11] call routing. 120 In order for most applications to properly function, they need input 121 from the user to guide their operation. As an example, a pre-paid 122 calling card application requires the user to input their calling 123 card number, their PIN code, and the destination number they wish to 124 reach. The process by which a user provides input to an application 125 is called "application interaction". 127 Application interaction can be either functional or stimulus. 128 Functional interaction requires the user device to understand the 129 semantics of the application, whereas stimulus interaction does not. 130 Stimulus signaling allows for applications to be built without 131 requiring modifications to the user device. Stimulus interaction is 132 the subject of this framework. The framework provides a model for 133 how users interact with applications through user interfaces, and how 134 user interfaces and applications can be distributed throughout a 135 network. This model is then used to describe how applications can 136 instantiate and manage user interfaces. 138 This document also defines a new SIP Refer-To header field parameter 139 and a new SIP option tag indicating support for that parameter. 141 2. Definitions 143 SIP Application: A SIP application is defined as a program running on 144 a SIP-based element (such as a proxy or user agent) that provides 145 some value-added function to a user or system administrator. 146 Examples of SIP applications include pre-paid calling card calls, 147 conferencing, and presence-based [11] call routing. 149 Application Interaction: The process by which a user provides input 150 to an application. 152 Real-Time Application Interaction: Application interaction that takes 153 place while an application instance is executing. For example, 154 when a user enters their PIN number into a pre-paid calling card 155 application, this is real-time application interaction. 157 Non-Real Time Application Interaction: Application interaction that 158 takes place asynchronously with the execution of the application. 159 Generally, non-real time application interaction is accomplished 160 through provisioning. 162 Functional Application Interaction: Application interaction is 163 functional when the user device has an understanding of the 164 semantics of the interaction with the application. 166 Stimulus Application Interaction: Application interaction is 167 considered to be stimulus when the user device has no 168 understanding of the semantics of the interaction with the 169 application. 171 User Interface (UI): The user interface provides the user with 172 context in order to make decisions about what they want. The user 173 interacts with the device, which conveys the user input the the 174 user interface. The user interface interprets the information, 175 and passes it to the application. 177 User Interface Component: A piece of user interface which operates 178 independently of other pieces of the user interface. For example, 179 a user might have two separate web interfaces to a pre-paid 180 calling card application - one for hanging up and making another 181 call, and another for entering the username and PIN. 183 User Device: The software or hardware system that the user directly 184 interacts with in order to communicate with the application. An 185 example of a user device is a telephone. Another example is a PC 186 with a web browser. 188 User Device Proxy: A software or hardware system that a user 189 indirectly interacts through in order to communicate with the 190 application. This indirection can be through a network. An 191 example is a gateway from IP to the Public Switched Telephone 192 Network (PSTN). It acts a user device proxy, acting on behalf of 193 the user on the circuit network. 195 User Input: The "raw" information passed from a user to a user 196 interface. Examples of user input include a spoken word or a 197 click on a hyperlink. 199 Client-Local User Interface: A user interface which is co-resident 200 with the user device. 202 Client-Remote User Interface: A user interface which executes 203 remotely from the user device. In this case, a standardized 204 interface is needed between the user device and the user 205 interface. Typically, this is done through media sessions - 206 audio, video, or application sharing. 208 Markup Language: A markup language describes a logical flow of 209 presentation of information to the user, collection of information 210 from the user, and transmission of that information to an 211 application. 213 Media Interaction: A means of separating a user and a user interface 214 by connecting them with media streams. 216 Interactive Voice Response (IVR): An IVR is a type of user interface 217 that allows users to speak commands to the application, and hear 218 responses to those commands prompting for more information. 220 Prompt-and-Collect: The basic primitive of an IVR user interface. 221 The user is presented with a voice option, and the user speaks 222 their choice. 224 Barge-In: The act of entering information into an IVR user inteface 225 prior to the completion of a prompt requesting that information. 227 Focus: A user interface component has focus when user input is 228 provided fed to it, as opposed to any other user interface 229 components. This is not to be confused with the term focus within 230 the SIP conferencing framework, which refers to the center user 231 agent in a conference [13]. 233 Focus Determination: The process by which the user device determines 234 which user interface component will receive the user input. 236 Focusless Device: A user device which has no ability to perform focus 237 determination. An example of a focusless device is a telephone 238 with a keypad. 240 Presentation Capable UI: A user interface which can prompt the user 241 with input, collect results, and then prompt the user with new 242 information based on those results. 244 Presentation Free UI: A user interface which cannot prompt the user 245 with information. 247 Feature Interaction: A class of problems which result when multiple 248 applications or application components are trying to provide 249 services to a user at the same time. 251 Inter-Application Feature Interaction: Feature interactions that 252 occur between applications. 254 DTMF: Dual-Tone Multi-Frequency. DTMF refer to a class of tones 255 generated by circuit switched telephony devices when the user 256 presses a key on the keypad. As a result, DTMF and keypad input 257 are often used synonymously, when in fact one of them (DTMF) is 258 merely a means of conveying the other (the keypad input) to a 259 client-remote user interface (the switch, for example). 261 Application Instance: A single execution path of a SIP application. 263 Originating Application: A SIP application which acts as a UAC, 264 making a call on behalf of the user. 266 Terminating Application: A SIP application which acts as a UAS, 267 answering a call generated by a user. IVR applications are 268 terminating applications. 270 Intermediary Application: A SIP application which is neither the 271 caller or callee, but rather, a third party involved in a call. 273 3. A Model for Application Interaction 275 +---+ +---+ +---+ +---+ 276 | | | | | | | | 277 | | | U | | U | | A | 278 | | Input | s | Input | s | Results | p | 279 | | ---------> | e | ---------> | e | ----------> | p | 280 | U | | r | | r | | l | 281 | s | | | | | | i | 282 | e | | D | | I | | c | 283 | r | Output | e | Output | f | Update | a | 284 | | <--------- | v | <--------- | a | <.......... | t | 285 | | | i | | c | | i | 286 | | | c | | e | | o | 287 | | | e | | | | n | 288 | | | | | | | | 289 +---+ +---+ +---+ +---+ 291 Figure 1: Model for Real-Time Interactions 293 Figure 1 presents a general model for how users interact with 294 applications. Generally, users interact with a user interface 295 through a user device. A user device can be a telephone, or it can 296 be a PC with a web browser. Its role is to pass the user input from 297 the user, to the user interface. The user interface provides the 298 user with context in order to make decisions about what they want. 299 The user interacts with the device, causing information to be passed 300 from the device to the user interface. The user interface interprets 301 the information, and passes it as a user interface event to the 302 application. The application may be able to modify the user 303 interface based on this event. Whether or not this is possible 304 depends on the type of user interface. 306 User interfaces are fundamentally about rendering and interpretation. 307 Rendering refers to the way in which the user is provided context. 309 This can be through hyperlinks, images, sounds, videos, text, and so 310 on. Interpretation refers to the way in which the user interface 311 takes the "raw" data provided by the user, and returns the result to 312 the application as a meaningful event, abstracted from the 313 particulars of the user interface. As an example, consider a 314 pre-paid calling card application. The user interface worries about 315 details such as what prompt the user is provided, whether the voice 316 is male or female, and so on. It is concerned with recognizing the 317 speech that the user provides, in order to obtain the desired 318 information. In this case, the desired information is the calling 319 card number, the PIN code, and the destination number. The 320 application needs that data, and it doesn't matter to the application 321 whether it was collected using a male prompt or a female one. 323 User interfaces generally have real-time requirements towards the 324 user. That is, when a user interacts with the user interface, the 325 user interface needs to react quickly, and that change needs to be 326 propagated to the user right away. However, the interface between 327 the user interface and the application need not be that fast. Faster 328 is better, but the user interface itself can frequently compensate 329 for long latencies there. In the case of a pre-paid calling card 330 application, when the user is prompted to enter their PIN, the prompt 331 should generally stop immediately once the first digit of the PIN is 332 entered. This is referred to as barge-in. After the user-interface 333 collects the rest of the PIN, it can tell the user to "please wait 334 while processing". The PIN can then be gradually transmitted to the 335 application. In this example, the user interface has compensated for 336 a slow UI to application interface by asking the user to wait. 338 The separation between user interface and application is absolutely 339 fundamental to the entire framework provided in this document. Its 340 importance cannot be overstated. 342 With this basic model, we can begin to taxonomize the types of 343 systems that can be built. 345 3.1 Functional vs. Stimulus 347 The first way to taxonomize the system is to consider the interface 348 between the UI and the application. There are two fundamentally 349 different models for this interface. In a functional interface, the 350 user interface has detailed knowledge about the application, and is, 351 in fact, specific to the application. The interface between the two 352 components is through a functional protocol, capable of representing 353 the semantics which can be exposed through the user interface. 354 Because the user interface has knowledge of the application, it can 355 be optimally designed for that application. As a result, functional 356 user interfaces are almost always the most user friendly, the fastest 357 and the most responsive. However, in order to allow interoperability 358 between user devices and applications, the details of the functional 359 protocols need to be specified in standards. This slows down 360 innovation and limits the scope of applications that can be built. 362 An alternative is a stimulus interface. In a stimulus interface, the 363 user interface is generic; totally ignorant of the details of the 364 application. Indeed, the application may pass instructions to the 365 user interface describing how it should operate. The user interface 366 translates user input into "stimulus" - which are data understood 367 only by the application, and not by the user interface. Because they 368 are generic, and because they require communications with the 369 application in order to change the way in which they render 370 information to the user, stimulus user interfaces are usually slower, 371 less user friendly, and less responsive than a functional 372 counterpart. However, they allow for substantial innovation in 373 applications, since no standardization activity is needed to build a 374 new application, as long as it can interact with the user within the 375 confines of the user interface mechanism. The web is an example of a 376 stimulus user interface to applications. 378 In SIP systems, functional interfaces are provided by extending the 379 SIP protocol to provide the needed functionality. For example, the 380 SIP caller preferences specification [14] provides a functional 381 interface that allows a user to request applications to route the 382 call to specific types of user agents. Functional interfaces are 383 important, but are not the subject of this framework. The primary 384 goal of this framework is to address the role of stimulus interfaces 385 to SIP applications. 387 3.2 Real-Time vs. Non-Real Time 389 Application interaction systems can also be real-time or 390 non-real-time. Non-real interaction allows the user to enter 391 information about application operation asynchronously with its 392 invocation. Frequently, this is done through provisioning systems. 394 As an example, a user can set up the forwarding number for a 395 call-forward on no-answer application using a web page. Real-time 396 interaction requires the user to interact with the application at the 397 time of its invocation. 399 3.3 Client-Local vs. Client-Remote 401 Another axis in the taxonomization is whether the user interface is 402 co-resident with the user device (which we refer to as a client-local 403 user interface), or the user interface runs in a host separated from 404 the client (which we refer to as a client-remote user interface). In 405 a client-remote user interface, there exists some kind of protocol 406 between the client device and the UI that allows the client to 407 interact with the user interface over a network. 409 The most important way to separate the UI and the client device is 410 through media interaction. In media interaction, the interface 411 between the user and the user interface is through media - audio, 412 video, messaging, and so on. This is the classic mode of operation 413 for VoiceXML [4], where the user interface (also referred to as the 414 voice browser) runs on a platform in the network. Users communicate 415 with the voice browser through the telephone network (or using a SIP 416 session). The voice browser interacts with the application using 417 HTTP to convey the information collected from the user. 419 In the case of a client-local user interface, the user interface runs 420 co-located with the user device. The interface between them is 421 through the software that interprets the users input and passes them 422 to the user interface. The classic example of this is the web. In 423 the web, the user interface is a web browser, and the interface is 424 defined by the HTML document that it's rendering. The user interacts 425 directly with the user interface running in the browser. The results 426 of that user interface are sent to the application (running on the 427 web server) using HTTP. 429 It is important to note that whether or not the user interface is 430 local or remote (in the case of media interaction) is not a property 431 of the modality of the interface, but rather a property of the 432 system. As an example, it is possible for a web-based user interface 433 to be provided with a client-remote user interface. In such a 434 scenario, video and application sharing media sessions can be used 435 between the user and the user interface. The user interface, still 436 guided by HTML, now runs "in the network", remote from the client. 437 Similarly, a VoiceXML document can be interpreted locally by a client 438 device, with no media streams at all. Indeed, the VoiceXML document 439 can be rendered using text, rather than media, with no impact on the 440 interface between the user interface and the application. 442 It is also important to note that systems can be hybrid. In a hybrid 443 user interface, some aspects of it (usually those associated with a 444 particular modality) run locally, and others run remotely. 446 3.4 Presentation Capable vs. Presentation Free 448 A user interface can be capable of presenting information to the user 449 (a presentation capable UI), or it can be capable only of collecting 450 user input (a presentation free UI). These are very different types 451 of user interfaces. A presentation capable UI can provide the user 452 with feedback after every input, providing the context for collecting 453 the next input. As a result, presentation capable user interfaces 454 require an update to the information provided to the user after each 455 input. The web is a classic example of this. After every input 456 (i.e., a click), the browser provides the input to the application 457 and fetches the next page to render. In a presentation free user 458 interface, this is not the case. Since the user is not provided with 459 feedback, these user interfaces tend to merely collect information as 460 its entered, and pass it to the application. 462 Another difference is that a presentation-free user interface cannot 463 support the concept of a focus. As a result, if multiple 464 applications wish to gather input from the user, there is no way for 465 the user to select which application the input is destined for. The 466 input provided to applications through presentation-free user 467 interfaces is more of a broadcast or notification operation, as a 468 result. 470 4. Interaction Scenarios on Telephones 472 In this section, we applied the model of Section 3 to telephones. 474 In a traditional telephone, the user interface consists of a 12-key 475 keypad, a speaker, and a microphone. Indeed, from here forward, the 476 term "telephone" is used to represent any device that meets, at a 477 minimum, the characteristics described in the previous sentence. 478 Circuit-switched telephony applications are almost universally 479 client-remote user interfaces. In the Public Switched Telephone 480 Network (PSTN), there is usually a circuit interface between the user 481 and the user interface. The user input from the keypad is conveyed 482 used Dual-Tone Multi-Frequency (DTMF), and the microphone input as 483 Pulse Code Modulated (PCM) encoded voice. 485 In an IP-based system, there is more variability in how the system 486 can be instantiated. Both client-remote and client-local user 487 interfaces to a telephone can be provided. 489 In this framework, a PSTN gateway can be considered a User Device 490 Proxy. It is a proxy for the user because it can provide, to a user 491 interface on an IP network, input taken from a user on a circuit 492 switched telephone. The gateway may be able to run a client-local 493 user interface, just as an IP telephone might. 495 4.1 Client Remote 497 The most obvious instantiation is the "classic" circuit-switched 498 telephony model. In that model, the user interface runs remotely 499 from the client. The interface between the user and the user 500 interface is through media, set up by SIP and carried over the Real 501 Time Transport Protocol (RTP) [16]. The microphone input can be 502 carried using any suitable voice encoding algorithm. The keypad 503 input can be conveyed in one of two ways. The first is to convert 504 the keypad input to DTMF, and then convey that DTMF using a suitance 505 encoding algorithm for it (such as PCMU). An alternative, and 506 generally the preferred approach, is to transmit the keypad input 507 using RFC 2833 [17], which provides an encoding mechanism for 508 carrying keypad input within RTP. 510 In this classic model, the user interface would run on a server in 511 the IP network. It would perform speech recognition and DTMF 512 recognition to derive the user intent, feed them through the user 513 interface, and provide the result to an application. 515 4.2 Client Local 517 An alternative model is for the entire user interface to reside on 518 the telephone. The user interface can be a VoiceXML browser, running 519 speech recognition on the microphone input, and feeding the keypad 520 input directly into the script. As discussed above, the VoiceXML 521 script could be rendered using text instead of voice, if the 522 telephone had a textual display. 524 For simpler phones without a display, the user interface can be 525 described by a Keypad Markup Language request document [7]. As the 526 user enters digits in the keypad, they are passed to the user 527 interface, which generates user interface events that can be 528 transported to the application. 530 4.3 Flip-Flop 532 A middle-ground approach is to flip back and forth between a 533 client-local and client-remote user interface. Many voice 534 applications are of the type which listen to the media stream and 535 wait for some specific trigger that kicks off a more complex user 536 interaction. The long pound in a pre-paid calling card application 537 is one example. Another example is a conference recording 538 application, where the user can press a key at some point in the call 539 to begin recording. When the key is pressed, the user hears a 540 whisper to inform them that recording has started. 542 The ideal way to support such an application is to install a 543 client-local user interface component that waits for the trigger to 544 kick off the real interaction. Once the trigger is received, the 545 application connects the user to a client-remote user interface that 546 can play announements, collect more information, and so on. 548 The benefit of flip-flopping between a client-local and client-remote 549 user interface is cost. The client-local user interface will 550 eliminate the need to send media streams into the network just to 551 wait for the user to press the pound key on the keypad. 553 The Keypad Markup Language (KPML) was designed to support exactly 554 this kind of need [7]. It models the keypad on a phone, and allows 555 an application to be informed when any sequence of keys have been 556 pressed. However, KPML has no presentation component. Since user 557 interfaces generally require a response to user input, the 558 presentation will need to be done using a client-remote user 559 interface that gets instantiated as a result of the trigger. 561 It is tempting to use a hybrid model, where a prompt-and-collect 562 application is implemented by using a client-remote user interface 563 that plays the prompts, and a client-local user interface, described 564 by KPML, that collects digits. However, this only complicates the 565 application. Firstly, the keypad input will be sent to both the 566 media stream and the KPML user interface. This requires the 567 application to sort out which user inputs are duplicates, a process 568 that is very complicated. Secondly, the primary benefit of KPML is 569 to avoid having a media stream towards a user interface. However, 570 there is already a media stream for the prompting, so there is no 571 real savings. 573 5. Framework Overview 575 In this framework, we use the term "SIP application" to refer to a 576 broad set of functionality. A SIP application is a program running 577 on a SIP-based element (such as a proxy or user agent) that provides 578 some value-added function to a user or system administrator. SIP 579 applications can execute on behalf of a caller, a called party, or a 580 multitude of users at once. 582 Each application has a number of instances that are executing at any 583 given time. An instance represents a single execution path for an 584 application. Each instance has a well defined lifecycle. It is 585 established as a result of some event. That event can be a SIP 586 event, such as the reception of a SIP INVITE request, or it can be a 587 non-SIP event, such as a web form post or even a timer. Application 588 instances also have a specific end time. Some instances have a 589 lifetime that is coupled with a SIP transaction or dialog. For 590 example, a proxy application might begin when an INVITE arrives, and 591 terminate when the call is answered. Other applications have a 592 lifetime that spans multiple dialogs or transactions. For example, a 593 conferencing application instance may exist so long as there are any 594 dialogs connected to it. When the last dialog terminates, the 595 application instance terminates. Other applications have a liftime 596 that is completely decoupled from SIP events. 598 It is fundamental to the framework described here that multiple 599 application instances may interact with a user during a single SIP 600 transaction or dialog. Each instance may be for the same 601 application, or different applications. Each of the applications may 602 be completely independent, in that they may be owned by different 603 providers, and may not be aware of each others existence. Similarly, 604 there may be application instances interacting with the caller, and 605 instances interacting with the callee, both within the same 606 transaction or dialog. 608 The first step in the interaction with the user is to instantiate one 609 or more user interface components for the application instance. A 610 user interface component is a single piece of the user interface that 611 is defined by a logical flow that is not synchronously coupled with 612 any other component. In other words, each component runs more or 613 less independently. 615 A user interface component can be instantiated in one of the user 616 agents in a dialog (for a client-local user interface), or within a 617 network element (for a client-remote user interface). If a 618 client-local user interface is to be used, the application needs to 619 determine whether or not the user agent is capable of supporting a 620 client-local user interface, and in what format. In this framework, 621 all client-local user interface components are described by a markup 622 language. A markup language describes a logical flow of presentation 623 of information to the user, collection of information from the user, 624 and transmission of that information to an application. Examples of 625 markup languages include HTML, WML, VoiceXML, and the Keypad Markup 626 Language (KPML) [7]. 628 Unlike an application instance, which has very flexible lifetimes, a 629 user interface component has a very fixed lifetime. A user interface 630 component is always associated with a dialog. The user interface 631 component can be created at any point after the dialog (or early 632 dialog) is created. However, the user interface component terminates 633 when the dialog terminates. The user interface component can be 634 terminated earlier by the user agent, and possibly by the 635 application, but its lifetime never exceeds that of its associated 636 dialog. 638 There are two ways to create a client local interface component. For 639 interface components that are presentation capable, the application 640 sends a REFER [6] request to the user agent. The Refer-To header 641 field contains an HTTP URI that points to the markup for the user 642 interface. For interface components that are presentation free (such 643 as those defined by KPML), the application sends a SUBSCRIBE request 644 to the user agent. The body of the SUBSCRIBE request contains a 645 filter, which, in this case, is the markup that defines when 646 information is to be sent to the application in a NOTIFY. 648 If a user interface component is to be instantiated in the network, 649 there is no need to determine the capabilities of the device on which 650 the user interface is instantiated. Presumably, it is on a device on 651 which the application knows a UI can be created. However, the 652 application does need to connect the user device to the user 653 interface. This will require manipulation of media streams in order 654 to establish that connection. 656 The interface between the user interface component and the 657 application depends on the type of user interface. For presentation 658 capable user interfaces, such as those described by HTML and 659 VoiceXML, HTTP form POST operations are used. For presentation free 660 user interfaces, a SIP NOTIFY is used. The differing needs and 661 capabilities of these two user interfaces, as described in Section 662 3.4, is what drives the different choices for the interactions. 663 Since presentation capable user interfaces require an update to the 664 presentation every time user data is entered, they are a good match 665 for HTTP. Since presentation free user interfaces merely transmit 666 user input to the application, a NOTIFY is more appropriate. 668 Indeed, for presentation free user interfaces, there are two 669 different modalities of operation. The first is called "one shot". 670 In the one-shot role, the markup waits for a user to enter some 671 information, and when they do, reports this event to the application. 672 The application then does something, and the markup is no longer 673 used. In the other modality, called "monitor", the markup stays 674 permanently resident, and reports information back to an application 675 until termination of the associated dialog. 677 6. Deployment Topologies 679 This section presents some of the network topologies in which this 680 framework can be instantiated. 682 6.1 Third Party Application 684 +-------------+ 685 /---| Application | 686 / +-------------+ 687 / 688 SUB/ / REFER/ 689 NOT / HTTP 690 / 691 +--------+ SIP (INVITE) +-----+ 692 | UI A--------------------X | 693 |........| | SIP | 694 | User | RTP | UA | 695 | Device B--------------------Y | 696 +--------+ +-----+ 698 Figure 2: Third Party Topology 700 In this topology, the application that is interested in interacting 701 with the users exists outside of the SIP dialog between the user 702 agents. In that case, the application learns about the initiation 703 and termination of the dialog, along with the dialog identifiers, 704 through some out of band means. One such possibility is the dialog 705 event package [15]. Dialog information is only revealed to trusted 706 parties, so the application would need to be trusted by one of the 707 users in order to obtain this information. 709 At any point during the dialog, the application can instantiate user 710 interface components on the user device of the caller or callee. It 711 can do this either using SUBSCRIBE or REFER, depending on the type of 712 user interface (presentation capable or presentation free). 714 6.2 Co-Resident Application 716 +--------+ SIP (INVITE) +-----+ 717 | User A--------------------X SIP | 718 | Device | RTP | UA | 719 |........B--------------------Y | 720 | | SUB/NOT | App)| 721 | UI A'-------------------X' | 722 +--------+ REFER/HTTP +-----+ 724 Figure 3: Co-Resident Topology 726 In this deployment topology, the application is co-resident with one 727 of the user agents (the one on the right in the picture above). This 728 application can install client-local user interface components on the 729 other user agent, which is acting as the user device. These 730 components can be installed using either SUBSCRIBE, for presentation 731 free user interfaces, or REFER, for presentation capable ones. This 732 situation typically arises when the application wishes to install UI 733 components on a presentation capable user interface. If the only 734 user input is via keypad input, the framework is not needed per se, 735 because the UA/application will receive the input via RFC 2833 in the 736 RTP stream. 738 If the application resides in the called party, it is called a 739 terminating application. If it resides in the calling party, it is 740 called an originating application. 742 This kind of topology is common in protocol converter and gateway 743 applications. 745 6.3 Third Party Application and User Device Proxy 747 +-------------+ 748 /---| Application | 749 / +-------------+ 750 / 751 SUB/ / REFER/ 752 NOT / HTTP 753 / 754 +-----+ SIP +---M----+ SIP +-----+ 755 | V--------------------C A--------------------X | 756 | SIP | | UI | | SIP | 757 | UAa | RTP | | RTP | UAb | 758 | W--------------------D B--------------------Y | 759 +-----+ +--------+ +-----+ 760 User User 761 Device Device 762 Proxy 764 Figure 4: User Device Proxy Topology 766 In this deployment topology, there is a third party application as in 767 Section 6.1. However, instead of installing a user interface 768 component on the end user device, the component is installed in an 769 intermediate device, known as a User Device Proxy. From the 770 perspective of the actual user device (on the left), the User Device 771 Proxy is a client remote user interface. As such, media, typically 772 transported using RTP (including RFC 2833 for carrying user input), 773 is sent from the user device to the client remote user interface on 774 the User Device Proxy. As far as the application is concerned, it is 775 installing what it thinks is a client local user interface on the 776 user device, but it happens to be on a user device proxy which looks 777 like the user device to the application. 779 The user device proxy will need to terminate and re-originate both 780 signaling (SIP) and media traffic towards the actual peer in the 781 conversation. The User Device Proxy is a media relay in the 782 terminology of RFC 3550 [16]. The User Device Proxy will need to 783 monitor the media streams associated with each dialog, in order to 784 convert user input received in the media stream to events reported to 785 the user interface. This can pose a challenge in multi-media 786 systems, where it may be unclear on which media stream the user input 787 is being sent. As discussed in RFC 3264 [18], if a user agent has a 788 single media source and is supporting multiple streams, it is 789 supposed to send that source to all streams. In cases where there 790 are multiple sources, the mapping is a matter of local policy. In 791 the absence of a way to explicitly identify or request which sources 792 map to which streams, the user device proxy will need to do the best 793 job it can. This specification RECOMMENDS that the User Device Proxy 794 monitor the first stream (defined in terms of ordering of media 795 sessions within a session description). As such, user agents SHOULD 796 send their user input on the first stream, absent a policy to direct 797 it otherwise. 799 6.4 Proxy Application 800 +----------+ 801 SUB/NOT | App | SUB/NOT 802 +--------------->| |<-----------------+ 803 | REFER/HTTP |..........| REFER/HTTP | 804 | | SIP | | 805 | | Proxy | | 806 | +----------+ | 807 V ^ | V 808 +----------+ | | +----------+ 809 | UI | INVITE | | INVITE | UI | 810 | |------------+ +------------>| | 811 |......... | |..........| 812 | SIP |...................................| SIP | 813 | UA | | UA | 814 +----------+ RTP +----------+ 815 User Device User Device 817 Figure 5: Proxy Application Topology 819 In this topology, the application is co-resident with a transaction 820 stateful, record-routing proxy server on the call path between two 821 user devices. The application uses SUBSCRIBE or REFER to install 822 user interface components on one or both user devices. 824 This topology is common in routing applications, such as a 825 web-assisted call routing application. 827 7. Application Behavior 829 The behavior of an application within this framework depends on 830 whether it seeks to use a client-local or client-remote user 831 interface. 833 7.1 Client Local Interfaces 835 One key component of this framework is support for client local user 836 interfaces. 838 7.1.1 Discovering Capabilities 840 A client local user interface can only be instantiated on a user 841 agent if the user agent supports that type of user interface 842 component. Support for client local user interface components is 843 declared by both the UAC and a UAS in its Accept, Allow, Contact and 844 Allow-Event header fields of dialog-initiating requests and 845 responses. If the Allow header field indicates support for the SIP 846 SUBSCRIBE method, and the Allow-Event header field indicates support 847 for the kpml package [7], and the Supported header field indicates 848 that its Contact URI is a GRUU [8], it means that the UA can 849 instantiate presentation free user interface components. In this 850 case, the application MAY push presentation free user interface 851 components according to the rules of Section 7.1.2. The specific 852 markup languages that can be supported are indicated in the Accept 853 header field. 855 If the Allow header field indicates support for the SIP REFER method, 856 the Supported header field indicates support for the "refer-context" 857 extension described below, and the Contact header field contains UA 858 capabilities [5] that indicate support for the HTTP URI scheme, it 859 means that the UA supports presentation capable user interface 860 components. In this case, the application MAY push presentation 861 capable user interface components to the client according to the 862 rules of Section 7.1.2. The specific markups that are supported are 863 indicated in the Accept header field. 865 A third party application that is not present on the call path will 866 not be privy to these headers in the dialog requests that pass by. 867 As such, it will need to obtain this capability information in other 868 ways. One way is through the registration event package [19], which 869 can contain user agent capability information provided in REGISTER 870 requests [5]. 872 7.1.2 Pushing an Initial Interface Component 874 Generally, we anticipate that interface components will need to be 875 created at various different points in a SIP session. Clearly, they 876 will need to be pushed during session setup, or after the session is 877 established. A user interface component is always associated with a 878 specific dialog, however. 880 An application MUST NOT attempt to push a user interface component to 881 a user agent until it has determined that the user agent has the 882 neccesary capabilities and a dialog has been created. In the case of 883 a UAC, this means that an application MUST NOT push a user interface 884 component for an INVITE initiated dialog until the application has 885 seen a request confirming the receipt of a dialog-creating response. 886 This could be an ACK for a 200 OK, or a PRACK for a provisional 887 response [2]. For SUBSCRIBE initiated dialogs, it MUST NOT push a 888 user interface component until the application has seen a 200 OK to 889 the NOTIFY request. For a user interface component on a UAS, the 890 application MUST NOT push a user interface component for an INVITE 891 initiated dialog until it has seen a dialog-creating response from 892 the UAS. For a SUBSCRIBE initiated dialog, it MUST NOT push a user 893 interface component until it has seen a NOTIFY request from the 894 notifier. 896 To create a presentation capable UI component on the UA, the 897 application sends a REFER request to the UA. This REFER MUST be sent 898 to the Globally Routable UA URI (GRUU) [8] advertised by that UA in 899 the Contact header field of the dialog initiating request or response 900 sent by that UA. Note that this REFER request creates a separate 901 dialog between the application and the UA. The Refer-To header field 902 of the REFER request MUST contain an HTTP URI that references the 903 markup document to be fetched. 905 Furthermore, it is essential for the REFER request to be correlated 906 with the dialog to which the user interface component will be 907 associated. This is necessary for authorization and for terminating 908 the user interface components when the dialog terminates. To provide 909 this context, this specification defines the "context" header field 910 parameter as an extension to the Refer-To heder field. The grammar 911 for this header field parameter is: 913 refer-to-ctxt = "context" EQUAL DQUOTE local-tag "," remote-tag 914 "," callid DQUOTE ; callid defined in RFC 3261 915 ;; NOTE: any DQUOTEs inside callid MUST be escaped 916 ;; using quoted pair 917 local-tag = token 918 remote-tag = token 920 Refer-To = ("Refer-To" / "r") HCOLON ( name-addr / addr-spec ) * 921 (SEMI (generic-param / refer-to-ctxt)) 923 The application MUST include the context header field parameter in 924 the REFER request. The remote-tag MUST be set to the remote tag of 925 the dialog as seen by the user device. The local-tag MUST be set to 926 the local tag of the dialog as seen by the user device. The callid 927 MUST be set to the Call-ID of the dialog as seen by the device. 928 Since the callid grammar allows it to contain double quotes, any such 929 double quotes MUST be represented with a quoted pair. 931 Since the "context" parameter in the Refer-To header field must be 932 understood by the UA to process the request, this specification 933 defines a new SIP option tag, "refer-context". A REFER request 934 generated by an application MUST include a Require header field with 935 this option tag value. Fortunately, the application will know ahead 936 of time whether this extension is supported, as discussed in Section 937 7.1.1. 939 To create a presentation free user interface component, the 940 application sends a SUBSCRIBE request to the UA. The SUBSCRIBE MUST 941 be sent to the GRUU advertised by the UA. This SUBSCRIBE request 942 creates a separate dialog. The SUBSCRIBE request MUST use the KPML 944 [7] event package. The Event header field MUST contain parameters 945 which identify the particular dialog that the interface component is 946 being instantiated against. The body of the SUBSCRIBE request 947 contains the markup document that defines the conditions under which 948 the application wishes to be notified of user input. 950 In both cases, the REFER or SUBSCRIBE request SHOULD include a 951 display name in the From header field which identifies the name of 952 the application. For example, a prepaid calling card might include a 953 From header field which looks like: 955 From: "Prepaid Calling Card" 957 Any of the SIP identity assertion mechanisms that have been defined, 958 such as [10] and [12] are applicable to these requests as well. 960 7.1.3 Updating an Interface Component 962 Once a user interface component has been created on a client, it can 963 be updated. The means for updating it depends on the type of UI 964 component. 966 Presentation capable UI components are updated using techniques 967 already in place for those markups. In particular, user input will 968 cause an HTTP POST operation to push the user input to the 969 application. The result of the POST operation is a new markup that 970 the UI is supposed to use. This allows the UI to updated in response 971 to user action. Some markups, such as HTML, provide the ability to 972 force a refresh after a certain period of time, so that the UI can be 973 updated without user input. Those mechanisms can be used here as 974 well. However, there is no support for an asynchronous push of an 975 updated UI component from the appliciation to the user agent. A new 976 REFER request to the same GRUU would create a new UI component rather 977 than updating any components already in place. 979 For presentation free UI, the story is different. The application 980 MAY update the filter at any time by generating a SUBSCRIBE refresh 981 with the new filter. The UA will immediately begin using this new 982 filter. 984 7.1.4 Terminating an Interface Component 986 User interface components have a well defined lifetime. They are 987 created when the component is first pushed to the client. User 988 interface components are always associated with the SIP dialog on 989 which they were pushed. As such, their lifetime is bound by the 990 lifetime of the dialog. When the dialog ends, so does the interface 991 component. 993 However, there are some cases where the application would like to 994 terminate the user interface component before its natural termination 995 point. For presentation capable user interfaces, this is not 996 possible. For presentation free user interfaces, the application MAY 997 terminate the component by sending a SUBSCRIBE with Expires equal to 998 zero. This terminates the subscription, which removes the UI 999 component. 1001 A client can remove a UI component at any time. For presentation 1002 capable UI, this is analagous to the user dismissing the web form 1003 window. There is no mechanism provided for reporting this kind of 1004 event to the application. The application MUST be prepared to time 1005 out, and never receive input from a user. The duration of this 1006 timeout is application dependent. For presentation free user 1007 interfaces, the UA can explicitly terminate the subscription. This 1008 will result in the generation of a NOTIFY with a Subscription-State 1009 header field equal to "terminated". 1011 7.2 Client Remote Interfaces 1013 As an alternative to, or in conjunction with client local user 1014 interfaces, an application can make use of client remote user 1015 interfaces. These user interfaces can execute co-resident with the 1016 application itself (in which case no standardized interfaces between 1017 the UI and the application need to be used), or it can run 1018 separately. This framework assumes that the user interface runs on a 1019 host that has a sufficient trust relationship with the application. 1020 As such, the means for instantiating the user interface is not 1021 considered here. 1023 The primary issue is to connect the user device to the remote user 1024 interface. Doing so requires the manipulation of media streams 1025 between the client and the user interface. Such manipulation can 1026 only be done by user agents. There are two types of user agent 1027 applications within this framework - originating/terminating 1028 applications, and intermediary applications. 1030 7.2.1 Originating and Terminating Applications 1032 Originating and terminating applications are applications which are 1033 themselves the originator or the final recipient of a SIP invitation. 1034 They are "pure" user agent applications - not back-to-back user 1035 agents. The classic example of such an application is an interactive 1036 voice response (IVR) application, which is typically a terminating 1037 application. It is a terminating application because the user 1038 explicitly calls it; i.e., it is the actual called party. An example 1039 of an originating application is a wakeup call application, which 1040 calls a user at a specified time in order to wake them up. 1042 Because originating and terminating applications are a natural 1043 termination point of the dialog, manipulation of the media session by 1044 the application is trivial. Traditional SIP techniques for adding 1045 and removing media streams, modifying codecs, and changing the 1046 address of the recipient of the media streams, can be applied. 1047 Similarly, the application can directly authenticate itself to the 1048 user through S/MIME, since it is the peer UA in the dialog. 1050 7.2.2 Intermediary Applications 1052 Intermediary applications are, at the same time, more common than 1053 originating/terminating applications, and more complex. Intermediary 1054 applications are applications that are neither the actual caller or 1055 called party. Rather, they represent a "third party" that wishes to 1056 interact with the user. The classic example is the ubiquitous 1057 pre-paid calling card application. 1059 In order for the intermediary application to add a client remote user 1060 interface, it needs to manipulate the media streams of the user agent 1061 to terminate on that user interface. This also introduces a 1062 fundamental feature interaction issue. Since the intermediary 1063 application is not an actual participant in the call, the user will 1064 need to interact with both the intermediary application and its peer 1065 in the dialog. Doing both at the same time is complicated, and is 1066 discussed in more detail in Section 9. 1068 8. User Agent Behavior 1070 8.1 Advertising Capabilities 1072 In order to participate in applications that make use of stimulus 1073 interfaces, a user agent needs to advertise its interaction 1074 capabilities. 1076 If a user agent supports presentation capable user interfaces, it 1077 MUST support the REFER method, along with the "context" extension 1078 defined here. It MUST include, in all dialog initiating requests and 1079 responses, an Allow header field that includes the REFER method and 1080 and the Supported header field that includes the value 1081 "refer-context". Furthermore, the UA MUST support the SIP user agent 1082 capabilities specification [5]. The UA MUST be capable of being 1083 REFER'd to an HTTP URI. It MUST include, in the Contact header field 1084 of its dialog initiating requests and responses, a "schemes" Contact 1085 header field parameter include the http URI scheme. The UA MUST 1086 include, in all dialog initiating requests and responses, an Accept 1087 header field listing all of those markups supported by the UA. It is 1088 RECOMMENDED that all user agents that support presentation capable 1089 user interfaces support HTML. 1091 If a user agent supports presentation free user interfaces, it MUST 1092 support the SUBSCRIBE [3] method. It MUST support the KPML [7] event 1093 package. It MUST include, in all dialog initiating requests and 1094 responses, an Allow header field that includes the SUBSCRIBE method. 1095 It MUST include, in all dialog initiating requests and responses, an 1096 Allow-Events header field that lists the KPML event package. The UA 1097 MUST include, in all dialog initiating requests and responses, an 1098 Accept header field listing those event filters it supports. At a 1099 minimum, a UA MUST support the "application/kpml-request+xml" MIME 1100 type. 1102 For either presentation free or presentation capable user interfaces, 1103 the user agent MUST support the GRUU [8] specification. The Contact 1104 header field in all dialog initiating requests and responses MUST 1105 contain a GRUU. The UA MUST include a Supported header field which 1106 contains the "gruu" option tag. 1108 Because these headers are examined by proxies which may be executing 1109 applications, a UA that wishes to support client local user 1110 interfaces should not encrypt them. 1112 8.2 Receiving User Interface Components 1114 Once the UA has created a dialog (in either the early or confirmed 1115 states), it MUST be prepared to receive a SUBSCRIBE or REFER request 1116 against its GRUU. If the UA receives such a request prior to the 1117 establishment of a dialog, the UA MUST reject the request. 1119 A user agent SHOULD attempt to authenticate the sender of the 1120 request. The sender will generally be an application, and therefore 1121 the user agent is unlikely to ever have a shared secret with it, 1122 making digest authentication useless. However, authenticated 1123 identities can be obtained through other means, such as [10]. 1125 A user agent MAY have pre-defined authorization policies which permit 1126 applications which have authenticated themselves with a particular 1127 identity, to push user interface components. If such a set of 1128 policies are present, it is checked first. If the application is 1129 authorized, processing proceeds. 1131 If the application has authenticated itself, but it is not explicitly 1132 authorized or blocked, this specification RECOMMENDS that the 1133 application be automatically authorized if it can prove that it was 1134 either on the call path, or is trusted by one of the elements on the 1135 call path. An application proves this to the user agent by 1136 presenting it with the dialog identifiers in the SUBSCRIBE or REFER 1137 request. In the case of SUBSCRIBE, those identifiers are present in 1138 the Event header field [7]. In the case of REFER, those identifiers 1139 are present in the "context" parameter of the Refer-To header field. 1141 Because of the dialog identifiers serve as a tool for authorization, 1142 a user agent compliant to this framework SHOULD use dialog 1143 identifiers that are cryptographically random, with at least 128 bits 1144 of randomness. It is recommended that this randomness be split 1145 between the Call-ID and From header field tag in the case of a UAC. 1147 Furthermore, to ensure that only applications resident in or trusted 1148 by on-path elements can instantiate a user interface component, a 1149 user agent compliant to this specification SHOULD use the sips URI 1150 scheme for all dialogs it initiates. This will guarantee secure 1151 links between all of the elements on the signaling path. 1153 If the dialog was not established with a sips URI, or the user agent 1154 did not choose cryptographically random dialog identifiers, then the 1155 application MUST NOT automatically be authorized, even if it 1156 presented valid dialog identifiers. A user agent MAY apply any other 1157 policies in addition to (but not instead of) the ones specified here 1158 in order to authorize the creation of the user interface component. 1159 One such mechanism would be to prompt the user, informing them of the 1160 identity of the application and the dialog it is associated with. If 1161 an authorization policy requires user interaction, the user agent 1162 SHOULD respond to the SUBSCRIBE or REFER request with a 202. In the 1163 case of SUBSCRIBE, if authorization is not granted, the user agent 1164 SHOULD generate a NOTIFY to terminate the subscription. In the case 1165 of REFER, the user agent MUST NOT act upon the URI in the Refer-To 1166 header field until user authorization was obtained. 1168 If an application does not present a valid dialog identifier in its 1169 REFER or SUBSCRIBE request, the user agent MUST reject the request 1170 with a 403 response. 1172 If a REFER request to an HTTP URI was authorized, the UA executes the 1173 URI and fetches the content to be rendered to the user. This 1174 instantiates a presentation capable user interface component. If a 1175 SUBSCRIBE was authorized, a presentation free user interface 1176 component was instantiated. 1178 8.3 Mapping User Input to User Interface Components 1180 Once the user interface components are instantiated, the user agent 1181 must direct user input to the appropriate component. In the case of 1182 presentation capable user interfaces, this process is known as focus 1183 selection. It is done by means that are specific to the user 1184 interface on the device. In the case of a PC, for example, the 1185 window manager would allow the user to select the appropriate user 1186 interface component that their input is directed to. 1188 For presentation free user interfaces, the situation is more 1189 complicated. In some cases, the device may support a mechanism that 1190 allows the user to select a "line", and thus the associated dialog. 1191 Any user input on the keypad while this line is selected are fed to 1192 the user interface components associated with that dialog. 1194 Otherwise, for client local user interfaces, the user input is 1195 assumed to be associated with all user interface components. For 1196 client remote user interfaces, the user device converts the user 1197 input to media, typically conveyed using RFC 2833, and sends this to 1198 the client remote user interface. This user interface then needs to 1199 map user input from potentially many media streams into user 1200 interface events. The process for doing this is described in Section 1201 6.3. 1203 8.4 Receiving Updates to User Interface Components 1205 For presentation capable user interfaces, updates to the user 1206 interface occur in ways specific to that user interface component. 1207 In the case of HTML, for example, the document can tell the client to 1208 fetch a new document periodically. However, this framework does not 1209 provide any additional machinery to asynchronously push a new user 1210 interface component to the client. 1212 For presentation free user interfaces, an application can push an 1213 update to a component by sending a SUBSCRIBE refresh with a new 1214 filter. The user agent will process these according to the rules of 1215 the event package. 1217 8.5 Terminating a User Interface Component 1219 Termination of a presentation capable user interface component is a 1220 trivial procedure. The user agent merely dismisses the window (or 1221 equivalent). The fact that the component is dismissed is not 1222 communicated to the application. As such, it is purely a local 1223 matter. 1225 In the case of a presentation free user interface, the user might 1226 wish to cease interacting with the application. However, most 1227 presentation free user interfaces will not have a way for the user to 1228 signal this through the device. If such a mechanism did exist, the 1229 UA SHOULD generate a NOTIFY request with a Subscription-State equal 1230 to "terminated" and a reason of "rejected". This tells the 1231 application that the component has been removed, and that it should 1232 not attempt to re-subscribe. 1234 9. Inter-Application Feature Interaction 1236 The inter-application feature interaction problem is inherent to 1237 stimulus signaling. Whenever there are multiple applications, there 1238 are multiple user interfaces. The system has to determine to which 1239 user interface any particular input is destined. That question is 1240 the essence of the inter-application feature interaction problem. 1242 Inter-application feature interaction is not an easy problem to 1243 resolve. For now, we consider separately the issues for client-local 1244 and client-remote user interface components. 1246 9.1 Client Local UI 1248 When the user interface itself resides locally on the client device, 1249 the feature interaction problem is actually much simpler. The end 1250 device knows explicitly about each application, and therefore can 1251 present the user with each one separately. When the user provides 1252 input, the client device can determine to which user interface the 1253 input is destined. The user interface to which input is destined is 1254 referred to as the application in focus, and the means by which the 1255 focused application is selected is called focus determination. 1257 Generally speaking, focus determination is purely a local operation. 1258 In the PC universe, focus determination is provided by window 1259 managers. Each application does not know about focus, it merely 1260 receives the user input that has been targeted to it when its in 1261 focus. This basic concept applies to SIP-based applications as well. 1263 Focus determination will frequently be trivial, depending on the user 1264 interface type. Consider a user that makes a call from a PC. The 1265 call passes through a pre-paid calling card application, and a call 1266 recording application. Both of these wish to interact with the user. 1267 Both push an HTML-based user interface to the user. On the PC, each 1268 user interface would appear as a separate window. The user interacts 1269 with the call recording application by selecting its window, and with 1270 the pre-paid calling card application by selecting its window. Focus 1271 determination is literally provided by the PC window manager. It is 1272 clear to which application the user input is targeted. 1274 As another example, consider the same two applications, but on a 1275 "smart phone" that has a set of buttons, and next to each button, an 1276 LCD display that can provide the user with an option. This user 1277 interface can be represented using the Wireless Markup Language 1278 (WML). 1280 The phone would allocate some number of buttons to each application. 1281 The prepaid calling card would get one button for its "hangup" 1282 command, and the recording application would get one for its 1283 "start/stop" command. The user can easily determine which 1284 application to interact with by pressing the appropriate button. 1285 Pressing a button determines focus and provides user input, both at 1286 the same time. 1288 Unfortunately, not all devices will have these advanced displays. A 1289 PSTN gateway, or a basic IP telephone, may only have a 12-key keypad. 1290 The user interfaces for these devices are provided through the Keypad 1291 Markup Language (KPML). Considering once again the feature 1292 interaction case above, the pre-paid calling card application and the 1293 call recording application would both pass a KPML document to the 1294 device. When the user presses a button on the keypad, to which 1295 document does the input apply? The device does not allow the user to 1296 select. A device where the user cannot provide focus is called a 1297 focusless device. This is quite a hard problem to solve. This 1298 framework does not make any explicit normative recommendation, but 1299 concludes that the best option is to send the input to both user 1300 interfaces unless the markup in one interface has indicated that it 1301 should be suppressed from others. This is a sensible choice by 1302 analogy - its exactly what the existing circuit switched telephone 1303 network will do. It is an explicit non-goal to provide a better 1304 mechanism for feature interaction resolution than the PSTN on devices 1305 which have the same user interface as they do on the PSTN. Devices 1306 with better displays, such as PCs or screen phones, can benefit from 1307 the capabilities of this framework, allowing the user to determine 1308 which application they are interacting with. 1310 Indeed, when a user provides input on a focusless device, the input 1311 must be passed to all client local user interfaces, AND all client 1312 remote user interfaces, unless the markup tells the UI to suppress 1313 the media. In the case of KPML, key events are passed to remote user 1314 interfaces by encoding them in RFC 2833 [17]. Of course, since a 1315 client cannot determine if a media stream terminates in a remote user 1316 interface or not, these key events are passed in all audio media 1317 streams unless the KPML request document is used to suppress. 1319 9.2 Client-Remote UI 1321 When the user interfaces run remotely, the determination of focus can 1322 be much, much harder. There are many architectures that can be 1323 deployed to handle the interaction. None are ideal. However, all 1324 are beyond the scope of this specification. 1326 10. Intra Application Feature Interaction 1328 An application can instantiate a multiplicity of user interface 1329 components. For example, a single application can instantiate two 1330 separate HTML components and one WML component. Furthermore, an 1331 application can instantiate both client local and client remote user 1332 interfaces. 1334 The feature interaction issues between these components within the 1335 same application are less severe. If an application has multiple 1336 client user interface components, their interaction is resolved 1337 identically to the inter-application case - through focus 1338 determination. However, the problems in focusless user devices (such 1339 as a keypad on a telephone) generally won't exist, since the 1340 application can generate user interfaces which do not overlap in 1341 their usage of an input. 1343 The real issue is that the optimal user experience frequently 1344 requires some kind of coupling between the differing user interface 1345 components. This is a classic problem in multi-modal user 1346 interfaces, such as those described by Speech Application Language 1347 Tags (SALT). As an example, consider a user interface where a user 1348 can either press a labeled button to make a selection, or listen to a 1349 prompt, and speak the desired selection. Ideally, when the user 1350 presses the button, the prompt should cease immediately, since both 1351 of them were targeted at collecting the same information in parallel. 1352 Such interactions are best handled by markups which natively support 1353 such interactions, such as SALT, and thus require no explicit support 1354 from this framework. 1356 11. Example Call Flow 1358 This section shows the operation of a call recording application. 1359 This application allows a user to record the media in their call by 1360 clicking on a button in a web form. The application uses a 1361 presentation capable user interface component that is pushed to the 1362 caller. 1364 A Recording App B 1365 |(1) INVITE | | 1366 |----------------------->| | 1367 | |(2) INVITE | 1368 | |----------------------->| 1369 | |(3) 200 OK | 1370 | |<-----------------------| 1371 |(4) 200 OK | | 1372 |<-----------------------| | 1373 |(5) ACK | | 1374 |----------------------->| | 1375 | |(6) ACK | 1376 | |----------------------->| 1377 |(7) REFER | | 1378 |<-----------------------| | 1379 |(8) 200 OK | | 1380 |----------------------->| | 1381 |(9) NOTIFY | | 1382 |----------------------->| | 1383 |(10) 200 OK | | 1384 |<-----------------------| | 1385 |(11) HTTP GET | | 1386 |----------------------->| | 1387 |(12) 200 OK | | 1388 |<-----------------------| | 1389 |(13) NOTIFY | | 1390 |----------------------->| | 1391 |(14) 200 OK | | 1392 |<-----------------------| | 1393 |(15) HTTP POST | | 1394 |----------------------->| | 1395 |(16) 200 OK | | 1396 |<-----------------------| | 1398 Figure 8 1400 First, the caller, A, sends an INVITE to setup a call (message 1). 1401 Since the caller supports the framework, and can handle presentation 1402 capable user interface components, it includes the Supported header 1403 field indicating that the GRUU extension and the REFER context 1404 extension are understood, Allow indicating that REFER is understood, 1405 and a Contact header field that includes the "schemes" header field 1406 parameter. 1408 INVITE sips:B@example.com SIP/2.0 1409 Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz8 1410 From: Caller ;tag=kkaz- 1411 To: Callee 1412 Call-ID: faif9ahhs9dd8==-sd98ajzz@host.example.com 1413 CSeq: 1 INVITE 1414 Max-Forwards: 70 1415 Supported: gruu, refer-context 1416 Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, REFER 1417 Contact: ;schemes="http,sip,sips" 1418 Content-Length: ... 1419 Content-Type: application/sdp 1421 --SDP not shown-- 1423 The proxy acts as a recording server, and forwards the INVITE to the 1424 called party (message 2): 1426 INVITE sips:B@pc.example.com SIP/2.0 1427 Record-Route: 1428 Via: SIP/2.0/TLS app.example.com;branch=z9hG4bK97sh 1429 Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz8 1430 From: Caller ;tag=kkaz- 1431 To: Callee 1432 Call-ID: faif9ahhs9dd8==-sd98ajzz@host.example.com 1433 CSeq: 1 INVITE 1434 Max-Forwards: 69 1435 Supported: gruu, refer-context 1436 Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, REFER 1437 Contact: ;schemes="http,sip,sips" 1438 Content-Length: ... 1439 Content-Type: application/sdp 1441 --SDP not shown-- 1443 B accepts the call with a 200 OK (message 3). It does not support 1444 the framework, and so the various header fields are not present. 1446 SIP/2.0 200 OK 1447 Record-Route: 1448 Via: SIP/2.0/TLS app.example.com;branch=z9hG4bK97sh 1449 Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz8 1450 From: Caller ;tag=kkaz- 1451 To: Callee ;tag=7777 1452 Call-ID: faif9ahhs9dd8==-sd98ajzz@host.example.com 1453 CSeq: 1 INVITE 1454 Contact: 1455 Content-Length: ... 1456 Content-Type: application/sdp 1458 --SDP not shown-- 1460 This 200 OK is passed back to the caller (message 4): 1462 SIP/2.0 200 OK 1463 Record-Route: 1464 Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz8 1465 From: Caller ;tag=kkaz- 1466 To: Callee ;tag=7777 1467 Call-ID: faif9ahhs9dd8==-sd98ajzz@host.example.com 1468 CSeq: 1 INVITE 1469 Contact: 1470 Content-Length: ... 1471 Content-Type: application/sdp 1473 --SDP not shown-- 1475 The caller generates an ACK (message 5). 1477 ACK sips:B@pc.example.com 1478 Route: 1479 Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz9 1480 From: Caller ;tag=kkaz- 1481 To: Callee ;tag=7777 1482 Call-ID: faif9ahhs9dd8==-sd98ajzz@host.example.com 1483 CSeq: 1 ACK 1485 The ACK is forwarded to the called party (message 6). 1487 ACK sips:B@pc.example.com 1488 Via: SIP/2.0/TLS app.example.com;branch=z9hG4bKh7s 1489 Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz9 1490 From: Caller ;tag=kkaz- 1491 To: Callee ;tag=7777 1492 Call-ID: faif9ahhs9dd8==-sd98ajzz@host.example.com 1493 CSeq: 1 ACK 1495 Now, the application decides to push a user interface component to 1496 user A. So, it sends it a REFER request (message 7): 1498 REFER sips:bad998asd8asd0000a@example.com SIP/2.0 1499 Refer-To: https://app.example.com/script.pl 1500 ;context="kkaz-,7777,faif9ahhs9dd8==-sd98ajzz@host.example.com" 1501 Via: SIP/2.0/TLS app.example.com;branch=z9hG4bK9zh6 1502 Max-Forwards: 70 1503 From: Recorder Application ;tag=jhgf 1504 To: Caller 1505 Call-ID: 66676776767@app.example.com 1506 CSeq: 1 REFER 1507 Event: refer 1508 Contact: 1510 The REFER is answered by a 200 OK (message 8). 1512 SIP/2.0 200 OK 1513 Via: SIP/2.0/TLS app.example.com;branch=z9hG4bK9zh6 1514 From: Recorder Application ;tag=jhgf 1515 To: Caller ;tag=pqoew 1516 Call-ID: 66676776767@app.example.com 1517 Supported: gruu, refer-context 1518 Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, REFER 1519 Contact: ;schemes="http,sip,sips" 1520 CSeq: 1 REFER 1522 User A sends a NOTIFY (message 9): 1524 NOTIFY sips:app.example.com SIP/2.0 1525 Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9320394238995 1526 To: Recorder Application ;tag=jhgf 1527 From: Caller ;tag=pqoew 1528 Call-ID: 66676776767@app.example.com 1529 CSeq: 1 NOTIFY 1530 Max-Forwards: 70 1531 Event: refer;id=93809824 1532 Subscription-State: active;expires=3600 1533 Contact: ;schemes="http,sip,sips" 1534 Content-Type: message/sipfrag;version=2.0 1535 Content-Length: 20 1536 SIP/2.0 100 Trying 1538 And the recording server responds with a 200 OK (message 10) 1540 SIP/2.0 200 OK 1541 Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9320394238995 1542 To: Recorder Application ;tag=jhgf 1543 From: Caller ;tag=pqoew 1544 Call-ID: 66676776767@app.example.com 1545 CSeq: 1 NOTIFY 1547 The REFER request contained a "context" Refer-To header field 1548 parameter with a valid dialog identifier. Furthermore, all of the 1549 signaling was over TLS and the dialog identifiers contain sufficient 1550 randomness. As such, the caller, A, automatically authorizes the 1551 application. It then acts on the Refer-To URI, fetching the script 1552 from app.example.com (message 11). The response, message 12, 1553 contains a web application that the user can click on to enable 1554 recording. Because the client executed the URL in the Refer-To, it 1555 generates another NOTIFY to the application, informing it of the 1556 successful response (message 13). This is answered with a 200 OK 1557 (message 14). When the user clicks on the link (message 15), the 1558 results are posted to the server, and an updated display is provided 1559 (message 16). 1561 12. Security Considerations 1563 There are many security considerations associated with this 1564 framework. It allows applications in the network to instantiate user 1565 interface components on a client device. Such instantiations need to 1566 be from authenticated applications, and also need to be authorized to 1567 place a UI into the client. Indeed, the stronger requirement is 1568 authorization. It is not so important to know that name of the 1569 provider of the application, but rather, that the provider is 1570 authorized to instantiate components. 1572 This specification defines specific authorization techniques and 1573 requirements. Automatic authorization is granted if the application 1574 can prove that it is on the call path, or is trusted by an element on 1575 the call path. As documented above, this can be accompished by the 1576 use of cryptographically random dialog identifiers and the usage of 1577 sips for message confidentiality. It is RECOMMENDED that sips be 1578 implemented by user agents compliant to this specification. This 1579 does not represent a change from the requirements in RFC 3261. 1581 13. IANA Considerations 1583 13.1 SIP Option Tag 1585 This specification registers a new SIP option tag, as per the 1586 guidelines in Section 27.1 of RFC 3261 [1]. 1588 Name: refer-context 1590 Description: This option tag is used to identify the REFER extension 1591 that defines the "context" parameter of the Refer-To header field. 1593 13.2 Header Field Parameter 1595 This specification defines a new header field parameter, as per the 1596 registry created by [9]. The required information is as follows: 1598 Header field in which the parameter can appear: Refer-To 1600 Name of the Parameter context 1602 RFC Reference RFC XXXX [[NOTE TO IANA: Please replace XXXX with the 1603 RFC number of this specification.]] 1605 14. Contributors 1607 This document was produced as a result of discussions amongst the 1608 application interaction design team. All members of this team 1609 contributed significantly to the ideas embodied in this document. 1610 The members of this team were: 1612 Eric Burger 1613 Cullen Jennings 1614 Robert Fairlie-Cuninghame 1616 15. Acknowledgements 1618 The authors would like to thank Martin Dolly and Rohan Mahy for their 1619 input and comments. Thanks to Allison Mankin for her support of this 1620 work. 1622 16. References 1623 16.1 Normative References 1625 [1] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., 1626 Peterson, J., Sparks, R., Handley, M. and E. Schooler, "SIP: 1627 Session Initiation Protocol", RFC 3261, June 2002. 1629 [2] Rosenberg, J. and H. Schulzrinne, "Reliability of Provisional 1630 Responses in Session Initiation Protocol (SIP)", RFC 3262, June 1631 2002. 1633 [3] Roach, A., "Session Initiation Protocol (SIP)-Specific Event 1634 Notification", RFC 3265, June 2002. 1636 [4] McGlashan, S., Lucas, B., Porter, B., Rehor, K., Burnett, D., 1637 Carter, J., Ferrans, J. and A. Hunt, "Voice Extensible Markup 1638 Language (VoiceXML) Version 2.0", W3C CR CR-voicexml20-20030220, 1639 February 2003. 1641 [5] Rosenberg, J., Schulzrinne, H. and P. Kyzivat, "Indicating User 1642 Agent Capabilities in the Session Initiation Protocol (SIP)", 1643 RFC 3840, August 2004. 1645 [6] Sparks, R., "The Session Initiation Protocol (SIP) Refer 1646 Method", RFC 3515, April 2003. 1648 [7] Burger, E., "A Session Initiation Protocol (SIP) Event Package 1649 for Key Press Stimulus (KPML)", draft-ietf-sipping-kpml-07 1650 (work in progress), December 2004. 1652 [8] Rosenberg, J., "Obtaining and Using Globally Routable User Agent 1653 (UA) URIs (GRUU) in the Session Initiation Protocol (SIP)", 1654 draft-ietf-sip-gruu-02 (work in progress), July 2004. 1656 [9] Camarillo, G., "The Internet Assigned Number Authority (IANA) 1657 Header Field Parameter Registry for the Session Initiation 1658 Protocol (SIP)", BCP 98, RFC 3968, December 2004. 1660 16.2 Informative References 1662 [10] Peterson, J., "Enhancements for Authenticated Identity 1663 Management in the Session Initiation Protocol (SIP)", 1664 draft-ietf-sip-identity-03 (work in progress), September 2004. 1666 [11] Day, M., Rosenberg, J. and H. Sugano, "A Model for Presence and 1667 Instant Messaging", RFC 2778, February 2000. 1669 [12] Jennings, C., Peterson, J. and M. Watson, "Private Extensions 1670 to the Session Initiation Protocol (SIP) for Asserted Identity 1671 within Trusted Networks", RFC 3325, November 2002. 1673 [13] Rosenberg, J., "A Framework for Conferencing with the Session 1674 Initiation Protocol", 1675 draft-ietf-sipping-conferencing-framework-03 (work in 1676 progress), October 2004. 1678 [14] Rosenberg, J., Schulzrinne, H. and P. Kyzivat, "Caller 1679 Preferences for the Session Initiation Protocol (SIP)", RFC 1680 3841, August 2004. 1682 [15] Rosenberg, J., "An INVITE Inititiated Dialog Event Package for 1683 the Session Initiation Protocol (SIP)", 1684 draft-ietf-sipping-dialog-package-05 (work in progress), 1685 November 2004. 1687 [16] Schulzrinne, H., Casner, S., Frederick, R. and V. Jacobson, 1688 "RTP: A Transport Protocol for Real-Time Applications", RFC 1689 3550, July 2003. 1691 [17] Schulzrinne, H. and S. Petrack, "RTP Payload for DTMF Digits, 1692 Telephony Tones and Telephony Signals", RFC 2833, May 2000. 1694 [18] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with 1695 Session Description Protocol (SDP)", RFC 3264, June 2002. 1697 [19] Rosenberg, J., "A Session Initiation Protocol (SIP) Event 1698 Package for Registrations", RFC 3680, March 2004. 1700 Author's Address 1702 Jonathan Rosenberg 1703 Cisco Systems 1704 600 Lanidex Plaza 1705 Parsippany, NJ 07054 1706 US 1708 Phone: +1 973 952-5000 1709 EMail: jdrosen@cisco.com 1710 URI: http://www.jdrosen.net 1712 Intellectual Property Statement 1714 The IETF takes no position regarding the validity or scope of any 1715 Intellectual Property Rights or other rights that might be claimed to 1716 pertain to the implementation or use of the technology described in 1717 this document or the extent to which any license under such rights 1718 might or might not be available; nor does it represent that it has 1719 made any independent effort to identify any such rights. Information 1720 on the procedures with respect to rights in RFC documents can be 1721 found in BCP 78 and BCP 79. 1723 Copies of IPR disclosures made to the IETF Secretariat and any 1724 assurances of licenses to be made available, or the result of an 1725 attempt made to obtain a general license or permission for the use of 1726 such proprietary rights by implementers or users of this 1727 specification can be obtained from the IETF on-line IPR repository at 1728 http://www.ietf.org/ipr. 1730 The IETF invites any interested party to bring to its attention any 1731 copyrights, patents or patent applications, or other proprietary 1732 rights that may cover technology that may be required to implement 1733 this standard. Please address the information to the IETF at 1734 ietf-ipr@ietf.org. 1736 Disclaimer of Validity 1738 This document and the information contained herein are provided on an 1739 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1740 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 1741 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 1742 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 1743 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1744 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1746 Copyright Statement 1748 Copyright (C) The Internet Society (2005). This document is subject 1749 to the rights, licenses and restrictions contained in BCP 78, and 1750 except as set forth therein, the authors retain all their rights. 1752 Acknowledgment 1754 Funding for the RFC Editor function is currently provided by the 1755 Internet Society.