idnits 2.17.1 draft-ietf-sipping-app-interaction-framework-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3667, Section 5.1 on line 14. -- Found old boilerplate from RFC 3978, Section 5.5 on line 1734. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1711. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1718. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1724. ** Found boilerplate matching RFC 3978, Section 5.4, paragraph 1 (on line 1740), which is fine, but *also* found old RFC 2026, Section 10.4C, paragraph 1 text on line 36. ** The document seems to lack an RFC 3978 Section 5.1 IPR Disclosure Acknowledgement -- however, there's a paragraph with a matching beginning. Boilerplate error? ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. ** The document uses RFC 3667 boilerplate or RFC 3978-like boilerplate instead of verbatim RFC 3978 boilerplate. After 6 May 2005, submission of drafts without verbatim RFC 3978 boilerplate is not accepted. The following non-3978 patterns matched text found in the document. That text should be removed or replaced: By submitting this Internet-Draft, I certify that any applicable patent or other IPR claims of which I am aware have been disclosed, or will be disclosed, and any of which I become aware will be disclosed, in accordance with RFC 3668. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 6 instances of too long lines in the document, the longest one being 4 characters in excess of 72. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 785: '...ription). As such, user agents SHOULD...' RFC 2119 keyword, line 840: '... the application MAY push presentation...' RFC 2119 keyword, line 850: '... the application MAY push presentation...' RFC 2119 keyword, line 870: '... An application MUST NOT attempt to p...' RFC 2119 keyword, line 873: '...t an application MUST NOT push a user ...' (49 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (October 24, 2004) is 7123 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 3265 (ref. '3') (Obsoleted by RFC 6665) -- Possible downref: Non-RFC (?) normative reference: ref. '4' == Outdated reference: A later version (-08) exists of draft-ietf-sipping-kpml-04 == Outdated reference: A later version (-15) exists of draft-ietf-sip-gruu-02 == Outdated reference: A later version (-06) exists of draft-ietf-sip-identity-03 == Outdated reference: A later version (-05) exists of draft-ietf-sipping-conferencing-framework-02 == Outdated reference: A later version (-06) exists of draft-ietf-sipping-dialog-package-04 -- Obsolete informational reference (is this intentional?): RFC 2833 (ref. '17') (Obsoleted by RFC 4733, RFC 4734) Summary: 9 errors (**), 0 flaws (~~), 7 warnings (==), 9 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 SIPPING J. Rosenberg 2 Internet-Draft Cisco Systems 3 Expires: April 24, 2005 October 24, 2004 5 A Framework for Application Interaction in the Session Initiation 6 Protocol (SIP) 7 draft-ietf-sipping-app-interaction-framework-03 9 Status of this Memo 11 By submitting this Internet-Draft, I certify that any applicable 12 patent or other IPR claims of which I am aware have been disclosed, 13 and any of which I become aware will be disclosed, in accordance with 14 RFC 3668. 16 Internet-Drafts are working documents of the Internet Engineering 17 Task Force (IETF), its areas, and its working groups. Note that 18 other groups may also distribute working documents as 19 Internet-Drafts. 21 Internet-Drafts are draft documents valid for a maximum of six months 22 and may be updated, replaced, or obsoleted by other documents at any 23 time. It is inappropriate to use Internet-Drafts as reference 24 material or to cite them other than as "work in progress." 26 The list of current Internet-Drafts can be accessed at 27 http://www.ietf.org/ietf/1id-abstracts.txt. 29 The list of Internet-Draft Shadow Directories can be accessed at 30 http://www.ietf.org/shadow.html. 32 This Internet-Draft will expire on April 24, 2005. 34 Copyright Notice 36 Copyright (C) The Internet Society (2004). All Rights Reserved. 38 Abstract 40 This document describes a framework for the interaction between users 41 and Session Initiation Protocol (SIP) based applications. By 42 interacting with applications, users can guide the way in which they 43 operate. The focus of this framework is stimulus signaling, which 44 allows a user agent to interact with an application without knowledge 45 of the semantics of that application. Stimulus signaling can occur 46 to a user interface running locally with the client, or to a remote 47 user interface, through media streams. Stimulus signaling 48 encompasses a wide range of mechanisms, ranging from clicking on 49 hyperlinks, to pressing buttons, to traditional Dual Tone Multi 50 Frequency (DTMF) input. In all cases, stimulus signaling is 51 supported through the use of markup languages, which play a key role 52 in this framework. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 57 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 4 58 3. A Model for Application Interaction . . . . . . . . . . . . . 7 59 3.1 Functional vs. Stimulus . . . . . . . . . . . . . . . . . 8 60 3.2 Real-Time vs. Non-Real Time . . . . . . . . . . . . . . . 9 61 3.3 Client-Local vs. Client-Remote . . . . . . . . . . . . . . 9 62 3.4 Presentation Capable vs. Presentation Free . . . . . . . . 10 63 4. Interaction Scenarios on Telephones . . . . . . . . . . . . . 11 64 4.1 Client Remote . . . . . . . . . . . . . . . . . . . . . . 11 65 4.2 Client Local . . . . . . . . . . . . . . . . . . . . . . . 12 66 4.3 Flip-Flop . . . . . . . . . . . . . . . . . . . . . . . . 12 67 5. Framework Overview . . . . . . . . . . . . . . . . . . . . . . 13 68 6. Deployment Topologies . . . . . . . . . . . . . . . . . . . . 15 69 6.1 Third Party Application . . . . . . . . . . . . . . . . . 16 70 6.2 Co-Resident Application . . . . . . . . . . . . . . . . . 16 71 6.3 Third Party Application and User Device Proxy . . . . . . 17 72 6.4 Proxy Application . . . . . . . . . . . . . . . . . . . . 18 73 7. Application Behavior . . . . . . . . . . . . . . . . . . . . . 19 74 7.1 Client Local Interfaces . . . . . . . . . . . . . . . . . 19 75 7.1.1 Discovering Capabilities . . . . . . . . . . . . . . . 19 76 7.1.2 Pushing an Initial Interface Component . . . . . . . . 20 77 7.1.3 Updating an Interface Component . . . . . . . . . . . 22 78 7.1.4 Terminating an Interface Component . . . . . . . . . . 22 79 7.2 Client Remote Interfaces . . . . . . . . . . . . . . . . . 23 80 7.2.1 Originating and Terminating Applications . . . . . . . 23 81 7.2.2 Intermediary Applications . . . . . . . . . . . . . . 24 82 8. User Agent Behavior . . . . . . . . . . . . . . . . . . . . . 24 83 8.1 Advertising Capabilities . . . . . . . . . . . . . . . . . 24 84 8.2 Receiving User Interface Components . . . . . . . . . . . 25 85 8.3 Mapping User Input to User Interface Components . . . . . 26 86 8.4 Receiving Updates to User Interface Components . . . . . . 27 87 8.5 Terminating a User Interface Component . . . . . . . . . . 27 88 9. Inter-Application Feature Interaction . . . . . . . . . . . . 28 89 9.1 Client Local UI . . . . . . . . . . . . . . . . . . . . . 28 90 9.2 Client-Remote UI . . . . . . . . . . . . . . . . . . . . . 29 91 10. Intra Application Feature Interaction . . . . . . . . . . . 29 92 11. Example Call Flow . . . . . . . . . . . . . . . . . . . . . 30 93 12. Security Considerations . . . . . . . . . . . . . . . . . . 35 94 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . 36 95 13.1 SIP Option Tag . . . . . . . . . . . . . . . . . . . . . . 36 96 13.2 Header Field Parameter . . . . . . . . . . . . . . . . . . 36 98 14. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 36 99 15. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 36 100 16. References . . . . . . . . . . . . . . . . . . . . . . . . . 37 101 16.1 Normative References . . . . . . . . . . . . . . . . . . . . 37 102 16.2 Informative References . . . . . . . . . . . . . . . . . . . 37 103 Author's Address . . . . . . . . . . . . . . . . . . . . . . . 38 104 Intellectual Property and Copyright Statements . . . . . . . . 39 106 1. Introduction 108 The Session Initiation Protocol (SIP) [1] provides the ability for 109 users to initiate, manage, and terminate communications sessions. 110 Frequently, these sessions will involve a SIP application. A SIP 111 application is defined as a program running on a SIP-based element 112 (such as a proxy or user agent) that provides some value-added 113 function to a user or system administrator. Examples of SIP 114 applications include pre-paid calling card calls, conferencing, and 115 presence-based [11] call routing. 117 In order for most applications to properly function, they need input 118 from the user to guide their operation. As an example, a pre-paid 119 calling card application requires the user to input their calling 120 card number, their PIN code, and the destination number they wish to 121 reach. The process by which a user provides input to an application 122 is called "application interaction". 124 Application interaction can be either functional or stimulus. 125 Functional interaction requires the user device to understand the 126 semantics of the application, whereas stimulus interaction does not. 127 Stimulus signaling allows for applications to be built without 128 requiring modifications to the user device. Stimulus interaction is 129 the subject of this framework. The framework provides a model for 130 how users interact with applications through user interfaces, and how 131 user interfaces and applications can be distributed throughout a 132 network. This model is then used to describe how applications can 133 instantiate and manage user interfaces. 135 2. Definitions 137 SIP Application: A SIP application is defined as a program running on 138 a SIP-based element (such as a proxy or user agent) that provides 139 some value-added function to a user or system administrator. 140 Examples of SIP applications include pre-paid calling card calls, 141 conferencing, and presence-based [11] call routing. 143 Application Interaction: The process by which a user provides input 144 to an application. 146 Real-Time Application Interaction: Application interaction that takes 147 place while an application instance is executing. For example, 148 when a user enters their PIN number into a pre-paid calling card 149 application, this is real-time application interaction. 151 Non-Real Time Application Interaction: Application interaction that 152 takes place asynchronously with the execution of the application. 153 Generally, non-real time application interaction is accomplished 154 through provisioning. 156 Functional Application Interaction: Application interaction is 157 functional when the user device has an understanding of the 158 semantics of the interaction with the application. 160 Stimulus Application Interaction: Application interaction is 161 considered to be stimulus when the user device has no 162 understanding of the semantics of the interaction with the 163 application. 165 User Interface (UI): The user interface provides the user with 166 context in order to make decisions about what they want. The user 167 interacts with the device, which conveys the user input the the 168 user interface. The user interface interprets the information, 169 and passes it to the application. 171 User Interface Component: A piece of user interface which operates 172 independently of other pieces of the user interface. For example, 173 a user might have two separate web interfaces to a pre-paid 174 calling card application - one for hanging up and making another 175 call, and another for entering the username and PIN. 177 User Device: The software or hardware system that the user directly 178 interacts with in order to communicate with the application. An 179 example of a user device is a telephone. Another example is a PC 180 with a web browser. 182 User Device Proxy: A software or hardware system that a user 183 indirectly interacts through in order to communicate with the 184 application. This indirection can be through a network. An 185 example is a gateway from IP to the Public Switched Telephone 186 Network (PSTN). It acts a user device proxy, acting on behalf of 187 the user on the circuit network. 189 User Input: The "raw" information passed from a user to a user 190 interface. Examples of user input include a spoken word or a 191 click on a hyperlink. 193 Client-Local User Interface: A user interface which is co-resident 194 with the user device. 196 Client-Remote User Interface: A user interface which executes 197 remotely from the user device. In this case, a standardized 198 interface is needed between the user device and the user 199 interface. Typically, this is done through media sessions - 200 audio, video, or application sharing. 202 Media Interaction: A means of separating a user and a user interface 203 by connecting them with media streams. 205 Interactive Voice Response (IVR): An IVR is a type of user interface 206 that allows users to speak commands to the application, and hear 207 responses to those commands prompting for more information. 209 Prompt-and-Collect: The basic primitive of an IVR user interface. 210 The user is presented with a voice option, and the user speaks 211 their choice. 213 Barge-In: In an IVR user interface, a user is prompted to enter some 214 information. With some prompts, the user may enter the requested 215 information before the prompt completes. In that case, the prompt 216 ceases. The act of entering the information before completion of 217 the prompt is referred to as barge-in. 219 Focus: A user interface component has focus when user input is 220 provided fed to it, as opposed to any other user interface 221 components. This is not to be confused with the term focus within 222 the SIP conferencing framework, which refers to the center user 223 agent in a conference [13]. 225 Focus Determination: The process by which the user device determines 226 which user interface component will receive the user input. 228 Focusless User Interface: A user interface which has no ability to 229 perform focus determination. An example of a focusless user 230 interface is a keypad on a telephone. 232 Presentation Capable UI: A user interface which can prompt the user 233 with input, collect results, and then prompt the user with new 234 information based on those results. 236 Presentation Free UI: A user interface which cannot prompt the user 237 with information. 239 Feature Interaction: A class of problems which result when multiple 240 applications or application components are trying to provide 241 services to a user at the same time. 243 Inter-Application Feature Interaction: Feature interactions that 244 occur between applications. 246 DTMF: Dual-Tone Multi-Frequency. DTMF refer to a class of tones 247 generated by circuit switched telephony devices when the user 248 presses a key on the keypad. As a result, DTMF and keypad input 249 are often used synonymously, when in fact one of them (DTMF) is 250 merely a means of conveying the other (the keypad input) to a 251 client-remote user interface (the switch, for example). 253 Application Instance: A single execution path of a SIP application. 255 Originating Application: A SIP application which acts as a UAC, 256 making a call on behalf of the user. 258 Terminating Application: A SIP application which acts as a UAS, 259 answering a call generated by a user. IVR applications are 260 terminating applications. 262 Intermediary Application: A SIP application which is neither the 263 caller or callee, but rather, a third party involved in a call. 265 3. A Model for Application Interaction 267 +---+ +---+ +---+ +---+ 268 | | | | | | | | 269 | | | U | | U | | A | 270 | | Input | s | Input | s | Results | p | 271 | | ---------> | e | ---------> | e | ----------> | p | 272 | U | | r | | r | | l | 273 | s | | | | | | i | 274 | e | | D | | I | | c | 275 | r | Output | e | Output | f | Update | a | 276 | | <--------- | v | <--------- | a | <.......... | t | 277 | | | i | | c | | i | 278 | | | c | | e | | o | 279 | | | e | | | | n | 280 | | | | | | | | 281 +---+ +---+ +---+ +---+ 283 Figure 1: Model for Real-Time Interactions 285 Figure 1 presents a general model for how users interact with 286 applications. Generally, users interact with a user interface 287 through a user device. A user device can be a telephone, or it can 288 be a PC with a web browser. Its role is to pass the user input from 289 the user, to the user interface. The user interface provides the 290 user with context in order to make decisions about what they want. 291 The user interacts with the device, causing information to be passed 292 from the device to the user interface. The user interface interprets 293 the information, and passes it as a user interface event to the 294 application. The application may be able to modify the user 295 interface based on this event. Whether or not this is possible 296 depends on the type of user interface. 298 User interfaces are fundamentally about rendering and interpretation. 299 Rendering refers to the way in which the user is provided context. 300 This can be through hyperlinks, images, sounds, videos, text, and so 301 on. Interpretation refers to the way in which the user interface 302 takes the "raw" data provided by the user, and returns the result to 303 the application as a meaningful event, abstracted from the 304 particulars of the user interface. As an example, consider a 305 pre-paid calling card application. The user interface worries about 306 details such as what prompt the user is provided, whether the voice 307 is male or female, and so on. It is concerned with recognizing the 308 speech that the user provides, in order to obtain the desired 309 information. In this case, the desired information is the calling 310 card number, the PIN code, and the destination number. The 311 application needs that data, and it doesn't matter to the application 312 whether it was collected using a male prompt or a female one. 314 User interfaces generally have real-time requirements towards the 315 user. That is, when a user interacts with the user interface, the 316 user interface needs to react quickly, and that change needs to be 317 propagated to the user right away. However, the interface between 318 the user interface and the application need not be that fast. Faster 319 is better, but the user interface itself can frequently compensate 320 for long latencies there. In the case of a pre-paid calling card 321 application, when the user is prompted to enter their PIN, the prompt 322 should generally stop immediately once the first digit of the PIN is 323 entered. This is referred to as barge-in. After the user-interface 324 collects the rest of the PIN, it can tell the user to "please wait 325 while processing". The PIN can then be gradually transmitted to the 326 application. In this example, the user interface has compensated for 327 a slow UI to application interface by asking the user to wait. 329 The separation between user interface and application is absolutely 330 fundamental to the entire framework provided in this document. Its 331 importance cannot be overstated. 333 With this basic model, we can begin to taxonomize the types of 334 systems that can be built. 336 3.1 Functional vs. Stimulus 338 The first way to taxonomize the system is to consider the interface 339 between the UI and the application. There are two fundamentally 340 different models for this interface. In a functional interface, the 341 user interface has detailed knowledge about the application, and is, 342 in fact, specific to the application. The interface between the two 343 components is through a functional protocol, capable of representing 344 the semantics which can be exposed through the user interface. 345 Because the user interface has knowledge of the application, it can 346 be optimally designed for that application. As a result, functional 347 user interfaces are almost always the most user friendly, the fastest 348 and the most responsive. However, in order to allow interoperability 349 between user devices and applications, the details of the functional 350 protocols need to be specified in standards. This slows down 351 innovation and limits the scope of applications that can be built. 353 An alternative is a stimulus interface. In a stimulus interface, the 354 user interface is generic; totally ignorant of the details of the 355 application. Indeed, the application may pass instructions to the 356 user interface describing how it should operate. The user interface 357 translates user input into "stimulus" - which are data understood 358 only by the application, and not by the user interface. Because they 359 are generic, and because they require communications with the 360 application in order to change the way in which they render 361 information to the user, stimulus user interfaces are usually slower, 362 less user friendly, and less responsive than a functional 363 counterpart. However, they allow for substantial innovation in 364 applications, since no standardization activity is needed to build a 365 new application, as long as it can interact with the user within the 366 confines of the user interface mechanism. The web is an example of a 367 stimulus user interface to applications. 369 In SIP systems, functional interfaces are provided by extending the 370 SIP protocol to provide the needed functionality. For example, the 371 SIP caller preferences specification [14] provides a functional 372 interface that allows a user to request applications to route the 373 call to specific types of user agents. Functional interfaces are 374 important, but are not the subject of this framework. The primary 375 goal of this framework is to address the role of stimulus interfaces 376 to SIP applications. 378 3.2 Real-Time vs. Non-Real Time 380 Application interaction systems can also be real-time or 381 non-real-time. Non-real interaction allows the user to enter 382 information about application operation asynchronously with its 383 invocation. Frequently, this is done through provisioning systems. 384 As an example, a user can set up the forwarding number for a 385 call-forward on no-answer application using a web page. Real-time 386 interaction requires the user to interact with the application at the 387 time of its invocation. 389 3.3 Client-Local vs. Client-Remote 391 Another axis in the taxonomization is whether the user interface is 392 co-resident with the user device (which we refer to as a client-local 393 user interface), or the user interface runs in a host separated from 394 the client (which we refer to as a client-remote user interface). In 395 a client-remote user interface, there exists some kind of protocol 396 between the client device and the UI that allows the client to 397 interact with the user interface over a network. 399 The most important way to separate the UI and the client device is 400 through media interaction. In media interaction, the interface 401 between the user and the user interface is through media - audio, 402 video, messaging, and so on. This is the classic mode of operation 403 for VoiceXML [4], where the user interface (also referred to as the 404 voice browser) runs on a platform in the network. Users communicate 405 with the voice browser through the telephone network (or using a SIP 406 session). The voice browser interacts with the application using 407 HTTP to convey the information collected from the user. 409 In the case of a client-local user interface, the user interface runs 410 co-located with the user device. The interface between them is 411 through the software that interprets the users input and passes them 412 to the user interface. The classic example of this is the web. In 413 the web, the user interface is a web browser, and the interface is 414 defined by the HTML document that it's rendering. The user interacts 415 directly with the user interface running in the browser. The results 416 of that user interface are sent to the application (running on the 417 web server) using HTTP. 419 It is important to note that whether or not the user interface is 420 local or remote (in the case of media interaction) is not a property 421 of the modality of the interface, but rather a property of the 422 system. As an example, it is possible for a web-based user interface 423 to be provided with a client-remote user interface. In such a 424 scenario, video and application sharing media sessions can be used 425 between the user and the user interface. The user interface, still 426 guided by HTML, now runs "in the network", remote from the client. 427 Similarly, a VoiceXML document can be interpreted locally by a client 428 device, with no media streams at all. Indeed, the VoiceXML document 429 can be rendered using text, rather than media, with no impact on the 430 interface between the user interface and the application. 432 It is also important to note that systems can be hybrid. In a hybrid 433 user interface, some aspects of it (usually those associated with a 434 particular modality) run locally, and others run remotely. 436 3.4 Presentation Capable vs. Presentation Free 438 A user interface can be capable of presenting information to the user 439 (a presentation capable UI), or it can be capable only of collecting 440 user input (a presentation free UI). These are very different types 441 of user interfaces. A presentation capable UI can provide the user 442 with feedback after every input, providing the context for collecting 443 the next input. As a result, presentation capable user interfaces 444 require an update to the information provided to the user after each 445 input. The web is a classic example of this. After every input 446 (i.e., a click), the browser provides the input to the application 447 and fetches the next page to render. In a presentation free user 448 interface, this is not the case. Since the user is not provided with 449 feedback, these user interfaces tend to merely collect information as 450 its entered, and pass it to the application. 452 Another difference is that a presentation-free user interface cannot 453 support the concept of a focus. As a result, if multiple 454 applications wish to gather input from the user, there is no way for 455 the user to select which application the input is destined for. The 456 input provided to applications through presentation-free user 457 interfaces is more of a broadcast or notification operation, as a 458 result. 460 4. Interaction Scenarios on Telephones 462 In this section, we applied the model of Section 3 to telephones. 464 In a traditional telephone, the user interface consists of a 12-key 465 keypad, a speaker, and a microphone. Indeed, from here forward, the 466 term "telephone" is used to represent any device that meets, at a 467 minimum, the characteristics described in the previous sentence. 468 Circuit-switched telephony applications are almost universally 469 client-remote user interfaces. In the Public Switched Telephone 470 Network (PSTN), there is usually a circuit interface between the user 471 and the user interface. The user input from the keypad is conveyed 472 used Dual-Tone Multi-Frequency (DTMF), and the microphone input as 473 Pulse Code Modulated (PCM) encoded voice. 475 In an IP-based system, there is more variability in how the system 476 can be instantiated. Both client-remote and client-local user 477 interfaces to a telephone can be provided. 479 In this framework, a PSTN gateway can be considered a User Device 480 Proxy. It is a proxy for the user because it can provide, to a user 481 interface on an IP network, input taken from a user on a circuit 482 switched telephone. The gateway may be able to run a client-local 483 user interface, just as an IP telephone might. 485 4.1 Client Remote 487 The most obvious instantiation is the "classic" circuit-switched 488 telephony model. In that model, the user interface runs remotely 489 from the client. The interface between the user and the user 490 interface is through media, set up by SIP and carried over the Real 491 Time Transport Protocol (RTP) [16]. The microphone input can be 492 carried using any suitable voice encoding algorithm. The keypad 493 input can be conveyed in one of two ways. The first is to convert 494 the keypad input to DTMF, and then convey that DTMF using a suitance 495 encoding algorithm for it (such as PCMU). An alternative, and 496 generally the preferred approach, is to transmit the keypad input 497 using RFC 2833 [17], which provides an encoding mechanism for 498 carrying keypad input within RTP. 500 In this classic model, the user interface would run on a server in 501 the IP network. It would perform speech recognition and DTMF 502 recognition to derive the user intent, feed them through the user 503 interface, and provide the result to an application. 505 4.2 Client Local 507 An alternative model is for the entire user interface to reside on 508 the telephone. The user interface can be a VoiceXML browser, running 509 speech recognition on the microphone input, and feeding the keypad 510 input directly into the script. As discussed above, the VoiceXML 511 script could be rendered using text instead of voice, if the 512 telephone had a textual display. 514 For simpler phones without a display, the user interface can be 515 described by a Keypad Markup Language request document [7]. As the 516 user enters digits in the keypad, they are passed to the user 517 interface, which generates user interface events that can be 518 transported to the application. 520 4.3 Flip-Flop 522 A middle-ground approach is to flip back and forth between a 523 client-local and client-remote user interface. Many voice 524 applications are of the type which listen to the media stream and 525 wait for some specific trigger that kicks off a more complex user 526 interaction. The long pound in a pre-paid calling card application 527 is one example. Another example is a conference recording 528 application, where the user can press a key at some point in the call 529 to begin recording. When the key is pressed, the user hears a 530 whisper to inform them that recording has started. 532 The ideal way to support such an application is to install a 533 client-local user interface component that waits for the trigger to 534 kick off the real interaction. Once the trigger is received, the 535 application connects the user to a client-remote user interface that 536 can play announements, collect more information, and so on. 538 The benefit of flip-flopping between a client-local and client-remote 539 user interface is cost. The client-local user interface will 540 eliminate the need to send media streams into the network just to 541 wait for the user to press the pound key on the keypad. 543 The Keypad Markup Language (KPML) was designed to support exactly 544 this kind of need [7]. It models the keypad on a phone, and allows 545 an application to be informed when any sequence of keys have been 546 pressed. However, KPML has no presentation component. Since user 547 interfaces generally require a response to user input, the 548 presentation will need to be done using a client-remote user 549 interface that gets instantiated as a result of the trigger. 551 It is tempting to use a hybrid model, where a prompt-and-collect 552 application is implemented by using a client-remote user interface 553 that plays the prompts, and a client-local user interface, described 554 by KPML, that collects digits. However, this only complicates the 555 application. Firstly, the keypad input will be sent to both the 556 media stream and the KPML user interface. This requires the 557 application to sort out which user inputs are duplicates, a process 558 that is very complicated. Secondly, the primary benefit of KPML is 559 to avoid having a media stream towards a user interface. However, 560 there is already a media stream for the prompting, so there is no 561 real savings. 563 5. Framework Overview 565 In this framework, we use the term "SIP application" to refer to a 566 broad set of functionality. A SIP application is a program running 567 on a SIP-based element (such as a proxy or user agent) that provides 568 some value-added function to a user or system administrator. SIP 569 applications can execute on behalf of a caller, a called party, or a 570 multitude of users at once. 572 Each application has a number of instances that are executing at any 573 given time. An instance represents a single execution path for an 574 application. Each instance has a well defined lifecycle. It is 575 established as a result of some event. That event can be a SIP 576 event, such as the reception of a SIP INVITE request, or it can be a 577 non-SIP event, such as a web form post or even a timer. Application 578 instances also have a specific end time. Some instances have a 579 lifetime that is coupled with a SIP transaction or dialog. For 580 example, a proxy application might begin when an INVITE arrives, and 581 terminate when the call is answered. Other applications have a 582 lifetime that spans multiple dialogs or transactions. For example, a 583 conferencing application instance may exist so long as there are any 584 dialogs connected to it. When the last dialog terminates, the 585 application instance terminates. Other applications have a liftime 586 that is completely decoupled from SIP events. 588 It is fundamental to the framework described here that multiple 589 application instances may interact with a user during a single SIP 590 transaction or dialog. Each instance may be for the same 591 application, or different applications. Each of the applications may 592 be completely independent, in that they may be owned by different 593 providers, and may not be aware of each others existence. Similarly, 594 there may be application instances interacting with the caller, and 595 instances interacting with the callee, both within the same 596 transaction or dialog. 598 The first step in the interaction with the user is to instantiate one 599 or more user interface components for the application instance. A 600 user interface component is a single piece of the user interface that 601 is defined by a logical flow that is not synchronously coupled with 602 any other component. In other words, each component runs more or 603 less independently. 605 A user interface component can be instantiated in one of the user 606 agents in a dialog (for a client-local user interface), or within a 607 network element (for a client-remote user interface). If a 608 client-local user interface is to be used, the application needs to 609 determine whether or not the user agent is capable of supporting a 610 client-local user interface, and in what format. In this framework, 611 all client-local user interface components are described by a markup 612 language. A markup language describes a logical flow of presentation 613 of information to the user, collection of information from the user, 614 and transmission of that information to an application. Examples of 615 markup languages include HTML, WML, VoiceXML, and the Keypad Markup 616 Language (KPML) [7]. 618 Unlike an application instance, which has very flexible lifetimes, a 619 user interface component has a very fixed lifetime. A user interface 620 component is always associated with a dialog. The user interface 621 component can be created at any point after the dialog (or early 622 dialog) is created. However, the user interface component terminates 623 when the dialog terminates. The user interface component can be 624 terminated earlier by the user agent, and possibly by the 625 application, but its lifetime never exceeds that of its associated 626 dialog. 628 There are two ways to create a client local interface component. For 629 interface components that are presentation capable, the application 630 sends a REFER [6] request to the user agent. The Refer-To header 631 field contains an HTTP URI that points to the markup for the user 632 interface. For interface components that are presentation free (such 633 as those defined by KPML), the application sends a SUBSCRIBE request 634 to the user agent. The body of the SUBSCRIBE request contains a 635 filter, which, in this case, is the markup that defines when 636 information is to be sent to the application in a NOTIFY. 638 If a user interface component is to be instantiated in the network, 639 there is no need to determine the capabilities of the device on which 640 the user interface is instantiated. Presumably, it is on a device on 641 which the application knows a UI can be created. However, the 642 application does need to connect the user device to the user 643 interface. This will require manipulation of media streams in order 644 to establish that connection. 646 The interface between the user interface component and the 647 application depends on the type of user interface. For presentation 648 capable user interfaces, such as those described by HTML and 649 VoiceXML, HTTP form POST operations are used. For presentation free 650 user interfaces, a SIP NOTIFY is used. The differing needs and 651 capabilities of these two user interfaces, as described in Section 652 3.4, is what drives the different choices for the interactions. 653 Since presentation capable user interfaces require an update to the 654 presentation every time user data is entered, they are a good match 655 for HTTP. Since presentation free user interfaces merely transmit 656 user input to the application, a NOTIFY is more appropriate. 658 Indeed, for presentation free user interfaces, there are two 659 different modalities of operation. The first is called "one shot". 660 In the one-shot role, the markup waits for a user to enter some 661 information, and when they do, reports this event to the application. 662 The application then does something, and the markup is no longer 663 used. In the other modality, called "monitor", the markup stays 664 permanently resident, and reports information back to an application 665 until termination of the associated dialog. 667 6. Deployment Topologies 669 This section presents some of the network topologies in which this 670 framework can be instantiated. 672 6.1 Third Party Application 674 +-------------+ 675 /---| Application | 676 / +-------------+ 677 / 678 SUB/ / REFER/ 679 NOT / HTTP 680 / 681 +--------+ SIP (INVITE) +-----+ 682 | UI A--------------------X | 683 |........| | SIP | 684 | User | RTP | UA | 685 | Device B--------------------Y | 686 +--------+ +-----+ 688 Figure 2: Third Party Topology 690 In this topology, the application that is interested in interacting 691 with the users exists outside of the SIP dialog between the user 692 agents. In that case, the application learns about the initiation 693 and termination of the dialog, along with the dialog identifiers, 694 through some out of band means. One such possibility is the dialog 695 event package [15]. Dialog information is only revealed to trusted 696 parties, so the application would need to be trusted by one of the 697 users in order to obtain this information. 699 At any point during the dialog, the application can instantiate user 700 interface components on the user device of the caller or callee. It 701 can do this either using SUBSCRIBE or REFER, depending on the type of 702 user interface (presentation capable or presentation free). 704 6.2 Co-Resident Application 706 +--------+ SIP (INVITE) +-----+ 707 | User A--------------------X SIP | 708 | Device | RTP | UA | 709 |........B--------------------Y | 710 | | SUB/NOT | App)| 711 | UI A'-------------------X' | 712 +--------+ REFER/HTTP +-----+ 714 Figure 3: Co-Resident Topology 716 In this deployment topology, the application is co-resident with one 717 of the user agents (the one on the right in the picture above). This 718 application can install client-local user interface components on the 719 other user agent, which is acting as the user device. These 720 components can be installed using either SUBSCRIBE, for presentation 721 free user interfaces, or REFER, for presentation capable ones. This 722 situation typically arises when the application wishes to install UI 723 components on a presentation capable user interface. If the only 724 user input is via keypad input, the framework is not needed per se, 725 because the UA/application will receive the input via RFC 2833 in the 726 RTP stream. 728 If the application resides in the called party, it is called a 729 terminating application. If it resides in the calling party, it is 730 called an originating application. 732 This kind of topology is common in protocol converter and gateway 733 applications. 735 6.3 Third Party Application and User Device Proxy 737 +-------------+ 738 /---| Application | 739 / +-------------+ 740 / 741 SUB/ / REFER/ 742 NOT / HTTP 743 / 744 +-----+ SIP +---M----+ SIP +-----+ 745 | V--------------------C A--------------------X | 746 | SIP | | UI | | SIP | 747 | UAa | RTP | | RTP | UAb | 748 | W--------------------D B--------------------Y | 749 +-----+ +--------+ +-----+ 750 User User 751 Device Device 752 Proxy 754 Figure 4: User Device Proxy Topology 756 In this deployment topology, there is a third party application as in 757 Section 6.1. However, instead of installing a user interface 758 component on the end user device, the component is installed in an 759 intermediate device, known as a User Device Proxy. From the 760 perspective of the actual user device (on the left), the User Device 761 Proxy is a client remote user interface. As such, media, typically 762 transported using RTP (including RFC 2833 for carrying user input), 763 is sent from the user device to the client remote user interface on 764 the User Device Proxy. As far as the application is concerned, it is 765 installing what it thinks is a client local user interface on the 766 user device, but it happens to be on a user device proxy which looks 767 like the user device to the application. 769 The user device proxy will need to terminate and re-originate both 770 signaling (SIP) and media traffic towards the actual peer in the 771 conversation. The User Device Proxy is a media relay in the 772 terminology of RFC 3550 [16]. The User Device Proxy will need to 773 monitor the media streams associated with each dialog, in order to 774 convert user input received in the media stream to events reported to 775 the user interface. This can pose a challenge in multi-media 776 systems, where it may be unclear on which media stream the user input 777 is being sent. As discussed in RFC 3264 [18], if a user agent has a 778 single media source and is supporting multiple streams, it is 779 supposed to send that source to all streams. In cases where there 780 are multiple sources, the mapping is a matter of local policy. In 781 the absence of a way to explicitly identify or request which sources 782 map to which streams, the user device proxy will need to do the best 783 job it can. This specification RECOMMENDS that the User Device Proxy 784 monitor the first stream (defined in terms of ordering of media 785 sessions within a session description). As such, user agents SHOULD 786 send their user input on the first stream, absent a policy to direct 787 it otherwise. 789 6.4 Proxy Application 790 +----------+ 791 SUB/NOT | App | SUB/NOT 792 +--------------->| |<-----------------+ 793 | REFER/HTTP |..........| REFER/HTTP | 794 | | SIP | | 795 | | Proxy | | 796 | +----------+ | 797 V ^ | V 798 +----------+ | | +----------+ 799 | UI | INVITE | | INVITE | UI | 800 | |------------+ +------------>| | 801 |......... | |..........| 802 | SIP |...................................| SIP | 803 | UA | | UA | 804 +----------+ RTP +----------+ 805 User Device User Device 807 Figure 5: Proxy Application Topology 809 In this topology, the application is co-resident with a transaction 810 stateful, record-routing proxy server on the call path between two 811 user devices. The application uses SUBSCRIBE or REFER to install 812 user interface components on one or both user devices. 814 This topology is common in routing applications, such as a 815 web-assisted call routing application. 817 7. Application Behavior 819 The behavior of an application within this framework depends on 820 whether it seeks to use a client-local or client-remote user 821 interface. 823 7.1 Client Local Interfaces 825 One key component of this framework is support for client local user 826 interfaces. 828 7.1.1 Discovering Capabilities 830 A client local user interface can only be instantiated on a user 831 agent if the user agent supports that type of user interface 832 component. Support for client local user interface components is 833 declared by both the UAC and a UAS in its Accept, Allow, Contact and 834 Allow-Event header fields of dialog-initiating requests and 835 responses. If the Allow header field indicates support for the SIP 836 SUBSCRIBE method, and the Allow-Event header field indicates support 837 for the kpml package [7], and the Supported header field indicates 838 that its Contact URI is a GRUU [8], it means that the UA can 839 instantiate presentation free user interface components. In this 840 case, the application MAY push presentation free user interface 841 components according to the rules of Section 7.1.2. The specific 842 markup languages that can be supported are indicated in the Accept 843 header field. 845 If the Allow header field indicates support for the SIP REFER method, 846 the Supported header field indicates support for the "refer-context" 847 extension described below, and the Contact header field contains UA 848 capabilities [5] that indicate support for the HTTP URI scheme, it 849 means that the UA supports presentation capable user interface 850 components. In this case, the application MAY push presentation 851 capable user interface components to the client according to the 852 rules of Section 7.1.2. The specific markups that are supported are 853 indicated in the Accept header field. 855 A third party application that is not present on the call path will 856 not be privy to these headers in the dialog requests that pass by. 857 As such, it will need to obtain this capability information in other 858 ways. One way is through the registration event package [19], which 859 can contain user agent capability information provided in REGISTER 860 requests [5]. 862 7.1.2 Pushing an Initial Interface Component 864 Generally, we anticipate that interface components will need to be 865 created at various different points in a SIP session. Clearly, they 866 will need to be pushed during session setup, or after the session is 867 established. A user interface component is always associated with a 868 specific dialog, however. 870 An application MUST NOT attempt to push a user interface component to 871 a user agent until it has determined that the user agent has the 872 neccesary capabilities and a dialog has been created. In the case of 873 a UAC, this means that an application MUST NOT push a user interface 874 component for an INVITE initiated dialog until the application has 875 seen a request confirming the receipt of a dialog-creating response. 876 This could be an ACK for a 200 OK, or a PRACK for a provisional 877 response [2]. For SUBSCRIBE initiated dialogs, it MUST NOT push a 878 user interface component until the application has seen a 200 OK to 879 the NOTIFY request. For a user interface component on a UAS, the 880 application MUST NOT push a user interface component for an INVITE 881 initiated dialog until it has seen a dialog-creating response from 882 the UAS. For a SUBSCRIBE initiated dialog, it MUST NOT push a user 883 interface component until it has seen a NOTIFY request from the 884 notifier. 886 To create a presentation capable UI component on the UA, the 887 application sends a REFER request to the UA. This REFER MUST be sent 888 to the Globally Routable UA URI (GRUU) [8] advertised by that UA in 889 the Contact header field of the dialog initiating request or response 890 sent by that UA. Note that this REFER request creates a separate 891 dialog between the application and the UA. The Refer-To header field 892 of the REFER request MUST contain an HTTP URI that references the 893 markup document to be fetched. 895 Furthermore, it is essential for the REFER request to be correlated 896 with the dialog to which the user interface component will be 897 associated. This is necessary for authorization and for terminating 898 the user interface components when the dialog terminates. To provide 899 this context, this specification defines the "context" header field 900 parameter as an extension to the Refer-To heder field. The grammar 901 for this header field parameter is: 903 refer-to-ctxt = "context" EQUAL DQUOTE local-tag "," remote-tag 904 "," callid DQUOTE ; callid defined in RFC 3261 905 ;; NOTE: any DQUOTEs inside callid MUST be escaped 906 ;; using quoted pair 907 local-tag = token 908 remote-tag = token 910 Refer-To = ("Refer-To" / "r") HCOLON ( name-addr / addr-spec ) * 911 (SEMI (generic-param / refer-to-ctxt)) 913 The application MUST include the context header field parameter in 914 the REFER request. The remote-tag MUST be set to the remote tag of 915 the dialog as seen by the user device. The local-tag MUST be set to 916 the local tag of the dialog as seen by the user device. The callid 917 MUST be set to the Call-ID of the dialog as seen by the device. 918 Since the callid grammar allows it to contain double quotes, any such 919 double quotes MUST be represented with a quoted pair. 921 Since the "context" parameter in the Refer-To header field must be 922 understood by the UA to process the request, this specification 923 defines a new SIP option tag, "refer-context". A REFER request 924 generated by an application MUST include a Require header field with 925 this option tag value. Fortunately, the application will know ahead 926 of time whether this extension is supported, as discussed in Section 927 7.1.1. 929 To create a presentation free user interface component, the 930 application sends a SUBSCRIBE request to the UA. The SUBSCRIBE MUST 931 be sent to the GRUU advertised by the UA. This SUBSCRIBE request 932 creates a separate dialog. The SUBSCRIBE request MUST use the KPML 934 [7] event package. The Event header field MUST contain parameters 935 which identify the particular dialog that the interface component is 936 being instantiated against. The body of the SUBSCRIBE request 937 contains the markup document that defines the conditions under which 938 the application wishes to be notified of user input. 940 In both cases, the REFER or SUBSCRIBE request SHOULD include a 941 display name in the From header field which identifies the name of 942 the application. For example, a prepaid calling card might include a 943 From header field which looks like: 945 From: "Prepaid Calling Card" 947 Any of the SIP identity assertion mechanisms that have been defined, 948 such as [10] and [12] are applicable to these requests as well. 950 7.1.3 Updating an Interface Component 952 Once a user interface component has been created on a client, it can 953 be updated. The means for updating it depends on the type of UI 954 component. 956 Presentation capable UI components are updated using techniques 957 already in place for those markups. In particular, user input will 958 cause an HTTP POST operation to push the user input to the 959 application. The result of the POST operation is a new markup that 960 the UI is supposed to use. This allows the UI to updated in response 961 to user action. Some markups, such as HTML, provide the ability to 962 force a refresh after a certain period of time, so that the UI can be 963 updated without user input. Those mechanisms can be used here as 964 well. However, there is no support for an asynchronous push of an 965 updated UI component from the appliciation to the user agent. A new 966 REFER request to the same GRUU would create a new UI component rather 967 than updating any components already in place. 969 For presentation free UI, the story is different. The application 970 MAY update the filter at any time by generating a SUBSCRIBE refresh 971 with the new filter. The UA will immediately begin using this new 972 filter. 974 7.1.4 Terminating an Interface Component 976 User interface components have a well defined lifetime. They are 977 created when the component is first pushed to the client. User 978 interface components are always associated with the SIP dialog on 979 which they were pushed. As such, their lifetime is bound by the 980 lifetime of the dialog. When the dialog ends, so does the interface 981 component. 983 However, there are some cases where the application would like to 984 terminate the user interface component before its natural termination 985 point. For presentation capable user interfaces, this is not 986 possible. For presentation free user interfaces, the application MAY 987 terminate the component by sending a SUBSCRIBE with Expires equal to 988 zero. This terminates the subscription, which removes the UI 989 component. 991 A client can remove a UI component at any time. For presentation 992 capable UI, this is analagous to the user dismissing the web form 993 window. There is no mechanism provided for reporting this kind of 994 event to the application. The application MUST be prepared to time 995 out, and never receive input from a user. The duration of this 996 timeout is application dependent. For presentation free user 997 interfaces, the UA can explicitly terminate the subscription. This 998 will result in the generation of a NOTIFY with a Subscription-State 999 header field equal to "terminated". 1001 7.2 Client Remote Interfaces 1003 As an alternative to, or in conjunction with client local user 1004 interfaces, an application can make use of client remote user 1005 interfaces. These user interfaces can execute co-resident with the 1006 application itself (in which case no standardized interfaces between 1007 the UI and the application need to be used), or it can run 1008 separately. This framework assumes that the user interface runs on a 1009 host that has a sufficient trust relationship with the application. 1010 As such, the means for instantiating the user interface is not 1011 considered here. 1013 The primary issue is to connect the user device to the remote user 1014 interface. Doing so requires the manipulation of media streams 1015 between the client and the user interface. Such manipulation can 1016 only be done by user agents. There are two types of user agent 1017 applications within this framework - originating/terminating 1018 applications, and intermediary applications. 1020 7.2.1 Originating and Terminating Applications 1022 Originating and terminating applications are applications which are 1023 themselves the originator or the final recipient of a SIP invitation. 1024 They are "pure" user agent applications - not back-to-back user 1025 agents. The classic example of such an application is an interactive 1026 voice response (IVR) application, which is typically a terminating 1027 application. It is a terminating application because the user 1028 explicitly calls it; i.e., it is the actual called party. An example 1029 of an originating application is a wakeup call application, which 1030 calls a user at a specified time in order to wake them up. 1032 Because originating and terminating applications are a natural 1033 termination point of the dialog, manipulation of the media session by 1034 the application is trivial. Traditional SIP techniques for adding 1035 and removing media streams, modifying codecs, and changing the 1036 address of the recipient of the media streams, can be applied. 1037 Similarly, the application can directly authenticate itself to the 1038 user through S/MIME, since it is the peer UA in the dialog. 1040 7.2.2 Intermediary Applications 1042 Intermediary applications are, at the same time, more common than 1043 originating/terminating applications, and more complex. Intermediary 1044 applications are applications that are neither the actual caller or 1045 called party. Rather, they represent a "third party" that wishes to 1046 interact with the user. The classic example is the ubiquitous 1047 pre-paid calling card application. 1049 In order for the intermediary application to add a client remote user 1050 interface, it needs to manipulate the media streams of the user agent 1051 to terminate on that user interface. This also introduces a 1052 fundamental feature interaction issue. Since the intermediary 1053 application is not an actual participant in the call, the user will 1054 need to interact with both the intermediary application and its peer 1055 in the dialog. Doing both at the same time is complicated, and is 1056 discussed in more detail in Section 9. 1058 8. User Agent Behavior 1060 8.1 Advertising Capabilities 1062 In order to participate in applications that make use of stimulus 1063 interfaces, a user agent needs to advertise its interaction 1064 capabilities. 1066 If a user agent supports presentation capable user interfaces, it 1067 MUST support the REFER method, along with the "context" extension 1068 defined here. It MUST include, in all dialog initiating requests and 1069 responses, an Allow header field that includes the REFER method and 1070 and the Supported header field that includes the value 1071 "refer-context". Furthermore, the UA MUST support the SIP user agent 1072 capabilities specification [5]. The UA MUST be capable of being 1073 REFER'd to an HTTP URI. It MUST include, in the Contact header field 1074 of its dialog initiating requests and responses, a "schemes" Contact 1075 header field parameter include the http URI scheme. The UA MUST 1076 include, in all dialog initiating requests and responses, an Accept 1077 header field listing all of those markups supported by the UA. It is 1078 RECOMMENDED that all user agents that support presentation capable 1079 user interfaces support HTML. 1081 If a user agent supports presentation free user interfaces, it MUST 1082 support the SUBSCRIBE [3] method. It MUST support the KPML [7] event 1083 package. It MUST include, in all dialog initiating requests and 1084 responses, an Allow header field that includes the SUBSCRIBE method. 1085 It MUST include, in all dialog initiating requests and responses, an 1086 Allow-Events header field that lists the KPML event package. The UA 1087 MUST include, in all dialog initiating requests and responses, an 1088 Accept header field listing those event filters it supports. At a 1089 minimum, a UA MUST support the "application/kpml-request+xml" MIME 1090 type. 1092 For either presentation free or presentation capable user interfaces, 1093 the user agent MUST support the GRUU [8] specification. The Contact 1094 header field in all dialog initiating requests and responses MUST 1095 contain a GRUU. The UA MUST include a Supported header field which 1096 contains the "gruu" option tag. 1098 Because these headers are examined by proxies which may be executing 1099 applications, a UA that wishes to support client local user 1100 interfaces should not encrypt them. 1102 8.2 Receiving User Interface Components 1104 Once the UA has created a dialog (in either the early or confirmed 1105 states), it MUST be prepared to receive a SUBSCRIBE or REFER request 1106 against its GRUU. If the UA receives such a request prior to the 1107 establishment of a dialog, the UA MUST reject the request. 1109 A user agent SHOULD attempt to authenticate the sender of the 1110 request. The sender will generally be an application, and therefore 1111 the user agent is unlikely to ever have a shared secret with it, 1112 making digest authentication useless. However, authenticated 1113 identities can be obtained through other means, such as [10]. 1115 A user agent MAY have pre-defined authorization policies which permit 1116 applications which have authenticated themselves with a particular 1117 identity, to push user interface components. If such a set of 1118 policies are present, it is checked first. If the application is 1119 authorized, processing proceeds. 1121 If the application has authenticated itself, but it is not explicitly 1122 authorized or blocked, this specification RECOMMENDS that the 1123 application be automatically authorized if it can prove that it was 1124 either on the call path, or is trusted by one of the elements on the 1125 call path. An application proves this to the user agent by 1126 presenting it with the dialog identifiers in the SUBSCRIBE or REFER 1127 request. In the case of SUBSCRIBE, those identifiers are present in 1128 the Event header field [7]. In the case of REFER, those identifiers 1129 are present in the "context" parameter of the Refer-To header field. 1131 Because of the dialog identifiers serve as a tool for authorization, 1132 a user agent compliant to this framework SHOULD use dialog 1133 identifiers that are cryptographically random, with at least 128 bits 1134 of randomness. It is recommended that this randomness be split 1135 between the Call-ID and From header field tag in the case of a UAC. 1137 Furthermore, to ensure that only applications resident in or trusted 1138 by on-path elements can instantiate a user interface component, a 1139 user agent compliant to this specification SHOULD use the sips URI 1140 scheme for all dialogs it initiates. This will guarantee secure 1141 links between all of the elements on the signaling path. 1143 If the dialog was not established with a sips URI, or the user agent 1144 did not choose cryptographically random dialog identifiers, then the 1145 application MUST NOT automatically be authorized, even if it 1146 presented valid dialog identifiers. A user agent MAY apply any other 1147 policies in addition to (but not instead of) the ones specified here 1148 in order to authorize the creation of the user interface component. 1149 One such mechanism would be to prompt the user, informing them of the 1150 identity of the application and the dialog it is associated with. If 1151 an authorization policy requires user interaction, the user agent 1152 SHOULD respond to the SUBSCRIBE or REFER request with a 202. In the 1153 case of SUBSCRIBE, if authorization is not granted, the user agent 1154 SHOULD generate a NOTIFY to terminate the subscription. In the case 1155 of REFER, the user agent MUST NOT act upon the URI in the Refer-To 1156 header field until user authorization was obtained. 1158 If an application does not present a valid dialog identifier in its 1159 REFER or SUBSCRIBE request, the user agent MUST reject the request 1160 with a 403 response. 1162 If a REFER request to an HTTP URI was authorized, the UA executes the 1163 URI and fetches the content to be rendered to the user. This 1164 instantiates a presentation capable user interface component. If a 1165 SUBSCRIBE was authorized, a presentation free user interface 1166 component was instantiated. 1168 8.3 Mapping User Input to User Interface Components 1170 Once the user interface components are instantiated, the user agent 1171 must direct user input to the appropriate component. In the case of 1172 presentation capable user interfaces, this process is known as focus 1173 selection. It is done by means that are specific to the user 1174 interface on the device. In the case of a PC, for example, the 1175 window manager would allow the user to select the appropriate user 1176 interface component that their input is directed to. 1178 For presentation free user interfaces, the situation is more 1179 complicated. In some cases, the device may support a mechanism that 1180 allows the user to select a "line", and thus the associated dialog. 1181 Any user input on the keypad while this line is selected are fed to 1182 the user interface components associated with that dialog. 1184 Otherwise, for client local user interfaces, the user input is 1185 assumed to be associated with all user interface components. For 1186 client remote user interfaces, the user device converts the user 1187 input to media, typically conveyed using RFC 2833, and sends this to 1188 the client remote user interface. This user interface then needs to 1189 map user input from potentially many media streams into user 1190 interface events. The process for doing this is described in Section 1191 6.3. 1193 8.4 Receiving Updates to User Interface Components 1195 For presentation capable user interfaces, updates to the user 1196 interface occur in ways specific to that user interface component. 1197 In the case of HTML, for example, the document can tell the client to 1198 fetch a new document periodically. However, this framework does not 1199 provide any additional machinery to asynchronously push a new user 1200 interface component to the client. 1202 For presentation free user interfaces, an application can push an 1203 update to a component by sending a SUBSCRIBE refresh with a new 1204 filter. The user agent will process these according to the rules of 1205 the event package. 1207 8.5 Terminating a User Interface Component 1209 Termination of a presentation capable user interface component is a 1210 trivial procedure. The user agent merely dismisses the window (or 1211 equivalent). The fact that the component is dismissed is not 1212 communicated to the application. As such, it is purely a local 1213 matter. 1215 In the case of a presentation free user interface, the user might 1216 wish to cease interacting with the application. However, most 1217 presentation free user interfaces will not have a way for the user to 1218 signal this through the device. If such a mechanism did exist, the 1219 UA SHOULD generate a NOTIFY request with a Subscription-State equal 1220 to "terminated" and a reason of "rejected". This tells the 1221 application that the component has been removed, and that it should 1222 not attempt to re-subscribe. 1224 9. Inter-Application Feature Interaction 1226 The inter-application feature interaction problem is inherent to 1227 stimulus signaling. Whenever there are multiple applications, there 1228 are multiple user interfaces. The system has to determine to which 1229 user interface any particular input is destined. That question is 1230 the essence of the inter-application feature interaction problem. 1232 Inter-application feature interaction is not an easy problem to 1233 resolve. For now, we consider separately the issues for client-local 1234 and client-remote user interface components. 1236 9.1 Client Local UI 1238 When the user interface itself resides locally on the client device, 1239 the feature interaction problem is actually much simpler. The end 1240 device knows explicitly about each application, and therefore can 1241 present the user with each one separately. When the user provides 1242 input, the client device can determine to which user interface the 1243 input is destined. The user interface to which input is destined is 1244 referred to as the application in focus, and the means by which the 1245 focused application is selected is called focus determination. 1247 Generally speaking, focus determination is purely a local operation. 1248 In the PC universe, focus determination is provided by window 1249 managers. Each application does not know about focus, it merely 1250 receives the user input that has been targeted to it when its in 1251 focus. This basic concept applies to SIP-based applications as well. 1253 Focus determination will frequently be trivial, depending on the user 1254 interface type. Consider a user that makes a call from a PC. The 1255 call passes through a pre-paid calling card application, and a call 1256 recording application. Both of these wish to interact with the user. 1257 Both push an HTML-based user interface to the user. On the PC, each 1258 user interface would appear as a separate window. The user interacts 1259 with the call recording application by selecting its window, and with 1260 the pre-paid calling card application by selecting its window. Focus 1261 determination is literally provided by the PC window manager. It is 1262 clear to which application the user input is targeted. 1264 As another example, consider the same two applications, but on a 1265 "smart phone" that has a set of buttons, and next to each button, an 1266 LCD display that can provide the user with an option. This user 1267 interface can be represented using the Wireless Markup Language 1268 (WML). 1270 The phone would allocate some number of buttons to each application. 1271 The prepaid calling card would get one button for its "hangup" 1272 command, and the recording application would get one for its "start/ 1273 stop" command. The user can easily determine which application to 1274 interact with by pressing the appropriate button. Pressing a button 1275 determines focus and provides user input, both at the same time. 1277 Unfortunately, not all devices will have these advanced displays. A 1278 PSTN gateway, or a basic IP telephone, may only have a 12-key keypad. 1279 The user interfaces for these devices are provided through the Keypad 1280 Markup Language (KPML). Considering once again the feature 1281 interaction case above, the pre-paid calling card application and the 1282 call recording application would both pass a KPML document to the 1283 device. When the user presses a button on the keypad, to which 1284 document does the input apply? The user interface does not allow the 1285 user to select. A user interface where the user cannot provide focus 1286 is called a focusless user interface. This is quite a hard problem 1287 to solve. This framework does not make any explicit normative 1288 recommendation, but concludes that the best option is to send the 1289 input to both user interfaces unless the markup in one interface has 1290 indicated that it should be suppressed from others. This is a 1291 sensible choice by analogy - its exactly what the existing circuit 1292 switched telephone network will do. It is an explicit non-goal to 1293 provide a better mechanism for feature interaction resolution than 1294 the PSTN on devices which have the same user interface as they do on 1295 the PSTN. Devices with better displays, such as PCs or screen 1296 phones, can benefit from the capabilities of this framework, allowing 1297 the user to determine which application they are interacting with. 1299 Indeed, when a user provides input on a focusless device, the input 1300 must be passed to all client local user interfaces, AND all client 1301 remote user interfaces, unless the markup tells the UI to suppress 1302 the media. In the case of KPML, key events are passed to remote user 1303 interfaces by encoding them in RFC 2833 [17]. Of course, since a 1304 client cannot determine if a media stream terminates in a remote user 1305 interface or not, these key events are passed in all audio media 1306 streams unless the KPML request document is used to suppress. 1308 9.2 Client-Remote UI 1310 When the user interfaces run remotely, the determination of focus can 1311 be much, much harder. There are many architectures that can be 1312 deployed to handle the interaction. None are ideal. However, all 1313 are beyond the scope of this specification. 1315 10. Intra Application Feature Interaction 1317 An application can instantiate a multiplicity of user interface 1318 components. For example, a single application can instantiate two 1319 separate HTML components and one WML component. Furthermore, an 1320 application can instantiate both client local and client remote user 1321 interfaces. 1323 The feature interaction issues between these components within the 1324 same application are less severe. If an application has multiple 1325 client user interface components, their interaction is resolved 1326 identically to the inter-application case - through focus 1327 determination. However, the problems in focusless user interfaces 1328 (such as a keypad) generally won't exist, since the application can 1329 generate user interfaces which do not overlap in their usage of an 1330 input. 1332 The real issue is that the optimal user experience frequently 1333 requires some kind of coupling between the differing user interface 1334 components. This is a classic problem in multi-modal user 1335 interfaces, such as those described by Speech Application Language 1336 Tags (SALT). As an example, consider a user interface where a user 1337 can either press a labeled button to make a selection, or listen to a 1338 prompt, and speak the desired selection. Ideally, when the user 1339 presses the button, the prompt should cease immediately, since both 1340 of them were targeted at collecting the same information in parallel. 1341 Such interactions are best handled by markups which natively support 1342 such interactions, such as SALT, and thus require no explicit support 1343 from this framework. 1345 11. Example Call Flow 1347 This section shows the operation of a call recording application. 1348 This application allows a user to record the media in their call by 1349 clicking on a button in a web form. The application uses a 1350 presentation capable user interface component that is pushed to the 1351 caller. 1353 A Recording App B 1354 |(1) INVITE | | 1355 |----------------------->| | 1356 | |(2) INVITE | 1357 | |----------------------->| 1358 | |(3) 200 OK | 1359 | |<-----------------------| 1360 |(4) 200 OK | | 1361 |<-----------------------| | 1362 |(5) ACK | | 1363 |----------------------->| | 1364 | |(6) ACK | 1365 | |----------------------->| 1366 |(7) REFER | | 1367 |<-----------------------| | 1368 |(8) 200 OK | | 1369 |----------------------->| | 1370 |(9) NOTIFY | | 1371 |----------------------->| | 1372 |(10) 200 OK | | 1373 |<-----------------------| | 1374 |(11) HTTP GET | | 1375 |----------------------->| | 1376 |(12) 200 OK | | 1377 |<-----------------------| | 1378 |(13) NOTIFY | | 1379 |----------------------->| | 1380 |(14) 200 OK | | 1381 |<-----------------------| | 1382 |(15) HTTP POST | | 1383 |----------------------->| | 1384 |(16) 200 OK | | 1385 |<-----------------------| | 1387 Figure 8 1389 First, the caller, A, sends an INVITE to setup a call (message 1). 1390 Since the caller supports the framework, and can handle presentation 1391 capable user interface components, it includes the Supported header 1392 field indicating that the GRUU extension and the REFER context 1393 extension are understood, Allow indicating that REFER is understood, 1394 and a Contact header field that includes the "schemes" header field 1395 parameter. 1397 INVITE sips:B@example.com SIP/2.0 1398 Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz8 1399 From: Caller ;tag=kkaz- 1400 To: Callee 1401 Call-ID: faif9ahhs9dd8==-sd98ajzz@host.example.com 1402 CSeq: 1 INVITE 1403 Max-Forwards: 70 1404 Supported: gruu, refer-context 1405 Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, REFER 1406 Contact: ;schemes="http,sip,sips" 1407 Content-Length: ... 1408 Content-Type: application/sdp 1410 --SDP not shown-- 1412 The proxy acts as a recording server, and forwards the INVITE to the 1413 called party (message 2): 1415 INVITE sips:B@pc.example.com SIP/2.0 1416 Record-Route: 1417 Via: SIP/2.0/TLS app.example.com;branch=z9hG4bK97sh 1418 Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz8 1419 From: Caller ;tag=kkaz- 1420 To: Callee 1421 Call-ID: faif9ahhs9dd8==-sd98ajzz@host.example.com 1422 CSeq: 1 INVITE 1423 Max-Forwards: 69 1424 Supported: gruu, refer-context 1425 Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, REFER 1426 Contact: ;schemes="http,sip,sips" 1427 Content-Length: ... 1428 Content-Type: application/sdp 1430 --SDP not shown-- 1432 B accepts the call with a 200 OK (message 3). It does not support 1433 the framework, and so the various header fields are not present. 1435 SIP/2.0 200 OK 1436 Record-Route: 1437 Via: SIP/2.0/TLS app.example.com;branch=z9hG4bK97sh 1438 Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz8 1439 From: Caller ;tag=kkaz- 1440 To: Callee ;tag=7777 1441 Call-ID: faif9ahhs9dd8==-sd98ajzz@host.example.com 1442 CSeq: 1 INVITE 1443 Contact: 1444 Content-Length: ... 1445 Content-Type: application/sdp 1447 --SDP not shown-- 1449 This 200 OK is passed back to the caller (message 4): 1451 SIP/2.0 200 OK 1452 Record-Route: 1453 Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz8 1454 From: Caller ;tag=kkaz- 1455 To: Callee ;tag=7777 1456 Call-ID: faif9ahhs9dd8==-sd98ajzz@host.example.com 1457 CSeq: 1 INVITE 1458 Contact: 1459 Content-Length: ... 1460 Content-Type: application/sdp 1462 --SDP not shown-- 1464 The caller generates an ACK (message 5). 1466 ACK sips:B@pc.example.com 1467 Route: 1468 Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz9 1469 From: Caller ;tag=kkaz- 1470 To: Callee ;tag=7777 1471 Call-ID: faif9ahhs9dd8==-sd98ajzz@host.example.com 1472 CSeq: 1 ACK 1474 The ACK is forwarded to the called party (message 6). 1476 ACK sips:B@pc.example.com 1477 Via: SIP/2.0/TLS app.example.com;branch=z9hG4bKh7s 1478 Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9zz9 1479 From: Caller ;tag=kkaz- 1480 To: Callee ;tag=7777 1481 Call-ID: faif9ahhs9dd8==-sd98ajzz@host.example.com 1482 CSeq: 1 ACK 1484 Now, the application decides to push a user interface component to 1485 user A. So, it sends it a REFER request (message 7): 1487 REFER sips:bad998asd8asd0000a@example.com SIP/2.0 1488 Refer-To: https://app.example.com/script.pl 1489 ;context="kkaz-,7777,faif9ahhs9dd8==-sd98ajzz@host.example.com" 1490 Via: SIP/2.0/TLS app.example.com;branch=z9hG4bK9zh6 1491 Max-Forwards: 70 1492 From: Recorder Application ;tag=jhgf 1493 To: Caller 1494 Call-ID: 66676776767@app.example.com 1495 CSeq: 1 REFER 1496 Event: refer 1497 Contact: 1499 The REFER is answered by a 200 OK (message 8). 1501 SIP/2.0 200 OK 1502 Via: SIP/2.0/TLS app.example.com;branch=z9hG4bK9zh6 1503 From: Recorder Application ;tag=jhgf 1504 To: Caller ;tag=pqoew 1505 Call-ID: 66676776767@app.example.com 1506 Supported: gruu, refer-context 1507 Allow: INVITE, OPTIONS, BYE, CANCEL, ACK, REFER 1508 Contact: ;schemes="http,sip,sips" 1509 CSeq: 1 REFER 1511 User A sends a NOTIFY (message 9): 1513 NOTIFY sips:app.example.com SIP/2.0 1514 Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9320394238995 1515 To: Recorder Application ;tag=jhgf 1516 From: Caller ;tag=pqoew 1517 Call-ID: 66676776767@app.example.com 1518 CSeq: 1 NOTIFY 1519 Max-Forwards: 70 1520 Event: refer;id=93809824 1521 Subscription-State: active;expires=3600 1522 Contact: ;schemes="http,sip,sips" 1523 Content-Type: message/sipfrag;version=2.0 1524 Content-Length: 20 1525 SIP/2.0 100 Trying 1527 And the recording server responds with a 200 OK (message 10) 1529 SIP/2.0 200 OK 1530 Via: SIP/2.0/TLS host.example.com;branch=z9hG4bK9320394238995 1531 To: Recorder Application ;tag=jhgf 1532 From: Caller ;tag=pqoew 1533 Call-ID: 66676776767@app.example.com 1534 CSeq: 1 NOTIFY 1536 The REFER request contained a "context" Refer-To header field 1537 parameter with a valid dialog identifier. Furthermore, all of the 1538 signaling was over TLS and the dialog identifiers contain sufficient 1539 randomness. As such, the caller, A, automatically authorizes the 1540 application. It then acts on the Refer-To URI, fetching the script 1541 from app.example.com (message 11). The response, message 12, 1542 contains a web application that the user can click on to enable 1543 recording. Because the client executed the URL in the Refer-To, it 1544 generates another NOTIFY to the application, informing it of the 1545 successful response (message 13). This is answered with a 200 OK 1546 (message 14). When the user clicks on the link (message 15), the 1547 results are posted to the server, and an updated display is provided 1548 (message 16). 1550 12. Security Considerations 1552 There are many security considerations associated with this 1553 framework. It allows applications in the network to instantiate user 1554 interface components on a client device. Such instantiations need to 1555 be from authenticated applications, and also need to be authorized to 1556 place a UI into the client. Indeed, the stronger requirement is 1557 authorization. It is not so important to know that name of the 1558 provider of the application, but rather, that the provider is 1559 authorized to instantiate components. 1561 This specification defines specific authorization techniques and 1562 requirements. Automatic authorization is granted if the application 1563 can prove that it is on the call path, or is trusted by an element on 1564 the call path. As documented above, this can be accompished by the 1565 use of cryptographically random dialog identifiers and the usage of 1566 sips for message confidentiality. It is RECOMMENDED that sips be 1567 implemented by user agents compliant to this specification. This 1568 does not represent a change from the requirements in RFC 3261. 1570 13. IANA Considerations 1572 13.1 SIP Option Tag 1574 This specification registers a new SIP option tag, as per the 1575 guidelines in Section 27.1 of RFC 3261 [1]. 1577 Name: refer-context 1579 Description: This option tag is used to identify the REFER extension 1580 that defines the "context" parameter of the Refer-To header field. 1582 13.2 Header Field Parameter 1584 This specification defines a new header field parameter, as per the 1585 registry created by [9]. The required information is as follows: 1587 Header field in which the parameter can appear: Refer-To 1589 Name of the Parameter context 1591 RFC Reference RFC XXXX [[NOTE TO IANA: Please replace XXXX with the 1592 RFC number of this specification.]] 1594 14. Contributors 1596 This document was produced as a result of discussions amongst the 1597 application interaction design team. All members of this team 1598 contributed significantly to the ideas embodied in this document. 1599 The members of this team were: 1601 Eric Burger 1602 Cullen Jennings 1603 Robert Fairlie-Cuninghame 1605 15. Acknowledgements 1607 The authors would like to thank Martin Dolly and Rohan Mahy for their 1608 input and comments. Thanks to Allison Mankin for her support of this 1609 work. 1611 16. References 1612 16.1 Normative References 1614 [1] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., 1615 Peterson, J., Sparks, R., Handley, M. and E. Schooler, "SIP: 1616 Session Initiation Protocol", RFC 3261, June 2002. 1618 [2] Rosenberg, J. and H. Schulzrinne, "Reliability of Provisional 1619 Responses in Session Initiation Protocol (SIP)", RFC 3262, June 1620 2002. 1622 [3] Roach, A., "Session Initiation Protocol (SIP)-Specific Event 1623 Notification", RFC 3265, June 2002. 1625 [4] McGlashan, S., Lucas, B., Porter, B., Rehor, K., Burnett, D., 1626 Carter, J., Ferrans, J. and A. Hunt, "Voice Extensible Markup 1627 Language (VoiceXML) Version 2.0", W3C CR CR-voicexml20-20030220, 1628 February 2003. 1630 [5] Rosenberg, J., Schulzrinne, H. and P. Kyzivat, "Indicating User 1631 Agent Capabilities in the Session Initiation Protocol (SIP)", 1632 RFC 3840, August 2004. 1634 [6] Sparks, R., "The Session Initiation Protocol (SIP) Refer 1635 Method", RFC 3515, April 2003. 1637 [7] Burger, E., "A Session Initiation Protocol (SIP) Event Package 1638 for Key Press Stimulus (KPML)", draft-ietf-sipping-kpml-04 1639 (work in progress), July 2004. 1641 [8] Rosenberg, J., "Obtaining and Using Globally Routable User Agent 1642 (UA) URIs (GRUU) in the Session Initiation Protocol (SIP)", 1643 draft-ietf-sip-gruu-02 (work in progress), July 2004. 1645 [9] Camarillo, G., "The Internet Assigned Number Authority (IANA) 1646 Header Field Parameter Registry for the Session Initiation 1647 Protocol (SIP)", draft-ietf-sip-parameter-registry-02 (work in 1648 progress), June 2004. 1650 16.2 Informative References 1652 [10] Peterson, J., "Enhancements for Authenticated Identity 1653 Management in the Session Initiation Protocol (SIP)", 1654 draft-ietf-sip-identity-03 (work in progress), September 2004. 1656 [11] Day, M., Rosenberg, J. and H. Sugano, "A Model for Presence and 1657 Instant Messaging", RFC 2778, February 2000. 1659 [12] Jennings, C., Peterson, J. and M. Watson, "Private Extensions 1660 to the Session Initiation Protocol (SIP) for Asserted Identity 1661 within Trusted Networks", RFC 3325, November 2002. 1663 [13] Rosenberg, J., "A Framework for Conferencing with the Session 1664 Initiation Protocol", 1665 draft-ietf-sipping-conferencing-framework-02 (work in 1666 progress), June 2004. 1668 [14] Rosenberg, J., Schulzrinne, H. and P. Kyzivat, "Caller 1669 Preferences for the Session Initiation Protocol (SIP)", RFC 1670 3841, August 2004. 1672 [15] Rosenberg, J. and H. Schulzrinne, "An INVITE Inititiated Dialog 1673 Event Package for the Session Initiation Protocol (SIP)", 1674 draft-ietf-sipping-dialog-package-04 (work in progress), 1675 February 2004. 1677 [16] Schulzrinne, H., Casner, S., Frederick, R. and V. Jacobson, 1678 "RTP: A Transport Protocol for Real-Time Applications", RFC 1679 3550, July 2003. 1681 [17] Schulzrinne, H. and S. Petrack, "RTP Payload for DTMF Digits, 1682 Telephony Tones and Telephony Signals", RFC 2833, May 2000. 1684 [18] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model with 1685 Session Description Protocol (SDP)", RFC 3264, June 2002. 1687 [19] Rosenberg, J., "A Session Initiation Protocol (SIP) Event 1688 Package for Registrations", RFC 3680, March 2004. 1690 Author's Address 1692 Jonathan Rosenberg 1693 Cisco Systems 1694 600 Lanidex Plaza 1695 Parsippany, NJ 07054 1696 US 1698 Phone: +1 973 952-5000 1699 EMail: jdrosen@dynamicsoft.com 1700 URI: http://www.jdrosen.net 1702 Intellectual Property Statement 1704 The IETF takes no position regarding the validity or scope of any 1705 Intellectual Property Rights or other rights that might be claimed to 1706 pertain to the implementation or use of the technology described in 1707 this document or the extent to which any license under such rights 1708 might or might not be available; nor does it represent that it has 1709 made any independent effort to identify any such rights. Information 1710 on the procedures with respect to rights in RFC documents can be 1711 found in BCP 78 and BCP 79. 1713 Copies of IPR disclosures made to the IETF Secretariat and any 1714 assurances of licenses to be made available, or the result of an 1715 attempt made to obtain a general license or permission for the use of 1716 such proprietary rights by implementers or users of this 1717 specification can be obtained from the IETF on-line IPR repository at 1718 http://www.ietf.org/ipr. 1720 The IETF invites any interested party to bring to its attention any 1721 copyrights, patents or patent applications, or other proprietary 1722 rights that may cover technology that may be required to implement 1723 this standard. Please address the information to the IETF at 1724 ietf-ipr@ietf.org. 1726 Disclaimer of Validity 1728 This document and the information contained herein are provided on an 1729 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1730 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 1731 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 1732 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 1733 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1734 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1736 Copyright Statement 1738 Copyright (C) The Internet Society (2004). This document is subject 1739 to the rights, licenses and restrictions contained in BCP 78, and 1740 except as set forth therein, the authors retain all their rights. 1742 Acknowledgment 1744 Funding for the RFC Editor function is currently provided by the 1745 Internet Society.