| < draft-rosenberg-sipping-app-interaction-framework-00.txt | draft-rosenberg-sipping-app-interaction-framework-01.txt > | |||
|---|---|---|---|---|
| Internet Engineering Task Force SIPPING WG | SIPPING J. Rosenberg | |||
| Internet Draft J. Rosenberg | Internet-Draft dynamicsoft | |||
| dynamicsoft | Expires: December 29, 2003 June 30, 2003 | |||
| draft-rosenberg-sipping-app-interaction-framework-00.txt | ||||
| October 28, 2002 | ||||
| Expires: April 2003 | ||||
| A Framework and Requirements for Application Interaction in SIP | A Framework and Requirements for Application Interaction in the | |||
| Session Initiation Protocol (SIP) | ||||
| draft-rosenberg-sipping-app-interaction-framework-01 | ||||
| STATUS OF THIS MEMO | Status of this Memo | |||
| This document is an Internet-Draft and is in full conformance with | This document is an Internet-Draft and is in full conformance with | |||
| all provisions of Section 10 of RFC2026. | all provisions of Section 10 of RFC2026. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that other | |||
| other groups may also distribute working documents as Internet- | groups may also distribute working documents as Internet-Drafts. | |||
| Drafts. | ||||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
| and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
| time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
| material or to cite them other than as "work in progress". | material or to cite them other than as "work in progress." | |||
| The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at http:// | |||
| http://www.ietf.org/ietf/1id-abstracts.txt | www.ietf.org/ietf/1id-abstracts.txt. | |||
| To view the list Internet-Draft Shadow Directories, see | The list of Internet-Draft Shadow Directories can be accessed at | |||
| http://www.ietf.org/shadow.html. | http://www.ietf.org/shadow.html. | |||
| This Internet-Draft will expire on December 29, 2003. | ||||
| Copyright Notice | ||||
| Copyright (C) The Internet Society (2003). All Rights Reserved. | ||||
| Abstract | Abstract | |||
| This document describes a framework and requirements for the | This document describes a framework and requirements for the | |||
| interaction between users and Session Initiation Protocol (SIP) based | interaction between users and Session Initiation Protocol (SIP) based | |||
| applications. By interacting with applications, users can guide the | applications. By interacting with applications, users can guide the | |||
| way in which they operate. The focus of this framework is stimulus | way in which they operate. The focus of this framework is stimulus | |||
| signaling, which allows a user agent to interact with an application | signaling, which allows a user agent to interact with an application | |||
| without knowledge of the semantics of that application. Stimulus | without knowledge of the semantics of that application. Stimulus | |||
| signaling can occur to a user interface running locally with the | signaling can occur to a user interface running locally with the | |||
| client, or to a remote user interface, through media streams. | client, or to a remote user interface, through media streams. | |||
| Stimulus signaling encompasses a wide range of mechanisms, ranging | Stimulus signaling encompasses a wide range of mechanisms, ranging | |||
| from clicking on hyperlinks, to pressing buttons, to traditional Dual | from clicking on hyperlinks, to pressing buttons, to traditional Dual | |||
| Tone Multi Frequency (DTMF) input. In all cases, stimulus signaling | Tone Multi Frequency (DTMF) input. In all cases, stimulus signaling | |||
| is supported through the use of markup languages, which play a key | is supported through the use of markup languages, which play a key | |||
| role in this framework. | role in this framework. | |||
| Table of Contents | Table of Contents | |||
| 1 Introduction ........................................ 3 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
| 2 Definitions ......................................... 3 | 2. Definitions . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
| 3 A Model for Application Interaction ................. 6 | 3. A Model for Application Interaction . . . . . . . . . . . . 7 | |||
| 3.1 Function vs. Stimulus ............................... 8 | 3.1 Function vs. Stimulus . . . . . . . . . . . . . . . . . . . 8 | |||
| 3.2 Real-Time vs. Non-Real Time ......................... 8 | 3.2 Real-Time vs. Non-Real Time . . . . . . . . . . . . . . . . 9 | |||
| 3.3 Client-Local vs. Client-Remote ...................... 9 | 3.3 Client-Local vs. Client-Remote . . . . . . . . . . . . . . . 9 | |||
| 3.4 Interaction Scenarios on Telephones ................. 10 | 3.4 Interaction Scenarios on Telephones . . . . . . . . . . . . 10 | |||
| 3.4.1 Client Remote ....................................... 10 | 3.4.1 Client Remote . . . . . . . . . . . . . . . . . . . . . . . 11 | |||
| 3.4.2 Client Local ........................................ 10 | 3.4.2 Client Local . . . . . . . . . . . . . . . . . . . . . . . . 11 | |||
| 3.4.3 Flip-Flop ........................................... 11 | 3.4.3 Flip-Flop . . . . . . . . . . . . . . . . . . . . . . . . . 11 | |||
| 4 Framework Overview .................................. 12 | 4. Framework Overview . . . . . . . . . . . . . . . . . . . . . 13 | |||
| 5 Client Local Interfaces ............................. 13 | 5. Client Local Interfaces . . . . . . . . . . . . . . . . . . 15 | |||
| 5.1 Discovering Capabilities ............................ 14 | 5.1 Discovering Capabilities . . . . . . . . . . . . . . . . . . 15 | |||
| 5.2 Pushing an Initial Interface Component .............. 14 | 5.2 Pushing an Initial Interface Component . . . . . . . . . . . 15 | |||
| 5.3 Updating an Interface Component ..................... 16 | 5.3 Updating an Interface Component . . . . . . . . . . . . . . 17 | |||
| 5.4 Terminating an Interface Component .................. 17 | 5.4 Terminating an Interface Component . . . . . . . . . . . . . 18 | |||
| 6 Client Remote Interfaces ............................ 17 | 6. Client Remote Interfaces . . . . . . . . . . . . . . . . . . 19 | |||
| 6.1 Originating and Terminating Applications ............ 18 | 6.1 Originating and Terminating Applications . . . . . . . . . . 19 | |||
| 6.2 Intermediary Applications ........................... 18 | 6.2 Intermediary Applications . . . . . . . . . . . . . . . . . 19 | |||
| 7 Inter-Application Feature Interaction ............... 18 | 7. Inter-Application Feature Interaction . . . . . . . . . . . 21 | |||
| 7.1 Client Local UI ..................................... 19 | 7.1 Client Local UI . . . . . . . . . . . . . . . . . . . . . . 21 | |||
| 7.2 Client-Remote UI .................................... 20 | 7.2 Client-Remote UI . . . . . . . . . . . . . . . . . . . . . . 22 | |||
| 7.2.1 Centralized Server .................................. 20 | 8. Intra Application Feature Interaction . . . . . . . . . . . 23 | |||
| 7.2.2 Pipe-and-Filter ..................................... 21 | 9. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 24 | |||
| 7.2.2.1 Client Resolution ................................... 22 | 10. Security Considerations . . . . . . . . . . . . . . . . . . 25 | |||
| 7.2.3 Comparison .......................................... 31 | 11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 26 | |||
| 8 Intra Application Feature Interaction ............... 33 | Informative References . . . . . . . . . . . . . . . . . . . 27 | |||
| 9 Examples ............................................ 34 | Author's Address . . . . . . . . . . . . . . . . . . . . . . 28 | |||
| 10 Security Considerations ............................. 35 | Intellectual Property and Copyright Statements . . . . . . . 29 | |||
| 11 Contributors ........................................ 35 | ||||
| 12 Authors Address ..................................... 35 | ||||
| 13 Normative References ................................ 37 | ||||
| 14 Informative References .............................. 37 | ||||
| 1 Introduction | 1. Introduction | |||
| The Session Initiation Protocol (SIP) [1] provides the ability for | The Session Initiation Protocol (SIP) [1] provides the ability for | |||
| users to initiate, manage, and terminate communications sessions. | users to initiate, manage, and terminate communications sessions. | |||
| Frequently, these sessions will involve a SIP application. A SIP | Frequently, these sessions will involve a SIP application. A SIP | |||
| application is defined as a program running on a SIP-based element | application is defined as a program running on a SIP-based element | |||
| (such as a proxy or user agent) that provides some value-added | (such as a proxy or user agent) that provides some value-added | |||
| function to a user or system administrator. Examples of SIP | function to a user or system administrator. Examples of SIP | |||
| applications include pre-paid calling card calls, conferencing, and | applications include pre-paid calling card calls, conferencing, and | |||
| presence-based [2] call routing. | presence-based [3] call routing. | |||
| In order for most applications to properly function, they need input | In order for most applications to properly function, they need input | |||
| from the user to guide their operation. As an example, a pre-paid | from the user to guide their operation. As an example, a pre-paid | |||
| calling card application requires the user to input their calling | calling card application requires the user to input their calling | |||
| card number, their PIN code, and the destination number they wish to | card number, their PIN code, and the destination number they wish to | |||
| reach. The process by which a user provides input to an application | reach. The process by which a user provides input to an application | |||
| is called "application interaction". | is called "application interaction". | |||
| Application interaction can be either functional or stimulus. | Application interaction can be either functional or stimulus. | |||
| Functional interaction requires the user agent to understand the | Functional interaction requires the user agent to understand the | |||
| semantics of the application, whereas stimulus interaction does not. | semantics of the application, whereas stimulus interaction does not. | |||
| Stimulus signaling allows for applications to be built without | Stimulus signaling allows for applications to be built without | |||
| requiring modifications to the client. Stimulus interaction is the | requiring modifications to the client. Stimulus interaction is the | |||
| subject of this framework. The framework provides a model for how | subject of this framework. The framework provides a model for how | |||
| users interact with applications through user interfaces, and how | users interact with applications through user interfaces, and how | |||
| user interfaces and applications can be distributed throughout a | user interfaces and applications can be distributed throughout a | |||
| network. This model is then used to describe how applications can | network. This model is then used to describe how applications can | |||
| instantiate and manage user interfaces. | instantiate and manage user interfaces. | |||
| 2 Definitions | 2. Definitions | |||
| SIP Application: A SIP application is defined as a program | SIP Application: A SIP application is defined as a program running on | |||
| running on a SIP-based element (such as a proxy or user | a SIP-based element (such as a proxy or user agent) that provides | |||
| agent) that provides some value-added function to a user or | some value-added function to a user or system administrator. | |||
| system administrator. Examples of SIP applications include | Examples of SIP applications include pre-paid calling card calls, | |||
| pre-paid calling card calls, conferencing, and presence- | conferencing, and presence-based [3] call routing. | |||
| based [2] call routing. | ||||
| Application Interaction: The process by which a user provides | Application Interaction: The process by which a user provides input | |||
| input to an application. | to an application. | |||
| Real-Time Application Interaction: Application interaction that | Real-Time Application Interaction: Application interaction that takes | |||
| takes place while an application instance is executing. For | place while an application instance is executing. For example, | |||
| example, when a user enters their PIN number into a pre- | when a user enters their PIN number into a pre-paid calling card | |||
| paid calling card application, this is real-time | application, this is real-time application interaction. | |||
| application interaction. | ||||
| Non-Real Time Application Interaction: Application interaction | Non-Real Time Application Interaction: Application interaction that | |||
| that takes place asynchronously with the execution of the | takes place asynchronously with the execution of the application. | |||
| application. Generally, non-real time application | Generally, non-real time application interaction is accomplished | |||
| interaction is accomplished through provisioning. | through provisioning. | |||
| Functional Application Interaction: Application interaction is | Functional Application Interaction: Application interaction is | |||
| functional when the user device has an understanding of the | functional when the user device has an understanding of the | |||
| semantics of the application that the user is interacting | semantics of the application that the user is interacting with. | |||
| with. | ||||
| Stimulus Application Interaction: Application interaction is | Stimulus Application Interaction: Application interaction is | |||
| considered to be stimulus when the user device has no | considered to be stimulus when the user device has no | |||
| understanding of the semantics of the application that the | understanding of the semantics of the application that the user is | |||
| user is interacting with. | interacting with. | |||
| User Interface (UI): The user interface provides the user with | User Interface (UI): The user interface provides the user with | |||
| context in order to make decisions about what they want. | context in order to make decisions about what they want. The user | |||
| The user enters information into the user interface. The | enters information into the user interface. The user interface | |||
| user interface interprets the information, and passes it to | interprets the information, and passes it to the application. | |||
| the application. | ||||
| User Interface Component: A piece of user interface which | User Interface Component: A piece of user interface which operates | |||
| operates independently of other pieces of the user | independently of other pieces of the user interface. For example, | |||
| interface. For example, a user might have two separate web | a user might have two separate web interfaces to a pre-paid | |||
| interfaces to a pre-paid calling card application - one for | calling card application - one for hanging up and making another | |||
| hanging up and making another call, and another for | call, and another for entering the username and PIN. | |||
| entering the username and PIN. | ||||
| User Device: The software or hardware system that the user | User Device: The software or hardware system that the user directly | |||
| directly interacts with in order to communicate with the | interacts with in order to communicate with the application. An | |||
| application. An example of a user device is a telephone. | example of a user device is a telephone. Another example is a PC | |||
| Another example is a PC with a web browser. | with a web browser. | |||
| User Input: The "raw" information passed from a user to a user | User Input: The "raw" information passed from a user to a user | |||
| interface. Examples of user input include a spoken word or | interface. Examples of user input include a spoken word or a click | |||
| a click on a hyperlink. | on a hyperlink. | |||
| Client-Local User Interface: A user interface which is co- | Client-Local User Interface: A user interface which is co-resident | |||
| resident with the user device. | with the user device. | |||
| Client Remote User Interface: A user interface which executes | Client Remote User Interface: A user interface which executes | |||
| remotely from the user device. In this case, a standardized | remotely from the user device. In this case, a standardized | |||
| interface is needed between them. Typically, this is done | interface is needed between them. Typically, this is done through | |||
| through media sessions - audio, video, or application | media sessions - audio, video, or application sharing. | |||
| sharing. | ||||
| Media Interaction: A means of separating a user and a user | Media Interaction: A means of separating a user and a user interface | |||
| interface by connecting them with media streams. | by connecting them with media streams. | |||
| Interactive Voice Response (IVR): An IVR is a type of user | Interactive Voice Response (IVR): An IVR is a type of user interface | |||
| interface that allows users to speak commands to the | that allows users to speak commands to the application, and hear | |||
| application, and hear responses to those commands prompting | responses to those commands prompting for more information. | |||
| for more information. | ||||
| Prompt-and-Collect: The basic primitive of an IVR user | Prompt-and-Collect: The basic primitive of an IVR user interface. The | |||
| interface. The user is presented with a voice option, and | user is presented with a voice option, and the user speaks their | |||
| the user speaks their choice. | choice. | |||
| Barge-In: In an IVR user interface, a user is prompted to enter | Barge-In: In an IVR user interface, a user is prompted to enter some | |||
| some information. With some prompts, the user may enter the | information. With some prompts, the user may enter the requested | |||
| requested information before the prompt completes. In that | information before the prompt completes. In that case, the prompt | |||
| case, the prompt ceases. The act of entering the | ceases. The act of entering the information before completion of | |||
| information before completion of the prompt is referred to | the prompt is referred to as barge-in. | |||
| as barge-in. | ||||
| Focus: A user interface component has focus when user input is | Focus: A user interface component has focus when user input is | |||
| provided fed to it, as opposed to any other user interface | provided fed to it, as opposed to any other user interface | |||
| components. This is not to be confused with the term focus | components. This is not to be confused with the term focus within | |||
| within the SIP conferencing framework, which refers to the | the SIP conferencing framework, which refers to the center user | |||
| center user agent in a conference [3]. | agent in a conference [4]. | |||
| Focus Determination: The process by which the user device | Focus Determination: The process by which the user device determines | |||
| determines which user interface component will receive the | which user interface component will receive the user input. | |||
| user input. | ||||
| Focusless User Interface: A user interface which has no ability | Focusless User Interface: A user interface which has no ability to | |||
| to perform focus determination. An example of a focusless | perform focus determination. An example of a focusless user | |||
| user interface is a keypad on a telephone. | interface is a keypad on a telephone. | |||
| Feature Interaction: A class of problems which result when | Feature Interaction: A class of problems which result when multiple | |||
| multiple applications or application components are trying | applications or application components are trying to provide | |||
| to provide services to a user at the same time. | services to a user at the same time. | |||
| Inter-Application Feature Interaction: Feature interactions that | Inter-Application Feature Interaction: Feature interactions that | |||
| occur between applications. | occur between applications. | |||
| DTMF: Dual-Tone Multi-Frequency. DTMF refer to a class of tones | DTMF: Dual-Tone Multi-Frequency. DTMF refer to a class of tones | |||
| generated by circuit switched telephony devices when the | generated by circuit switched telephony devices when the user | |||
| user presses a key on the keypad. As a result, DTMF and | presses a key on the keypad. As a result, DTMF and keypad input | |||
| keypad input are often used synonymously, when in fact one | are often used synonymously, when in fact one of them (DTMF) is | |||
| of them (DTMF) is merely a means of conveying the other | merely a means of conveying the other (the keypad input) to a | |||
| (the keypad input) to a client-remote user interface (the | client-remote user interface (the switch, for example). | |||
| switch, for example). | ||||
| Application Instance: A single execution path of a SIP | Application Instance: A single execution path of a SIP application. | |||
| application. | ||||
| Originating Application: A SIP application which acts as a UAC, | Originating Application: A SIP application which acts as a UAC, | |||
| calling the user. | calling the user. | |||
| Terminating Application: A SIP application which acts as a UAS, | Terminating Application: A SIP application which acts as a UAS, | |||
| answering a call generated by a user. IVR applications are | answering a call generated by a user. IVR applications are | |||
| terminating applications. | terminating applications. | |||
| Intermediary Application: A SIP application which is neither the | Intermediary Application: A SIP application which is neither the | |||
| caller or callee, but rather, a third party involved in a | caller or callee, but rather, a third party involved in a call. | |||
| call. | ||||
| 3 A Model for Application Interaction | 3. A Model for Application Interaction | |||
| +---+ +---+ +---+ +---+ | +---+ +---+ +---+ +---+ | |||
| | | | | | | | | | | | | | | | | | | |||
| | | | U | | U | | A | | | | | U | | U | | A | | |||
| | | Input | s | Input | s | Results | p | | | | Input | s | Input | s | Results | p | | |||
| | | ---------> | e | ---------> | e | ----------> | p | | | | ---------> | e | ---------> | e | ----------> | p | | |||
| | U | | r | | r | | l | | | U | | r | | r | | l | | |||
| | s | | | | | | i | | | s | | | | | | i | | |||
| | e | | D | | I | | c | | | e | | D | | I | | c | | |||
| | r | Output | e | Output | f | Update | a | | | r | Output | e | Output | f | Update | a | | |||
| | | <--------- | v | <--------- | a | <.......... | t | | | | <--------- | v | <--------- | a | <.......... | t | | |||
| | | | i | | c | | i | | | | | i | | c | | i | | |||
| | | | c | | e | | o | | | | | c | | e | | o | | |||
| | | | e | | | | n | | | | | e | | | | n | | |||
| | | | | | | | | | | | | | | | | | | |||
| +---+ +---+ +---+ +---+ | +---+ +---+ +---+ +---+ | |||
| Figure 1: Model for Real-Time Interactions | Figure 1: Model for Real-Time Interactions | |||
| Figure 1 presents a general model for how users interact with | Figure 1 presents a general model for how users interact with | |||
| applications. Generally, users interact with a user interface through | applications. Generally, users interact with a user interface through | |||
| a user device. A user device can be a telephone, or it can be a PC | a user device. A user device can be a telephone, or it can be a PC | |||
| with a web browser. Its role is to pass the user input from the user, | with a web browser. Its role is to pass the user input from the user, | |||
| to the user interface. The user interface provides the user with | to the user interface. The user interface provides the user with | |||
| context in order to make decisions about what they want. The user | context in order to make decisions about what they want. The user | |||
| enters information into the user interface. The user interface | enters information into the user interface. The user interface | |||
| interprets the information, and passes it to the application. The | interprets the information, and passes it to the application. The | |||
| application may be able to modify the user interface based on this | application may be able to modify the user interface based on this | |||
| skipping to change at page 7, line 48 ¶ | skipping to change at page 8, line 22 ¶ | |||
| application, when the user is prompted to enter their PIN, the prompt | application, when the user is prompted to enter their PIN, the prompt | |||
| should generally stop immediately once the first digit of the PIN is | should generally stop immediately once the first digit of the PIN is | |||
| entered. This is referred to as barge-in. After the user-interface | entered. This is referred to as barge-in. After the user-interface | |||
| collects the rest of the PIN, it can tell the user to "please wait | collects the rest of the PIN, it can tell the user to "please wait | |||
| while processing". The PIN can then be gradually transmitted to the | while processing". The PIN can then be gradually transmitted to the | |||
| application. In this example, the user interface has compensated for | application. In this example, the user interface has compensated for | |||
| a slow UI to application interface by asking the user to wait. | a slow UI to application interface by asking the user to wait. | |||
| The separation between user interface and application is absolutely | The separation between user interface and application is absolutely | |||
| fundamental to the entire framework provided in this document. Its | fundamental to the entire framework provided in this document. Its | |||
| importance cannot be understated. | importance cannot be overstated. | |||
| With this basic model, we can begin to taxonomize the types of | With this basic model, we can begin to taxonomize the types of | |||
| systems that can be built. | systems that can be built. | |||
| 3.1 Function vs. Stimulus | 3.1 Function vs. Stimulus | |||
| The first way to taxonomize the system is to consider the interface | The first way to taxonomize the system is to consider the interface | |||
| between the UI and the application. There are two fundamentally | between the UI and the application. There are two fundamentally | |||
| different models for this interface. In a functional interface, the | different models for this interface. In a functional interface, the | |||
| user interface has detailed knowledge about the application, and is, | user interface has detailed knowledge about the application, and is, | |||
| skipping to change at page 8, line 40 ¶ | skipping to change at page 9, line 13 ¶ | |||
| application in order to change the way in which they render | application in order to change the way in which they render | |||
| information to the user, stimulus user interfaces are usually slower, | information to the user, stimulus user interfaces are usually slower, | |||
| less user friendly, and less responsive than a functional | less user friendly, and less responsive than a functional | |||
| counterpart. However, they allow for substantial innovation in | counterpart. However, they allow for substantial innovation in | |||
| applications, since no standardization activity is needed to built a | applications, since no standardization activity is needed to built a | |||
| new application, as long as it can interact with the user within the | new application, as long as it can interact with the user within the | |||
| confines of the user interface mechanism. | confines of the user interface mechanism. | |||
| In SIP systems, functional interfaces are provided by extending the | In SIP systems, functional interfaces are provided by extending the | |||
| SIP protocol to provide the needed functionality. For example, the | SIP protocol to provide the needed functionality. For example, the | |||
| SIP caller preferences specification [4] provides a functional | SIP caller preferences specification [5] provides a functional | |||
| interface that allows a user to request applications to route the | interface that allows a user to request applications to route the | |||
| call to specific types of user agents. Functional interfaces are | call to specific types of user agents. Functional interfaces are | |||
| important, but are not the subject of this framework. The primary | important, but are not the subject of this framework. The primary | |||
| goal of this framework is to address the role of stimulus interfaces | goal of this framework is to address the role of stimulus interfaces | |||
| to SIP applications. | to SIP applications. | |||
| 3.2 Real-Time vs. Non-Real Time | 3.2 Real-Time vs. Non-Real Time | |||
| Application interaction systems can also be real-time or non-real- | Application interaction systems can also be real-time or | |||
| time. Non-real interaction allows the user to enter information about | non-real-time. Non-real interaction allows the user to enter | |||
| application operation in asynchronously with its invocation. | information about application operation in asynchronously with its | |||
| Frequently, this is done through provisioning systems. As an example, | invocation. Frequently, this is done through provisioning systems. As | |||
| a user can set up the forwarding number for a call-forward on no- | an example, a user can set up the forwarding number for a | |||
| answer application using a web page. Real-time interaction requires | call-forward on no-answer application using a web page. Real-time | |||
| the user to interact with the application at the time of its | interaction requires the user to interact with the application at the | |||
| invocation. | time of its invocation. | |||
| 3.3 Client-Local vs. Client-Remote | 3.3 Client-Local vs. Client-Remote | |||
| Another axis in the taxonomization is whether the user interface is | Another axis in the taxonomization is whether the user interface is | |||
| co-resident with the user device (which we refer to as a client-local | co-resident with the user device (which we refer to as a client-local | |||
| user interface), or the user interface runs in a host separated from | user interface), or the user interface runs in a host separated from | |||
| the client (which we refer to as a client-remote user interface). In | the client (which we refer to as a client-remote user interface). In | |||
| a client-remote user interface, there exists some kind of protocol | a client-remote user interface, there exists some kind of protocol | |||
| between the client device and the UI that allows the client to | between the client device and the UI that allows the client to | |||
| interact with the user interface over a network. | interact with the user interface over a network. | |||
| The most important way to separate the UI and the client device is | The most important way to separate the UI and the client device is | |||
| through media interaction. In media interaction, the interface | through media interaction. In media interaction, the interface | |||
| between the user and the user interface is through media - audio, | between the user and the user interface is through media - audio, | |||
| video, messaging, and so on. This is the classic mode of operation | video, messaging, and so on. This is the classic mode of operation | |||
| for VoiceXML [5], where the user interface (also referred to as the | for VoiceXML [2], where the user interface (also referred to as the | |||
| voice browser) runs on a platform in the network. Users communicate | voice browser) runs on a platform in the network. Users communicate | |||
| with the voice browser through the telephone network (or using a SIP | with the voice browser through the telephone network (or using a SIP | |||
| session). The voice browser interacts with the application using HTTP | session). The voice browser interacts with the application using HTTP | |||
| to convey the information collected from the user. | to convey the information collected from the user. | |||
| We refer to the second sub-case as a client-local user interface. In | We refer to the second sub-case as a client-local user interface. In | |||
| this case, the user interface runs co-located with the user. The | this case, the user interface runs co-located with the user. The | |||
| interface between them is through the software that interprets the | interface between them is through the software that interprets the | |||
| users input and passes them to the user interface. The classic | users input and passes them to the user interface. The classic | |||
| example of this is the web. In the web, the user interface is a web | example of this is the web. In the web, the user interface is a web | |||
| skipping to change at page 11, line 4 ¶ | skipping to change at page 11, line 25 ¶ | |||
| (such as PCMU). An alternative, and generally the preferred approach, | (such as PCMU). An alternative, and generally the preferred approach, | |||
| is to transmit the keypad input using RFC 2833 [7], which provides an | is to transmit the keypad input using RFC 2833 [7], which provides an | |||
| encoding mechanism for carrying keypad input within RTP. | encoding mechanism for carrying keypad input within RTP. | |||
| In this classic model, the user interface would run on a server in | In this classic model, the user interface would run on a server in | |||
| the IP network. It would perform speech recognition and DTMF | the IP network. It would perform speech recognition and DTMF | |||
| recognition to derive the user intent, feed them through the user | recognition to derive the user intent, feed them through the user | |||
| interface, and provide the result to an application. | interface, and provide the result to an application. | |||
| 3.4.2 Client Local | 3.4.2 Client Local | |||
| An alternative model is for the entire user interface to reside on | An alternative model is for the entire user interface to reside on | |||
| the telephone. The user interface can be a VoiceXML browser, running | the telephone. The user interface can be a VoiceXML browser, running | |||
| speech recognition on the microphone input, and feeding the keypad | speech recognition on the microphone input, and feeding the keypad | |||
| input directly into the script. As discussed above, the VoiceXML | input directly into the script. As discussed above, the VoiceXML | |||
| script could be rendered using text instead of voice, if the | script could be rendered using text instead of voice, if the | |||
| telephone had a textual display. | telephone had a textual display. | |||
| 3.4.3 Flip-Flop | 3.4.3 Flip-Flop | |||
| A middle-ground approach is to flip back and forth between a client- | A middle-ground approach is to flip back and forth between a | |||
| local and client-remote user interface. Many voice applications are | client-local and client-remote user interface. Many voice | |||
| of the type which listen to the media stream and wait for some | applications are of the type which listen to the media stream and | |||
| specific trigger that kicks off a more complex user interaction. The | wait for some specific trigger that kicks off a more complex user | |||
| long pound in a pre-paid calling card application is one example. | interaction. The long pound in a pre-paid calling card application is | |||
| Another example is a conference recording application, where the user | one example. Another example is a conference recording application, | |||
| can press a key at some point in the call to begin recording. When | where the user can press a key at some point in the call to begin | |||
| the key is pressed, the user hears a whisper to inform them that | recording. When the key is pressed, the user hears a whisper to | |||
| recording has started. | inform them that recording has started. | |||
| The idea way to support such an application is to install a client- | The ideal way to support such an application is to install a | |||
| local user interface component that waits for the trigger to kick off | client-local user interface component that waits for the trigger to | |||
| the real interaction. Once the trigger is received, the application | kick off the real interaction. Once the trigger is received, the | |||
| connects the user to a client-remote user interface that can play | application connects the user to a client-remote user interface that | |||
| announements, collect more information, and so on. | can play announements, collect more information, and so on. | |||
| The benefit of flip-flopping between a client-local and client-remote | The benefit of flip-flopping between a client-local and client-remote | |||
| user interface is cost. The client-local user interface will | user interface is cost. The client-local user interface will | |||
| eliminate the need to send media streams into the network just to | eliminate the need to send media streams into the network just to | |||
| wait for the user to press the pound key on the keypad. | wait for the user to press the pound key on the keypad. | |||
| The Keypad Markup Language (KPML) was designed to support exactly | The Keypad Markup Language (KPML) was designed to support exactly | |||
| this kind of need. It models the keypad on a phone, and allows an | this kind of need [8]. It models the keypad on a phone, and allows an | |||
| application to be informed when any sequence of keys have been | application to be informed when any sequence of keys have been | |||
| pressed. However, KPML has no presentation component. Since user | pressed. However, KPML has no presentation component. Since user | |||
| interfaces generally require a response to user input, the | interfaces generally require a response to user input, the | |||
| presentation will need to be done using a client-remote user | presentation will need to be done using a client-remote user | |||
| interface that gets instantiated as a result of the trigger. | interface that gets instantiated as a result of the trigger. | |||
| It is tempting to use a hybrid model, where a prompt-and-collect | It is tempting to use a hybrid model, where a prompt-and-collect | |||
| application is implemented by using a client-remote user interface | application is implemented by using a client-remote user interface | |||
| that plays the prompts, and a client-local user interface, described | that plays the prompts, and a client-local user interface, described | |||
| by KPML, that collects digits. However, this only complicates the | by KPML, that collects digits. However, this only complicates the | |||
| application. Firstly, the keypad input will be sent to both the media | application. Firstly, the keypad input will be sent to both the media | |||
| stream and the KPML user interface. This requires the application to | stream and the KPML user interface. This requires the application to | |||
| sort out which user inputs are duplicates, a process that is very | sort out which user inputs are duplicates, a process that is very | |||
| complicated. Secondly, the primary benefit of KPML is to avoid having | complicated. Secondly, the primary benefit of KPML is to avoid having | |||
| a media stream towards a user interface. However, there is already a | a media stream towards a user interface. However, there is already a | |||
| media stream for the prompting, so there is no real savings. | media stream for the prompting, so there is no real savings. | |||
| That said, the framework does support this hybrid model. | 4. Framework Overview | |||
| 4 Framework Overview | ||||
| In this framework, we use the term "SIP application" to refer to a | In this framework, we use the term "SIP application" to refer to a | |||
| broad set of functionality. A SIP application is a program running on | broad set of functionality. A SIP application is a program running on | |||
| a SIP-based element (such as a proxy or user agent) that provides | a SIP-based element (such as a proxy or user agent) that provides | |||
| some value-added function to a user or system administrator. SIP | some value-added function to a user or system administrator. SIP | |||
| applications can execute on behalf of a caller, a called party, or a | applications can execute on behalf of a caller, a called party, or a | |||
| multitude of users at once. | multitude of users at once. | |||
| Each application has a number of instances that are executing at any | Each application has a number of instances that are executing at any | |||
| given time. An instance represents a single execution path for an | given time. An instance represents a single execution path for an | |||
| skipping to change at page 13, line 15 ¶ | skipping to change at page 14, line 13 ¶ | |||
| interface, and in what format. In this framework, all client-local | interface, and in what format. In this framework, all client-local | |||
| user interface components are described by a markup language. A | user interface components are described by a markup language. A | |||
| markup language describes a logical flow of presentation of | markup language describes a logical flow of presentation of | |||
| information to the user, collection of information from the user, and | information to the user, collection of information from the user, and | |||
| transmission of that information to an application. Examples of | transmission of that information to an application. Examples of | |||
| markup languages include HTML, WML, VoiceXML, the Keypad Markup | markup languages include HTML, WML, VoiceXML, the Keypad Markup | |||
| Language (KPML) [8] and the Media Server Control Markup Language | Language (KPML) [8] and the Media Server Control Markup Language | |||
| (MSCML) [9]. | (MSCML) [9]. | |||
| The interface between the user interface component and the | The interface between the user interface component and the | |||
| application is typically markup-language specific. However, all of | application is typically markup-language specific. For those markups | |||
| the markup languages discussed above use HTTP form POST requests as | which support rendering of information to a user, such as HTML, HTTP | |||
| the primary interface [note that this is still an open issue with | form POST operations are used. For those markups where no information | |||
| KPML]. As discussed in Section 3, this interface is well suited to | is rendered to the user, the markup can play one of two roles. The | |||
| HTTP, which is a good match for its latency, reliability, and content | first is called "one shot". In the one-shot role, the markup waits | |||
| requirements. | for a user to enter some information, and when they do, reports this | |||
| event to the application. The application then does something, and | ||||
| the markup is no longer used. In the other modality, called | ||||
| "monitor", the markup stays permanently resident, and reports | ||||
| information back to an application continuously. However, the act of | ||||
| reporting information back to the application does not cause the | ||||
| installation of a new markup. In markups where one-shot or monitor | ||||
| modalities are used, a SIP MESSAGE request is used to report the | ||||
| status. | ||||
| To create a client-local user interface, the application passes the | To create a client-local user interface, the application passes the | |||
| markup document (or a reference to it) in a SIP message to that | markup document (or a reference to it) in a SIP message to that | |||
| client. The SIP message can be one explicitly generated by the | client. The SIP message can be one explicitly generated by the | |||
| application (in which case the application has to be a UA or B2BUA), | application (in which case the application has to be a UA or B2BUA), | |||
| or it can be placed in a SIP message that passes by (in which case | or it can be placed in a SIP message that passes by (in which case | |||
| the application can be running in a proxy). | the application can be running in a proxy). | |||
| Client local user interface components are always associated with the | Client local user interface components are always associated with the | |||
| dialog that the SIP message itself is associated with. Consequently, | dialog that the SIP message itself is associated with. Consequently, | |||
| skipping to change at page 13, line 47 ¶ | skipping to change at page 15, line 5 ¶ | |||
| which the application knows a UI can be created. However, the | which the application knows a UI can be created. However, the | |||
| application does need to connect the user device to the user | application does need to connect the user device to the user | |||
| interface. This will require manipulation of media streams in order | interface. This will require manipulation of media streams in order | |||
| to establish that connection. | to establish that connection. | |||
| Once a user interface component is created, the application needs to | Once a user interface component is created, the application needs to | |||
| be able to change it, and to remove it. Finally, more advanced | be able to change it, and to remove it. Finally, more advanced | |||
| applications may require coupling between application components. The | applications may require coupling between application components. The | |||
| framework supports rudimentary capabilities there. | framework supports rudimentary capabilities there. | |||
| 5 Client Local Interfaces | 5. Client Local Interfaces | |||
| One key component of this framework is support for client local user | One key component of this framework is support for client local user | |||
| interfaces. | interfaces. | |||
| 5.1 Discovering Capabilities | 5.1 Discovering Capabilities | |||
| A client local user interface can only be instantiated on a client if | A client local user interface can only be instantiated on a client if | |||
| the user device has the capabilities needed to do so. Specifically, | the user device has the capabilities needed to do so. Specifically, | |||
| an application needs to know what markup languages, if any, are | an application needs to know what markup languages, if any, are | |||
| supported by the client. For example, does the client support HTML? | supported by the client. For example, does the client support HTML? | |||
| VoiceXML? However, that information is not sufficient to determine | VoiceXML? However, that information is not sufficient to determine if | |||
| if a client local user interface can be instantiated. In order to | a client local user interface can be instantiated. In order to | |||
| instantiate the user interface, the application needs to transfer the | instantiate the user interface, the application needs to transfer the | |||
| markup document to the client. There are two ways in which the markup | markup document to the client. There are two ways in which the markup | |||
| document can be transferred. The application can send the client a | document can be transferred. The application can send the client a | |||
| URI which the client can use to fetch the markup, or the markup can | URI which the client can use to fetch the markup, or the markup can | |||
| be sent inline within the message. The application needs to know | be sent inline within the message. The application needs to know | |||
| which of these modes are supported, and in the case of indirection, | which of these modes are supported, and in the case of indirection, | |||
| which URI schemes are supported to obtain the indirection. | which URI schemes are supported to obtain the indirection. | |||
| Many applications will need to know these capabilities at the time an | Many applications will need to know these capabilities at the time an | |||
| application instance is first created. Since applications can be | application instance is first created. Since applications can be | |||
| created through SIP requests or responses, SIP needs to provide a | created through SIP requests or responses, SIP needs to provide a | |||
| means to convey this information. This introduces several concrete | means to convey this information. This introduces several concrete | |||
| requirements for SIP: | requirements for SIP: | |||
| REQ 1: A SIP request or response must be capable of conveying | REQ 1: A SIP request or response must be capable of conveying the set | |||
| the set of markup languages supported by the UA that | of markup languages supported by the UA that generated the request | |||
| generated the request or response. | or response. | |||
| REQ 2: A SIP request or response must be capable of indicating | REQ 2: A SIP request or response must be capable of indicating | |||
| whether a UA can obtain markups inline, or through an | whether a UA can obtain markups inline, or through an indirection. | |||
| indirection. In the case of indirection, the UA must be | In the case of indirection, the UA must be capable of indicating | |||
| capable of indicating what URI schemes it supports. | what URI schemes it supports. | |||
| 5.2 Pushing an Initial Interface Component | 5.2 Pushing an Initial Interface Component | |||
| Once the application has determined that the UA is capable of | Once the application has determined that the UA is capable of | |||
| supporting client local user interfaces, the next step is for the | supporting client local user interfaces, the next step is for the | |||
| application to push an interface component to the application. | application to push an interface component to the user device. | |||
| Generally, we anticipate that interface components will need to be | Generally, we anticipate that interface components will need to be | |||
| created at various different points in a SIP session. Clearly, they | created at various different points in a SIP session. Clearly, they | |||
| will need to be pushed during an initial INVITE, in both responses | will need to be pushed during an initial INVITE, in both responses | |||
| (so as to place a component into the calling UA) and in the request | (so as to place a component into the calling UA) and in the request | |||
| (so as to place a component into the called UA). As an example, a | (so as to place a component into the called UA). As an example, a | |||
| conference recording application allows the users to record the media | conference recording application allows the users to record the media | |||
| for the session at any time. The application would like to push an | for the session at any time. The application would like to push an | |||
| HTML user interface component to both the caller and callee at the | HTML user interface component to both the caller and callee at the | |||
| time the call is setup, allowing either to record the session. The | time the call is setup, allowing either to record the session. The | |||
| HTML component would have buttons to start and stop recording. To | HTML component would have buttons to start and stop recording. To | |||
| push the HTML component to the caller, it needs to be pushed in the | push the HTML component to the caller, it needs to be pushed in the | |||
| 200 OK (and possibly provisional response), and to push it to the | 200 OK (and possibly provisional response), and to push it to the | |||
| callee, in the INVITE itself. | callee, in the INVITE itself. | |||
| To state the requirement more concretely: | To state the requirement more concretely: | |||
| REQ 3: An application must be able to add a reference to, or an | REQ 3: An application must be able to add a reference to, or an | |||
| inline version of, a user interface component into any | inline version of, a user interface component into any request or | |||
| request or response that passes through or is eminated from | response that passes through or is emanated from that application. | |||
| that application. | ||||
| However, there will also be cases where the application needs to push | However, there will also be cases where the application needs to push | |||
| a new interface component to a UA, but it is not as a result of any | a new interface component to a UA, but it is not as a result of any | |||
| SIP message. As an example, a pre-paid calling card application will | SIP message. As an example, a pre-paid calling card application will | |||
| set a timer that determines how long the call can proceed, given the | set a timer that determines how long the call can proceed, given the | |||
| availability of funds in the user's account. When the timer fires, | availability of funds in the user's account. When the timer fires, | |||
| the application would like to push a new interface component to the | the application would like to push a new interface component to the | |||
| calling UA, allowing them to click to add more funds. | calling UA, allowing them to click to add more funds. | |||
| In this case, there is no message already in transit that can be used | In this case, there is no message already in transit that can be used | |||
| as a vehicle for pushing a user interface component. This requires | as a vehicle for pushing a user interface component. This requires | |||
| that applications can generate their own messages to push a new | that applications can generate their own messages to push a new | |||
| component to a UA: | component to a UA: | |||
| REQ 4: A UA application must be able to send a SIP message to | REQ 4: A UA application must be able to send a SIP message to the UA | |||
| the UA at the other end of the dialog, asking it to create | at the other end of the dialog, asking it to create a new | |||
| a new interface component. | interface component. | |||
| In all cases, the information passed from the application to the UA | In all cases, the information passed from the application to the UA | |||
| must include more than just the interface component itself (or a | must include more than just the interface component itself (or a | |||
| reference to it). The user must be able to decide whether or not it | reference to it). The user must be able to decide whether or not it | |||
| wants to proceed with this application. To make that determination, | wants to proceed with this application. To make that determination, | |||
| the user must have information about the application. Specifically, | the user must have information about the application. Specifically, | |||
| it will need the name of the application, and an identifier of the | it will need the name of the application, and an identifier of the | |||
| owner or administrator for the application. As an example, a typical | owner or administrator for the application. As an example, a typical | |||
| name would be "Prepaid Calling Card" and the owner could be | name would be "Prepaid Calling Card" and the owner could be | |||
| "voiceprovider.com". | "voiceprovider.com". | |||
| REQ 5: Any user interface component passed to a client (either | REQ 5: Any user interface component passed to a client (either inline | |||
| inline or through a reference) must also include markup | or through a reference) must also include markup meta-data, | |||
| meta-data, including a human readable name of the | including a human readable name of the application, and an | |||
| application, and an identifier of the owner of the | identifier of the owner of the application. | |||
| application. | ||||
| Clearly, there are security implications. The user will need to | Clearly, there are security implications. The user will need to | |||
| verify the identity of the application owner, and be sure that the | verify the identity of the application owner, and be sure that the | |||
| user interface component is not being replayed, that is, it actually | user interface component is not being replayed, that is, it actually | |||
| belongs with this specific SIP message. | belongs with this specific SIP message. | |||
| REQ 6: It must be possible for the client to validate the | REQ 6: It must be possible for the client to validate the | |||
| authenticity and integrity of the markup document (or its | authenticity and integrity of the markup document (or its | |||
| reference) and its associated meta-data. It must be | reference) and its associated meta-data. It must be possible for | |||
| possible for the client to verify that the information has | the client to verify that the information has not been replayed | |||
| not been replayed from a previous SIP message. | from a previous SIP message. | |||
| If the user decides not to execute the user interface component, it | If the user decides not to execute the user interface component, it | |||
| simply discards it. There is no explicit requirement for the user to | simply discards it. There is no explicit requirement for the user to | |||
| be able to inform the application that the component was discarded. | be able to inform the application that the component was discarded. | |||
| Effectively, the application will think that the component was | Effectively, the application will think that the component was | |||
| executed, but that the user never entered any information. | executed, but that the user never entered any information. | |||
| OPEN ISSUE: Are we certain? Adding support for this makes | ||||
| the system more complicated though. Warning headers may | ||||
| make sense here. | ||||
| 5.3 Updating an Interface Component | 5.3 Updating an Interface Component | |||
| Once a user interface component has been created on a client, it can | Once a user interface component has been created on a client, it can | |||
| be updated in two ways. The first way is the "normal" path inherent | be updated in two ways. The first way is the "normal" path inherent | |||
| to that component. The client enters some data, the user interface | to that component. The client enters some data, the user interface | |||
| transfers the information to the application (typically through | transfers the information to the application (typically through | |||
| HTTP), and the result of that transfer brings a new markup document | HTTP), and the result of that transfer brings a new markup document | |||
| describing an updated interface. This is referred to as a synchronous | describing an updated interface. This is referred to as a synchronous | |||
| update, since it is syncrhonized with user interaction. | update, since it is synchronized with user interaction. | |||
| However, synchronous updates are not sufficient for many | However, synchronous updates are not sufficient for many | |||
| applications. Frequently, the interface will need to be updated | applications. Frequently, the interface will need to be updated | |||
| asynchronously by the application, without an explicit user action. A | asynchronously by the application, without an explicit user action. A | |||
| good example of this is, once again, the pre-paid calling card | good example of this is, once again, the pre-paid calling card | |||
| application. The application might like to update the user interface | application. The application might like to update the user interface | |||
| when the timer runs out on the call. This introduces several | when the timer runs out on the call. This introduces several | |||
| requirements: | requirements: | |||
| REQ 7: It must be possible for an application to asynchronously | REQ 7: It must be possible for an application to asynchronously push | |||
| push an update to an existing user interface component, | an update to an existing user interface component, either in a | |||
| either in a message that was already in transit, or by | message that was already in transit, or by generating a new | |||
| generating a new message. | message. | |||
| REQ 8: It must be possible for the client to associate the new | REQ 8: It must be possible for the client to associate the new | |||
| interface component with the one that it is supposed to | interface component with the one that it is supposed to replace, | |||
| replace, so that the old one can be removed. | so that the old one can be removed. | |||
| Unfortunately, pushing of application components introduces a race | Unfortunately, pushing of application components introduces a race | |||
| condition. What if the user enters data into the old component, | condition. What if the user enters data into the old component, | |||
| causing an HTTP request to the application, while an update of that | causing an HTTP request to the application, while an update of that | |||
| component is in progress? The client will get an interface component | component is in progress? The client will get an interface component | |||
| in the HTTP response, and also get the new one in the SIP message. | in the HTTP response, and also get the new one in the SIP message. | |||
| Which one does the client use? There needs to be a way in which to | Which one does the client use? There needs to be a way in which to | |||
| properly order the components: | properly order the components: | |||
| REQ 9: It must be possible for the client to relatively order | REQ 9: It must be possible for the client to relatively order user | |||
| user interface updates it receives as the result of | interface updates it receives as the result of synchronous and | |||
| syncrhonous and asynchronous messaging. | asynchronous messaging. | |||
| 5.4 Terminating an Interface Component | 5.4 Terminating an Interface Component | |||
| User interface components have a well defined lifetime. They are | User interface components have a well defined lifetime. They are | |||
| created when the component is first pushed to the client. User | created when the component is first pushed to the client. User | |||
| interface components are always associated with the SIP dialog on | interface components are always associated with the SIP dialog on | |||
| which they were pushed. As such, their lifetime is bound by the | which they were pushed. As such, their lifetime is bound by the | |||
| lifetime of the dialog. When the dialog ends, so does the interface | lifetime of the dialog. When the dialog ends, so does the interface | |||
| component. | component. | |||
| This rule applies to early dialogs as well. If a user interface | This rule applies to early dialogs as well. If a user interface | |||
| component is passed in a provisional response to INVITE, and a | component is passed in a provisional response to INVITE, and a | |||
| separate branch eventually answers the call, the component terminates | separate branch eventually answers the call, the component terminates | |||
| with the arrival of the 2xx. Thats because the early dialog itself | with the arrival of the 2xx. That's because the early dialog itself | |||
| terminates with the arrival of the 2xx. | terminates with the arrival of the 2xx. | |||
| However, there are some cases where the application would like to | However, there are some cases where the application would like to | |||
| terminate the user interface component before its natural termination | terminate the user interface component before its natural termination | |||
| point. To do this, the application pushes a "null" update to the | point. To do this, the application pushes a "null" update to the | |||
| client. This is an update that replaces the existing user interface | client. This is an update that replaces the existing user interface | |||
| component with nothing. | component with nothing. | |||
| REQ 10: It must be possible for an application to terminate a | REQ 10: It must be possible for an application to terminate a user | |||
| user interface component before its natural expiration. | interface component before its natural expiration. | |||
| The user can also terminate the user interface component. However, | The user can also terminate the user interface component. However, | |||
| there is no explicit signaling required in this case. The component | there is no explicit signaling required in this case. The component | |||
| is simply dismissed. To the application, it appears as if the user | is simply dismissed. To the application, it appears as if the user | |||
| has simply ceased entering data. | has simply ceased entering data. | |||
| 6 Client Remote Interfaces | 6. Client Remote Interfaces | |||
| As an alternative to, or in conjunction with client local user | As an alternative to, or in conjunction with client local user | |||
| interfaces, an application can make use of client remote user | interfaces, an application can make use of client remote user | |||
| interfaces. These user interfaces can execute co-resident with the | interfaces. These user interfaces can execute co-resident with the | |||
| application itself (in which case no standardized interfaces between | application itself (in which case no standardized interfaces between | |||
| the UI and the application need to be used), or it can run | the UI and the application need to be used), or it can run | |||
| separately. This framework assumes that the user interface runs on a | separately. This framework assumes that the user interface runs on a | |||
| host that has a sufficient trust relationship with the application. | host that has a sufficient trust relationship with the application. | |||
| As such, the means for instantiating the user interface is not | As such, the means for instantiating the user interface is not | |||
| considered here. | considered here. | |||
| skipping to change at page 18, line 29 ¶ | skipping to change at page 19, line 41 ¶ | |||
| application. Its a terminating application because the user | application. Its a terminating application because the user | |||
| explicitly calls it; i.e., it is the actual called party. An example | explicitly calls it; i.e., it is the actual called party. An example | |||
| of an originating application is a wakeup call application, which | of an originating application is a wakeup call application, which | |||
| calls a user at a specified time in order to wake them up. | calls a user at a specified time in order to wake them up. | |||
| Because originating and terminating applications are a natural | Because originating and terminating applications are a natural | |||
| termination point of the dialog, manipulation of the media session by | termination point of the dialog, manipulation of the media session by | |||
| the application is trivial. Traditional SIP techniques for adding and | the application is trivial. Traditional SIP techniques for adding and | |||
| removing media streams, modifying codecs, and changing the address of | removing media streams, modifying codecs, and changing the address of | |||
| the recipient of the media streams, can be applied. Similarly, the | the recipient of the media streams, can be applied. Similarly, the | |||
| application can direclty authenticate itself to the user through | application can direclty authenticate itself to the user through S/ | |||
| S/MIME, since it is the peer UA in the dialog. | MIME, since it is the peer UA in the dialog. | |||
| 6.2 Intermediary Applications | 6.2 Intermediary Applications | |||
| Intermediary application are, at the same time, more common than | Intermediary application are, at the same time, more common than | |||
| originating/terminating applications, and more complex. Intermediary | originating/terminating applications, and more complex. Intermediary | |||
| applications are applications that are neither the actual caller or | applications are applications that are neither the actual caller or | |||
| called party. Rather, they represent a "third party" that wishes to | called party. Rather, they represent a "third party" that wishes to | |||
| interact with the user. The classic example is the ubiquitous pre- | interact with the user. The classic example is the ubiquitous | |||
| paid calling card application. | pre-paid calling card application. | |||
| In order for the intermediary application to add a client remote user | In order for the intermediary application to add a client remote user | |||
| interface, it needs to manipulate the media streams of the user agent | interface, it needs to manipulate the media streams of the user agent | |||
| to terminate on that user interface. This also introduces a | to terminate on that user interface. This also introduces a | |||
| fundamental feature interaction issue. Since the intermediary | fundamental feature interaction issue. Since the intermediary | |||
| application is not an actual participant in the call, how does the | application is not an actual participant in the call, how does the | |||
| user interact with the intermediary application, and its actual peer | user interact with the intermediary application, and its actual peer | |||
| in the dialog, at the same time? This is discussed in more detail in | in the dialog, at the same time? This is discussed in more detail in | |||
| Section 7. In fact, the choice about how this problem is solved | Section 7. | |||
| completely determines the architecture of the application. | ||||
| 7. Inter-Application Feature Interaction | ||||
| 7 Inter-Application Feature Interaction | ||||
| The inter-application feature interaction problem is inherent to | The inter-application feature interaction problem is inherent to | |||
| stimulus signaling. Whenever there are multiple applications, there | stimulus signaling. Whenever there are multiple applications, there | |||
| are multiple user interfaces. When the user provides an input, to | are multiple user interfaces. When the user provides an input, to | |||
| which user interface is the input destined? That question is the | which user interface is the input destined? That question is the | |||
| essence of the inter-application feature interaction problem. | essence of the inter-application feature interaction problem. | |||
| Inter-application feature interaction is not an easy problem to | Inter-application feature interaction is not an easy problem to | |||
| resolve. For now, we consider separately the issues for client-local | resolve. For now, we consider separately the issues for client-local | |||
| and client-remote user interface components. | and client-remote user interface components. | |||
| skipping to change at page 19, line 50 ¶ | skipping to change at page 22, line 4 ¶ | |||
| clear to which application the user input is targeted. | clear to which application the user input is targeted. | |||
| As another example, consider the same two applications, but on a | As another example, consider the same two applications, but on a | |||
| "smart phone" that has a set of buttons, and next to each button, an | "smart phone" that has a set of buttons, and next to each button, an | |||
| LCD display that can provide the user with an option. This user | LCD display that can provide the user with an option. This user | |||
| interface can be represented using the Wireless Markup Language | interface can be represented using the Wireless Markup Language | |||
| (WML). | (WML). | |||
| The phone would allocate some number of buttons to each application. | The phone would allocate some number of buttons to each application. | |||
| The prepaid calling card would get one button for its "hangup" | The prepaid calling card would get one button for its "hangup" | |||
| command, and the recording application would get one for its | command, and the recording application would get one for its "start/ | |||
| "start/stop" command. The user can easily determine which application | stop" command. The user can easily determine which application to | |||
| to interact with by pressing the appropriate button. Pressing a | interact with by pressing the appropriate button. Pressing a button | |||
| button determines focus and provides user input, both at the same | determines focus and provides user input, both at the same time. | |||
| time. | ||||
| Unfortunately, not all devices will have these advanced displays. A | Unfortunately, not all devices will have these advanced displays. A | |||
| PSTN gateway, or a basic IP telephone, may only have a 12-key keypad. | PSTN gateway, or a basic IP telephone, may only have a 12-key keypad. | |||
| The user interfaces for these devices are provided through the Keypad | The user interfaces for these devices are provided through the Keypad | |||
| Markup Language (KPML). Considering once again the feature | Markup Language (KPML). Considering once again the feature | |||
| interaction case above, the pre-paid calling card application and the | interaction case above, the pre-paid calling card application and the | |||
| call recording application would both pass a KPML document to the | call recording application would both pass a KPML document to the | |||
| device. When the user presses a button on the keypad, to which | device. When the user presses a button on the keypad, to which | |||
| document does the input apply? The user interface does not allow the | document does the input apply? The user interface does not allow the | |||
| user to select. A user interface where the user cannot provide focus | user to select. A user interface where the user cannot provide focus | |||
| is called a focusless user interface. This is quite a hard problem to | is called a focusless user interface. This is quite a hard problem to | |||
| solve. This framework does not make any explicit normative | solve. This framework does not make any explicit normative | |||
| recommendation, but concludes that the best option is to send the | recommendation, but concludes that the best option is to send the | |||
| input to both user interfaces. This is a sensible choice by analogy - | input to both user interfaces unless the markup in one interface has | |||
| its exactly what the existing circuit switched telephone network will | indicated that it should be suppressed from others. This is a | |||
| do. It is an explicit non-goal to provide a better mechanism for | sensible choice by analogy - its exactly what the existing circuit | |||
| feature interaction resolution than the PSTN on devices which have | switched telephone network will do. It is an explicit non-goal to | |||
| the same user interface as they do on the PSTN. Devices with better | provide a better mechanism for feature interaction resolution than | |||
| displays, such as PCs or screen phones, can benefit from the | the PSTN on devices which have the same user interface as they do on | |||
| capabilities of this framework, allowing the user to determine which | the PSTN. Devices with better displays, such as PCs or screen phones, | |||
| application they are interacting with. | can benefit from the capabilities of this framework, allowing the | |||
| user to determine which application they are interacting with. | ||||
| Indeed, when a user provides input on a focusless device, the input | Indeed, when a user provides input on a focusless device, the input | |||
| must be passed to all client local user interfaces, AND all client | must be passed to all client local user interfaces, AND all client | |||
| remote user interfaces. In the case of KPML, key events are passed to | remote user interfaces, unless the markup tells the UI to suppress | |||
| remote user interfaces by encoding them in RFC 2833 [7]. Of course, | the media. In the case of KPML, key events are passed to remote user | |||
| since a client cannot determine if a media stream terminates in a | interfaces by encoding them in RFC 2833 [7]. Of course, since a | |||
| remote user interface or not, these key events are passed in all | client cannot determine if a media stream terminates in a remote user | |||
| audio media streams. | interface or not, these key events are passed in all audio media | |||
| streams unless the "Q" digit is used to suppress. | ||||
| 7.2 Client-Remote UI | 7.2 Client-Remote UI | |||
| When the user interfaces run remotely, the determination of focus can | When the user interfaces run remotely, the determination of focus can | |||
| be much, much harder. There are three architectures supported in this | be much, much harder. There are many architectures that can be | |||
| framework for determining focus. The first is a centralized server | deployed to handle the interaction. None are ideal. However, all are | |||
| model, the second is a pipe-and-filter model, and the third is a | beyond the scope of this specification. | |||
| client model. | ||||
| 7.2.1 Centralized Server | ||||
| One approach to resolving the feature interaction is to deploy a | ||||
| centralized server whose goal is to do just that. The user sends a | ||||
| single copy of their media to this server, and the server is the sole | ||||
| source of media towards the user. Each application that wishes to | ||||
| interact with the user does so using a client local user interface. | ||||
| However, the user interface is not instantiated on the client, its | ||||
| instantiated on this central server. The central server is presumed | ||||
| to know enough about each application so that it can do a good job of | ||||
| determining how media should be passed to each user interface | ||||
| requested by each application. This is shown pictorially in Figure 2. | ||||
| This model has minimal impact on the client, but it only works well | ||||
| in a controlled environment where the entire set of applications is | ||||
| known ahead of time. | ||||
| 7.2.2 Pipe-and-Filter | ||||
| In order to resolve the interaction, each application acts as a B2BUA | ||||
| and as a media relay. This is shown in Figure 3. Each application | ||||
| takes its media from the "previous hop", which will be an end-user or | ||||
| another B2BUA application, and passes some or all of it on to the | ||||
| "next hop". Each application can pick off any media input it feels is | ||||
| relevant to its operation, passing the result off to the next hop. | ||||
| Furthermore, it can inject media in each direction as it so chooses. | ||||
| Conceptually, its each application pipes the media it receives to the | ||||
| next hop, and can filter it appropriately before sending it on. Thus | ||||
| the name, pipe-and-filter. | ||||
| The pipe-and-filter model describes the resolution of focus as | ||||
| provided in the existing circuit-switched telephony network. | ||||
| Of course, it is not strictly necessary for the application to always | ||||
| be a focal point for media. The application can allow the media to | ||||
| pass directly between participants when the application has no media | ||||
| to present to the user. When the application does have media to | ||||
| present to the user, it can execute a re-INVITE to move the media | ||||
| streams to a central point of control. | ||||
| An example of this is shown in Figure 4. In this example, there are | ||||
| two applications - a prepaid calling card application and a call | ||||
| recording application. The user makes a call to the prepaid number | ||||
| (1). The prepaid application acts as a UAS, answering the INVITE (2- | ||||
| 3). It prompts the user to enter their calling card, PIN, and | ||||
| destination number (4). Once the user has done that, the prepaid | ||||
| application makes a call towards the destination number (5). This | ||||
| passes through the recording application, which acts as a B2BUA with | ||||
| media (i.e., it will also be a media intermediary), and forwards the | ||||
| INVITE to the called party (6). The called party answers (7), and the | ||||
| 200 OKs and ACKs are propagated normally (8-10). At this point, both | ||||
| the prepaid application and the call recording application are B2BUA, | ||||
| so that the media flows between the caller and the prepaid app (11), | ||||
| then to the call recording app (12), and then to the called party | ||||
| (13). | ||||
| However, once the call is established, the prepaid calling card | ||||
| application does not really wish to remain on the media path. All it | ||||
| wants is to wait for the long-pound which the caller users to signal | ||||
| the end of the call. To do that, it uses a re-INVITE (14) to both | ||||
| remove itself from the media path, and to instantiate a client-local | ||||
| user interface, using KPML, into the calling UA. That INVITE contains | ||||
| no SDP, as it uses flow I from the third party call control | ||||
| specification [10]. The 200 OK from the caller contains its SDP (15), | ||||
| which is passed from the prepaid application to the call recording | ||||
| application (16). Since the call recording application is a B2BUA, it | ||||
| modifies the SDP to keep itself on the media path, passing that SDP | ||||
| to the called party (17). The called party answers with its updated | ||||
| SDP (18), which is passed to the call recording application, modified | ||||
| by it, and passed to the prepaid application (19). The prepaid | ||||
| application passes this SDP to the caller in an ACK (22), and then | ||||
| generates an ACK back towards the call recording application (20-21). | ||||
| Now, media flows from the caller to the call recording application | ||||
| (23), and from there, towards the called party (24). | ||||
| At some point later, the caller presses the long pound. This is | ||||
| passed to the KPML document, which has a single rule waiting for that | ||||
| sequence. The result is passed to the prepaid calling card | ||||
| application (25). The calling card application now knows that it | ||||
| needs to terminate the call with the called party. So, it sends a BYE | ||||
| (27), which is propagated normally (28-30). Now, the prepaid | ||||
| application needs to prompt the user for the next number. To do that, | ||||
| it needs to re-establish a media connection to it, in order to | ||||
| execute its client-remote user interface. To do that, it uses a re- | ||||
| INVITE (31-33), connecting the application to the caller (34). | ||||
| 7.2.2.1 Client Resolution | ||||
| Having the client resolve the interaction represents a fundemantally | ||||
| different way of thinking about intermediary applications. | ||||
| Instead of having intermediary applications be a B2BUA just to insert | ||||
| themselves into the media stream, they are implemented as a UA (i.e., | ||||
| not back-to-back). Each application is a separate UA, and as such, | ||||
| will create and maintain a separate dialog with the user that it | ||||
| wishes to interact with. How does the user handle this multiplicity | ||||
| of dialogs? Simply put, it acts like a focus. A focus, as defined in | ||||
| the SIP conferencing framework [3], is a SIP element that terminates | ||||
| multiple SIP dialogs, each of which represents a participant into the | ||||
| conference. Effectively, the conferencing framework itself provides | ||||
| +-+ +-+ | ||||
| |A| |A| | ||||
| |p| |p| | ||||
| |p| |p| | ||||
| |1| |2| | ||||
| | | | | | ||||
| |U| |U| | ||||
| |I| |I| | ||||
| +-+ +-+ | ||||
| +---------+ +------+ +------+ | ||||
| | | | | | | | ||||
| | Central |........>| App1 |..........>| App2 | | ||||
| | Server | | | | | | ||||
| | |+++ +------+ +------+ | ||||
| +---------+** ++++ . | ||||
| ^ + * **** ++++ . | ||||
| . + * *** +++++ . | ||||
| . + * **** ++++ . | ||||
| . + * *** ++++ . | ||||
| . + * **** ++++ . | ||||
| . + * *** +++ V | ||||
| +---+--+ **** +------+ | ||||
| | | ** | | | ||||
| |Client| |Callee| | ||||
| | | | | | ||||
| +------+ +------+ | ||||
| +++++++ RTP Path | ||||
| ******* SIP Dialog | ||||
| ....... SIP INVITE Path | ||||
| Figure 2: Centralized Server Resolution | ||||
| +--------+ +--------+ | ||||
| | |+++++++++ | | | ||||
| | App1 |********* | App1 | | ||||
| | |........> | | | ||||
| +--------+ +--------+ | ||||
| ^ * + . * + | ||||
| . * + . * + | ||||
| . * + . * + | ||||
| . * + . * + | ||||
| . * + . * + | ||||
| . * + . * + | ||||
| . * + . * + | ||||
| . * + . * + | ||||
| . * + . * + | ||||
| . * + . * + | ||||
| * + V * + | ||||
| +--------+ +--------+ | ||||
| | | | | | ||||
| | Caller | | Callee | | ||||
| | | | | | ||||
| +--------+ +--------+ | ||||
| +++++++ RTP Path | ||||
| ******* SIP Dialog | ||||
| ....... SIP INVITE Path | ||||
| Figure 3: Pipe-and-Filter Model | ||||
| the foundation upon which client resolution of multiple applications | ||||
| will take place. | ||||
| Each application has particular requirements on how it would like its | ||||
| media stream treated in relation to the other media streams that the | ||||
| focus may be managing. As an example, a prepaid calling card | ||||
| application will generate media towards the client, in order to | ||||
| inform them that they are running out of time in the call. The | ||||
| Caller Prepaid App Recorder App Callee | ||||
| |(1) INVITE | | | | ||||
| |--------------->| | | | ||||
| |(2) 200 OK | | | | ||||
| |<---------------| | | | ||||
| |(3) ACK | | | | ||||
| |--------------->| | | | ||||
| |(4) RTP | | | | ||||
| |collect PIN | | | | ||||
| |and number | | | | ||||
| |................| | | | ||||
| | |(5) INVITE | | | ||||
| | |--------------->| | | ||||
| | | |(6) INVITE | | ||||
| | | |--------------->| | ||||
| | | |(7) 200 OK | | ||||
| | | |<---------------| | ||||
| | | |(8) ACK | | ||||
| | | |--------------->| | ||||
| | |(9) 200 OK | | | ||||
| | |<---------------| | | ||||
| | |(10) ACK | | | ||||
| | |--------------->| | | ||||
| |(11) RTP | | | | ||||
| |................| | | | ||||
| | |(12) RTP | | | ||||
| | |................| | | ||||
| | | |(13) RTP | | ||||
| | | |................| | ||||
| |(14) INVITE | | | | ||||
| |no SDP | | | | ||||
| |KPML | | | | ||||
| |<---------------| | | | ||||
| |(15) 200 OK | | | | ||||
| |SDP1 | | | | ||||
| |--------------->| | | | ||||
| | |(16) INVITE | | | ||||
| | |SDP1 | | | ||||
| | |--------------->| | | ||||
| | | |(17) INVITE | | ||||
| | | |SDP2 | | ||||
| | | |--------------->| | ||||
| | | |(18) 200 OK | | ||||
| | | |SDP3 | | ||||
| | | |<---------------| | ||||
| | |(19) 200 OK | | | ||||
| | |SDP4 | | | ||||
| | |<---------------| | | ||||
| | |(20) ACK | | | ||||
| | |--------------->| | | ||||
| | | |(21) ACK | | ||||
| | | |--------------->| | ||||
| |(22) ACK | | | | ||||
| |SDP4 | | | | ||||
| |<---------------| | | | ||||
| |(23) RTP | | | | ||||
| |.................................| | | ||||
| | | |(24) RTP | | ||||
| | | |................| | ||||
| |Hit # | | | | ||||
| |(25) HTTP POST | | | | ||||
| |--------------->| | | | ||||
| |(26) 200 OK | | | | ||||
| |<---------------| | | | ||||
| | |(27) BYE | | | ||||
| | |--------------->| | | ||||
| | | |(28) BYE | | ||||
| | | |--------------->| | ||||
| | | |(29) 200 OK | | ||||
| | | |<---------------| | ||||
| | |(30) 200 OK | | | ||||
| | |<---------------| | | ||||
| |(31) INVITE | | | | ||||
| |<---------------| | | | ||||
| |(32) 200 OK | | | | ||||
| |--------------->| | | | ||||
| |(33) ACK | | | | ||||
| |<---------------| | | | ||||
| |(34) RTP | | | | ||||
| |................| | | | ||||
| Figure 4: Pre-Paid Application with Pipe-and-Filter | ||||
| application would like this announcement to be spoken more loudly | ||||
| than the media from the other participants in the call (which is | ||||
| usually just the other party in the call, but could include other | ||||
| applications too!). Furthermore, the prepaid calling card application | ||||
| would like to receive media from just the calling user, not from any | ||||
| other applications or from the other participant in the call. To | ||||
| implement this, the application uses the media policy control | ||||
| protocol [3]. This protocol allows a participant in a conference to | ||||
| inform the focus about its desired policies for media handling. Each | ||||
| application would act as a client of this protocol, passing its | ||||
| request to the media policy server, which actually runs on the end | ||||
| user device. | ||||
| The media policy server in the end user device would reconcile the | ||||
| various requests, and generate the appropriate media streams towards | ||||
| each application, and towards the other user in the call. Indeed, the | ||||
| media policy server can reconcile the requests in any way it likes, | ||||
| so long as it has sufficient information about what each application | ||||
| wants to do. When the user device has a powerful user interface, the | ||||
| user themselves can be asked to select which application their media | ||||
| is targeted to. Effectively, the client determines the application | ||||
| focus, just as in the client-local user interface case (Section 7.1). | ||||
| Figure 5 depicts this basic model pictorially. The calling device | ||||
| makes an initial INVITE to setup a basic call with the called party. | ||||
| This INVITE passes through two proxies, both of which kick off | ||||
| applications (app1 and app2) as the request is proxied towards the | ||||
| called party. The result is a single dialog setup between the caller | ||||
| and called party (dialog C). However, the INVITE from the caller | ||||
| indicated that the device is capable of acting as a focus. How did it | ||||
| do that? It did so by indicating support for the SIP Join extension | ||||
| [11] which allows a UA to request to be conferenced into an existing | ||||
| dialog. As such, both app1 and app2, acting as a pure UAC, generate | ||||
| an INVITE towards this focus, with a Join header requesting to be | ||||
| added to a conference which includes the original dialog. The result | ||||
| is two additional dialogs, dialog A and dialog B respectively, which | ||||
| join the original dialog in their connection to a focus co-resident | ||||
| with the caller. Both app1 and app2 use the media policy control | ||||
| protocol to interact with the media policy server co-resident with | ||||
| the user device (interaction not shown). This would require the | ||||
| caller to have indicated that it supports a media policy control | ||||
| server. | ||||
| REQ 11: There must be a way for a UA to indicate that it | ||||
| supports a media policy server function. | ||||
| In this model, there may be a media stream from the called party, | ||||
| app1, and app2, towards the mixer present in the calling UA. This | ||||
| "may" is important. In many cases, each application is not really | ||||
| actively generating media towards the user. It may only need to | ||||
| sporadically interact with the user, and during those times, the | ||||
| desired effect is for media from other applications, and the peer | ||||
| user, to be suppressed. Therefore, a client can support this model of | ||||
| resolution without ever needing to actually mix any media! | ||||
| Interestingly, this model for resolving the interaction problem does | ||||
| not introduce any new requirements into SIP. The existing | ||||
| conferencing framework and its associated requirements provide all | ||||
| the tools that are needed. For example, the framework will allow an | ||||
| application to initiate a new dialog towards the endpoint focus, | ||||
| allowing it to join the call without "ringing" the phone again. | ||||
| Figure 6 shows a call flow for the example scenario of Section 7.2.2, | ||||
| but using the client resolution architecture. The caller sends out an | ||||
| initial INVITE to the prepaid application (1). This INVITE contains a | ||||
| Supported header indicating the ability to receive INVITE requests | ||||
| with Join headers. It also indicates that the UA supports a media | ||||
| policy control server. This arrives at the pre-paid application. The | ||||
| pre-paid application generates a 183 to the initial INVITE (2). Then, | ||||
| it sends a brand new INVITE request (i.e., not a re-INVITE, and not | ||||
| with the same dialog identifiers as the original INVITE) towards the | ||||
| caller (3). This INVITE has a Join header containing the dialog | ||||
| identifiers from the 183. This is received by the caller. The caller | ||||
| mutates into a focus [3], and generates a 200 OK to the INVITE (4). | ||||
| The Contact header field in this 200 OK contains the conference URI. | ||||
| Effectively, the caller is now hosting a conference that has two | ||||
| dialogs - one towards the prepaid application, and the other, an | ||||
| early dialog. The prepaid application uses the media policy control | ||||
| protocol, and informs the caller that it wishes to be the sole source | ||||
| and sink of media (6). This media policy request could be presented | ||||
| to the user, informing them that the prepaid calling card application | ||||
| is now in focus. The application prompts the user for their calling | ||||
| card number, their PIN, and the destination number. Once collected, | ||||
| the prepaid calling card application acts as a B2BUA on the original | ||||
| INVITE request, and forwards it to the call recording application | ||||
| (8). Note that the prepaid application is a B2BUA on this dialog | ||||
| because it needs to hang up the call. It does not act as a B2BUA with | ||||
| media on this dialog; that is, it does not touch the SDP. | ||||
| The forwarded INVITE is received by the call recording application. | ||||
| At this point, it just proxies the request towards the called party | ||||
| (9). It is not a B2BUA on this dialog, although it does record-route. | ||||
| The called party receives the INVITE, and answers with a 200 OK (10). | ||||
| This is propagated to the call recording application, which carefully | ||||
| +------+ +------+ | ||||
| | | 2 | | | ||||
| > | App1 | .............>| App1 | | ||||
| . | | | | . | ||||
| . +------+ +------+ . | ||||
| . * ** . | ||||
| . ** *** . | ||||
| . * **** . | ||||
| . *A *** . | ||||
| 1. ** *** . | ||||
| . * ***B . | ||||
| . ** *** .3 | ||||
| . * **** . | ||||
| . * *** . | ||||
| . ** *** . | ||||
| +----*----**---------------+ . | ||||
| | +----------+ | . | ||||
| | | Endpoint | **** | . | ||||
| | | Focus | ******* | . | ||||
| | +----------+ ******* . | ||||
| | * +-----+ +--------+| ******* V | ||||
| | * |mixer| | Media || C******* +--------+ | ||||
| | * +-----+ | Policy || ****| | | ||||
| | +------+ | Server || |+------+| | ||||
| | | User | +--------+| || User || | ||||
| | +------+ | |+------+| | ||||
| +--------------------------+ +--------+ | ||||
| Calling Device Called Device | ||||
| ........ Path of initial SIP INVITE | ||||
| ******** SIP Dialog | ||||
| Figure 5: Architecture for Client Resolution | ||||
| Caller Prepaid App Recorder App Callee | ||||
| |(1) INVITE | | | | ||||
| |--------------->| | | | ||||
| |(2) 183 | | | | ||||
| |<---------------| | | | ||||
| |(3) INVITE | | | | ||||
| |Join | | | | ||||
| |<---------------| | | | ||||
| |(4) 200 OK | | | | ||||
| |--------------->| | | | ||||
| |(5) ACK | | | | ||||
| |<---------------| | | | ||||
| |(6) MS-CTRL | | | | ||||
| |just me | | | | ||||
| |<---------------| | | | ||||
| |(7) RTP | | | | ||||
| |collect PIN | | | | ||||
| |and number | | | | ||||
| |................| | | | ||||
| | |(8) INVITE | | | ||||
| | |--------------->| | | ||||
| | | |(9) INVITE | | ||||
| | | |--------------->| | ||||
| | | |(10) 200 OK | | ||||
| | | |<---------------| | ||||
| | |(11) 200 OK | | | ||||
| | |<---------------| | | ||||
| |(12) 200 OK | | | | ||||
| |<---------------| | | | ||||
| |(13) ACK | | | | ||||
| |--------------->| | | | ||||
| | |(14) ACK | | | ||||
| | |--------------->| | | ||||
| | | |(15) ACK | | ||||
| | | |--------------->| | ||||
| |(16) BYE | | | | ||||
| |<---------------| | | | ||||
| |(17) 200 OK | | | | ||||
| |--------------->| | | | ||||
| |(18) INVITE | | | | ||||
| |Join,no media | | | | ||||
| |KPML | | | | ||||
| |<---------------| | | | ||||
| |(19) 200 OK | | | | ||||
| |--------------->| | | | ||||
| |(20) ACK | | | | ||||
| |<---------------| | | | ||||
| |(21) INVITE | | | | ||||
| |Join | | | | ||||
| |<--------------------------------| | | ||||
| |(22) 200 OK | | | | ||||
| |-------------------------------->| | | ||||
| |(23) ACK | | | | ||||
| |<--------------------------------| | | ||||
| |(24) MS-CTRL | | | | ||||
| |fork to me | | | | ||||
| |<--------------------------------| | | ||||
| |Hits # | | | | ||||
| |(25) HTTP POST | | | | ||||
| |--------------->| | | | ||||
| |(26) 200 OK | | | | ||||
| |<---------------| | | | ||||
| | |(27) BYE | | | ||||
| | |--------------->| | | ||||
| | | |(28) BYE | | ||||
| | | |--------------->| | ||||
| | | |(29) 200 OK | | ||||
| | | |<---------------| | ||||
| | |(30) 200 OK | | | ||||
| | |<---------------| | | ||||
| |(31) BYE | | | | ||||
| |<--------------------------------| | | ||||
| |(32) 200 OK | | | | ||||
| |-------------------------------->| | | ||||
| |(33) INVITE | | | | ||||
| |enable | | | | ||||
| |media | | | | ||||
| |<---------------| | | | ||||
| |(34) 200 OK | | | | ||||
| |--------------->| | | | ||||
| |(35) ACK | | | | ||||
| |<---------------| | | | ||||
| |(36) MS-CTRL | | | | ||||
| |just me | | | | ||||
| |<---------------| | | | ||||
| |(37) RTP | | | | ||||
| |................| | | | ||||
| Figure 6: Prepaid Application with Client Resolution | ||||
| notes the dialog identifier. This 200 OK is passed to the prepaid | ||||
| application (11), which also notes the dialog identifier. The 200 OK | ||||
| is passed towards the caller (12). The ACK is propagated back towards | ||||
| the called party normally (13-15). The 200 OK will have the effect of | ||||
| terminating the early dialog that was established by the pre-paid | ||||
| calling card application. This leaves the caller with a hosted | ||||
| conference with itself, and the pre-paid application as members, | ||||
| along with a new dialog (outside of the conference) created from the | ||||
| 200 OK. | ||||
| Knowing this is the case, the prepaid calling card application | ||||
| terminates its previous dialog with the caller (16-17). This dialog | ||||
| is not useful any more, since it is not joined with the dialog which | ||||
| was actually created for the call. However, the prepaid calling card | ||||
| application would like to be involved in the successful dialog. For | ||||
| now, it doesn't need media, but it wishes to install a client-local | ||||
| user interface, in KPML, to watch for the long pound. So, it sends an | ||||
| INVITE with to media, with a Join header containing the dialog | ||||
| identifier for the established call. The INVITE also contains a KPML | ||||
| document (18). This INVITE completes successfully (19-20). | ||||
| Now, the call recording application needs to receive a copy of the | ||||
| media stream, in order to record it. To do that, it also generates an | ||||
| INVITE towards the caller (21), with a Join header containing the | ||||
| dialog identifiers from message 10. The INVITE indicates a receive | ||||
| only media stream. This dialog completes succesfully (22-23). Now, | ||||
| the caller is hosting a conference which contains itself, the prepaid | ||||
| calling card application (which neither sending or receiving media), | ||||
| the recording application (which is receiving media), and the called | ||||
| party (which is sending and receiving media). The call recording | ||||
| application instructs the media policy server in the UA (24) that it | ||||
| would like to receive a copy of the media, including that received | ||||
| from the called party. Note that there is no need for endpoint mixing | ||||
| to support this conference. | ||||
| The caller has their conversation. Eventually, they hit the long | ||||
| pound to hang up. This results in an HTTP POST to the prepaid | ||||
| application, based on the rules in the KPML (25). The prepaid calling | ||||
| card application sends a BYE towards the recording application (27). | ||||
| The recording application proxies it (28), and it completes normally | ||||
| (29-30). Now, recall that the call recording application was actually | ||||
| a combination of a proxy (for the original dialog), and a pure UA (to | ||||
| record the media stream). Now that the call is over, it terminates | ||||
| its dialog with the caller (31-32), and it is now out of the loop. | ||||
| The prepaid calling card would now like to communicate with the | ||||
| caller. It already has a dialog active with it. So, it merely | ||||
| generates a re-INVITE on that dialog (33), adding media streams. This | ||||
| dialog completes sucessfully, (34-35). Now, the pre-paid application | ||||
| uses the media policy control protocol to tell the caller that they | ||||
| are the only ones that should be sending or receiving a media stream | ||||
| (36). The prepaid application can then prompt for the next number. | ||||
| 7.2.3 Comparison | ||||
| There are important differences between the three models. Both have | ||||
| pros and cons. We generally compare only the client and pipe-and- | ||||
| filter models; the centralized server model is not generally | ||||
| applicable since it assumes centralized coordination of applications. | ||||
| The model in Section 7.2.2 has many benefits. First, it has excellent | ||||
| security properties. Because each application has a direct dialog | ||||
| with the user, and that dialog manages media streams directly between | ||||
| the user and each application, the existing SIP security tools can be | ||||
| directly used. S/MIME and potentially TLS (if there are no | ||||
| intervening proxies between each application and the user device) can | ||||
| provide for authentictation services. The client device can know the | ||||
| complete set of applications it is interacting with, since each one | ||||
| can authenticate directly with the UA (and vice-a-versa). In the | ||||
| model of Section 7.2.2, there is a single dialog between the user and | ||||
| their "first" application. Therefore, the user cannot directly | ||||
| authenticate each application, and vice-a-versa. | ||||
| Similarly, each media stream can be properly secured using SRTP [12]. | ||||
| Because each application is a UA, and not a B2BUA, SRTP key exchanges | ||||
| (using MIKEY, for example [13]) are done directly with the | ||||
| application to which the media is being sent. In the model of Section | ||||
| 7.2.2, the applications are the terminating point of the signaling, | ||||
| but may not even touch the media stream (once again, consider the | ||||
| pre-paid calling card application). Such a configuration might | ||||
| preclude the use of SRTP, since the intermediary application would | ||||
| appear as a man-in-the-middle attacker! | ||||
| B2BUAs also have well understood interactions with end-to-end | ||||
| encryption. If the caller should encrypt their SDP, B2BUA | ||||
| applications will not be able to manipulate it, and so the model of | ||||
| Section 7.2.2 will simply fail. However, the endpoint-based model of | ||||
| Section 7.2.2 still works in the presence of end-to-end encryption of | ||||
| SDP. This is, of course, because there are no B2BUAs. | ||||
| That leads to another benefit - feature transparency. B2BUAs can | ||||
| interfere with the operations of features when messages are | ||||
| propagated through them. This problem is completely eliminated in the | ||||
| client-based architecture of Section 7.2.2. | ||||
| There is another interesting benefit of the client-based architecture | ||||
| - firewall traversal. In the application-based architecture of | ||||
| Section 7.2.2, many applications will not need to always be on the | ||||
| media path. The applications will use re-INVITEs to move the media | ||||
| streams to themselves when needed, and then move them back when done. | ||||
| The result of this, as far as the user is concerned, is that a single | ||||
| media stream will, at times, appear to be coming from different | ||||
| source IP addresses. This means that a SIP-enabled firewall (or one | ||||
| controlled by MIDCOM [14]) will need to open a "cone" for the media | ||||
| stream - allowing it to go to the user, but come from any source | ||||
| address. Such cones are more insecure, and less desirable, than a | ||||
| pinhole. With the client-based architecture of Section 7.2.2, a SIP- | ||||
| enabled firewall can open a cone initially, and when the media | ||||
| arrives from the application, close the cone to a pinhole by | ||||
| restricting media packets to always have the same source IP address | ||||
| from then on. This restriction is possible because media on a | ||||
| particular dialog comes from a single source - the application or the | ||||
| user, depending on which dialog. The source of the media does not | ||||
| change within a single dialog, as it does in the model of Section | ||||
| 7.2.2. | ||||
| TODO: A picture and some more words are needed here to | ||||
| explain this. | ||||
| Conceptually, the client-based architecture allows for a unified view | ||||
| of applications. A SIP application that desires to instantiate a | ||||
| remote client user interface is always a normal user agent, whether | ||||
| it be a "terminating" type of application, or "intermediary" type of | ||||
| application. These two cases therefore become merged into one. | ||||
| Furthermore, the inter-application feature interaction between client | ||||
| local user interfaces and client remote user interfaces become | ||||
| unified - both become local focus determination problems. | ||||
| Furthermore, much of the interactions between application components | ||||
| (discussed in Section 8) are simplified because of the simple | ||||
| correlation of a dialog to a single application. | ||||
| Unfortunately, the benefits of the client-based architecture come at | ||||
| a cost of complexity. End devices need to support a focus capability, | ||||
| a media policy server function, and possibly a media mixer, although | ||||
| the latter can probably be avoided. The model also requires the | ||||
| client to construct a globally routable URI to represent its focus, | ||||
| something which is not trivial in an IP network laden with NATs and | ||||
| firewalls. | ||||
| 8 Intra Application Feature Interaction | 8. Intra Application Feature Interaction | |||
| An application can instantiate a multiplicity of user interface | An application can instantiate a multiplicity of user interface | |||
| components. For example, a single application can instantiate two | components. For example, a single application can instantiate two | |||
| separate HTML components and one WML component. Furthermore, an | separate HTML components and one WML component. Furthermore, an | |||
| application can instantiate both client local and client remote user | application can instantiate both client local and client remote user | |||
| interfaces. | interfaces. | |||
| The feature interaction issues between these components within the | The feature interaction issues between these components within the | |||
| same application are less severe. If an application has multiple | same application are less severe. If an application has multiple | |||
| client user interface components, their interaction is resolved | client user interface components, their interaction is resolved | |||
| identically to the inter-application case - through focus | identically to the inter-application case - through focus | |||
| determination. However, the problems in focusless user interfaces | determination. However, the problems in focusless user interfaces | |||
| (such as a keypad) generally won't exist, since the application can | (such as a keypad) generally won't exist, since the application can | |||
| generate user interfaces which do not overlap in their usage of an | generate user interfaces which do not overlap in their usage of an | |||
| input. | input. | |||
| The real issue is that the optimal user experience frequently | The real issue is that the optimal user experience frequently | |||
| requires some kind of coupling between the differing user interface | requires some kind of coupling between the differing user interface | |||
| components. This is a classic problem in multi-modal user interfaces, | components. This is a classic problem in multi-modal user interfaces, | |||
| such as those described by SALT [15]. As an example, consider a user | such as those described by Speech Application Language Tags (SALT). | |||
| interface where a user can either press a labeled button to make a | As an example, consider a user interface where a user can either | |||
| selection, or listen to a prompt, and speak the desired selection. | press a labeled button to make a selection, or listen to a prompt, | |||
| Ideally, when the user presses the button, the prompt should cease | and speak the desired selection. Ideally, when the user presses the | |||
| immediately, since both of them were targeted at collecting the same | button, the prompt should cease immediately, since both of them were | |||
| information in parallel. Such interactions are best handled by | targeted at collecting the same information in parallel. Such | |||
| markups which natively support such interactions, such as SALT, and | interactions are best handled by markups which natively support such | |||
| thus require no explicit support from this framework. | interactions, such as SALT, and thus require no explicit support from | |||
| this framework. | ||||
| There is, however, a very common interaction in voice-based | ||||
| applications which merits support from this framework. Many | ||||
| interactive voice response systems (IVR) allow for a user to | ||||
| "interrupt" a prompt by generating a response before the prompt | ||||
| finishes. The ideal user experience is achieved by having the prompt | ||||
| cease immediately when the user speaks the input. This is known as | ||||
| barge-in. | ||||
| In a traditional implementation of an IVR system, there would be a | ||||
| client-remote user interface, rendered in VoiceXML. VoiceXML has | ||||
| native support for barge-in. However, because the VoiceXML script is | ||||
| interpreted remotely, there is a fundamental latency between the | ||||
| client and the remote user interface. That is, when the user speaks | ||||
| or presses a key, the speech or key must be transmitted to the | ||||
| platform and interpreted, and then the VoiceXML server ceases playing | ||||
| out media. For this to be observed by the client, the last media | ||||
| packet must still travel from the VoiceXML server to the client, | ||||
| through its playout buffers, and out the speaker system. | ||||
| This framework allows for better performance. A VoiceXML user | ||||
| interface can actually delegate a component of the user interface to | ||||
| be interpreted on the client. Specifically, the collection of the | ||||
| keypad input from the user can be delegated to the client by placing | ||||
| a KPML-based user interface on the client solely for this purpose. | ||||
| KPML has a barge-in feature as well. When the barge-in option is | ||||
| selected, and user input matches a regular expression, all incoming | ||||
| media streams associated with the application are muted, and the | ||||
| playout buffers on the client are flushed. This situation persists | ||||
| until the beginning of the next talkspurt, framed by the market bit | ||||
| in the RTP stream. | ||||
| OPEN ISSUE: Is the marker bit the right way to do this? | ||||
| In this framework, a client local user interface is bound to a | ||||
| dialog. A media stream is said to be associated with that user | ||||
| interface component if the media stream is managed on the same dialog | ||||
| the user interface component is bound to. As a result, if a KPML | ||||
| script results in a barge-in, all media streams on that dialog are | ||||
| muted until their marker bits flip. | ||||
| A similar delegation can occur by placing instantiating a VoiceXML- | ||||
| based user interface into the client. That would allow barge-in to | ||||
| operate for speech driven IVR, in addition to keypad driven IVR. | ||||
| This capability can allow VoIP-based IVR applications to operate with | ||||
| zero-latency barge-in, better than todays circuit-switched IVR | ||||
| applications. This is shown in Figure 7, which demonstrates a call | ||||
| flow for this example. The caller makes an INVITE to a VoiceXML | ||||
| server (1). The VoiceXML server fetches the script to execute (2). | ||||
| The script, returned in (3), indicates that a prompt should be | ||||
| played, and if the user presses bound, to barge-in. So, the VoiceXML | ||||
| server generates a KPML script that looks for pound, and sets the | ||||
| barge flag to true. This is returned in the 200 OK (4). The user is | ||||
| played the prompt, and presses pound in the middle. The KPML notes | ||||
| this, and the UA ceases playout of the prompt immediately. At the | ||||
| same time, the client generates a POST to the VoiceXML server (7). | ||||
| The VoiceXML server knows that the pound has been pressed. So, it | ||||
| fetches the next VoiceXML script (8), and extracts from it the next | ||||
| KPML script, passed in the 200 OK response to the POST from the | ||||
| client (10). | ||||
| 9 Examples | 9. Examples | |||
| TODO. | TODO. | |||
| 10 Security Considerations | 10. Security Considerations | |||
| There are many security considerations associated with this | There are many security considerations associated with this | |||
| framework. It allows applications in the network to instantiate user | framework. It allows applications in the network to instantiate user | |||
| interface components on a client device. Such instantiations need to | interface components on a client device. Such instantiations need to | |||
| be from authenticated applications, and also need to be authorized to | be from authenticated applications, and also need to be authorized to | |||
| place a UI into the client. | place a UI into the client. Indeed, the stronger requirement is | |||
| authorization. It is not so important to know that name of the | ||||
| The means by which the authentication and authorization are done | provider of the application, but rather, that the provider is | |||
| depend on the architectural model in use. A pipe-and-filter model | authorized to instantiate components. | |||
| will make it difficult for the user device to authenticate each | ||||
| application, since there is no direct dialog between them. Direct | ||||
| dialogs are needed since they are needed for S/MIME, which is the | ||||
| primary tool for client authentication of a server through proxies. | ||||
| However, authorization is reasonably simple. An application is | ||||
| authorized if it was on the original call path. By using a secure SIP | ||||
| URI [1], the caller can obtain this guarantee as long as it trusts | ||||
| each element on the call setup path. | ||||
| With the client-based resolution model, authentication is much | Generally, an application should be considered authorized if it was | |||
| better, as noted in Section 7.2.2, since it can be done with S/MIME. | an application that was legitimately part of the call setup path. | |||
| Authorization works identically to the pipe-and-filter model. If the | With this definition, authorization can be enforced using the sips | |||
| caller initiated the call with a secure SIP URI, an application could | URI scheme when the call is initiated. | |||
| never learn the dialog identifiers unless it was in-path. Therefore, | ||||
| an application which generates an INVITE to join a dialog created | ||||
| from a SIPS URI must have been on the call path. However, this | ||||
| application itself must use SIPS to contact the UA, in order to | ||||
| protect the confidentiality of the dialog identifiers. | ||||
| 11 Contributors | 11. Contributors | |||
| This document was produced as a result of discussions amongst the | This document was produced as a result of discussions amongst the | |||
| application interaction design team. All members of this team | application interaction design team. All members of this team | |||
| contributed significantly to the ideas embodied in this document. The | contributed significantly to the ideas embodied in this document. The | |||
| members of this team were: | members of this team were: | |||
| Eric Burger | Eric Burger | |||
| Cullen Jennings | Cullen Jennings | |||
| Robert Fairlie-Cuninghame | Robert Fairlie-Cuninghame | |||
| 12 Authors Address | Informative References | |||
| Jonathan Rosenberg | [1] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., | |||
| Caller VXML Server Web Server | Peterson, J., Sparks, R., Handley, M. and E. Schooler, "SIP: | |||
| | | | | Session Initiation Protocol", RFC 3261, June 2002. | |||
| | | | | ||||
| |(1) SIP INVITE | | | ||||
| |--------------->| | | ||||
| | | | | ||||
| | | | | ||||
| | |(2) HTTP GET | | ||||
| | |--------------->| | ||||
| | | | | ||||
| | |(3) HTTP 200 OK | | ||||
| | |VXML | | ||||
| | |<---------------| | ||||
| | | | | ||||
| |(4) SIP 200 OK | | | ||||
| |KPML | | | ||||
| |<---------------| | | ||||
| | | | | ||||
| | | | | ||||
| |(5) SIP ACK | | | ||||
| |--------------->| | | ||||
| | | | | ||||
| | | | | ||||
| |(6) RTP | | | ||||
| |................| | | ||||
| | | | | ||||
| | | | | ||||
| |press # | | | ||||
| | | | | ||||
| | | | | ||||
| | | | | ||||
| |playout ends | | | ||||
| | | | | ||||
| | | | | ||||
| | | | | ||||
| |(7) HTTP POST | | | ||||
| |--------------->| | | ||||
| | | | | ||||
| | | | | ||||
| | |(8) HTTP POST | | ||||
| | |--------------->| | ||||
| | | | | ||||
| | |(9) 200 OK | | ||||
| | |VXML | | ||||
| | |<---------------| | ||||
| | | | | ||||
| |(10) 200 OK | | | ||||
| |KPML | | | ||||
| |<---------------| | | ||||
| | | | | ||||
| | | | | ||||
| | | | | ||||
| | | | | ||||
| Figure 7: Zero-Latency Barge In | [2] McGlashan, S., Lucas, B., Porter, B., Rehor, K., Burnett, D., | |||
| dynamicsoft | Carter, J., Ferrans, J. and A. Hunt, "Voice Extensible Markup | |||
| 72 Eagle Rock Avenue | Language (VoiceXML) Version 2.0", W3C CR CR-voicexml20-20030220, | |||
| First Floor | February 2003. | |||
| East Hanover, NJ 07936 | ||||
| email: jdrosen@dynamicsoft.com | ||||
| 13 Normative References | [3] Day, M., Rosenberg, J. and H. Sugano, "A Model for Presence and | |||
| Instant Messaging", RFC 2778, February 2000. | ||||
| 14 Informative References | [4] Rosenberg, J., "A Framework for Conferencing with the Session | |||
| Initiation Protocol", | ||||
| draft-ietf-sipping-conferencing-framework-00 (work in progress), | ||||
| May 2003. | ||||
| [1] J. Rosenberg, H. Schulzrinne, G. Camarillo, A. Johnston, J. | [5] Rosenberg, J., Schulzrinne, H. and P. Kyzivat, "Caller | |||
| Peterson, R. Sparks, M. Handley, and E. Schooler, "SIP: session | Preferences and Callee Capabilities for the Session Initiation | |||
| initiation protocol," RFC 3261, Internet Engineering Task Force, June | Protocol (SIP)", draft-ietf-sip-callerprefs-08 (work in | |||
| 2002. | progress), March 2003. | |||
| [2] M. Day, J. Rosenberg, and H. Sugano, "A model for presence and | [6] Schulzrinne, H., Casner, S., Frederick, R. and V. Jacobson, | |||
| instant messaging," RFC 2778, Internet Engineering Task Force, Feb. | "RTP: A Transport Protocol for Real-Time Applications", RFC | |||
| 2000. | 1889, January 1996. | |||
| [3] J. Rosenberg, "A framework for conferencing in the session | [7] Schulzrinne, H. and S. Petrack, "RTP Payload for DTMF Digits, | |||
| initiation protocol," Internet Draft, Internet Engineering Task | Telephony Tones and Telephony Signals", RFC 2833, May 2000. | |||
| Force, Oct. 2002. Work in progress. | ||||
| [4] H. Schulzrinne and J. Rosenberg, "Session initiation protocol | [8] Burger, E., "Keypad Markup Language (KPML)", | |||
| (SIP) caller preferences and callee capabilities," Internet Draft, | draft-burger-sipping-kpml-02 (work in progress), July 2003. | |||
| Internet Engineering Task Force, July 2002. Work in progress. | ||||
| [5] VoiceXML Forum, "Voice extensible markup language (VoiceXML) | [9] Dyke, J., Burger, E. and A. Spitzer, "Media Server Control | |||
| version 1.0," W3C Note NOTE-voicexml-20000505, World Wide Web | Markup Language (MSCML) and Protocol", draft-vandyke-mscml-02 | |||
| Consortium (W3C), May 2000. Available at | (work in progress), July 2003. | |||
| http://www.w3.org/TR/voicexml/. | ||||
| [6] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, "RTP: a | Author's Address | |||
| transport protocol for real-time applications," RFC 1889, Internet | ||||
| Engineering Task Force, Jan. 1996. | ||||
| [7] H. Schulzrinne and S. Petrack, "RTP payload for DTMF digits, | Jonathan Rosenberg | |||
| telephony tones and telephony signals," RFC 2833, Internet | dynamicsoft | |||
| Engineering Task Force, May 2000. | 600 Lanidex Plaza | |||
| Parsippany, NJ 07054 | ||||
| US | ||||
| [8] E. Burger, "The keypad markup language (kpml)," Internet Draft, | Phone: +1 973 952-5000 | |||
| Internet Engineering Task Force, Oct. 2002. Work in progress. | EMail: jdrosen@dynamicsoft.com | |||
| URI: http://www.jdrosen.net | ||||
| [9] J. V. Dyke, E. Burger, and A. Spitzer, "Snowshore media server | Intellectual Property Statement | |||
| control markup language and protocol," Internet Draft, Internet | ||||
| Engineering Task Force, Oct. 2002. Work in progress. | ||||
| [10] J. Rosenberg, J. Peterson, H. Schulzrinne, and G. Camarillo, | The IETF takes no position regarding the validity or scope of any | |||
| "Best current practices for third party call control in the session | intellectual property or other rights that might be claimed to | |||
| initiation protocol," Internet Draft, Internet Engineering Task | pertain to the implementation or use of the technology described in | |||
| Force, June 2002. Work in progress. | this document or the extent to which any license under such rights | |||
| might or might not be available; neither does it represent that it | ||||
| has made any effort to identify any such rights. Information on the | ||||
| IETF's procedures with respect to rights in standards-track and | ||||
| standards-related documentation can be found in BCP-11. Copies of | ||||
| claims of rights made available for publication and any assurances of | ||||
| licenses to be made available, or the result of an attempt made to | ||||
| obtain a general license or permission for the use of such | ||||
| proprietary rights by implementors or users of this specification can | ||||
| be obtained from the IETF Secretariat. | ||||
| [11] R. Mahy and D. Petrie, "The session initiation protocol (sip) | The IETF invites any interested party to bring to its attention any | |||
| join header," Internet Draft, Internet Engineering Task Force, Oct. | copyrights, patents or patent applications, or other proprietary | |||
| 2002. Work in progress. | rights which may cover technology that may be required to practice | |||
| this standard. Please address the information to the IETF Executive | ||||
| Director. | ||||
| [12] M. Baugher et al. , "The secure real-time transport protocol," | Full Copyright Statement | |||
| Internet Draft, Internet Engineering Task Force, June 2002. Work in | ||||
| progress. | ||||
| [13] J. Arkko et al. , "MIKEY: Multimedia internet KEYing," Internet | Copyright (C) The Internet Society (2003). All Rights Reserved. | |||
| Draft, Internet Engineering Task Force, Aug. 2002. Work in progress. | ||||
| [14] P. Srisuresh, J. Kuthan, J. Rosenberg, A. Molitor, and A. | This document and translations of it may be copied and furnished to | |||
| Rayhan, "Middlebox communication architecture and framework," RFC | others, and derivative works that comment on or otherwise explain it | |||
| 3303, Internet Engineering Task Force, Aug. 2002. | or assist in its implementation may be prepared, copied, published | |||
| and distributed, in whole or in part, without restriction of any | ||||
| kind, provided that the above copyright notice and this paragraph are | ||||
| included on all such copies and derivative works. However, this | ||||
| document itself may not be modified in any way, such as by removing | ||||
| the copyright notice or references to the Internet Society or other | ||||
| Internet organizations, except as needed for the purpose of | ||||
| developing Internet standards in which case the procedures for | ||||
| copyrights defined in the Internet Standards process must be | ||||
| followed, or as required to translate it into languages other than | ||||
| English. | ||||
| [15] S. Forum, "Speech application language tags 1.0 specification | The limited permissions granted above are perpetual and will not be | |||
| (SALT)," salt forum recommendation, Salt Forum, July 2002. Work in | revoked by the Internet Society or its successors or assignees. | |||
| progress. | ||||
| This document and the information contained herein is provided on an | ||||
| "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING | ||||
| TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING | ||||
| BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION | ||||
| HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF | ||||
| MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. | ||||
| Acknowledgement | ||||
| Funding for the RFC Editor function is currently provided by the | ||||
| Internet Society. | ||||
| End of changes. 105 change blocks. | ||||
| 1092 lines changed or deleted | 361 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||