| < draft-rosenberg-sip-app-components-00.txt | draft-rosenberg-sip-app-components-01.txt > | |||
|---|---|---|---|---|
| Internet Engineering Task Force SIP WG | Internet Engineering Task Force SIP WG | |||
| Internet Draft Rosenberg/Mataga/Schulzrinne | Internet Draft Rosenberg/Mataga/Schulzrinne | |||
| draft-rosenberg-sip-app-components-00.txt dynamicsoft/Columbia U. | draft-rosenberg-sip-app-components-01.txt dynamicsoft/Columbia U. | |||
| November 15, 2000 | March 2, 2001 | |||
| Expires: May 2001 | Expires: September 2001 | |||
| An Application Server Component Architecture for SIP | An Application Server Component Architecture for SIP | |||
| STATUS OF THIS MEMO | STATUS OF THIS MEMO | |||
| This document is an Internet-Draft and is in full conformance with | This document is an Internet-Draft and is in full conformance with | |||
| all provisions of Section 10 of RFC2026. | all provisions of Section 10 of RFC2026. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that | |||
| skipping to change at page 3, line 36 ¶ | skipping to change at page 3, line 36 ¶ | |||
| application may require special purpose hardware. This | application may require special purpose hardware. This | |||
| component can distributed to a specialized processor, with | component can distributed to a specialized processor, with | |||
| a normal off the shelf processor handling the more generic | a normal off the shelf processor handling the more generic | |||
| software tasks. Several of the components that we are | software tasks. Several of the components that we are | |||
| describing fit into this category (such as the TTS server). | describing fit into this category (such as the TTS server). | |||
| Sharing of resources. By decomposing a server into components, a | Sharing of resources. By decomposing a server into components, a | |||
| many-to-many interaction between them becomes possible. | many-to-many interaction between them becomes possible. | |||
| This means that one component can provide services to many | This means that one component can provide services to many | |||
| other components. This provides for sharing of resources, | other components. This provides for sharing of resources, | |||
| which ultimately results in cost reduction. | which ultimately results in capital cost reduction. | |||
| Expertise. Building a complex application requires expertise in | Expertise. Building a complex application requires expertise in | |||
| call control, media services, compression, web, speech | call control, media services, compression, web, speech | |||
| recognition, etc. It is highly unlikely that one | recognition, etc. It is highly unlikely that one | |||
| organization will have enough expertise in all of these to | organization will have enough expertise in all of these to | |||
| build them all. By decomposing an application server into | build them all. By decomposing an application server into | |||
| subpieces, organizations with expertise in one particular | subpieces, organizations with expertise in one particular | |||
| piece can build that one. The result is that the complete | piece can build that one. The result is that the complete | |||
| system can be composed of best in breed components. | system can be composed of best in breed components. | |||
| skipping to change at page 4, line 41 ¶ | skipping to change at page 4, line 41 ¶ | |||
| Typically, the MGCP interface between the two devices is fairly | Typically, the MGCP interface between the two devices is fairly | |||
| "busy"; there is a lot of messaging for complex applications. | "busy"; there is a lot of messaging for complex applications. | |||
| In this model, there is a tightly coupled relationship between the MS | In this model, there is a tightly coupled relationship between the MS | |||
| and AS. The MS cannot function without the AS, and the AS needs to | and AS. The MS cannot function without the AS, and the AS needs to | |||
| perform tight, low-level controls over the detailed operation of the | perform tight, low-level controls over the detailed operation of the | |||
| media server. | media server. | |||
| To some degree, breaking of an application server into these two | To some degree, breaking of an application server into these two | |||
| components represents an implementation detail of how one builds a | components represents an implementation detail of how one builds a | |||
| large, monolithic application server. It is not generally possible | large, monolithic application server. It is not generally practical | |||
| for the two components to be owned by separate providers. In fact, it | for the two components to be owned by separate providers, due to the | |||
| has yet to be shown that complete interoperability and integration is | master/slave relationship between the two. | |||
| possible with two components from different vendors, let alone | ||||
| different providers. | ||||
| This decomposition also does not provide a true separation of | This decomposition also does not provide a true separation of | |||
| function. Most applications that require media interaction (IVR, | function. Most applications that require media interaction (IVR, | |||
| credit card and debit card, etc.) have very cleanly separated media | credit card and debit card, etc.) have very cleanly separated media | |||
| phases and signaling phases. The details of the media interactions | phases and signaling phases. The details of the media interactions | |||
| are usually not important to the signaling component, and vice a | ||||
| versa. As an example, consider a debit card application. The | ||||
| .................... | .................... | |||
| . . | . . | |||
| . +-------------+ . | . +-------------+ . | |||
| . | | . | . | | . | |||
| SIP . | | . | SIP . | | . | |||
| -------------+ AS | . | -------------+ AS | . | |||
| . | | . | . | | . | |||
| . | | . | . | | . | |||
| . | | . | . | | . | |||
| . +-------------+ . | . +-------------+ . | |||
| skipping to change at page 5, line 36 ¶ | skipping to change at page 5, line 36 ¶ | |||
| . | | . | . | | . | |||
| . | | . | . | | . | |||
| . +-------------+ . | . +-------------+ . | |||
| . . | . . | |||
| .................... | .................... | |||
| Complete Application | Complete Application | |||
| Server | Server | |||
| Figure 1: MGCP-based decomposition | Figure 1: MGCP-based decomposition | |||
| are usually not important to the signaling component, and vice a | ||||
| versa. As an example, consider a debit card application. The | ||||
| application starts with the user making a call. As part of the call | application starts with the user making a call. As part of the call | |||
| processing, interaction is needed with the user via the media stream | processing, interaction is needed with the user via the media stream | |||
| to determine the debit card number. The precise set of menu | to determine the debit card number. The precise set of menu | |||
| operations and interactions used to obtain this number aren't | operations and interactions used to obtain this number aren't | |||
| important to the call/signaling processing piece; only the result | important to the call/signaling processing piece; only the result | |||
| (the number), is important. Once the number is returned, media | (the number), is important. Once the number is returned, media | |||
| processing ceases, and data and call processing commence. The debit | processing ceases, and data and call processing commence. The debit | |||
| card is looked up in a subscriber database, and if enough time | card is looked up in a subscriber database, and if enough time | |||
| remains, the call is completed. The signaling component monitors the | remains, the call is completed. The signaling component monitors the | |||
| call, and when the card has run out of minutes, the call is | call, and when the card has run out of minutes, the call is | |||
| skipping to change at page 6, line 30 ¶ | skipping to change at page 6, line 28 ¶ | |||
| interactions with the MS that are provided by MGCP, in addition to | interactions with the MS that are provided by MGCP, in addition to | |||
| the detailed signaling and data processing operations. The developers | the detailed signaling and data processing operations. The developers | |||
| will also need to build and manage the low level state representing | will also need to build and manage the low level state representing | |||
| the controlled entity, which can be painful. The result is longer | the controlled entity, which can be painful. The result is longer | |||
| development times, less code reuse, and slower innovation. | development times, less code reuse, and slower innovation. | |||
| It has been argued that one of the benefits of the MGCP decomposition | It has been argued that one of the benefits of the MGCP decomposition | |||
| is that it offloads the "burden" of call control from the media | is that it offloads the "burden" of call control from the media | |||
| server. However, from a complexity standpoint, the MGCP processing | server. However, from a complexity standpoint, the MGCP processing | |||
| required is probably on par with (if not more than), the simple | required is probably on par with (if not more than), the simple | |||
| amount of call control and SIP processing needed if SIP were used | amount of call control and event processing needed if SIP and | |||
| directly. | VoiceXML were used. | |||
| From a reliability perspective, an MGCP style decomposition is less | From a reliability perspective, an MGCP style decomposition is less | |||
| desirable. Since the components are strongly coupled, the system will | desirable. Since the components are strongly coupled, the system will | |||
| fail so long as any of the pieces fail. Failure can also be | fail so long as any of the pieces fail. Failure can also be | |||
| introduced because of additional network resources needed for | introduced because of additional network resources needed for | |||
| communications between the boxes. The result is that the MGCP | communications between the boxes. The result is that the MGCP | |||
| decomposition may actually increase the probability of failure, as | decomposition may actually increase the probability of failure, as | |||
| compared to no decomposition at all. | compared to no decomposition at all. | |||
| Another decomposition that has been proposed is to break a proxy into | Another decomposition that has been proposed is to break a proxy into | |||
| a routing and call control component, plus a services component. The | a routing and call control component, plus a services component. The | |||
| interface between the two is then a transactional interface for | interface between the two is then a transactional interface for | |||
| services, similar in concept to INAP, based upon state transitions | services, similar in concept to INAP, based upon state transitions | |||
| within a call model. This is another form of tight coupling, since it | within a call model. This is another form of tight coupling, since it | |||
| requires the services component to have detailed knowledge of the | requires the services component to have detailed knowledge of the | |||
| operational model of the call control component. We believe that this | operational model of the call control component. We believe that this | |||
| decomposition is limiting, for the same reasons the AS/MS | decomposition is limiting, for the same reasons the AS/MS | |||
| decomposition is limiting. | decomposition is limiting. | |||
| 4 The Decoupled Model | 4 The Decoupled Model | |||
| 4.1 Architecture | ||||
| 4.1 Architecture | ||||
| As a result of this, we see the master/slave decomposition as being | As a result of this, we see the master/slave decomposition as being | |||
| ideal for a single vendor to build a large system. However, this | ideal for a single vendor to build a large system. However, this | |||
| decomposition does not solve the other distribution needs we have | decomposition does not solve the other distribution needs we have | |||
| motivated above. As a result, we propose that the AS be decomposed | motivated above. As a result, we propose that the AS be decomposed | |||
| into an application component responsible for coordinating the | into an application component responsible for coordinating the | |||
| overall execution of the application (called the controller), and | overall execution of the application (called the controller), and | |||
| application server components that provide pieces of the overall | application server components that provide pieces of the overall | |||
| application. These components are only loosely coupled with the | application. These components are only loosely coupled with the | |||
| coordinating application server. The loose coupling implies that the | coordinating application server. The loose coupling implies that the | |||
| interaction between them is the same as the interaction between the | interaction between them is the same as the interaction between the | |||
| skipping to change at page 9, line 4 ¶ | skipping to change at page 8, line 50 ¶ | |||
| A prompt would be played over that session, something like "please | A prompt would be played over that session, something like "please | |||
| record your message for Joe now", and then the component takes the | record your message for Joe now", and then the component takes the | |||
| media input stream, records it, and saves it. When it is done, the | media input stream, records it, and saves it. When it is done, the | |||
| session is terminated. | session is terminated. | |||
| In some cases, the session may require a "side channel" over which | In some cases, the session may require a "side channel" over which | |||
| intermediate data is passed, needed to control the session | intermediate data is passed, needed to control the session | |||
| interactions from that point forward. IVR is the classic example. In | interactions from that point forward. IVR is the classic example. In | |||
| some cases the coordinating application server can kick off the IVR | some cases the coordinating application server can kick off the IVR | |||
| script, and then only get back the final result - a menu option, a | script, and then only get back the final result - a menu option, a | |||
| credit card number, or what have you. In other cases, the | ||||
| coordinating component may need to get intermediate results, so that | ||||
| +-----------+ | +-----------+ | |||
| | | | | | | |||
| | | | | | | |||
| | AS | | | AS | | |||
| |coordinator| | |coordinator| | |||
| | | | | | | |||
| | | | | | | |||
| +-----------+ | +-----------+ | |||
| SIP, -- \ --- | SIP, -- \ --- | |||
| RTP? -- \ ---- SIP, | RTP? -- \ ---- SIP, | |||
| skipping to change at page 10, line 4 ¶ | skipping to change at page 10, line 4 ¶ | |||
| +----------+ | | | +----------+ | | | |||
| | | | | | | | | | | |||
| | | | | | | | | | | |||
| | | | ASC | | | | | ASC | | |||
| | ASC | | | | | ASC | | | | |||
| | | | | | | | | | | |||
| | | +----------+ | | | +----------+ | |||
| +----------+ | +----------+ | |||
| Figure 2: Decoupled Architecture | Figure 2: Decoupled Architecture | |||
| credit card number, or what have you. In other cases, the | ||||
| coordinating component may need to get intermediate results, so that | ||||
| it can guide the operation of the IVR moving forward. This requires a | it can guide the operation of the IVR moving forward. This requires a | |||
| companion control channel that provides data output from the | companion control channel that provides data output from the | |||
| component server back to the client, and then returns further high | component server back to the client, and then returns further high | |||
| level instructions from the client back to the server. | level instructions from the client back to the server. | |||
| There is a thin line in some cases between this control channel and | There is a thin line in some cases between this control channel and | |||
| the tightly coupled interactions of a master-slave MGCP relationship. | the tightly coupled interactions of a master-slave MGCP relationship. | |||
| However, the loosely coupled nature of the interaction can be | However, the loosely coupled nature of the interaction can be | |||
| maintained by using coarse-grained data passing over a distributed | maintained by using coarse-grained data passing over a distributed | |||
| client-server protocol, such as HTTP or Corba. | client-server protocol, such as HTTP or Corba. | |||
| skipping to change at page 17, line 4 ¶ | skipping to change at page 16, line 50 ¶ | |||
| caller (if they were using a softphone), the service execution code | caller (if they were using a softphone), the service execution code | |||
| is unchanged. | is unchanged. | |||
| Others have proposed that DTMF digits be carried in SIP directly from | Others have proposed that DTMF digits be carried in SIP directly from | |||
| the caller to the AS [9,10]. However, this approach does not work | the caller to the AS [9,10]. However, this approach does not work | |||
| for anything beyond DTMF, while our approach works for DTMF, speech, | for anything beyond DTMF, while our approach works for DTMF, speech, | |||
| and web interfaces. Another drawback of the DTMF-in-SIP approach is | and web interfaces. Another drawback of the DTMF-in-SIP approach is | |||
| that all entities on the call signaling path will receive any DTMF | that all entities on the call signaling path will receive any DTMF | |||
| digits dialed by the called party. Furthermore, since the caller | digits dialed by the called party. Furthermore, since the caller | |||
| doesn't know if there is an entity interested in DTMF, it is required | doesn't know if there is an entity interested in DTMF, it is required | |||
| to send DTMF within SIP messages all the time, even if no entity is | ||||
| interested. | ||||
| Caller Coordinator Media Server Callee | Caller Coordinator Media Server Callee | |||
| | | | | | | | | | | |||
| |(1) SIP INV | | | | |(1) SIP INV | | | | |||
| |--------------->|(2) SIP INV | | | |--------------->|(2) SIP INV | | | |||
| | |----------------->| | | | |----------------->| | | |||
| | |(3) 200 OK | | | | |(3) 200 OK | | | |||
| | |<-----------------| | | | |<-----------------| | | |||
| | |(4) SIP ACK | | | | |(4) SIP ACK | | | |||
| | |----------------->| | | | |----------------->| | | |||
| | |(5) SIP INV | | | | |(5) SIP INV | | | |||
| skipping to change at page 18, line 4 ¶ | skipping to change at page 18, line 4 ¶ | |||
| | |<-----------------+-----------------| | | |<-----------------+-----------------| | |||
| | |(19) SIP ACK | | | | |(19) SIP ACK | | | |||
| | |------------------+---------------->| | | |------------------+---------------->| | |||
| | | | | | | | | | | |||
| | | | | | | | | | | |||
| | | | | | | | | | | |||
| | | | | | | | | | | |||
| | | | | | | | | | | |||
| Figure 3: Call Flow for DTMF Enabled Hold Service | Figure 3: Call Flow for DTMF Enabled Hold Service | |||
| to send DTMF within SIP messages all the time, even if no entity is | ||||
| interested. | ||||
| There have been proposals for adding a subscription/notification | There have been proposals for adding a subscription/notification | |||
| mechanism on top of this to avoid this problem. However, this further | mechanism on top of this to avoid this problem. However, this further | |||
| complicates the system by adding a requirement for the caller to | complicates the system by adding a requirement for the caller to | |||
| support a subscription and notification service just for DTMF. | support a subscription and notification service just for DTMF. | |||
| Our approach fits well within the existing SIP framework, and | Our approach fits well within the existing SIP framework, and | |||
| requires no additional work from the end users. Furthermore, it | requires no additional work from the end users. Furthermore, it | |||
| transparently supports multiple application server components | transparently supports multiple application server components | |||
| receiving DTMF. This is because an AS is able to send a DTMF stream | receiving DTMF. This is because an AS is able to send a DTMF stream | |||
| to a component by adding a new media line to the list of media | to a component by adding a new media line to the list of media | |||
| skipping to change at page 20, line 4 ¶ | skipping to change at page 19, line 48 ¶ | |||
| 6 Patterns for Accessing Components | 6 Patterns for Accessing Components | |||
| In this section, we propose a set of patterns that define the | In this section, we propose a set of patterns that define the | |||
| interaction of a controller with an application server component. | interaction of a controller with an application server component. | |||
| These patterns manifest themselves in the description of the service | These patterns manifest themselves in the description of the service | |||
| invoked when a session is initiated, a discussion of the naming | invoked when a session is initiated, a discussion of the naming | |||
| conventions of the service, and a description of any back channel | conventions of the service, and a description of any back channel | |||
| used for control and data passing. | used for control and data passing. | |||
| 6.1 Interactive Voice Response Services | 6.1 Interactive Voice Response Services | |||
| We have touched upon the basics of the interaction between a | ||||
| controller and an IVR server. The controller initiates a call to the | ||||
| server, the server executes some kind of IVR service, and data is | ||||
| Caller A B Callee | Caller A B Callee | |||
| | | | | | | | | | | |||
| |(1) SIP INV | | | | |(1) SIP INV | | | | |||
| |-------------->|(2) SIP INV | | | |-------------->|(2) SIP INV | | | |||
| | |--------------->|(3) SIP INV | | | |--------------->|(3) SIP INV | | |||
| | | |---------------->| | | | |---------------->| | |||
| | | |(4) 200 OK | | | | |(4) 200 OK | | |||
| | |(5) 200 OK |<----------------| | | |(5) 200 OK |<----------------| | |||
| |(6) 200 OK |<---------------| | | |(6) 200 OK |<---------------| | | |||
| |<--------------| | | | |<--------------| | | | |||
| skipping to change at page 20, line 42 ¶ | skipping to change at page 20, line 42 ¶ | |||
| |(18) SIP ACK |<---------------| | | |(18) SIP ACK |<---------------| | | |||
| |<--------------| | | | |<--------------| | | | |||
| | | | | | | | | | | |||
| | | | | | | | | | | |||
| | | | | | | | | | | |||
| | | | | | | | | | | |||
| | | | | | | | | | | |||
| Figure 4: Multiple Application Servers and DTMF | Figure 4: Multiple Application Servers and DTMF | |||
| We have touched upon the basics of the interaction between a | possibly fed back to the controller with intermediate and/or final | |||
| controller and an IVR server. The controller initiates a call to the | results of the IVR interaction. | |||
| server, the server executes some kind of IVR service, and data is | ||||
| A number of questions still need to be answered, however: | ||||
| 1. How is the IVR service identified? | 1. How is the IVR service identified? | |||
| 2. How can the controller specify the details of the dialog | 2. How can the controller specify the details of the dialog | |||
| the IVR carries out with the user? | the IVR carries out with the user? | |||
| 3. How does data from the IVR get passed back to the | 3. How does data from the IVR get passed back to the | |||
| controller? | controller? | |||
| 4. How is intermediate control performed (e.g., to interrupt | 4. How is intermediate control performed (e.g., to interrupt | |||
| skipping to change at page 22, line 47 ¶ | skipping to change at page 22, line 44 ¶ | |||
| which is returned by the controller (7). The prompts are played to | which is returned by the controller (7). The prompts are played to | |||
| the caller, and the identity of the called party is collected. This | the caller, and the identity of the called party is collected. This | |||
| is passed to the controller through another POST (8), which returns | is passed to the controller through another POST (8), which returns | |||
| an empty VoiceXML script (9)[1] complete, the controller hangs up | an empty VoiceXML script (9)[1] complete, the controller hangs up | |||
| with it (10 and 11). The information the controller got in the POST | with it (10 and 11). The information the controller got in the POST | |||
| (8) is used to determine the next hop SIP server, and the initial | (8) is used to determine the next hop SIP server, and the initial | |||
| INVITE is proxied there (12). | INVITE is proxied there (12). | |||
| Its important to observe the all call control related to executing | Its important to observe the all call control related to executing | |||
| the service lives within the controlling application server. The IVR | the service lives within the controlling application server. The IVR | |||
| application server deals strictly with the media component. This | ||||
| division of work, as we have discussed above, allows for independent | ||||
| _________________________ | _________________________ | |||
| [1] Note that it is unusual for an empty script to be | [1] Note that it is unusual for an empty script to be | |||
| returned; this is because we want the AS to maintain | returned; this is because we want the AS to maintain | |||
| control of the call signaling | control of the call signaling | |||
| application server deals strictly with the media component. This | ||||
| division of work, as we have discussed above, allows for independent | ||||
| evolution of the call control and media components of services. For | evolution of the call control and media components of services. For | |||
| example, if the desired called party did not have a reachable SIP | example, if the desired called party did not have a reachable SIP | |||
| address, but they did have an email address, the call could be | address, but they did have an email address, the call could be | |||
| redirected to a mailto URL. To support this twist, only the | redirected to a mailto URL. To support this twist, only the | |||
| controlling application server code need change. The media component | controlling application server code need change. The media component | |||
| remains completely and totally unchanged. | remains completely and totally unchanged. | |||
| Readers familiar with VoiceXML will observe that VoiceXML almost | Readers familiar with VoiceXML will observe that VoiceXML almost | |||
| achieves this perfect separation. It lacks any call control excepting | achieves this perfect separation. It lacks any call control excepting | |||
| a two - for call transfer and call termination. These tags are | a two - for call transfer and call termination. These tags are | |||
| skipping to change at page 24, line 4 ¶ | skipping to change at page 23, line 50 ¶ | |||
| We observe once more that all of these conferencing "servers" are | We observe once more that all of these conferencing "servers" are | |||
| really conferencing applications that are just bundled as a server. | really conferencing applications that are just bundled as a server. | |||
| These conferencing applications can be decomposed into components in | These conferencing applications can be decomposed into components in | |||
| exactly the way we have described above. At the core of each of these | exactly the way we have described above. At the core of each of these | |||
| conferencing applications is a mixing service. This service is | conferencing applications is a mixing service. This service is | |||
| responsible for taking N audio or video streams, mixing them | responsible for taking N audio or video streams, mixing them | |||
| according to some matrix, and returning the mixed stream to each | according to some matrix, and returning the mixed stream to each | |||
| participant. Issues such as conference policy, provisioning of | participant. Issues such as conference policy, provisioning of | |||
| conferences, and authentication are all completely separate and | conferences, and authentication are all completely separate and | |||
| outside of this basic mixing component. | ||||
| | INVITE (1) | | | | INVITE (1) | | | |||
| |------------------------>| | | |------------------------>| | | |||
| | | INVITE (2) | | | | INVITE (2) | | |||
| | |------------------------->| | | |------------------------->| | |||
| | | 200 OK (3) | | | | 200 OK (3) | | |||
| | |<-------------------------| | | |<-------------------------| | |||
| | 183 (4) | | | | 183 (4) | | | |||
| |<------------------------| | | |<------------------------| | | |||
| | | ACK (5) | | | | ACK (5) | | |||
| | |------------------------->| | | |------------------------->| | |||
| skipping to change at page 25, line 4 ¶ | skipping to change at page 25, line 4 ¶ | |||
| | | | | | | | | |||
| | | | | | | | | |||
| | | | | | | | | |||
| | | | | | | | | |||
| | | | | | | | | |||
| | | | | | | | | |||
| Caller Controller IVR Server | Caller Controller IVR Server | |||
| Figure 5: Interaction of App Server and IVR Component | Figure 5: Interaction of App Server and IVR Component | |||
| outside of this basic mixing component. | ||||
| For this reason, we argue that a large variety of conferencing | For this reason, we argue that a large variety of conferencing | |||
| applications can be easily constructed by having the mixing service | applications can be easily constructed by having the mixing service | |||
| as separate application server component. | as separate application server component. | |||
| What does the interface to such a mixing server look like? For the | What does the interface to such a mixing server look like? For the | |||
| call control interface, users would join a conference by calling the | call control interface, users would join a conference by calling the | |||
| server. The server would answer the call, thus appearing as a SIP | server. The server would answer the call, thus appearing as a SIP | |||
| UAS. The media sent from the user is mixed with other users in the | UAS. The media sent from the user is mixed with other users in the | |||
| conference, and the media sent back to the user is the mixed stream. | conference, and the media sent back to the user is the mixed stream. | |||
| The user can leave the conference by sending a BYE to the server, and | The user can leave the conference by sending a BYE to the server, and | |||
| skipping to change at page 28, line 4 ¶ | skipping to change at page 27, line 49 ¶ | |||
| in mechanisms for session state sharing between the SIP and HTTP | in mechanisms for session state sharing between the SIP and HTTP | |||
| components. | components. | |||
| For this simple conferencing service, it was sufficient for the | For this simple conferencing service, it was sufficient for the | |||
| controller to act as a proxy. Thats because it does not need to | controller to act as a proxy. Thats because it does not need to | |||
| forcibly kick anyone out of the conference once they are in. To | forcibly kick anyone out of the conference once they are in. To | |||
| support that kind of functionality, third party call control is | support that kind of functionality, third party call control is | |||
| needed. Let us examine a more complex service in the next section. | needed. Let us examine a more complex service in the next section. | |||
| 6.2.2 Web Scheduled, IVR supported, Time Limited Conference | 6.2.2 Web Scheduled, IVR supported, Time Limited Conference | |||
| In this more complex example, we once again wish to use a web | ||||
| interface to set up the conferences. However, we wish to add a stop | ||||
| time. If there are participants in the conference when the stop time | ||||
| | | | | (1) HTTP POST | | | | | | | (1) HTTP POST | | | |||
| |--------------------------->| | | |--------------------------->| | | |||
| | | | | (2) 200 OK | | | | | | | (2) 200 OK | | | |||
| |<---------------------------| | | |<---------------------------| | | |||
| | | | | | | | | | | | | | | |||
| | | | | (3) INVITE | | | | | | | (3) INVITE | | | |||
| | |----------------------->| (4) INVITE | | | |----------------------->| (4) INVITE | | |||
| | | | | |--------------------->| | | | | | |--------------------->| | |||
| | | | | | (5) 200 OK | | | | | | | (5) 200 OK | | |||
| | | | | (6) 200 OK |<---------------------| | | | | | (6) 200 OK |<---------------------| | |||
| skipping to change at page 29, line 4 ¶ | skipping to change at page 29, line 4 ¶ | |||
| | | | |<---------------| | | | | | |<---------------| | | |||
| | | | |(17) ACK | | | | | | |(17) ACK | | | |||
| | | | |--------------->| | | | | | |--------------->| | | |||
| | | | | | | | | | | | | | | |||
| | | | | | | | | | | | | | | |||
| | | | | | | | | | | | | | | |||
| Web A B C Controller Mixer | Web A B C Controller Mixer | |||
| Figure 6: Web Scheduled Conference Services | Figure 6: Web Scheduled Conference Services | |||
| In this more complex example, we once again wish to use a web | ||||
| interface to set up the conferences. However, we wish to add a stop | ||||
| time. If there are participants in the conference when the stop time | ||||
| arrives, a warning announcement is played 10 minutes prior, and then | arrives, a warning announcement is played 10 minutes prior, and then | |||
| they are kicked off. In addition, when a user joins the conference, | they are kicked off. In addition, when a user joins the conference, | |||
| before they are added, they hear an announcement that states the name | before they are added, they hear an announcement that states the name | |||
| of the person that set up the conference, and what the start and stop | of the person that set up the conference, and what the start and stop | |||
| times are. They are then asked to speak their name. Then, they are | times are. They are then asked to speak their name. Then, they are | |||
| dropped in. The conference server then speaks their name, so that | dropped in. The conference server then speaks their name, so that | |||
| everyone knows who just joined. | everyone knows who just joined. | |||
| This seemingly complex service is very easily constructed by adding | This seemingly complex service is very easily constructed by adding | |||
| an IVR server as described above. Now, we have a controller, a mixing | an IVR server as described above. Now, we have a controller, a mixing | |||
| skipping to change at page 32, line ? ¶ | skipping to change at page 30, line 47 ¶ | |||
| These examples demonstrate the component model we are proposing. The | These examples demonstrate the component model we are proposing. The | |||
| mixing component does not have application level intelligence. It has | mixing component does not have application level intelligence. It has | |||
| a call control interface, allowing it to exist anywhere (and be | a call control interface, allowing it to exist anywhere (and be | |||
| provided by any ASP service) and yet be a callable resource by other | provided by any ASP service) and yet be a callable resource by other | |||
| application server components. By combining a controller with an IVR | application server components. By combining a controller with an IVR | |||
| server and the mixing server, complex and useful applications can be | server and the mixing server, complex and useful applications can be | |||
| constructed in a distributed fashion. | constructed in a distributed fashion. | |||
| 6.3 Continuous Text-to-Speech | 6.3 Continuous Text-to-Speech | |||
| Another example of an application server component is a continuous | ||||
| Text-to-Speech (TTS) converter. This kind of service allows a real | ||||
| time text stream (encapsulated in RTP using the RTP payload format | ||||
| Caller Controller IVR Server Mixing Server | Caller Controller IVR Server Mixing Server | |||
| | | | | | | | | | | |||
| | (1) INVITE | | | | | (1) INVITE | | | | |||
| |-------------->| (2) INVITE | | | |-------------->| (2) INVITE | | | |||
| | |----------------->| | | | |----------------->| | | |||
| | | (3) 200 OK | | | | | (3) 200 OK | | | |||
| | (4) 183 |<-----------------| | | | (4) 183 |<-----------------| | | |||
| |<--------------| | | | |<--------------| | | | |||
| | | (5) ACK | | | | | (5) ACK | | | |||
| | |----------------->| | | | |----------------->| | | |||
| skipping to change at page 33, line 6 ¶ | skipping to change at page 33, line 6 ¶ | |||
| | (18) 200 OK | | | | (18) 200 OK | | | |||
| |<----------------------| | | |<----------------------| | | |||
| | | | | | | | | |||
| | | | | | | | | |||
| Controller Mixer IVR Server | Controller Mixer IVR Server | |||
| Figure 8: Advanced Web Scheduled Conference Service: Warning | Figure 8: Advanced Web Scheduled Conference Service: Warning | |||
| Announcement | Announcement | |||
| Another example of an application server component is a continuous | ||||
| Text-to-Speech (TTS) converter. This kind of service allows a real | ||||
| time text stream (encapsulated in RTP using the RTP payload format | ||||
| for text [14] to be received, which is then converted to speech and | for text [14] to be received, which is then converted to speech and | |||
| returned as an audio stream encoded using a traditional speech codec, | returned as an audio stream encoded using a traditional speech codec, | |||
| be it G.723.1, G.711, or what have you. | be it G.723.1, G.711, or what have you. | |||
| Like the IVR server and mixing server, the TTS server acts as a user | Like the IVR server and mixing server, the TTS server acts as a user | |||
| agent server. It answers incoming calls, and basically mirrors | agent server. It answers incoming calls, and basically mirrors | |||
| incoming text back as speech. It continutes to do so until the call | incoming text back as speech. It continutes to do so until the call | |||
| is hung up by the initiating client. | is hung up by the initiating client. | |||
| A TTS service can be done using VoiceXML with an IVR server, as in | A TTS service can be done using VoiceXML with an IVR server, as in | |||
| skipping to change at page 34, line 15 ¶ | skipping to change at page 34, line 12 ¶ | |||
| payload type number bound to text/t140. The stream MUST be marked as | payload type number bound to text/t140. The stream MUST be marked as | |||
| receive-only. | receive-only. | |||
| The client then ACKs the request. The TTS server SHOULD attempt to | The client then ACKs the request. The TTS server SHOULD attempt to | |||
| convert all text received on the incoming text stream to speech, and | convert all text received on the incoming text stream to speech, and | |||
| return the resulting speech on the outgoing audio stream. | return the resulting speech on the outgoing audio stream. | |||
| 6.3.2 Hearing Impaired Service | 6.3.2 Hearing Impaired Service | |||
| The TTS server is extremely useful in supporting hearing impaired | The TTS server is extremely useful in supporting hearing impaired | |||
| services. Examples of such services are described in describes a | services. Examples of such services are described in [16]. | |||
| service where a controller accesses a TTS service. | Specifically, Section 2.4 describes a service where a controller | |||
| accesses a TTS service. | ||||
| 6.4 Messaging Servers | 6.4 Messaging Servers | |||
| Another type of application server component is a messaging server. | Another type of application server component is a messaging server. | |||
| Messaging servers allow for callers to record audio messages for | Messaging servers allow for callers to record audio messages for | |||
| users on the system. Users can also call into the server to retrieve | users on the system. Users can also call into the server to retrieve | |||
| these messages, delete them, and file them. The system operates | these messages, delete them, and file them. The system operates | |||
| through the use of voice prompts combined with DTMF detection and/or | through the use of voice prompts combined with DTMF detection and/or | |||
| speech recognition. The prompts that are played are context | speech recognition. The prompts that are played are context | |||
| dependent. A messaging server can be viewed as a specialized version | dependent. A messaging server can be viewed as a specialized version | |||
| skipping to change at page 35, line 9 ¶ | skipping to change at page 35, line 5 ¶ | |||
| An example usage of this application component is a web front end | An example usage of this application component is a web front end | |||
| that allows users to leave voicemail for company employees through | that allows users to leave voicemail for company employees through | |||
| the company web page. The page has a URL for each company employee. | the company web page. The page has a URL for each company employee. | |||
| If some user A clicks on a URL for employee B, A's phone rings. When | If some user A clicks on a URL for employee B, A's phone rings. When | |||
| A picks up, they hear a greeting to record a message for employee B. | A picks up, they hear a greeting to record a message for employee B. | |||
| The call flow for this application is the combination of third party | The call flow for this application is the combination of third party | |||
| call control combined with access to the service. It is shown in | call control combined with access to the service. It is shown in | |||
| Figure 9. | Figure 9. | |||
| | | | | | ||||
| | | (1) HTTP GET | | | ||||
| |-------------------->| | | ||||
| | | (2) 200 OK | | | ||||
| |<--------------------| | | ||||
| | | (3) INV | | | ||||
| | |<-------------| | | ||||
| | | (4) 200 OK | | | ||||
| | |------------->| | | ||||
| | | (5) ACK | | | ||||
| | |<-------------| | | ||||
| | | | (6) INV | | ||||
| | | |--------------------->| | ||||
| | | | (7) 200 OK | | ||||
| | | |<---------------------| | ||||
| | | | (8) ACK | | ||||
| | | |--------------------->| | ||||
| | | (9) INV | | | ||||
| | |<-------------| | | ||||
| | | (10) 200 OK | | | ||||
| | |------------->| | | ||||
| | | (11) ACK | | | ||||
| | |<-------------| | | ||||
| | | | | | ||||
| | | | | | ||||
| | | | | | ||||
| Web SIP Controller Messaging | ||||
| Caller Server | ||||
| Figure 9: Web Enabled Message Drops | ||||
| The caller, from a web page, clicks on the URL for the user they wish | The caller, from a web page, clicks on the URL for the user they wish | |||
| to leave a message for. The result is an HTTP request (1) to the | to leave a message for. The result is an HTTP request (1) to the | |||
| controller. The URI in this request would be some controller-specific | controller. The URI in this request would be some controller-specific | |||
| identifier that tells the controller what it needs to do. The | identifier that tells the controller what it needs to do. The | |||
| controller then calls the user (3) using an SDP with a single media | controller then calls the user (3) using an SDP with a single media | |||
| stream on hold initially. This is accepted (4), and the resulting SDP | stream on hold initially. This is accepted (4), and the resulting SDP | |||
| is used in an INVITE to the messaging server (6). The URI of this | is used in an INVITE to the messaging server (6). The URI of this | |||
| INVITE is that for message drop with standard greeting (sip:sub- | INVITE is that for message drop with standard greeting (sip:sub- | |||
| jdrosen-deposit@voiceserver.com). The call is accepted (7) and the | jdrosen-deposit@voiceserver.com). The call is accepted (7) and the | |||
| 200 OK is used in a re-INVITE to the caller (9) to set the address of | 200 OK is used in a re-INVITE to the caller (9) to set the address of | |||
| skipping to change at page 36, line 4 ¶ | skipping to change at page 36, line 46 ¶ | |||
| components could be offered by separate providers, for example, | components could be offered by separate providers, for example, | |||
| enabling an ASP component model to evolve. We have observed that many | enabling an ASP component model to evolve. We have observed that many | |||
| of the components can be described as having some kind of session | of the components can be described as having some kind of session | |||
| level resource that can be communicated with, usually in an automated | level resource that can be communicated with, usually in an automated | |||
| fashion. Access to these resources is typically parameterized. As a | fashion. Access to these resources is typically parameterized. As a | |||
| result, SIP access, using the request URI as a service indicator, is | result, SIP access, using the request URI as a service indicator, is | |||
| an ideal way to communicate across these components. | an ideal way to communicate across these components. | |||
| To validate this model, we examined the specific service interfaces | To validate this model, we examined the specific service interfaces | |||
| that would be defined by IVR servers, conferencing servers, text-to- | that would be defined by IVR servers, conferencing servers, text-to- | |||
| | | | | | ||||
| | | (1) HTTP GET | | | ||||
| |-------------------->| | | ||||
| | | (2) 200 OK | | | ||||
| |<--------------------| | | ||||
| | | (3) INV | | | ||||
| | |<-------------| | | ||||
| | | (4) 200 OK | | | ||||
| | |------------->| | | ||||
| | | (5) ACK | | | ||||
| | |<-------------| | | ||||
| | | | (6) INV | | ||||
| | | |--------------------->| | ||||
| | | | (7) 200 OK | | ||||
| | | |<---------------------| | ||||
| | | | (8) ACK | | ||||
| | | |--------------------->| | ||||
| | | (9) INV | | | ||||
| | |<-------------| | | ||||
| | | (10) 200 OK | | | ||||
| | |------------->| | | ||||
| | | (11) ACK | | | ||||
| | |<-------------| | | ||||
| | | | | | ||||
| | | | | | ||||
| | | | | | ||||
| Web SIP Controller Messaging | ||||
| Caller Server | ||||
| Figure 9: Web Enabled Message Drops | ||||
| speech servers and messaging servers. We gave call flows of complex | speech servers and messaging servers. We gave call flows of complex | |||
| applications built up from these components using the specified | applications built up from these components using the specified | |||
| interfaces. | interfaces. | |||
| 9 Author's Addresses | 9 Changes from -00 | |||
| o Minor edits | ||||
| 10 Author's Addresses | ||||
| Jonathan Rosenberg | Jonathan Rosenberg | |||
| dynamicsoft | dynamicsoft | |||
| 72 Eagle Rock Avenue | 72 Eagle Rock Avenue | |||
| First Floor | First Floor | |||
| East Hanover, NJ 07936 | East Hanover, NJ 07936 | |||
| email: jdrosen@dynamicsoft.com | email: jdrosen@dynamicsoft.com | |||
| Peter Mataga | Peter Mataga | |||
| dynamicsoft | dynamicsoft | |||
| 72 Eagle Rock Avenue | 72 Eagle Rock Avenue | |||
| First Floor | First Floor | |||
| East Hanover, NJ 07936 | East Hanover, NJ 07936 | |||
| email: jdrosen@dynamicsoft.com | email: pmataga@dynamicsoft.com | |||
| Henning Schulzrinne | Henning Schulzrinne | |||
| Columbia University | Columbia University | |||
| M/S 0401 | M/S 0401 | |||
| 1214 Amsterdam Ave. | 1214 Amsterdam Ave. | |||
| New York, NY 10027-7003 | New York, NY 10027-7003 | |||
| email: schulzrinne@cs.columbia.edu | email: schulzrinne@cs.columbia.edu | |||
| 10 Bibliography | 11 Bibliography | |||
| [1] N. Greene, M. Ramalho, and B. Rosen, "Media gateway control | [1] N. Greene, M. Ramalho, and B. Rosen, "Media gateway control | |||
| protocol architecture and requirements," Request for Comments 2805, | protocol architecture and requirements," Request for Comments 2805, | |||
| Internet Engineering Task Force, Apr. 2000. | Internet Engineering Task Force, Apr. 2000. | |||
| [2] M. Arango, A. Dugan, I. Elliott, C. Huitema, and S. Pickett, | [2] M. Arango, A. Dugan, I. Elliott, C. Huitema, and S. Pickett, | |||
| "Media gateway control protocol (MGCP) version 1.0," Request for | "Media gateway control protocol (MGCP) version 1.0," Request for | |||
| Comments 2705, Internet Engineering Task Force, Oct. 1999. | Comments 2705, Internet Engineering Task Force, Oct. 1999. | |||
| [3] F. Cuervo, N. Greene, C. Huitema, A. Rayhan, B. Rosen, and J. | [3] F. Cuervo, N. Greene, C. Huitema, A. Rayhan, B. Rosen, and J. | |||
| skipping to change at page 39, line 5 ¶ | skipping to change at page 38, line 47 ¶ | |||
| [13] S. Donovan, "The SIP INFO method," Request for Comments 2976, | [13] S. Donovan, "The SIP INFO method," Request for Comments 2976, | |||
| Internet Engineering Task Force, Oct. 2000. | Internet Engineering Task Force, Oct. 2000. | |||
| [14] G. Hellstrom, "RTP payload for text conversation," Request for | [14] G. Hellstrom, "RTP payload for text conversation," Request for | |||
| Comments 2793, Internet Engineering Task Force, May 2000. | Comments 2793, Internet Engineering Task Force, May 2000. | |||
| [15] H. Alvestrand, "Tags for the identification of languages," | [15] H. Alvestrand, "Tags for the identification of languages," | |||
| Request for Comments 1766, Internet Engineering Task Force, Mar. | Request for Comments 1766, Internet Engineering Task Force, Mar. | |||
| 1995. | 1995. | |||
| [16] J. Rosenberg, H. Schulzrinne, and H. Sinnreich, "Sip enabled | ||||
| services to support the hearing impaired," Internet Draft, Internet | ||||
| Engineering Task Force, July 2000. Work in progress. | ||||
| Table of Contents | Table of Contents | |||
| 1 Introduction ........................................ 2 | 1 Introduction ........................................ 2 | |||
| 2 Why Decompose ....................................... 2 | 2 Why Decompose ....................................... 2 | |||
| 3 Tightly Coupled Decomposition ....................... 4 | 3 Tightly Coupled Decomposition ....................... 4 | |||
| 4 The Decoupled Model ................................. 6 | 4 The Decoupled Model ................................. 6 | |||
| 4.1 Architecture ........................................ 7 | 4.1 Architecture ........................................ 6 | |||
| 4.2 Benefits of the Decoupling .......................... 10 | 4.2 Benefits of the Decoupling .......................... 10 | |||
| 5 Architecture for the Interfaces ..................... 11 | 5 Architecture for the Interfaces ..................... 11 | |||
| 5.1 Naming .............................................. 12 | 5.1 Naming .............................................. 12 | |||
| 5.2 Additional Message Content .......................... 14 | 5.2 Additional Message Content .......................... 14 | |||
| 5.3 Session Duration .................................... 14 | 5.3 Session Duration .................................... 14 | |||
| 5.4 Third Party Call Control ............................ 15 | 5.4 Third Party Call Control ............................ 15 | |||
| 5.5 Side Channels ....................................... 18 | 5.5 Side Channels ....................................... 18 | |||
| 6 Patterns for Accessing Components ................... 19 | 6 Patterns for Accessing Components ................... 19 | |||
| 6.1 Interactive Voice Response Services ................. 19 | 6.1 Interactive Voice Response Services ................. 19 | |||
| 6.2 Conferencing Servers ................................ 23 | 6.2 Conferencing Servers ................................ 23 | |||
| 6.2.1 Web Scheduled Conference Services ................... 26 | 6.2.1 Web Scheduled Conference Services ................... 26 | |||
| 6.2.2 Web Scheduled, IVR supported, Time Limited | 6.2.2 Web Scheduled, IVR supported, Time Limited | |||
| Conference ..................................................... 27 | Conference ..................................................... 27 | |||
| 6.3 Continuous Text-to-Speech ........................... 30 | 6.3 Continuous Text-to-Speech ........................... 30 | |||
| 6.3.1 Service Interface ................................... 33 | 6.3.1 Service Interface ................................... 33 | |||
| 6.3.2 Hearing Impaired Service ............................ 34 | 6.3.2 Hearing Impaired Service ............................ 34 | |||
| 6.4 Messaging Servers ................................... 34 | 6.4 Messaging Servers ................................... 34 | |||
| 6.4.1 Service Interface ................................... 34 | 6.4.1 Service Interface ................................... 34 | |||
| 6.4.2 Web Enabled Message Drops ........................... 34 | 6.4.2 Web Enabled Message Drops ........................... 34 | |||
| 7 Security Considerations ............................. 35 | 7 Security Considerations ............................. 36 | |||
| 8 Conclusion .......................................... 35 | 8 Conclusion .......................................... 36 | |||
| 9 Author's Addresses .................................. 37 | 9 Changes from -00 .................................... 36 | |||
| 10 Bibliography ........................................ 37 | 10 Author's Addresses .................................. 37 | |||
| 11 Bibliography ........................................ 37 | ||||
| End of changes. 31 change blocks. | ||||
| 70 lines changed or deleted | 82 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||