idnits 2.17.1 draft-ietf-sipping-3pcc-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about the list of Shadow Directories. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 16) being 107 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 2 instances of too long lines in the document, the longest one being 5 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The "Author's Address" (or "Authors' Addresses") section title is misspelled. == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (May 10, 2002) is 8021 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. '1' -- Possible downref: Non-RFC (?) normative reference: ref. '2' -- Possible downref: Non-RFC (?) normative reference: ref. '4' -- Possible downref: Non-RFC (?) normative reference: ref. '5' -- Possible downref: Non-RFC (?) normative reference: ref. '6' -- Possible downref: Non-RFC (?) normative reference: ref. '7' -- Obsolete informational reference (is this intentional?): RFC 1889 (ref. '8') (Obsoleted by RFC 3550) Summary: 3 errors (**), 0 flaws (~~), 4 warnings (==), 9 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force SIPPING WG 3 Internet Draft J. Rosenberg 4 dynamicsoft 5 J. Peterson 6 Neustar 7 H. Schulzrinne 8 Columbia U. 9 G. Camarillo 10 Ericsson 11 draft-ietf-sipping-3pcc-00.txt 12 May 10, 2002 13 Expires: November 2002 15 Best Current Practices for Third Party Call Control 16 in the Session Initiation Protocol 18 STATUS OF THIS MEMO 20 This document is an Internet-Draft and is in full conformance with 21 all provisions of Section 10 of RFC2026. 23 Internet-Drafts are working documents of the Internet Engineering 24 Task Force (IETF), its areas, and its working groups. Note that 25 other groups may also distribute working documents as Internet- 26 Drafts. 28 Internet-Drafts are draft documents valid for a maximum of six months 29 and may be updated, replaced, or obsoleted by other documents at any 30 time. It is inappropriate to use Internet-Drafts as reference 31 material or to cite them other than as "work in progress". 33 The list of current Internet-Drafts can be accessed at 34 http://www.ietf.org/ietf/1id-abstracts.txt 36 To view the list Internet-Draft Shadow Directories, see 37 http://www.ietf.org/shadow.html. 39 Abstract 41 Third party call control refers to the ability of one entity to 42 create a call in which communications is actually between other 43 parties. Third party call control is possible using the mechanisms 44 specified within the Session Initiation Protocol (SIP). However, 45 there are several possible approaches, each with different benefits 46 and drawbacks. This document discusses best current practices for the 47 usage of the SIP for third party call control. 49 Table of Contents 51 1 Introduction ........................................ 3 52 2 Terminology ......................................... 3 53 3 Definitions ......................................... 4 54 4 3pcc Call Establishment ............................. 4 55 4.1 Flow I .............................................. 4 56 4.2 Flow II ............................................. 5 57 4.3 Flow III ............................................ 7 58 4.4 Flow IV ............................................. 8 59 4.5 Recommendations ..................................... 9 60 5 Error Handling ...................................... 10 61 6 Continued Processing ................................ 10 62 7 3pcc and Early Media ................................ 11 63 8 Third arty call control and SDP preconditions ....... 14 64 9 Example Call Flows .................................. 15 65 9.1 Click to Dial ....................................... 15 66 9.2 Mid-Call Announcement Capability .................... 18 67 10 Implementation Recommendations ...................... 20 68 11 Security Considerations ............................. 21 69 12 IANA Considerations ................................. 21 70 13 Authors Addresses ................................... 21 71 14 Normative References ................................ 22 72 15 Informative References .............................. 22 74 1 Introduction 76 (Note to RFC Editor - please replace all instances of RFC BBBB with 77 RFC 3261 when draft-ietf-sip-rfc2543bis is published as an RFC. 78 Please replace all instances of RFC MMMM with the RFC number of 79 draft-ietf-sip-manyfolks-resource when it issues as an RFC.) 81 In the traditional telephony context, third party call control allows 82 one entity (which we call the controller) to set up and manage a 83 communications relationship between two or more other parties. Third 84 party call control (referred to as 3pcc) is often used for operator 85 services (where an operator creates a call that connects two 86 participants together), and conferencing. 88 Similarly, many SIP services are possible through third party call 89 control. These include the traditional ones on the PSTN, but also new 90 ones such as click-to-dial. Click-to-dial allows a user to click on a 91 web page when they wish to speak to a customer service 92 representative. The web server then creates a call between the user 93 and a customer service representative. The call can be between two 94 phones, a phone and an IP host, or two IP hosts. 96 Third party call control is possible using only the mechanisms 97 specified within RFC BBBB [1]. Indeed, many different call flows are 98 possible, each of which will work with SIP compliant user agents. 99 However, there are benefits and drawbacks to each of these flows. The 100 usage of third party call control also becomes more complex when 101 aspects of the call utilize SIP extensions or optional features of 102 SIP. In particular, the usage of RFC MMMM [2] (used for coupling of 103 signaling to resource reservation) with third party call control is 104 non-trivial. Similarly, the usage of early media (where session data 105 is exchanged before the call is accepted) with third party call 106 control is not trivial. 108 This document serves as a best current practice for implementing 109 third party call control. Section 4 presents the known call flows 110 that can be used to achieve third party call control, and provides 111 guidelines on their usage. Section 8 discusses the interactions of 112 RFC MMMM [2] with third party call control. Section 7 discusses the 113 interactions of early media with third party call control. Section 9 114 provides example applications that make usage of the flows 115 recommended here. 117 2 Terminology 119 In this document, the key words "MUST", "MUST NOT", "REQUIRED", 120 "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", 121 and "OPTIONAL" are to be interpreted as described in RFC 2119 [3] and 122 indicate requirement levels for compliant implementations. 124 3 Definitions 126 The following terms are used throughout this document: 128 3pcc: Third Party Call Control, which refers to the general 129 ability to manipulate calls between other parties. 131 Controller: A controller is a SIP User Agent that wishes to 132 create a session between two other user agents. 134 4 3pcc Call Establishment 136 The primary primitive operation of third party call control is the 137 establishment of a session between participants A and B. 138 Establishment of this session is orchestrated by a third party, 139 referred to as the controller. 141 This section documents three call flows that the controller can 142 utilize in order to provide this primitive operation. 144 4.1 Flow I 146 A Controller B 147 |(1) INVITE no SDP | | 148 |<------------------| | 149 |(2) 200 offer1 | | 150 |------------------>| | 151 | |(3) INVITE offer1 | 152 | |------------------>| 153 | |(4) 200 OK answer1 | 154 | |<------------------| 155 | |(5) ACK | 156 | |------------------>| 157 |(6) ACK answer1 | | 158 |<------------------| | 159 |(7) RTP | | 160 |-------------------------------------->| 162 Figure 1: 3pcc Flow I 164 The call flow for Flow I is shown in Figure 1. The controller first 165 sends an INVITE A (1). This INVITE has no session description. A's 166 phone rings, and A answers. This results in a 200 OK (2) that 167 contains an offer [4]. The controller needs to send its answer in the 168 ACK, as mandated by [1]. To obtain the answer, it sends the offer it 169 got from A (offer1) in an INVITE to B (3). B's phone rings. When B 170 answers, the 200 OK (4) contains the answer to this offer, answer1. 171 The controller sends an ACK to B (5), and then passes answer1 to A in 172 an ACK sent to it (6). Because the offer was generated by A, and the 173 answer generated by B, the actual media session is between A and B. 174 Therefore, media flows between them (7). 176 This flow is simple, requires no manipulation of the SDP by the 177 controller, and works for any media types supported by both 178 endpoints. However, it has a serious timeout problem. User B may not 179 answer the call immediately. The result is that the controller cannot 180 send the ACK to A right away. This causes A to retransmit the 200 OK 181 response periodically. As specified in RFC BBBB Section 13.3.1.4, the 182 200 OK will be retransmitted for 64*T1 seconds. If an ACK does not 183 arrive by then, the call is considered to have failed. This limits 184 the applicability of this flow to scenarios where the controller 185 knows that B will answer the INVITE immediately. 187 4.2 Flow II 189 An alternative flow, Flow II, is shown in Figure 2. The controller 190 first sends an INVITE user A (1). This is a standard INVITE, 191 containing an offer (sdp1) with a single audio media line, one codec, 192 a random port number (but not zero), and a connection address of 193 0.0.0.0. This creates an initial media stream that is "black holed", 194 since no media (or RTCP packets [8]) will flow from A. The INVITE 195 causes A's phone to ring. 197 When A answers (2), the 200 OK contains an answer, sdp2. the 198 controller sends an ACK (4). It then generates a second INVITE (3). 199 This INVITE is addressed to user B, and it contains sdp2 as the offer 200 to B. Note that the role of sdp2 has changed. In the 200 OK (message 201 2), it was an answer, but in the INVITE, it is an offer. Fortunatly, 202 all valid answers are valid initial offers. This INVITE causes B's 203 phone to ring. When it answers, it generates a 200 OK (5) with an 204 answer, sdp3. The controller then generates an ACK (6). Next, it 205 sends a re-INVITE to A (7) containing sdp3 as the offer. Once again, 206 there has been a reversal of roles. sdp3 was an answer, and now it is 207 an offer. Fortunately, an answer to an answer recast as an offer is, 208 in turn, a valid offer. This re-INVITE generatea a 200 OK (8) with 209 sdp2, assuming that A doesn't decide to change any aspects of the 210 session as a result of this re-INVITE. This 200 OK is ACKed (9), and 211 then media can flow from A to B. Media from B to A could already 212 A Controller B 213 |(1) INVITE bh sdp1 | | 214 |<------------------| | 215 |(2) 200 sdp2 | | 216 |------------------>| | 217 | |(3) INVITE sdp2 | 218 | |------------------>| 219 |(4) ACK | | 220 |<------------------| | 221 | |(5) 200 OK sdp3 | 222 | |<------------------| 223 | |(6) ACK | 224 | |------------------>| 225 |(7) INVITE sdp3 | | 226 |<------------------| | 227 |(8) 200 OK sdp2 | | 228 |------------------>| | 229 |(9) ACK | | 230 |<------------------| | 231 |(10) RTP | | 232 |-------------------------------------->| 234 Figure 2: 3pcc Flow II 236 start flowing once message 5 was sent. 238 This flow has the advtange that all final responses are immediately 239 ACKed. If therefore does not suffer from the timeout and message 240 inefficiency problems of flow 1. However, it too has troubles. First 241 off, it requires that the controller know the media types to be used 242 for the call (since it must generate a "blackhole" SDP, which 243 requires media lines). Secondly, the first INVITE to A (1) contains 244 media with a 0.0.0.0 connection address. The controller expects that 245 the response contains a valid, non-zero connection address for A. 246 However, experience has shown that many UAs respond to an offer of a 247 0.0.0.0 connection address with an answer containing a 0.0.0.0 248 connection address. The offer-answer specification [4] now explicitly 249 tells implementors not to do this, but at the time of publication of 250 this document, many implementations still did. If A should respond 251 with a 0.0.0.0 connection address in sdp2, the flow will not work. 253 However, the most serious flaw in this flow is the assumption that 254 the 200 OK to the re-INVITE (message 8) contains the same SDP as in 255 message 2. This may not be the case. If it is not, the controller 256 needs to re-INVITE B with that SDP (say, sdp4), which may result in 257 getting a different SDP, sdp5 , in the 200 OK from B. Then, the 258 controller needs to re-INVITE A again, and so on. The result is an 259 infinite loop of re-INVITEs. It is possible to break this cycle by 260 having very smart UAs which can return the same SDP whenever 261 possible, or really smart controllers that can analyze the SDP to 262 determine if a re-INVITE is really needed. However, we wish to keep 263 this mechanism simple, and avoid SDP awareness in the controller. As 264 a result, this flow is not really workable. It is therefore NOT 265 RECOMMENDED. 267 4.3 Flow III 269 A Controller B 270 |(1) INVITE no SDP | | 271 |<---------------------| | 272 |(2) 200 offer1 | | 273 |--------------------->| | 274 |(3) ACK answer1 (bh) | | 275 |<---------------------| | 276 | |(4) INVITE no SDP | 277 | |--------------------->| 278 | |(5) 200 OK offer2 | 279 | |<---------------------| 280 |(6) INVITE offer2' | | 281 |<---------------------| | 282 |(7) 200 answer2' | | 283 |--------------------->| | 284 | |(8) ACK answer2 | 285 | |--------------------->| 286 |(9) ACK | | 287 |<---------------------| | 288 |(10) RTP | | 289 |-------------------------------------------->| 291 Figure 3: 3pcc Flow III 293 A thid flow, Flow III, is shown in Figure 3. 295 First, the controller sends an INVITE (1) to user A without any SDP 296 (which is good, since it means that the controller doesn't need to 297 assume anything about the media composition of the session). A's 298 phone rings. When A answers, a 200 OK is generated (2) containing its 299 offer, offer1. The controller generates an immediate ACK containing 300 an answer (3). This answer is a "black hole" SDP, with its connection 301 address set to 0.0.0.0. 303 The controller then sends an INVITE to B without SDP (4). This causes 304 B's phone to ring. When they answer, a 200 OK is sent, containing 305 their offer, offer2 (5). This SDP is used to create a re-INVITE back 306 to A (6). That re-INVITE is based on offer2, but may need to be 307 reorganized to match up media lines, or to trim media lines. For 308 example, if offer1 contained an audio and a video line, in that 309 order, but offer2 contained just an audio line, the controller would 310 need to add a video line to the offer (setting its port to zero) to 311 create offer2'. Since this is a re-INVITE, it should complete quickly 312 in the general case. Thats good, since user B is retransmitting their 313 200 OK, waiting for an ACK. The SDP in the 200 OK (7) from A, 314 answer2', may also need to be reorganized or trimmed before sending 315 it an the ACK to B (8) as answer2. Finally, an ACK is sent to A (9), 316 and then media can flow. 318 This flow has many benefits. First, it will usually operate without 319 any spurious retransmissions or timeouts (although this may still 320 happen if a re-INVITE is not responded to quickly). Secondly, it does 321 not require the controller to guess the media that will be used by 322 the participants. Thirdly, it does not assume that a device responds 323 properly to an INVITE with SDP containing a connection address of 324 0.0.0.0. 326 There are some drawbacks. The controller does need to perform SDP 327 manipulations. Specifically, it must take some SDP, and generate 328 another SDP which has the same media composition, but has connection 329 addresses of 0.0.0.0. This is needed for message 3. Secondly, it may 330 need to reorder and trim on SDP X, so that its media lines match up 331 with those in some other SDP, Y. Thirdly, the offer from B (offer2) 332 may have no codecs or media streams in common with the offer from A 333 (offer 1). The controller will need to detect this condition, and 334 terminate the call. Finally, the flow is far more complicated than 335 the simple and elegant Flow I (Figure 1). 337 4.4 Flow IV 339 Flow IV shows a variation on Flow III that reduces its complexity. 340 The actual message flow is identical, but the SDP placement and 341 construction differs. The initial INVITE (1) contains SDP with no 342 media at all, meaning that there are no m lines. This is valid, and 343 implies that the media makeup of the session will be established 344 later through a re-INVITE [4]. The 200 OK (2) has an answer with no 345 media either. This is acknowledged by the controller (3). The flow 346 A Controller B 347 |(1) INVITE offer1 | | 348 |no media | | 349 |<---------------------| | 350 |(2) 200 answer1 | | 351 |no media | | 352 |--------------------->| | 353 |(3) ACK | | 354 |<---------------------| | 355 | |(4) INVITE no SDP | 356 | |--------------------->| 357 | |(5) 200 OK offer2 | 358 | |<---------------------| 359 |(6) INVITE offer2' | | 360 |<---------------------| | 361 |(7) 200 answer2' | | 362 |--------------------->| | 363 | |(8) ACK answer2 | 364 | |--------------------->| 365 |(9) ACK | | 366 |<---------------------| | 367 |(10) RTP | | 368 |-------------------------------------------->| 370 Figure 4: 3pcc Flow IV 372 from this point onwards is identical to Flow III. However, the 373 manipuldations required to convert offer2 to offer2', and answer2' to 374 answer2, are much simpler. Indeed, no media manipulations are needed 375 at all. The only change that is needed is to modify the origin lines, 376 so that the origin line in offer2' is valid based on the value in 377 offer1 (validify requires that the version increments by one, and 378 that the other parameters remain unchanged). 380 4.5 Recommendations 382 Flow I (Figure 1) represents the simplest and the most efficient 383 flow. This flow SHOULD be used by a controller if it knows with 384 certainty that user B is actually an automata that will answer the 385 call immediately. This is the case for devices such as media servers, 386 conferencing servers, and messaging servers, for example. Since we 387 expect a great deal of third party call control to be to automata, 388 special caseing this scenario is reasonable. 390 For calls to unknown entities, or to entities known to represent 391 people, it is RECOMMENDED that Flow IV (Figure 4) be used for third 392 party call control. Flow III MAY be used instead, but it provides no 393 additional benefits over Flow IV. However, Flow II SHOULD NOT be 394 used, because of the potential for infinite ping-ponging of re- 395 INVITEs. 397 Several of these flows use a "black hole" connection address of 398 0.0.0.0. This is an IPV4 address with the property that packets sent 399 to it will never leave the host which sent them; they are just 400 discarded. Those flows are therefore specific to IPv4. For other 401 network or address types, an address with an equivalent property 402 SHOULD be used. 404 5 Error Handling 406 With all of the call flows in Section 4, one call is established to 407 A, and then the controller attempts to establish a call to B. 408 However, this call attempt may fail, for any number of reasons. User 409 B might be busy (resulting in a 486 response to the INVITE), there 410 may not be any media in common, the request may time out, and so on. 411 If the call attempt to B should fail, it is RECOMMENDED that the 412 controller send a BYE to A. This BYE SHOULD include a Reason header 413 [5] which carries the status code from the error response. This will 414 inform A of the precise reason for the failure. The information is 415 important from a user interface perspective. For example, if A was 416 calling from a black phone, and B generated a 486, the BYE will 417 contain a Reason code of 486, and this could be used to generate a 418 local busy signal so that A knows that B is busy. 420 6 Continued Processing 422 Once the calls are established, both participants believe they are in 423 a single point-to-point call. However, they are exchanging media 424 directly with each other, rather than with the controller. The 425 controller is involved in two dialogs, yet sees no media. 427 Since the controller is still a central point for signaling, it now 428 has complete control over the call. If it receives a BYE from one of 429 the participants, it can create a new BYE and hang up with the other 430 participant. This is shown in Figure 5. 432 Similarly, if it receives a re-INVITE from one of the participants, 433 it can forward it to the other participant. Depending on which flow 434 was used, this may require some manipulation on the SDP before 435 passing it on. 437 A Controller B 438 |(1) BYE | | 439 |------------------>| | 440 |(2) 200 OK | | 441 |<------------------| | 442 | |(3) BYE | 443 | |------------------>| 444 | |(4) 200 OK | 445 | |<------------------| 447 Figure 5: Hanging Up with 3PCC 449 However, the controller need not "proxy" the SIP messages received 450 from one of the parties. Since it is a B2BUA, it can invoke any 451 signaling mechanism on each dialog, as it sees fit. For example, if 452 the controller receives a BYE from A, it can generate a new INVITE to 453 a third party, C, and connect B to that participant instead A call 454 flow for this is shown in Figure 6, assuming the case where C 455 represents an end user, not an automata. Note that it is just Flow 456 IV. 458 From here, new parties can be added, removed, transferred, and so on, 459 as the controller sees fit. 461 It is important to point out that the call need not have been 462 established by the controller in order for the processing of this 463 section to be used. Rather, the controller could have acted as a 464 B2BUA during a call established by A towards B (or vice a versa). 466 7 3pcc and Early Media 468 Early media represents the condition where the session is established 469 (as a result of the completion of an offer/answer exchange), yet the 470 call itself has not been accepted. This is usually used to convey 471 tones or announcements regarding progress of the call. Handling of 472 early media in a third party call is straightforward. 474 Figure 7 shows the case where user B generates early media before 475 answering the call. The flow is almost identical to Flow IV from 476 Figure 4. The only difference is that user B generates a reliable 477 provisional response (5) [6] instead of a final response, and answer2 478 is carried in a PRACK (8) instead of an ACK. When party B finally 479 A Controller B C 480 |(1) BYE | | | 481 |--------------->| | | 482 |(2) 200 OK | | | 483 |<---------------| | | 484 | |(3) INV no media| | 485 | |-------------------------------->| 486 | |(4) 200 no media| | 487 | |<--------------------------------| 488 | |(5) ACK | | 489 | |-------------------------------->| 490 | |(6) INV no SDP | | 491 | |--------------->| | 492 | |(7) 200 offer3 | | 493 | |<---------------| | 494 | |(8) INV offer3' | | 495 | |-------------------------------->| 496 | |(9) 200 answer3'| | 497 | |<--------------------------------| 498 | |(10) ACK | | 499 | |-------------------------------->| 500 | |(11) ACK answer3| | 501 | |--------------->| | 502 | | |(12) RTP | 503 | | |--------------->| 505 Figure 6: Alternative to Hangup 507 does accept the call (11), there is no change in the session state, 508 and therefore, no signaling needs to be done with user A. The 509 controller simply ACKs the 200 OK (12) to confirm the dialog. 511 The case where user A generates early media is more complicated, and 512 is shown in Figure 8. The flow is based on Flow IV. The controller 513 sends an INVITE to user A (1), with an offer containing no media 514 streams. User A generates a reliable provisional response (2) 515 containing an answer with no media streams. The controller PRACKs 516 this provisional response (3). Now, the controller sends an INVITE 517 without SDP to user B (5). User B's phone rings, and they answer, 518 resulting in a 200 OK (6) with an offer, offer2. The controller now 519 needs to update the session parameters with user A. However, since 520 the call has not been answered, it cannot use a re-INVITE. Rather, it 521 uses a SIP UPDATE request (7) [7], passing the offer (after modifying 522 A Controller B 523 |(1) INVITE offer1 | | 524 |no media | | 525 |<---------------------| | 526 |(2) 200 answer1 | | 527 |no media | | 528 |--------------------->| | 529 |(3) ACK | | 530 |<---------------------| | 531 | |(4) INVITE no SDP | 532 | |--------------------->| 533 | |(5) 183 offer2 | 534 | |<---------------------| 535 |(6) INVITE offer2' | | 536 |<---------------------| | 537 |(7) 200 answer2' | | 538 |--------------------->| | 539 | |(8) PRACK answer2 | 540 | |--------------------->| 541 | |(9) 200 PRACK | 542 | |<---------------------| 543 |(10) RTP | | 544 |<--------------------------------------------| 545 | |(11) 200 OK | 546 | |<---------------------| 547 | |(12) ACK | 548 | |--------------------->| 550 Figure 7: Early Media from User B 552 it to get the origin field correct). User A generates its answer in 553 the 200 OK to the UPDATE (8). This answer is passed to user B in the 554 ACK (9). When user A finally answers (11), there is no change in 555 session state, so the controller simply ACKs the 200 OK (12). 557 Note that it is likely that there will be clipping of media in this 558 call flow. User A is likely a PSTN gateway, and has generated a 559 provisional response because of early media from the PSTN side. The 560 PSTN will deliver this media even though the gateway does not have 561 anywhere to send it, since the initial offer from the controller had 562 no media streams. When user B answers, media can begin to flow. 563 However, any media sent to the gateway from the PSTN up to that point 564 will be lost. 566 A Controller B 567 |(1) INVITE offer1 | | 568 |no media | | 569 |<---------------------| | 570 |(2) 183 answer1 | | 571 |no media | | 572 |--------------------->| | 573 |(3) PRACK | | 574 |<---------------------| | 575 |(4) 200 PRACK | | 576 |--------------------->| | 577 | |(5) INVITE no SDP | 578 | |--------------------->| 579 | |(6) 200 OK offer2 | 580 | |<---------------------| 581 |(7) UPDATE offer2' | | 582 |<---------------------| | 583 |(8) 200 answer2' | | 584 |--------------------->| | 585 | |(9) ACK answer2 | 586 | |--------------------->| 587 |(10) RTP | | 588 |-------------------------------------------->| 589 |(11) 200 OK | | 590 |--------------------->| | 591 |(12) ACK | | 592 |<---------------------| | 594 Figure 8: Early Media from User A 596 8 Third arty call control and SDP preconditions 598 A SIP extension has been specified that allows for the coupling of 599 signaling and resource reservation [2]. This draft relies on 600 exchanges of session descriptions before completion of the call 601 setup. These flows are initiated when certain SDP parameters are 602 passed in the initial INVITE. As a result, the interaction of this 603 mechanism with third party call control is not obvious, and worth 604 detailing. 606 Consider the call flow in Figure 9. The controller follows Flow IV; 607 it has no specific requirements for support of the preconditions 608 specification [2]. Indeed, there is no mechanism that can be used 609 with Flow IV which allows the controller to request preconditions. 610 Therefore, it sends an INVITE (1) with SDP that contains no media 611 lines. User A is interested in supporting preconditions, and does not 612 want to ring its phone until resources are reserved. Since there are 613 no media streams in the INVITE, it can't ring the phone until they 614 are conveyed in a subsequent offer. Therefore, it generates a 183 615 with the answer, and doesn't alert the user (2). The controller 616 PRACKs this (3) and A responds to the PRACK (4). 618 At this point, the controller attempts to bring B into the call. It 619 sends B an INVITE without SDP (5). B is interested in having 620 preconditions for this call. Therefore, it generates its offer in a 621 183 that contains the appropriate SDP attributes (6). The controller 622 passes this offer to A in an UPDATE request (7). The controller uses 623 UPDATE because the call has not been answered yet, and therefore, it 624 cannot use a re-INVITE. User A sees that its peer is capable of 625 supporting preconditions. Since it desires preconditions for the 626 call, it generates an answer in the 200 OK (8) to the UPDATE. This 627 answer, in turn, is passed to B in the PRACK for the provisional 628 response (9). Now, both sides perform resource reservation. User A 629 succeeds first, and passes an updated session description in an 630 UPDATE request (13). The controller simply passes this to A (after 631 the manipulation of the origin field, as required in Flow IV) in an 632 UPDATE (14), and the answer (15) is passed back to A (16). The same 633 flow happens, but from B to A, when B's reservation succeeds (17-20). 634 Since the preconditions have been met, both sides ring (21 and 22), 635 and then both answer (23 and 25), completing the call. 637 What is important about this flow is that the controller doesn't know 638 anything about preconditions. It merely passes the SDP back and forth 639 as needed. The trick is the usage of UPDATE and PRACK to pass the SDP 640 when needed. That determination is made entirely based on the 641 offer/answer rules described in [6] and [7], and is independent of 642 preconditions. 644 9 Example Call Flows 646 9.1 Click to Dial 648 The first application of this capability we discuss is click to dial. 649 In this service, a user is browsing the web page of an e-commerce 650 site, and would like to speak to a customer service representative. 651 They click on a link, and a call is placed to a customer service 652 representative. When the representative picks up, the phone on the 653 user's desk rings. When they pick up, the customer service 654 representative is there, ready to talk to the user. 656 A Controller B 657 |(1) INVITE offer1 | | 658 |no media | | 659 |<---------------------| | 660 |(2) 183 answer1 | | 661 |no media | | 662 |--------------------->| | 663 |(3) PRACK | | 664 |<---------------------| | 665 |(4) 200 OK | | 666 |--------------------->| | 667 | |(5) INVITE no SDP | 668 | |--------------------->| 669 | |(6) 183 OK offer2 | 670 | |des=sendrecv | 671 | |conf=recv | 672 | |cur=none | 673 | |<---------------------| 674 |(7) UPDATE offer2' | | 675 |des=sendrecv | | 676 |conf=recv | | 677 |cur=none | | 678 |<---------------------| | 679 |(8) 200 UPDATE | | 680 |answer2' | | 681 |des=sendrecv | | 682 |conf=recv | | 683 |cur=none | | 684 |--------------------->| | 685 | |(9) PRACK answer2 | 686 | |des=sendrecv | 687 | |conf=recv | 688 | |cur=none | 689 | |--------------------->| 690 | |(10) 200 PRACK | 691 | |<---------------------| 692 |(11) reservation | | 693 |-------------------------------------------->| 694 |(12) reservation | | 695 |<--------------------------------------------| 696 |(13) UPDATE offer3 | | 697 |des=sendrecv | | 698 |conf=recv | | 699 |cur=recv | | 700 |--------------------->| | 701 | |(14) UPDATE offer3' | 702 | |des=sendrecv | 703 | |conf=recv | 704 | |cur=recv | 705 | |--------------------->| 706 | |(15) 200 UPDATE | 707 | |answer3' | 708 | |des=sendrecv | 709 | |conf=recv | 710 | |cur=send | 711 | |<---------------------| 712 |(16) 200 UPDATE | | 713 |answer3 | | 714 |des=sendrecv | | 715 |conf=recv | | 716 |cur=send | | 717 |<---------------------| | 718 | |(17) UPDATE offer4 | 719 | |des=sendrecv | 720 | |conf=recv | 721 | |cur=sendrecv | 722 | |<---------------------| 723 |(18) UPDATE offer4' | | 724 |des=sendrecv | | 725 |conf=recv | | 726 |cur=sendrecv | | 727 |<---------------------| | 728 |(19) 200 UPDATE | | 729 |answer4' | | 730 |des=sendrecv | | 731 |conf=recv | | 732 |cur=sendrecv | | 733 |--------------------->| | 734 | |(20) 200 UPDATE | 735 | |answer4 | 736 | |des=sendrecv | 737 | |conf=recv | 738 | |cur=sendrecv | 739 | |--------------------->| 740 |(21) 180 INVITE | | 741 |--------------------->| | 742 | |(22) 180 INVITE | 743 | |<---------------------| 744 |(23) 200 INVITE | | 745 |--------------------->| | 746 |(24) ACK | | 747 |<---------------------| | 748 | |(25) 200 INVITE | 749 | |<---------------------| 750 | |(26) ACK | 751 | |--------------------->| 753 Figure 9: Call Flow for Preconditions 755 Customer Service Controller Users Phone Users Browser 756 | |(1) HTTP POST | | 757 | |<--------------------------------------| 758 | |(2) HTTP 200 OK | | 759 | |-------------------------------------->| 760 |(3) INVITE offer1 | | | 761 |no media | | | 762 |<------------------| | | 763 |(4) 200 answer1 | | | 764 |no media | | | 765 |------------------>| | | 766 |(5) ACK | | | 767 |<------------------| | | 768 | |(6) INVITE no SDP | | 769 | |------------------>| | 770 | |(7) 200 OK offer2 | | 771 | |<------------------| | 772 |(8) INVITE offer2' | | | 773 |<------------------| | | 774 |(9) 200 answer2' | | | 775 |------------------>| | | 776 | |(10) ACK answer2 | | 777 | |------------------>| | 778 |(11) ACK | | | 779 |<------------------| | | 780 |(12) RTP | | | 781 |-------------------------------------->| | 783 Figure 10: Click to Dial Call Flow 785 The call flow for this service is given in Figure 10. It is identical 786 to that of Figure 4, with the exception that the service is triggered 787 through an http GET request when the user clicks on the link. 789 We note that this service can be provided through other mechanisms, 790 namely PINT [9]. However, there are numerous differences between the 791 way in which the service is provided by pint, and the way in which it 792 is provided here: 794 o The pint solution enables calls only between two PSTN 795 endpoints. The solution described here allows calls between 796 PSTN phones (through SIP enabled gateways) and native IP 797 phones. 799 o When used for calls between two PSTN phones, the solution here 800 may result in a portion of the call being routed over the 801 Internet. In pint, the call is always routed only over the 802 PSTN. This may result in better quality calls with the pint 803 solution, depending on the codec in use and QoS capabilities 804 of the network routing the Internet portion of the call. 806 o The PINT solution requires extensions to SIP (PINT is an 807 extension to SIP), whereas the solution described here is done 808 with baseline SIP. 810 o The PINT solution allows the controller (acting as a PINT 811 client) to "step out" once the call is established. The 812 solution described here requires the controller to maintain 813 call state for the entire duration of the call. 815 9.2 Mid-Call Announcement Capability 817 The third party call control mechanism described here can also be 818 used to enable mid-call announcements. Consider a service for pre- 819 paid calling cards. Once the pre-paid call is established, the system 820 needs to set a timer to fire when they run out of minutes. When this 821 timer fires, we would like the user to hear an announcement which 822 tells them to enter a credit card to continue. Once they enter the 823 credit card info, more money is added to the pre-paid card, and the 824 user is reconnected to the destination party. 826 We consider here the usage of third party call control just for 827 playing the mid-call dialog to collect the credit card information. 829 We assume the call is set up so that the controller is in the call as 830 a B2BUA. When the timer fires, we wish to connect the caller to a 831 media server. The flow for this is shown in Figure 11. When the 832 timer expires, the controller places the called party with a 833 connection address of zero (1). This effectively "disconnects" the 834 called party. The controller then sends an INVITE without SDP to the 835 the pre-paid caller (4). The offer returned from the caller (5) is 836 used in an INVITE to the media server which will be collecting digits 837 (6). This is an instantiation of Flow II. This flow can only be used 838 here because the media server is an automata, and will answer the 839 INVITE immediately. If the controller was connecting the pre-paid 840 user with another end user, Flow III would need to be used. The media 841 server returns an immediate 200 OK (7) with an answer, which is 842 passed to the caller in an ACK (8). The result is that the media 843 server and the pre-paid caller have their media streams connected. 845 The media server plays an announcement, and prompts the user to enter 846 Pre-Paid User Controller Called Party Media Server 847 | |(1) INV SDP c=0 | | 848 | |------------------>| | 849 | |(2) 200 answer1 | | 850 | |<------------------| | 851 | |(3) ACK | | 852 | |------------------>| | 853 |(4) INV no SDP | | | 854 |<------------------| | | 855 |(5) 200 offer2 | | | 856 |------------------>| | | 857 | |(6) INV offer2 | | 858 | |-------------------------------------->| 859 | |(7) 200 answer2 | | 860 | |<--------------------------------------| 861 |(8) ACK answer2 | | | 862 |<------------------| | | 863 | |(9) ACK | | 864 | |-------------------------------------->| 865 |(10) RTP | | | 866 |---------------------------------------------------------->| 867 | |(11) BYE | | 868 | |-------------------------------------->| 869 | |(12) 200 OK | | 870 | |<--------------------------------------| 871 | |(13) INV no SDP | | 872 | |------------------>| | 873 | |(14) 200 offer3 | | 874 | |<------------------| | 875 |(15) INV offer3' | | | 876 |<------------------| | | 877 |(16) 200 answer3' | | | 878 |------------------>| | | 879 | |(17) ACK answer3' | | 880 | |------------------>| | 881 |(18) ACK | | | 882 |<------------------| | | 883 |(19) RTP | | | 884 |-------------------------------------->| | 886 Figure 11: Mid-Call Announcement 887 a credit card number. After collecting the number, the card number is 888 validated. The controller can then hang up the call to the media 889 server (11). How the controller can know when to hang up the call is 890 outside the scope of this document, and might have been done through 891 an HTTP message from the media server to the controller, for example. 893 After hanging up with the media server, the controller reconnects the 894 user to the original called party. To do this, the controller sends 895 an INVITE without SDP to the called party (13). The 200 OK (14) 896 contains an offer, offer3. The controller modifies the SDP (as is 897 done in Flow III), and passes the offer in an INVITE to the pre-paid 898 user (15). The pre-paid user generates an answer in a 200 OK (16) 899 which the controller passes to user B in the ACK (17). At this point, 900 the caller and called party are reconnected. 902 10 Implementation Recommendations 904 Most of the work involved in supporting third party call control is 905 within the controller. A standard SIP UA should be controllable using 906 the mechanisms described here. However, third party call control 907 relies on a few features that might not be implemented. As such, we 908 RECOMMEND that implementors of user agent servers to support the 909 following: 911 o Re-invites that change the port to which media should be sent 913 o Re-invites that change the connection address 915 o Re-invites that add a media stream 917 o Re-invites that remove a media stream (setting its port to 918 zero) 920 o Re-invites that add a codec amongst the set in a media stream 922 o SDP Connection address of zero 924 o Initial invites with a connection address of zero 926 o Initial invites with no SDP 928 o Initial invites with SDP but no media lines 930 o Re-invites with no SDP 932 o The UPDATE method [7] 934 o Reliability of provisional responses [6] 936 11 Security Considerations 938 The mechanism described here introduces several security 939 considerations. The first issue is that of identity. When the 940 controller initiates the call, what identity does it place in the 941 From field of the INVITE? The controller could indicate that the call 942 is from itself (From: sip:controller@company.com), but in many cases, 943 the service is more usable if it "spoofs" the identity of the 944 participant that is actually calling. However, to differentiate 945 legitimate use of 3pcc from real attacks where a caller is faking an 946 identity, user agents SHOULDauthenticate the requests. The controller 947 will, of course, authenticate itself as the controller, rather than 948 either participant. It is RECOMMENDED that user agents be 949 configurable with credentials for entities that are legitimate 950 controllers. Note that this will result in SIP messages whose From 951 field does not match the identity of originator as determined from 952 the authentication mechanism. 954 Some of the flows require the controller to manipulate the SDP. If 955 S/MIME is used to encrypt or sign the bodies of the request end-to- 956 end, third party call control will fail. 958 12 IANA Considerations 960 There are no IANA considerations associated with this specification. 962 13 Authors Addresses 964 Jonathan Rosenberg 965 dynamicsoft 966 72 Eagle Rock Avenue 967 First Floor 968 East Hanover, NJ 07936 969 email: jdrosen@dynamicsoft.com 971 Jon Peterson 972 NeuStar, Inc 973 1800 Sutter Street, Suite 570 974 Concord, CA 94520 975 USA 976 email: jon.peterson@neustar.com 978 Henning Schulzrinne 979 Columbia University 980 M/S 0401 981 1214 Amsterdam Ave. 983 New York, NY 10027-7003 984 email: schulzrinne@cs.columbia.edu 986 Gonzalo Camarillo 987 Ericsson 988 Advanced Signalling Research Lab. 989 FIN-02420 Jorvas 990 Finland 991 Phone: +358 9 299 3371 992 Fax: +358 9 299 3052 993 Email: Gonzalo.Camarillo@ericsson.com 995 14 Normative References 997 [1] J. Rosenberg, H. Schulzrinne, et al. , "SIP: Session initiation 998 protocol," Internet Draft, Internet Engineering Task Force, Feb. 999 2002. Work in progress. 1001 [2] W. Marshall, G. Camarillo, and J. Rosenberg, "Integration of 1002 resource management and SIP," Internet Draft, Internet Engineering 1003 Task Force, Apr. 2002. Work in progress. 1005 [3] S. Bradner, "Key words for use in RFCs to indicate requirement 1006 levels," RFC 2119, Internet Engineering Task Force, Mar. 1997. 1008 [4] J. Rosenberg and H. Schulzrinne, "An offer/answer model with 1009 SDP," Internet Draft, Internet Engineering Task Force, Feb. 2002. 1010 Work in progress. 1012 [5] H. Schulzrinne, D. Oran, and G. Camarillo, "The reason header 1013 field for the session initiation protocol," Internet Draft, Internet 1014 Engineering Task Force, Apr. 2002. Work in progress. 1016 [6] J. Rosenberg and H. Schulzrinne, "Reliability of provisional 1017 responses in SIP," Internet Draft, Internet Engineering Task Force, 1018 Feb. 2002. Work in progress. 1020 [7] J. Rosenberg, "The SIP UPDATE method," Internet Draft, Internet 1021 Engineering Task Force, Mar. 2002. Work in progress. 1023 15 Informative References 1025 [8] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson, "RTP: a 1026 transport protocol for real-time applications," RFC 1889, Internet 1027 Engineering Task Force, Jan. 1996. 1029 [9] S. Petrack and L. Conroy, "The PINT service protocol: Extensions 1030 to SIP and SDP for IP access to telephone call services," RFC 2848, 1031 Internet Engineering Task Force, June 2000. 1033 Full Copyright Statement 1035 Copyright (c) The Internet Society (2002). All Rights Reserved. 1037 This document and translations of it may be copied and furnished to 1038 others, and derivative works that comment on or otherwise explain it 1039 or assist in its implementation may be prepared, copied, published 1040 and distributed, in whole or in part, without restriction of any 1041 kind, provided that the above copyright notice and this paragraph are 1042 included on all such copies and derivative works. However, this 1043 document itself may not be modified in any way, such as by removing 1044 the copyright notice or references to the Internet Society or other 1045 Internet organizations, except as needed for the purpose of 1046 developing Internet standards in which case the procedures for 1047 copyrights defined in the Internet Standards process must be 1048 followed, or as required to translate it into languages other than 1049 English. 1051 The limited permissions granted above are perpetual and will not be 1052 revoked by the Internet Society or its successors or assigns. 1054 This document and the information contained herein is provided on an 1055 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 1056 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 1057 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 1058 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 1059 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.