idnits 2.17.1 draft-ietf-sipping-service-identification-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 16. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 856. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 867. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 874. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 880. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 24, 2008) is 5900 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-13) exists of draft-ietf-ecrit-framework-04 == Outdated reference: A later version (-04) exists of draft-rosenberg-sip-app-media-tag-02 Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 SIPPING J. Rosenberg 3 Internet-Draft Cisco 4 Intended status: Informational February 24, 2008 5 Expires: August 27, 2008 7 Identification of Communications Services in the Session Initiation 8 Protocol (SIP) 9 draft-ietf-sipping-service-identification-01 11 Status of this Memo 13 By submitting this Internet-Draft, each author represents that any 14 applicable patent or other IPR claims of which he or she is aware 15 have been or will be disclosed, and any of which he or she becomes 16 aware will be disclosed, in accordance with Section 6 of BCP 79. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that 20 other groups may also distribute working documents as Internet- 21 Drafts. 23 Internet-Drafts are draft documents valid for a maximum of six months 24 and may be updated, replaced, or obsoleted by other documents at any 25 time. It is inappropriate to use Internet-Drafts as reference 26 material or to cite them other than as "work in progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/1id-abstracts.txt. 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html. 34 This Internet-Draft will expire on August 27, 2008. 36 Copyright Notice 38 Copyright (C) The IETF Trust (2008). 40 Abstract 42 This document considers the problem of service identification in the 43 Session Initiation Protocol (SIP). Service identification is the 44 process of determining the user-level use case that is driving the 45 signaling being utilized by the user agent. This document discusses 46 the uses of service identification, and outlines several 47 architectural principles behind the process. It identifies several 48 perils associated with service identification, including fraud, 49 interoperability failures and stifling of innovation. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 54 2. Services and Service Identification . . . . . . . . . . . . . 4 55 3. Example Services . . . . . . . . . . . . . . . . . . . . . . . 5 56 3.1. IPTV vs. Multimedia . . . . . . . . . . . . . . . . . . . 5 57 3.2. Gaming vs. Voice Chat . . . . . . . . . . . . . . . . . . 5 58 3.3. Configuration vs. Pager Messaging . . . . . . . . . . . . 6 59 4. Using Service Identification . . . . . . . . . . . . . . . . . 6 60 4.1. Application Invocation in the User Agent . . . . . . . . . 6 61 4.2. Application Invocation in the Network . . . . . . . . . . 8 62 4.3. Network Quality of Service Authorization . . . . . . . . . 8 63 4.4. Service Authorization . . . . . . . . . . . . . . . . . . 9 64 4.5. Accounting and Billing . . . . . . . . . . . . . . . . . . 9 65 4.6. Negotiation of Service . . . . . . . . . . . . . . . . . . 9 66 4.7. Dispatch to Devices . . . . . . . . . . . . . . . . . . . 10 67 5. Key Principles of Service Identification . . . . . . . . . . . 10 68 5.1. Services are a By-Product of Signaling . . . . . . . . . . 10 69 5.2. Identical Signaling Produces Identical Services . . . . . 11 70 5.3. Do What I Say, not What I Mean . . . . . . . . . . . . . . 12 71 5.4. Explicit Service Identifiers are Redundant . . . . . . . . 12 72 6. Perils of Explicit Identifiers . . . . . . . . . . . . . . . . 13 73 6.1. Fraud . . . . . . . . . . . . . . . . . . . . . . . . . . 13 74 6.2. Systematic Interoperability Failures . . . . . . . . . . . 14 75 6.3. Stifling of Service Innovation . . . . . . . . . . . . . . 15 76 7. Recommendations . . . . . . . . . . . . . . . . . . . . . . . 16 77 8. Security Considerations . . . . . . . . . . . . . . . . . . . 17 78 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17 79 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 17 80 11. Informational References . . . . . . . . . . . . . . . . . . . 17 81 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 18 82 Intellectual Property and Copyright Statements . . . . . . . . . . 19 84 1. Introduction 86 The Session Initiation Protocol (SIP) [RFC3261] defines mechanisms 87 for initiating and managing communications sessions between agents. 88 SIP allows for a broad array of session types between agents. It can 89 manage audio sessions, ranging from low bitrate voice-only up to 90 multi-channel hi fidelity music. It can manage video sessions, 91 ranging from small, "talking-head" style video chat, up to high 92 definition multipoint video conferencing, to low bandwidth user- 93 generated content, up to high definition movie and TV content. SIP 94 endpoints can be anything - adaptors that convert an old analog 95 telephone to Voice over IP (VoIP), dedicated hardphones, fancy 96 hardphones with rich displays and user entry capabilities, softphones 97 on a PC, buddylist and presence applications on a PC, dedicated 98 videoconferencing peripherals, and speakerphones. 100 This breadth of applicability is SIP's greatest asset, but it also 101 introduces numerous challenges. One of these is that, when an 102 endpoint generates a SIP INVITE for a session, or receives one, that 103 session can potentially be within the context of any number of 104 different use cases and endpoint types. For example, a SIP INVITE 105 with a single audio stream could represent a Push-To-Talk session 106 between mobile devices, a VoIP session between softphones, or audio- 107 based access to stored content on a server. 109 These differing use cases have driven implementors and system 110 designers to seek techniques for service identification. Service 111 identification is the process of determining and/or signaling the 112 specific use case that is driving the signaling being generated by a 113 user agent. At first glance, this seems harmless and easy enough. 114 It is tempting to define a new header, "Service-ID", for example, and 115 have a user agent populate it with any number of well-known tokens 116 which define what the service is. It could then be consumed for any 117 number of purposes. A service identifier placed into the signaling 118 is called an explicit service identifier. 120 Explicit service identifiers have many problems, however. They are 121 redundant with the signaling itself (which is the ultimate expression 122 of the service that is desired), and are an example of Do-What-I-Mean 123 (DWIM). Consequently, their usage can lead to fraud, systemic 124 interoperability failures, and a complete stifling of the innovation 125 that SIP was meant to achieve. The purpose of this document is to 126 describe service identification in more detail and describe how these 127 problems arise. 129 Section 2 begins by defining a service and the service identification 130 problem. Section 3 gives some concrete examples of services and why 131 they can be challenging to identify. Section 4 explores the ways in 132 which a service identification can be utilized within a network. 133 Next, Section 5 discusses the key architectural principles of service 134 identification. Section 6 describes how explicit service identifiers 135 can lead to fraud, interoperability failures, and stifling of service 136 innovation. 138 2. Services and Service Identification 140 The problem of identifying services within SIP is not a new one. The 141 problem has been considered extensively in the context of presence. 142 In particular, the presence data model for SIP [RFC4479] defines the 143 concept of a service as one of the core notions that presence 144 describes. Services are described in Section 3.3 of RFC 4479. 146 Essentially, the service is the user-visible use case that is driving 147 the behavior of the user-agents and servers in the SIP network. 148 Being user-visible means that there is a difference in user 149 experience between two services that are different. That user 150 experience can be part of the call, or outside of the call. Within a 151 call, the user experience can be based on different media types (an 152 audio call vs. a video chat), different content within a particular 153 media type (stored content, such as a movie or TV session), different 154 devices (a wireless device for "telephony" vs. a PC application for 155 "voice-chat"), different user interfaces (a buddy list view of voice 156 on a PC application vs. a software emulation of a hard phone), 157 different communities that can be accessed (voice chat with other 158 users that have the same voice chat client, vs. voice communications 159 with any endpoint on the PSTN), or different applications that are 160 invoked by the user (manually selecting a push-to-talk application 161 from a wireless phone vs. a telephony application). Outside of a 162 call, the difference in user experience can be a billing one (cheaper 163 for one service than other), a notification feature for one and not 164 another (for example, an IM that gets sent whenever a user makes a 165 call), and so on. 167 In some cases, there is very little difference in the underlying 168 technology that will support two different services, and in other 169 cases, there are big differences. However, for purposes of this 170 discussion, the key definition is that two services are distinct when 171 there is a perceived difference by the user in the two services. 173 This leads naturally to the desire to perform service identification. 174 Service identification is defined as the process of (1) determination 175 of the underlying service which is driving a particular signaling 176 exchange, (2) associating that service with some kind of moniker, and 177 (3) attaching that moniker to a signaling message (typically a SIP 178 INVITE), and then utilizing it for various purposes within the 179 network. Service identification can be done in the endpoints, in 180 which case the UA would insert the moniker directly into the 181 signaling message based on its awareness of the service. Or, it can 182 be done within a server in the network (such as a proxy), based on 183 inspection of the SIP message, or based on hints placed into the 184 message by the user. 186 3. Example Services 188 It is very useful to consider several example services, especially 189 ones that appear difficult to differentiate from each other. 191 3.1. IPTV vs. Multimedia 193 IP Television (IPTV) is the usage of IP networks to access 194 traditional television content, such as movies and shows. SIP can be 195 utilized to establish a session to a media server in a network, which 196 then serves up multimedia content and streams it as an audio and 197 video stream towards the client. Whether SIP is ideal for IPTV is, 198 in itself, a good question. However, such a discussion is outside 199 the scope of this document. 201 Consider multimedia conferencing. The user accesses a voice and 202 video conference at a conference server. The user might join in 203 listen-only mode, in which case the user receives audio and video 204 streams, but does not send. 206 These two services - IPTV and listen-only multimedia conferencing, 207 clearly appear as different services. They have different user 208 experiences and applications. A user is unlikely to ever be confused 209 about whether a session is IPTV or listen-only multimedia 210 conferencing. Indeed, they are likely to have different software 211 applications or endpoints for the two services. 213 However, these two services look remarkably alike based on the 214 signaling. Both utilize audio and video. Both could utilize the 215 same codecs. Both are unidirectional streams (from a server in the 216 network to the client). Thus, it would appear on the surface that 217 there is no way to differentiate them, based on inspection of the 218 signaling alone. 220 3.2. Gaming vs. Voice Chat 222 Consider an interactive game, played between two users from their 223 mobile devices. The game involves the users sending each other game 224 moves, using a messaging channel, in addition to voice. In another 225 service, users have a voice and IM chat conversation using a buddy 226 list application on their PC. 228 In both services, there are two media streams - audio and messaging. 229 The audio uses the same codecs. Both use the Message Session Relay 230 Protocol (MSRP) [RFC4975]. In both cases, the caller would send an 231 INVITE to the Address of Record (AOR) of the target user. However, 232 these represent fairly different services, in terms of user 233 experience. 235 3.3. Configuration vs. Pager Messaging 237 The SIP MESSAGE method [RFC3428] provides a way to send one-shot 238 messages to a particular AOR. This specification is primarily aimed 239 at Short Message Service (SMS) style messaging, commonly found in 240 wireless phones. Receipt of a MESSAGE request would cause the 241 messaging application on a phone to launch, allowing the user to 242 browse message history and respond. 244 However, MESSAGE is sometimes used for the delivery of content to a 245 device for other purposes. For example, some providers use it to 246 deliver configuration updates, such as new phone settings or 247 parameters, or to indicate that a new version of firmware is 248 available. Though not designed for this purpose, MESSAGE gets used 249 since, in existing wireless networks, SMS is used for this purpose, 250 and MESSAGE is the SIP equivalent of SMS. 252 Consequently, the MESSAGE request sent to a phone can be for two 253 different services. One would require invocation of a messaging app, 254 whereas the other would be consumed by the software in the phone, 255 without any user interaction at all. 257 4. Using Service Identification 259 It is important to understand what the service identity would be 260 utilized for, if known. This section discusses the primary uses. 261 These are application invocation in user agents and the network, 262 Quality of Service authorization, service authorization, accounting 263 and billing, service negotiation, and device dispatch. 265 4.1. Application Invocation in the User Agent 267 In some of the examples above, there were multiple software 268 applications executing on the host. One common way of achieving this 269 is to utilize a common SIP user agent implementation that listens for 270 requests on a single port. When an incoming INVITE or MESSAGE 271 arrives, it must be delivered to the appropriate application 272 software. When each service is bound to a distinct software 273 application, it would seem that the service identity is needed to 274 dispatch the message to the appropriate piece of software. This is 275 shown in Figure 1. 277 +---------------------------------+ 278 | | 279 | +-------------+ +-------------+ | 280 | | UI | | UI | | 281 | +-------------+ +-------------+ | 282 | +-------------+ +-------------+ | 283 | | | | | | 284 | | Service 1 | | Service 2 | | 285 | | | | | | 286 | +-------------+ +-------------+ | 287 | +-----------------------------+ | 288 | | | | 289 | | SIP | | 290 | | Layer | | 291 | | | | 292 | +-----------------------------+ | 293 | | 294 +---------------------------------+ 296 Physical Device 298 Figure 1 300 The role of the SIP layer is to parse incoming messages, handle the 301 SIP state machinery for transactions and dialogs, and then dispatch 302 request to the appropriate service. This software architecture is 303 analagous to the way web servers frequently work. An HTTP server 304 listens on port 80 for requests, and based on the HTTP Request-URI, 305 dispatches the request to a number of disparate applications. The 306 same is happening here. For the example services in Section 3.2, an 307 incoming INVITE for the gaming service would be delivered to the 308 gaming application software. An incoming INVITE for the voice chat 309 service would be delivered to the voice chat application software. 310 For the examples in Section 3.3, a MESSAGE request for user to user 311 messaging would be delivered to the messaging or SMS app, and a 312 MESSAGE request containing configuration data would be delivered to a 313 configuration update application. 315 Unlike the web, however, in all three use cases, the user initiating 316 communications has only a single identifier for the recipient - their 317 AOR. Consequently, the SIP Request-URI cannot be used for 318 dispatching, as it is identical in all three cases. 320 4.2. Application Invocation in the Network 322 Another usage of a service identifier would be to cause servers in 323 the SIP network to provide additional processing, based on the 324 service. For example, an INVITE issued by a user agent for IPTV 325 would pass through a server that does some kind of content rights 326 management, authorizing whether the user is allowed to access that 327 content. On the other hand, an INVITE issued by a user for 328 multimedia conferencing would pass through a server providing 329 "traditional" telephony features, such as outbound call screening and 330 call recording. It would make no sense for the INVITE associated 331 with IPTV to have outbound call screening and call recording applied, 332 and it would make no sense for the multimedia conferencing INVITE to 333 be processed by the content rights management server. Indeed, in 334 these cases, it's not just an efficiency issue (invoking servers when 335 not needed), but rather, truly incorrect behavior can occur. For 336 example, if an outbound call screening application is set to block 337 outbound calls to everything except for the phone numbers of friends 338 and family, an IPTV request that gets processed by such a server 339 would be blocked (as it's not targeted to the AOR of a friend or 340 family member). This would block a user's attempt to access IPTV 341 services, when that was not the goal at all. 343 Similarly, a MESSAGE request from Section 3.3 might need to pass 344 through a message server for filtering when it is associated with 345 chat, but not when it is associated with config. Consider a filter 346 which gets applied to MESSAGE requests, and that filter runs in a 347 server in the network. The filter operation prevents user Joe from 348 sending messages to user Bob that contain the words "stock" or 349 "purchase", due to some regulations that disallow Joe and Bob from 350 discussing stock trading. However, a MESSAGE for configuration 351 purposes might contain an XML document that uses the token "stock" as 352 some kind of attribute. This configuration update would be discarded 353 by the filtering server, when it should not have been. 355 4.3. Network Quality of Service Authorization 357 The IP network can provide differing levels of Quality of Service 358 (QoS) to IP packets. This service can include guaranteed throughput, 359 latency, or loss characteristics. Typically, the user agent will 360 make some kind of QoS request, either using explicit signaling 361 protocols (such as RSVP) or through marking of Diffserv value in 362 packets. The network will need to make a policy decision based on 363 whether these QoS treatments are authorized or not. One common 364 authorization policy is to check if the user has invoked a service 365 using SIP that they are authorized to invoke, and that this service 366 requires the level of QoS treatment the user has requested. 368 For example, consider IPTV and multimedia conferencing as described 369 in Section 3.1. IPTV is a non-real time service. Consequently, 370 media traffic for IPTV would be authorized for bandwidth guarantees, 371 but not for latency or loss guarantees. On the other hand, 372 multimedia conferencing is real time. Its traffic would require 373 bandwidth, loss and latency guarantees from the network. 375 Consequently, if a user should make an RSVP reservation for a media 376 stream, and ask for latency guarantees for that stream, the network 377 would like to be able to authorize it if the service was multimedia 378 conferencing, but not if it was IPTV. This would require the server 379 performing the QoS authorization to know the service associated with 380 the INVITE that set up the session. 382 4.4. Service Authorization 384 Frequently, a network administrator will want to authorize whether a 385 user is allowed to invoke a particular service. Not all users will 386 be authorized to use all services that are provided. For example, a 387 user may not be authorized to access IPTV services, whereas they are 388 authorized to utilize multimedia processing. A user might not be 389 able to utilize a multiplayer gaming service, whereas they are 390 authorized to utilize voice chat services. 392 Consequently, when an INVITE arrives at a server in the network, the 393 server will need to determine what the requested service is, so that 394 the server can make an authorization decision. 396 4.5. Accounting and Billing 398 Service authorization and accounting/billing go hand in hand. One of 399 the primary reasons for authorizing that a user can utilize a service 400 is that they are being billed differently based on the type of 401 service. Consequently, one of the goals of a service identity is to 402 be able to include it in accounting records, so that the appropriate 403 billing model can be applied. 405 For example, in the case of IPTV, a service provider can bill based 406 on the content (US $5 per movie, perhaps), whereas for multimedia 407 conferencing, they can bill by the minute. This requires the 408 accounting streams to indicate which service was invoked for the 409 particular session. 411 4.6. Negotiation of Service 413 In some cases, when the caller initiates a session, they don't 414 actually know which service will be utilized. Rather, they might 415 like to offer up all of the services they have available to the 416 called party, and then let the called party decide, or let the system 417 make a decision based on overlapping service capabilities. 419 As an example, a user can do both the game and the voice chat service 420 of Section 3.2. They initiate a session to a target AOR, but the 421 devices used by that user can only support voice chat. The called 422 device returns, in its call acceptance, an indication that only voice 423 chat can be used. Consequently, voice chat gets utilized for the 424 session. 426 4.7. Dispatch to Devices 428 When a user has multiple devices, each with varying capabilities in 429 terms of service, it is useful to dispatch an incoming request to the 430 right device based on whether the device can support the service that 431 has been requested. 433 For example, if a user initiates a gaming session with voice chat, 434 and the target user has two devices - one that can support the gaming 435 service, and the other that cannot, the INVITE should be dispatched 436 to the device which supports the gaming session. 438 5. Key Principles of Service Identification 440 In this section, we describe three key principles of service 441 identification: 443 1. Services are a by-product of signaling 445 2. Identical signaling produces identical services 447 3. Explicit service identifiers are an example of Do-What-I-Mean 448 (DWIM) 450 4. Explicit service identifiers are redundant 452 5.1. Services are a By-Product of Signaling 454 Almost always, the first solution that people consider is to add some 455 kind of field to the signaling messages which indicates what the 456 service is. This field would then be inserted by the user agent, and 457 then can be used by the proxies and other user agent as a service 458 identifier. 460 This approach, however, misses a key point, which cannot be stressed 461 enough, and which represents the core architectural principle to be 462 understood here: 464 A service is the by-product of the signaling and the context 465 around it (the user profile, time-of-day and so on) - the effects 466 of the signaling message once launched into the network. The 467 service identity is therefore always derivable from the signaling 468 and its context without additional identifiers. 470 When a user sends an INVITE request to the network, and targets that 471 request at an IPTV server, and includes SDP for audio and video 472 streaming, the *result* of sending such an INVITE is that an IPTV 473 session occurs. The entire purpose of the INVITE is to establish 474 such a session, and therefore, invoke the service. Thus, a service 475 is not something that is different from the rest of the signaling 476 message. A service is what the user gets after the network and other 477 user agents have processed a signaling message. 479 5.2. Identical Signaling Produces Identical Services 481 This principle is a natural conclusion of the previous assertion. If 482 a service is the byproduct of signaling, how can a user have 483 different experiences and different services when the signaling 484 message is the same? They cannot. 486 But how can that be? From the examples in Section 3, it would seem 487 that there are services which are different, but have identical 488 signaling. If we hold true to the assertion, there is in fact only 489 one logical conclusion: 491 If two services are different, but their signaling appears to be 492 the same, it is because there is in fact something different that 493 has been overlooked, or something has been implied from the 494 signaling which should have been signaled explicitly. 496 To illustrate this, let us take each of the example services in 497 Section 3 and investigate whether there is, or should be, something 498 different in the signaling in each case. 500 IPTV vs. Multimedia Conferencing: The two services in Section 3.1 501 appear to have identical signaling. They both involve audio and 502 video streams, both of which are unidirectional. Both might 503 utilize the same codecs. However, there is another important 504 difference in the signaling - the target URI. In the case of 505 IPTV, the request is targeted at a media server or to a particular 506 piece of content to be viewed. In the case of multimedia 507 conferencing, the target is a conference server. The 508 administrator of the domain can therefore examine the two Request- 509 URI, and figure out whether it is targeted for a conference server 510 or a content server, and use that to derive the service associated 511 with the request. 513 Gaming vs. Voice Chat: Though both sessions involve MSRP and voice, 514 and both are targeted to the same AOR of the called user, there is 515 a difference. The MSRP messages for the gaming session carry 516 content which is game specific, whereas the MSRP messages for the 517 voice chat are just regular text, meant for rendering to a user. 518 Thus, the MSRP session in the SDP will indicate the specific 519 content type that MSRP is carrying, and this type will differ in 520 both cases. Even if the game moves look like text, since they are 521 being consumed by an automata there is an underlying schema that 522 dictates their content, and therefore, this schema represents the 523 actual content type that should be signaled. 525 Configuration vs. Pager Messaging: Just as in the case of gaming vs. 526 voice chat, the content type of the messages differentiates the 527 service that occurs as a consequence of the messages. 529 5.3. Do What I Say, not What I Mean 531 An explicit service identifier is a field included in the signaling 532 message that contains a token whose value indicates the specific 533 service invoked by the calling user. This would be "IPTV" or "voice 534 chat" or "shoot-em game" or "short message service". This explicit 535 identifier would typically be inserted by the originating user agent, 536 and carried in the signaling message. 538 "Do What I Mean", abbreviated as DWIM, is a concept in computer 539 science. It is sometimes used to describe a function which tries to 540 intelligently guess at what the user intended. It is contrast to "Do 541 What I Say", or DWIS, which describes a function that behaves 542 concretely based on the inputs provided. Systems built on the DWIM 543 concept can have unexpected behaviors because they are driven by 544 unstated rules. 546 An explicit service identifier is an example of a DWIM approach. An 547 explicit service identifier itself has no well-defined impact on the 548 state machinery or protocols in the system; it has various side- 549 effects based on an assumption of what is meant by the service 550 identifier. Interpretation of the signaling directly is an 551 expression of the principle of DWIS - the behavior of the system is 552 based entirely on the specifics of the protocol and are well defined 553 by the protocol specification. 555 5.4. Explicit Service Identifiers are Redundant 557 Because an explicit service identifier is, by definition, inside of 558 the signaling message, and because the signaling itself completely 559 defines the behavior of the service, another natural conclusion is 560 that an explicit service identifier is redundant with the signaling 561 itself. It says nothing that could not otherwise be derived from 562 examination of the signaling. 564 6. Perils of Explicit Identifiers 566 Based on these principles, several perils of an explicit service 567 identifier can be described. They are: 569 1. Explicit identifiers can be used for fraud 571 2. Explicit identifiers can hurt interoperability 573 3. Explicit identifiers can stifle service innovation 575 6.1. Fraud 577 Explicit service identifiers can lead to fraud. If a provider uses 578 the service identifier for billing and accounting purposes, or for 579 authorization purposes, it opens an avenue for attack. The user can 580 construct the signaling message so that its actual effect (which is 581 the service the user will receive), is what the user desires, but the 582 user places a service identifier into the request (which is what is 583 used for billing and authorization) that identifies a cheaper 584 service, or one that the user is authorized to receive. In such a 585 case, the user will be billed for something they did not receive. 587 If, however, the domain administrator derived the service identifier 588 from the signaling itself, the user cannot lie. If they did lie, 589 they wouldn't get the desired service. 591 Consider the example of IPTV vs. multimedia conferencing. If 592 multimedia conferencing is cheaper, the user could send an INVITE for 593 an IPTV session, but include a service identifier which indicates 594 multimedia conferencing. The user gets the service associated with 595 IPTV, but at the cost of multimedia conferencing. 597 This same principle shows up in other places. For example, in the 598 identification of an emergency services call 599 [I-D.ietf-ecrit-framework]. It is desirable to give emergency 600 services calls special treatment, such as being free, authorized even 601 when the user cannot otherwise make calls, and to give them priority. 602 If emergency calls where indicated through something other than the 603 target of the call being an emergency services URN [RFC5031], it 604 would open an avenue for fraud. The user could place any desired URI 605 in the request-URI, and indicate that the call is an emergency 606 services call. This could would then get special treatment, but of 607 course get routed to the target URI. The only way to prevent this 608 fraud is to consider an emergency call as any call whose target is an 609 emergency services URN. Thus, the service identification here is 610 based on the target of the request. When the target is an emergency 611 services URN, the request can get special treatment. The user cannot 612 lie, since there is no way to separately indicate this is an 613 emergency call, besides targeting it to an emergency URN. 615 6.2. Systematic Interoperability Failures 617 How can inclusion of an explicit service identifier cause loss of 618 interoperability? When such an identifier is used to drive 619 functionality - such as dispatch on the phones, in the network, or 620 QoS authorization, it means that the wrong thing can happen when this 621 field is not set properly. Consider a user in domain 1, calling a 622 user in domain 2. Domain 1 provides the user with a service they 623 call "voice chat", which utilizes voice and IM for real time 624 conversation, driven off of a buddy list application on a PC. Domain 625 2 provides their users with a service they call, "text telephony", 626 which is a voice service on a wireless device that also allows the 627 user to send text messages. Consider the case where domain 1 and 628 domain 2 both have their user agents insert a service identifiers 629 into the request, and then use that to derive QoS authorization, 630 accounting, and invocation of applications in the network and in the 631 device. The user in domain 1 calls the user in domain 2, and inserts 632 the identifier "Voice Chat" into the INVITE. When this arrives at 633 the server in domain 2, the service identifier is unknown. 634 Consequently, the request does not get the proper QoS treatment, even 635 if the call itself will succeed. 637 Explicit service identifiers, used between domains, cause 638 interoperability failures unless all interconnected domains agree on 639 exactly the same set of services and how to name them. Of course, 640 lack of service identifiers does not guarantee service 641 interoperability. However, SIP was built with rich tools for 642 negotiation of capabilities at a finely granular level. One user 643 agent can make a call using audio and video, but if the receiving UA 644 only supports audio, SIP allows both sides to negotiate down to the 645 lowest common denominator. Thus, communications is still provided. 646 As another example, if one agent initiates a Push-To-Talk session 647 (which is audio with a companion floor control mechanism), and the 648 other side only did regular audio, SIP would be able to negotiate 649 back down to a regular voice call. As another example, if a calling 650 user agent is running a high-definition video conferencing endpoint, 651 and the called user agent supports just a regular video endpoint, the 652 codecs themselves can negotiate downward to a lower rate, picture 653 size, and so on. Thus, interoperability is achieved. Interestingly, 654 the final "service" may no longer be well characterized by the 655 service identifier that would have been placed in the original 656 INVITE. For example, in this case, of the original INVITE from the 657 caller had contained the service identifier, "hi-fi video", but the 658 video gets negotiated down to a lower rate and picture size, the 659 service identifier is no longer really appropriate. 661 This illustrates another key aspect of the interoperability problem. 662 Usage of explicit service identifiers in the request will result in 663 inconsistencies between those service identifiers and the results of 664 any SIP negotiation that might otherwise be applied in the session. 666 When a service identifier becomes something that both proxies and the 667 user agent need to understand in order to properly treat a request, 668 it becomes equivalent to including a token in the Proxy-Require and 669 Require header fields of every single SIP request. The very reason 670 that [RFC4485] frowns upon usage of Require and certainly Proxy- 671 Require is the huge impact on interoperability it causes. It is for 672 this same reason that explicit service identifiers need to be 673 avoided. 675 6.3. Stifling of Service Innovation 677 The probability that any two pair of service providers end up with 678 the same set of services, and give them the same names, becomes 679 decreasingly small as the number of providers grow. Indeed, it would 680 almost certainly require a centralized authority to identify what the 681 services are, how they work, and what they are named. This, in turn, 682 leads to a requirement for complete homogeneity in order to 683 facilitate interconnection. Two providers cannot usefully 684 interconnect unless they agree on the set of services they are 685 offering to their customers, and each do the same thing. This is 686 because each provider has become dependent on inclusion of the proper 687 service identifier in the request, in order for the overall treatment 688 of the request to proceed correctly. This is, in a very real sense, 689 anathema to the entire notion of SIP, which is built on the idea that 690 heterogeneous domains can interconnect and still get 691 interoperability. 693 Explicit service identifiers lead to a requirement for homogeneity in 694 service definitions across providers that interconnect, ruining the 695 very service heterogeneity that SIP was meant to bring. 697 Indeed, Metcalfe's law says that the value of a network grows with 698 the square of the number of participants. As a consequence of this, 699 once a bunch of large domains did get together, agree on a set of 700 services, and then a set of well-known identifiers for those 701 services, it would force other providers to also deploy the same 702 services, in order to obtain the value that interconnection brings. 703 This, in turn, will stifle innovation, and quickly force the set of 704 services in SIP to become fixed and never expand beyond the ones 705 initially agreed upon. This, too, is anathema to the very framework 706 on which SIP is built, and defeats much of the purpose of why 707 providers have chosen to deploy SIP in their own networks: 709 Consider the following example. Several providers get together, and 710 standardize on a bunch of service identifiers. One of these uses 711 audio and video (say, "multimedia conversation"). This service is 712 successful, and is widely utilized. Endpoints look for this 713 identifier to dispatch calls to the right software applications, and 714 the network looks for it to invoke features, perform accouting, and 715 QoS. A new provider gets the idea for a new service, say, avatar- 716 enhanced multimedia conversation. In this service, there is audio 717 and video, but there is a third stream, which renders an avatar. A 718 caller can press buttons on their phone, to cause the avatar on the 719 other person's device to show emotion, make noise, and so on. This 720 is similar to the way emoticons are used today in IM. This service 721 is enabled by adding a third media stream (and consequently, third 722 m-line) to the SDP. 724 Normally, this service would be backwards compatible with a regular 725 audio-video endpoint, which would just reject the third media stream. 726 However, because a large network has been deployed that is expecting 727 to see the token, "multimedia conversation" and its associated audio+ 728 video service, it is nearly impossible for the new provider to roll 729 out this new service. If they did, it would fail completely, or 730 partially fail, when their users call users in other provider 731 domains. 733 7. Recommendations 735 From these principles, several recommendations can be made: 737 o Systems needing to perform service identification must examine 738 existing signaling constructs to identify the service based on 739 fields that exist within the signaling message already. 741 o If it appears that the signaling currently defined in standards is 742 not sufficient to identify the service, it may be due to lack of 743 sufficient signaling to convey what is needed, and new standards 744 work should be undertaken to fill this gap. 746 o The usage of an explicit service identifier does make sense as a 747 way to cache a decision made by a network element, for usage by 748 another network element within the same domain. However, service 749 identifiers are fundamentally useful within a particular domain, 750 and any such header must be stripped at a network boundary. 752 o Device dispatch should be done following the principles of 753 [RFC3841], using implicit preferences based on the signaling. For 754 example, [I-D.rosenberg-sip-app-media-tag] defines a new UA 755 capability that can be used to dispatch requests based on 756 different types of application media streams. 758 o Presence can help a great deal with service indentification. When 759 a user wishes to contact another user, and knows only the AOR for 760 the target (which is usually the case), the user can fetch the 761 presence document for the target. That document, in turn, can 762 contain numerous service URI for contacting the target with 763 different services. The usage of different URI for contacting 764 different services makes it very easy to identify the service - 765 it's the actual target of the request itself. When possible, this 766 is the best solution to the problem. 768 8. Security Considerations 770 Oftentimes, the service associated with a request is utilized for 771 purposes such as authorization, accounting, and billing. When 772 service identification is not done properly, the possibility of 773 unauthorized service use and network fraud is introduced. It is for 774 this reason, discussed extensively in Section 6.1, that the usage of 775 explicit service identifiers inserted by a UA is not recommended. 777 9. IANA Considerations 779 There are no IANA considerations associated with this specification. 781 10. Acknowledgements 783 This document is based on discussions with Paul Kyzivat and Andrew 784 Allen, who contributed significantly to the ideas here. Much of the 785 content in this draft is a result of discussions amongst participants 786 in the SIPPING mailing list, including Dean Willis, Tom Taylor, Eric 787 Burger, Dale Worley, Christer Holmberg, and John Elwell, amongst many 788 others. Thanks to Spencer Dawkins, Tolga Asveren, Mahesh Anjanappa 789 and Claudio Allochio for reviews of this document. 791 11. Informational References 793 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 794 A., Peterson, J., Sparks, R., Handley, M., and E. 795 Schooler, "SIP: Session Initiation Protocol", RFC 3261, 796 June 2002. 798 [RFC4479] Rosenberg, J., "A Data Model for Presence", RFC 4479, 799 July 2006. 801 [RFC4485] Rosenberg, J. and H. Schulzrinne, "Guidelines for Authors 802 of Extensions to the Session Initiation Protocol (SIP)", 803 RFC 4485, May 2006. 805 [RFC4975] Campbell, B., Mahy, R., and C. Jennings, "The Message 806 Session Relay Protocol (MSRP)", RFC 4975, September 2007. 808 [RFC5031] Schulzrinne, H., "A Uniform Resource Name (URN) for 809 Emergency and Other Well-Known Services", RFC 5031, 810 January 2008. 812 [I-D.ietf-ecrit-framework] 813 Rosen, B., Schulzrinne, H., Polk, J., and A. Newton, 814 "Framework for Emergency Calling using Internet 815 Multimedia", draft-ietf-ecrit-framework-04 (work in 816 progress), November 2007. 818 [I-D.rosenberg-sip-app-media-tag] 819 Rosenberg, J., "A Session Initiation Protocol (SIP) Media 820 Feature Tag for MIME Application Sub-Types", 821 draft-rosenberg-sip-app-media-tag-02 (work in progress), 822 November 2007. 824 [RFC3428] Campbell, B., Rosenberg, J., Schulzrinne, H., Huitema, C., 825 and D. Gurle, "Session Initiation Protocol (SIP) Extension 826 for Instant Messaging", RFC 3428, December 2002. 828 [RFC3841] Rosenberg, J., Schulzrinne, H., and P. Kyzivat, "Caller 829 Preferences for the Session Initiation Protocol (SIP)", 830 RFC 3841, August 2004. 832 Author's Address 834 Jonathan Rosenberg 835 Cisco 836 Edison, NJ 837 US 839 Email: jdrosen@cisco.com 840 URI: http://www.jdrosen.net 842 Full Copyright Statement 844 Copyright (C) The IETF Trust (2008). 846 This document is subject to the rights, licenses and restrictions 847 contained in BCP 78, and except as set forth therein, the authors 848 retain all their rights. 850 This document and the information contained herein are provided on an 851 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 852 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 853 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 854 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 855 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 856 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 858 Intellectual Property 860 The IETF takes no position regarding the validity or scope of any 861 Intellectual Property Rights or other rights that might be claimed to 862 pertain to the implementation or use of the technology described in 863 this document or the extent to which any license under such rights 864 might or might not be available; nor does it represent that it has 865 made any independent effort to identify any such rights. Information 866 on the procedures with respect to rights in RFC documents can be 867 found in BCP 78 and BCP 79. 869 Copies of IPR disclosures made to the IETF Secretariat and any 870 assurances of licenses to be made available, or the result of an 871 attempt made to obtain a general license or permission for the use of 872 such proprietary rights by implementers or users of this 873 specification can be obtained from the IETF on-line IPR repository at 874 http://www.ietf.org/ipr. 876 The IETF invites any interested party to bring to its attention any 877 copyrights, patents or patent applications, or other proprietary 878 rights that may cover technology that may be required to implement 879 this standard. Please address the information to the IETF at 880 ietf-ipr@ietf.org. 882 Acknowledgment 884 Funding for the RFC Editor function is provided by the IETF 885 Administrative Support Activity (IASA).