idnits 2.17.1 

draft-ietf-sipping-service-identification-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 17.

  -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on
     line 860.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 871.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 878.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 884.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** There are 30 instances of too long lines in the document, the longest
     one being 3 characters in excess of 72.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust Copyright Line does not match the
     current year

  == The document seems to use 'NOT RECOMMENDED' as an RFC 2119 keyword, but
     does not include the phrase in its RFC 2119 key words list.

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (August 1, 2007) is 6110 days in the past.  Is this
     intentional?


  Checking references for intended status: Best Current Practice
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: '14' is mentioned on line 199, but not defined

  == Missing Reference: '15' is mentioned on line 201, but not defined

  == Outdated reference: A later version (-13) exists of
     draft-ietf-ecrit-framework-01

  == Outdated reference: A later version (-07) exists of
     draft-ietf-ecrit-service-urn-06


     Summary: 2 errors (**), 0 flaws (~~), 6 warnings (==), 7 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	SIPPING                                                     J. Rosenberg
3	Internet-Draft                                                     Cisco
4	Intended status: Best Current                             August 1, 2007
5	Practice
6	Expires: February 2, 2008

8	  Identification of Communications Services in the Session Initiation
9	                             Protocol (SIP)
10	              draft-ietf-sipping-service-identification-00

12	Status of this Memo

14	   By submitting this Internet-Draft, each author represents that any
15	   applicable patent or other IPR claims of which he or she is aware
16	   have been or will be disclosed, and any of which he or she becomes
17	   aware will be disclosed, in accordance with Section 6 of BCP 79.

19	   Internet-Drafts are working documents of the Internet Engineering
20	   Task Force (IETF), its areas, and its working groups.  Note that
21	   other groups may also distribute working documents as Internet-
22	   Drafts.

24	   Internet-Drafts are draft documents valid for a maximum of six months
25	   and may be updated, replaced, or obsoleted by other documents at any
26	   time.  It is inappropriate to use Internet-Drafts as reference
27	   material or to cite them other than as "work in progress."

29	   The list of current Internet-Drafts can be accessed at
30	   http://www.ietf.org/ietf/1id-abstracts.txt.

32	   The list of Internet-Draft Shadow Directories can be accessed at
33	   http://www.ietf.org/shadow.html.

35	   This Internet-Draft will expire on February 2, 2008.

37	Copyright Notice

39	   Copyright (C) The IETF Trust (2007).

41	Abstract

43	   This document considers the problem of service identification in the
44	   Session Initiation Protocol (SIP).  Service identification is the
45	   process of determining the user-level use case that is driving the
46	   signaling being utilized by the user agent.  While seemingly simple,
47	   this process is quite complex, and when not addressed properly, can
48	   lead to fraud, interoperability problems, and stifling of innovation.

50	   This document discusses these problems and makes recommendations on
51	   how to address them.

53	Table of Contents

55	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
56	   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  4
57	   3.  Services and Service Identification  . . . . . . . . . . . . .  4
58	   4.  Example Services . . . . . . . . . . . . . . . . . . . . . . .  6
59	     4.1.  IPTV vs. Multimedia  . . . . . . . . . . . . . . . . . . .  6
60	     4.2.  Gaming vs. Voice Chat  . . . . . . . . . . . . . . . . . .  7
61	     4.3.  Configuration vs. Pager Messaging  . . . . . . . . . . . .  7
62	   5.  Using Service Identification . . . . . . . . . . . . . . . . .  7
63	     5.1.  Application Invocation in the User Agent . . . . . . . . .  8
64	     5.2.  Application Invocation in the Network  . . . . . . . . . .  9
65	     5.3.  Network Quality of Service Authorization . . . . . . . . .  9
66	     5.4.  Service Authorization  . . . . . . . . . . . . . . . . . . 10
67	     5.5.  Accounting and Billing . . . . . . . . . . . . . . . . . . 10
68	     5.6.  Negotiation of Service . . . . . . . . . . . . . . . . . . 10
69	     5.7.  Dispatch to Devices  . . . . . . . . . . . . . . . . . . . 11
70	   6.  Key Principles of Service Identification . . . . . . . . . . . 11
71	     6.1.  Services are a By-Product of Signaling . . . . . . . . . . 11
72	     6.2.  Perils of Explicit Identifiers . . . . . . . . . . . . . . 13
73	       6.2.1.  Fraud  . . . . . . . . . . . . . . . . . . . . . . . . 13
74	       6.2.2.  Systematic Interoperability Failures . . . . . . . . . 14
75	       6.2.3.  Stifling of Service Innovation . . . . . . . . . . . . 16
76	   7.  Recommendations  . . . . . . . . . . . . . . . . . . . . . . . 17
77	   8.  Security Considerations  . . . . . . . . . . . . . . . . . . . 17
78	   9.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 17
79	   10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 18
80	   11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 18
81	     11.1. Normative References . . . . . . . . . . . . . . . . . . . 18
82	     11.2. Informational References . . . . . . . . . . . . . . . . . 18
83	   Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 18
84	   Intellectual Property and Copyright Statements . . . . . . . . . . 20

86	1.  Introduction

88	   The Session Initiation Protocol (SIP) [2] defines mechanisms for
89	   initiating and managing communications sessions between agents.  SIP
90	   allows for a broad array of session types between agents.  It can
91	   manage audio sessions, ranging from low bitrate voice-only up to
92	   multi-channel hi fidelity music.  It can manage video sessions,
93	   ranging from small, "talking-head" style video chat, up to high
94	   definition multipoint video conferencing, to low bandwidth user-
95	   generated content, up to high definition movie and TV content.  SIP
96	   endpoints can be anything - adaptors that convert an old analog
97	   telephone to Voice over IP (VoIP), dedicated hardphones, fancy
98	   hardphones with rich displays and user entry capabilities, softphones
99	   on a PC, buddylist and presence applications on a PC, dedicated
100	   videoconferencing peripherals, and speakerphones.

102	   This breadth of applicability is SIPs greatest asset, but it also
103	   introduces numerous challenges.  One of these is that, when an
104	   endpoint generates a SIP INVITE for a session, or receives one, that
105	   session can potentially be within the context of any number of
106	   different use cases and endpoint types.  For example, a SIP INVITE
107	   with a single audio stream could represent a Push-To-Talk session
108	   between mobile devices, a VoIP session between softphones, or audio-
109	   based access to stored content on a server.

111	   These differing use cases have driven implementors and system
112	   designers to seek techniques for service identification.  Service
113	   identification is the process of determining and/or signaling the
114	   specific use case that is driving the signaling being generated by a
115	   user agent.  At first glance, this seems harmless and easy enough.
116	   It is tempting to define a new header, "Service-ID", for example, and
117	   have a user agent populate it with any number of well-known tokens
118	   which define what the service is.  This information could then be
119	   consumed for any number of purposes.

121	   However, as this document will demonstrate, service identification is
122	   a very complex and difficult process, and can very easily lead to
123	   fraud, systemic interoperability failures, and a complete stifling of
124	   the innovation that SIP was meant to achieve.

126	   Section 3 begins by defining a service and the service identification
127	   problem.  Section 4 gives some concrete examples of services and why
128	   they can be challenging to identify.  Section 5 explores the ways in
129	   which a service identification can be utilized within a network.
130	   Next, Section 6 discusses the key architectural principles of service
131	   identification, and how explicit service identifiers can lead to
132	   fraud, interoperability failures, and stifling of service innovation.

134	2.  Terminology

136	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
137	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
138	   document are to be interpreted as described in RFC 2119 [1].

140	3.  Services and Service Identification

142	   The problem of identifying services within SIP is not a new one.  The
143	   problem has been considered extensively in the context of presence.
144	   In particular, the presence data model for SIP [3] defines the
145	   concept of a service as one of the core notions that presence
146	   describes.  Services are described in Section 3.3 of RFC 4479, which
147	   has this to say on the topic:

149	   3.3.  Service

151	      Each presentity has access to a number of services.  Each of these
152	      represents a point of reachability for communications that can be
153	      used to interact with the user.  Examples of services are telephony
154	      (that is, traditional circuit-based telephone service), push-to-talk,
155	      instant messaging, Short Message Service (SMS), and Multimedia
156	      Message Service (MMS).

158	      It is difficult to give a precise definition for service.  One
159	      reasonable approach is to model each software or hardware agent in
160	      the system as a service.  If a user starts a softphone application on
161	      their PC, then that represents a service.  If a user has a videophone
162	      device, then that represents another service.  This is effectively a
163	      physical view of services.  This definition, however, starts to fall
164	      apart when a service is spread across multiple software agents or
165	      devices.  For example, a SIP URI representing an address-of-record
166	      can be routed to a softphone or a videophone, or both.  In that case,
167	      one might attempt instead to define a service based on its address on
168	      the network.  This definition also falls apart when modeling devices
169	      or applications that receive calls and dispatch them to different
170	      "helpers" based on potentially complex logic.  For example, a
171	      cellular telephone might house multiple SIP applications, each of
172	      which can "register" different handlers based on the method or even
173	      body type of the request.  Each of those applications or handlers can
174	      rightfully be considered a service, but it doesn't have an address on
175	      the network distinct from the others.

177	      Because of this inherent difficulty in precisely defining a service,
178	      the data model doesn't try to constrain what can be considered a
179	      service.  Rather, anything can be considered a service so long as it
180	      exhibits a set of key properties defined by this model.  In
181	      particular, each service is associated with characteristics that
182	      identify the nature and capabilities of that service, with reach
183	      information that indicates how to connect to the service, with status
184	      information representing the state of that service, and relative
185	      information that describes the ways in which that service relates to
186	      others associated with the presentity.

188	      As a consequence, in this model, services are not explicitly
189	      enumerated.  There is no central registry where one finds identifiers
190	      for each service.  Consequently, each service does not have a single
191	      "service" attribute with values such as "ptt" or "telephony".  That
192	      doesn't mean that these consolidated monikers aren't useful; indeed,
193	      they represent an essential summary of what the service is.  Such
194	      summarization is useful in creating icons that allow a user to choose
195	      one service over another.  A watcher is free to create such
196	      summarization information from any of the information associated with
197	      a service.  The reach information often provides valuable information
198	      for creating such a summarization.  Oftentimes, the scheme of the URI
199	      is synonymous with the view of what a service is.  An "sms" URI [14]
200	      clearly indicates SMS, for example.  For some URIs, there may be many
201	      services available, for example, SIP or tel [15], in which case the
202	      scheme is less meaningful as a way of creating a summary.  The reach
203	      information could also indicate that certain application software has
204	      to be invoked (such as a videogame), in which case that aspect of the
205	      reach information would be useful for generating an iconic
206	      representation of the game.

208	   Essentially, the service is the user-visible use case that is driving
209	   the behavior of the user-agents and servers in the SIP network.
210	   Being user-visible means that there is a difference in user
211	   experience between two services that are different.  That user
212	   experience can be part of the call, or outside of the call.  Within a
213	   call, the user experience can be based on different media types (an
214	   audio call vs. a video chat), different content within a particular
215	   media type (stored content, such as a movie or TV session), different
216	   devices (a wireless device for "telephony" vs. a PC application for
217	   "voice-chat"), different user interfaces (a buddy list view of voice
218	   on a PC application vs. a software emulation of a hard phone),
219	   different communities that can be accessed (voice chat with other
220	   users that have the same voice chat client, vs. voice communications
221	   with any endpoint on the PSTN), or different applications that are
222	   invoked by the user (manually selecting a push-to-talk application
223	   from a wireless phone vs. a telephony application).  Outside of a
224	   call, the difference in user experience can be a billing one (cheaper
225	   for one service than other), a notification feature for one and not
226	   another (for example, an IM that gets sent whenever a user makes a
227	   call), and so on.

229	   In some cases, there is very little difference in the underlying
230	   technology that will support two different services, and in other
231	   cases, there are big differences.  However, for purposes of this
232	   discussion, the key definition is that two services are distinct when
233	   there is a perceived difference by the user in the two services.

235	   This leads naturally to the desire to perform service identification.
236	   Service identification is defined as the process of (1) determination
237	   of the underlying service which is driving a particular signaling
238	   exchange, (2) associating that service with some kind of moniker, and
239	   (3) attaching that moniker to a signaling message (typically a SIP
240	   INVITE), and then utilizing it for various purposes within the
241	   network.  Service identification can be done in the endpoints, in
242	   which case the UA would insert the moniker directly into the
243	   signaling message based on its awareness of the service.  Or, it can
244	   be done within a proxy in the network, based on inspection of the SIP
245	   message, or based on hints placed into the message by the user.

247	4.  Example Services

249	   It is very useful to consider several example services, especially
250	   ones that appear difficult to differentiate from each other.

252	4.1.  IPTV vs. Multimedia

254	   IP Television (IPTV) is the usage of IP networks to access
255	   traditional television content, such as movies and shows.  SIP can be
256	   utilized to establish a session to a media server in a network, which
257	   then serves up multimedia content and streams it as an audio and
258	   video stream towards the client.  Whether SIP is ideal for IPTV is,
259	   in itself, a good question.  However, such a discussion is outside
260	   the scope of this document.

262	   Consider multimedia conferencing.  The user accesses a voice and
263	   video conference at a conference server.  The user might join in
264	   listen-only mode, in which case the user receives audio and video
265	   streams, but does not send.

267	   These two services - IPTV and multimedia conferencing, clearly appear
268	   as different services.  They have different user experiences and
269	   applications.  A user is unlikely to ever be confused about whether a
270	   session is IPTV or multimedia conferencing.  Indeed, they are likely
271	   to have different software applications or endpoints for the two
272	   services.

274	   However, these two services look remarkably alike based on the
275	   signaling.  Both utilize audio and video.  Both could utilize the
276	   same codecs.  Both are unidirectional streams (from a server in the
277	   network to the client).  Thus, it would appear on the surface that
278	   there is no way to differentiate them, based on inspection of the
279	   signaling alone.

281	4.2.  Gaming vs. Voice Chat

283	   Consider an interactive game, played between two users from their
284	   mobile devices.  The game involves the users sending each other game
285	   moves, using a messaging channel, in addition to voice.  In another
286	   service, users have a voice and IM chat conversation using a buddy
287	   list application on their PC.

289	   In both services, there are two media streams - audio and messaging.
290	   The audio uses the same codecs.  Both use the Message Session Relay
291	   Protocol (MSRP) [5].  In both cases, the caller would send an INVITE
292	   to the AOR of the target user.  However, these represent fairly
293	   different services, in terms of user experience.

295	4.3.  Configuration vs. Pager Messaging

297	   The SIP MESSAGE method [8] provides a way to send one-shot messages
298	   to a particular AOR.  This specification is primarily aimed at Short
299	   Message Service (SMS) style messaging, commonly found in wireless
300	   phones.  Receipt of a MESSAGE request would cause the messaging
301	   application on a phone to launch, allowing the user to browse message
302	   history and respond.

304	   However, MESSAGE is sometimes used for the delivery of content to a
305	   device for other purposes.  For example, some providers use it to
306	   deliver configuration updates, such as new phone settings or
307	   parameters, or to indicate that a new version of firmware is
308	   available.  Though not designed for this purpose, MESSAGE gets used
309	   since, in existing wireless networks, SMS are used for this purpose,
310	   and MESSAGE is the SIP equivalent of SMS.

312	   Consequently, the MESSAGE request sent to a phone can be for two
313	   different services.  One would require invocation of a messaging app,
314	   whereas the other would be consumed by the software in the phone,
315	   without any user interaction at all.

317	5.  Using Service Identification

319	   It is important to understand what the service identity would be
320	   utilized for, if known.  The discussions in Section 4 give some hints
321	   to the possible usages.  Here, we explicitly discuss them.

323	5.1.  Application Invocation in the User Agent

325	   In some of the examples above, there were multiple software
326	   applications running within a single user agent.  When an incoming
327	   INVITE or MESSAGE arrives, it must be delivered to the appropriate
328	   application software.  When each service is bound to a distinct
329	   software application, it would seem that the service identity is
330	   needed to dispatch the message to the appropriate piece of software.
331	   This is shown in Figure 2.

333	                            +---------------------------------+
334	                            |                                 |
335	                            | +-------------+ +-------------+ |
336	                            | |     UI      | |     UI      | |
337	                            | +-------------+ +-------------+ |
338	                            | +-------------+ +-------------+ |
339	                            | |             | |             | |
340	                            | |  Service 1  | |  Service 2  | |
341	                            | |             | |             | |
342	                            | +-------------+ +-------------+ |
343	                            | +-----------------------------+ |
344	                            | |                             | |
345	                            | |             SIP             | |
346	                            | |            Layer            | |
347	                            | |                             | |
348	                            | +-----------------------------+ |
349	                            |                                 |
350	                            +---------------------------------+

352	                                      Physical Device

354	                                 Figure 2

356	   The role of the SIP layer is to parse incoming messages, handle the
357	   SIP state machinery for transactions and dialogs, and then dispatch
358	   request to the appropriate service.  For the example services in
359	   Section 4.2, an incoming INVITE for the gaming service would be
360	   delivered to the gaming application software.  An incoming INVITE for
361	   the voice chat service would be delivered to the voice chat
362	   application software.  For the examples in Section 4.3, a MESSAGE
363	   request for user to user messaging would be delivered to the
364	   messaging or SMS app, and a MESSAGE request containing configuration
365	   data would be delivered to a configuration update application.

367	5.2.  Application Invocation in the Network

369	   Another usage of a service identifier would be to cause servers in
370	   the SIP network to provide additional processing, based on the
371	   service.  For example, an INVITE issued by a user agent for IPTV
372	   would pass through a server that does some kind of content rights
373	   management, authorizing whether the user is allowed to access that
374	   content.  On the other hand, an INVITE issued by a user for
375	   multimedia conferencing would pass through a server providing
376	   "traditional" telephony features, such as outbound call screening and
377	   call recording.  It would make no sense for the INVITE associated
378	   with IPTV to have outbound call screening and call recording applied,
379	   and it would make no sense for the multimedia conferencing INVITE to
380	   be processed by the content rights management server.  Indeed, in
381	   these cases, its not just an efficiency issue (invoking servers when
382	   not needed), but rather, truly incorrect behavior can occur.  For
383	   example, if an outbound call screening application is set to block
384	   outbound calls to everything except for the phone numbers of friends
385	   and family, an IPTV request that gets processed by such a server
386	   would be blocked (as its not targeted to the AOR of a friend or
387	   family member).  This would block a user's attempt to access IPTV
388	   services, when that was not the goal at all.

390	   Similarly, a MESSAGE request from Section 4.3 might need to pass
391	   through a message server for filtering when it is associated with
392	   chat, but not when it is associated with config.  Consider a filter
393	   which gets applied to MESSAGE requests, and that filter runs in a
394	   server in the network.  The filter operation prevents user Joe from
395	   sending messages to user Bob that contain the words "stock" or
396	   "purchase", due to some regulations that disallow Joe and Bob from
397	   discussing stock trading.  However, a MESSAGE for configuration
398	   purposes might contain an XML document that uses the token "stock" as
399	   some kind of attribute.  This configuration update would be discarded
400	   by the filtering server, when it should not have been.

402	5.3.  Network Quality of Service Authorization

404	   The IP network can provide differing levels of Quality of Service
405	   (QoS) to IP packets.  This service can include guaranteed throughput,
406	   latency, or loss characteristics.  Typically, the user agent will
407	   make some kind of QoS request, either using explicit signaling
408	   protocols (such as RSVP) or through marking of Diffserv value in
409	   packets.  The network will need to make a policy decision based on
410	   whether these QoS treatments are authorized or not.  One common
411	   authorization policy is to check if the user has invoked a service
412	   using SIP that they are authorized to invoke, and that this service
413	   requires the level of QoS treatment the user has requested.

415	   For example, consider IPTV and multimedia conferencing as described
416	   in Section 4.1.  IPTV is a non-real time service.  Consequently,
417	   media traffic for IPTV would be authorized for bandwidth guarantees,
418	   but not for latency or loss guarantees.  On the other hand,
419	   multimedia conferencing is real time.  Its traffic would require
420	   bandwidth, loss and latency guarantees from the network.

422	   Consequently, if a user should make an RSVP reservation for a media
423	   stream, and ask for latency guarantees for that stream, the network
424	   would like to be able to authorize it if the service was multimedia
425	   conferencing, but not if it was IPTV.  This would require the server
426	   performing the QoS authorization to know the service associated with
427	   the INVITE that set up the session.

429	5.4.  Service Authorization

431	   Frequently, a network administrator will want to authorize whether a
432	   user is allowed to invoke a particular service.  Not all users will
433	   be authorized to use all services that are provided.  For example, a
434	   user may not be authorized to access IPTV services, whereas they are
435	   authorized to utilize multimedia processing.  A user might not be
436	   able to utilize a multiplayer gaming service, whereas they are
437	   authorized to utilize voice chat services.

439	   Consequently, when an INVITE arrives at a proxy in the network, the
440	   proxy will need to determine what the requested service is, so that
441	   the proxy can make an authorization decision.

443	5.5.  Accounting and Billing

445	   Service authorization and accounting/billing go hand in hand.
446	   Presumably, one of the primary reasons for authorizing that a user
447	   can utilize a service is that they are being billed differently based
448	   on the type of service.  Consequently, one of the goals of a service
449	   identity is to be able to include it in accounting records, so that
450	   the appropriate billing model can be applied.

452	   For example, in the case of IPTV, a service provider can bill based
453	   on the content (US $5 per movie, perhaps), whereas for multimedia
454	   conferencing, they can bill by the minute.  This requires the
455	   accounting streams to indicate which service was invoked for the
456	   particular session.

458	5.6.  Negotiation of Service

460	   In some cases, when the caller initiates a session, they don't
461	   actually know which service will be utilized.  Rather, they might
462	   like to offer up all of the services they have available to the
463	   called party, and then let the called party decide, or let the system
464	   make a decision based on overlapping service capabilities.

466	   As an example, s user can do both the game and the voice chat service
467	   of Section 4.2.  They initiate a session to a target AOR, but the
468	   devices used by that user can only support voice chat.  Consequently,
469	   voice chat gets utilized for the session.

471	5.7.  Dispatch to Devices

473	   When a user has multiple devices, each with varying capabilities in
474	   terms of service, it is useful to dispatch an incoming request to the
475	   right device based on whether the device can support the service that
476	   has been requested.

478	   For example, if a user initiates a gaming session with voice chat,
479	   and the target user has two devices - one that can support the gaming
480	   service, and the other that cannot, the INVITE should be dispatched
481	   to the device which supports the gaming session.

483	6.  Key Principles of Service Identification

485	   In this section, we describe some of the key principles of performing
486	   service identification.

488	6.1.  Services are a By-Product of Signaling

490	   Almost always, the first solution that people consider is to add some
491	   kind of field to the signaling messages which indicates what the
492	   service is.  This field would then be inserted by the user agent, and
493	   then can be used by the proxies and other user agent as a service
494	   identifier.

496	   This approach, however, misses a key point, which cannot be stressed
497	   enough, and which represents the core architectural principle to be
498	   understood here:

500	      A service is the by-product of the signaling and the context
501	      around it (the user profile, time-of-day and so on) - the effects
502	      of the signaling message once launched into the network.  The
503	      service identity is therefore always derivable from the signaling
504	      and its context without additional identifiers.

506	   When a user sends an INVITE request to the network, and targets that
507	   request at an IPTV server, and includes SDP for audio and video
508	   streaming, the *result* of sending such an INVITE is that an IPTV
509	   session occurs.  The entire purpose of the INVITE is to establish
510	   such a session, and therefore, invoke the service.  Thus, a service
511	   is not something that is different from the rest of the signaling
512	   message.  A service is what the user gets after the network and other
513	   user agents have processed a signaling message.

515	   This principle leads to another important conclusion:

517	      If two services are different, but their signaling appears to be
518	      the same, it is because there is in fact something different that
519	      has been overlooked, or something has been implied from the
520	      signaling which should have been signaled explicitly.

522	   This makes sense; if a service is the byproduct of signaling, how can
523	   a user have different experiences and different services when the
524	   signaling message is the same?  There has to be something different
525	   in the messages, if the user experience was in fact different.

527	   To illustrate this, let us take each of the example services in
528	   Section 4 and investigate whether there is, or should be, something
529	   different in the signaling in each case.

531	   IPTV vs. Multimedia Conferencing:  The two services in Section 4.1
532	      appear to have identical signaling.  They both involve audio and
533	      video streams, both of which are unidirectional.  Both might
534	      utilize the same codecs.  However, there is another important
535	      difference in the signaling - the target URI.  In the case of
536	      IPTV, the request is targeted at a media server or to a particular
537	      piece of content to be viewed.  In the case of multimedia
538	      conferencing, the target is a conference server.  The
539	      administrator of the domain can therefore examine the two Request-
540	      URI, and figure out whether it is targeted for a conference server
541	      or a content server, and use that to derive the service associated
542	      with the request.

544	   Gaming vs. Voice Chat:  Though both sessions involve MSRP and voice,
545	      and both are targeted to the same AOR of the called user, there is
546	      a difference.  The MSRP messages for the gaming session carry
547	      content which is game specific, whereas the MSRP messages for the
548	      voice chat are just regular text, meant for rendering to a user.
549	      Thus, the MSRP session in the SDP will indicate the specific
550	      content type that MSRP is carrying, and this type will differ in
551	      both cases.  Even if the game moves look like text, since they are
552	      being consumed by an automata there is an underlying schema that
553	      dictates their content, and therefore, this schema represents the
554	      actual content type that should be signaled.

556	   Configuration vs. Pager Messaging:  Just as in the case of gaming vs.
557	      voice chat, the content type of the messages differentiates the
558	      service that occurs as a consequence of the messages.

560	   This is ultimately an expression of the principle of DWIM vs. DWIS
561	   (Do-What-I-Mean vs. Do-What-I-Say).  Explicit signaling is DWIS - the
562	   user is asking for a service by invoking the signaling that results
563	   in the desired effect.  A service identifier is DWIM - an unspecific
564	   request for something that is ill-defined and non-interoperable.

566	6.2.  Perils of Explicit Identifiers

568	   Given that the information in the signaling message always conveys
569	   enough information to identify the service, another important
570	   conclusion can be drawn:

572	      Inclusion of an explicit service identifier within a message is,
573	      at best, redundant, and at worst, an avenue for fraud, loss of
574	      interoperability, and stifling of service innovation.

576	   By "explicit service identifier", we mean a field included in the
577	   signaling message that contains a token whose value indicates the
578	   specific service invoked by the calling user.  This would be "IPTV"
579	   or "voice chat" or "shoot-em game" or "short message service".  This
580	   explicit identifier would typically be inserted by the originating
581	   user agent, and carried in the signaling message.

583	   Clearly, if the signaling message itself contains enough information
584	   to identify the service, inclusion of an extra field to say the same
585	   thing is going to be redundant.  Redundancy by itself is not a big
586	   deal.  However, redundancy can lead to other,more significant
587	   problems.

589	6.2.1.  Fraud

591	   First and foremost, it can lead to fraud.  If a provider uses the
592	   service identifier for billing and accounting purposes, or for
593	   authorization purposes, it opens an avenue for attack.  The user can
594	   construct the signaling message so that its actual effect (which is
595	   the service the user will receive), is what the user desires, but the
596	   service identity (which is what is used for billing and
597	   authorization) doesn't match, and indicates a cheaper service, or one
598	   that the user is authorized to receive.  If, however, the service
599	   identity used by the domain admistrator is derived from the signaling
600	   itself, the user cannot lie.  If they did lie, they wouldn't get the
601	   desired service.

603	   Consider the example of IPTV vs. multimedia conferencing.  If
604	   multimedia conferencing is cheaper, the user could send an INVITE for
605	   an IPTV session, but include a service identifier which indicates
606	   multimedia conferencing.  They get the service associated with IPTV,
607	   but at the cost of multimedia conferencing.

609	   This same principle shows up in other places.  For example, in the
610	   identification of an emergency services call [6].  It is desirable to
611	   give emergency services calls special treatment, such as being free,
612	   authorized even when the user cannot otherwise make calls, and to
613	   give them priority.  If emergency calls where indicated through
614	   something other than the target of the call being an emergency
615	   services URN [7], it would open an avenue for fraud.  The user could
616	   place any desired URI in the request-URI, and indicate that the call
617	   is an emergency services call.  This could would then get special
618	   treatment, but of course get routed to the target URI.  The only way
619	   to prevent this fraud is to consider an emergency call as any call
620	   whose target is an emergency services URN.  Thus, the service
621	   identification here is based on the target of the request.  When the
622	   target is an emergency services URN, the request can get special
623	   treatment.  The user cannot lie, since there is no way to separately
624	   indicate this is an emergency call, besides targeting it to an
625	   emergency URN.

627	6.2.2.  Systematic Interoperability Failures

629	   How can inclusion of an explicit service identifier cause loss of
630	   interoperability?  When such an identifier is used to drive
631	   functionality - such as dispatch on the phones, in the network, or
632	   QoS authorization, it means that the wrong thing can happen when this
633	   field is not set properly.  Consider a user in domain 1, calling a
634	   user in domain 2.  Domain 1 provides the user with a service they
635	   call "voice chat", which utilizes voice and IM for real time
636	   conversation, driven off of a buddy list application on a PC.  Domain
637	   2 provides their users with a service they call, "text telephony",
638	   which is a voice service on a wireless device that also allows the
639	   user to send text messages.  Consider the case where domain 1 and
640	   domain 2 both have their user agents insert a service identifiers
641	   into the request, and then use that to derive QoS authorization,
642	   accounting, and invocation of applications in the network and in the
643	   device.  The user in domain 1 calls the user in domain 2, and inserts
644	   the identifier "Voice Chat" into the INVITE.  When this arrives at
645	   the proxy in domain 2, the service is unknown.  Consequently, the
646	   request does not get the proper QoS treatment.  When it gets
647	   delivered to the User Agent of the user in domain 2, the user agent
648	   does not see a service it understands, and so consequently, does not
649	   know to dispatch the request to the right application software.
650	   Thus, this call has completely failed, even when it could have
651	   succeeded.  This illustrates the following key point:

653	      Explicit service identifiers, used between domains, cause
654	      interoperability failures unless all interconnected domains agree
655	      on exactly the same set of services and how to name them.

657	   Of course, lack of service identifiers does not guarantee service
658	   interoperability.  However, SIP was built with rich tools for
659	   negotiation of capabilities at a finely granular level.  One user
660	   agent can make a call using audio and video, but if the receiving UA
661	   only supports audio, SIP allows both sides to negotiate down to the
662	   lowest common denominator.  Thus, communications is still provided.
663	   As another example, if one agent initiates a Push-To-Talk session
664	   (which is audio with a companion floor control mechanism), and the
665	   other side only did regular audio, SIP would be able to negotiate
666	   back down to a regular voice call.  As another example, if a calling
667	   user agent is running a high-definition video conferencing endpoint,
668	   and the called user agent supports just a regular video endpoint, the
669	   codecs themselves can negotiate downward to a lower rate, picture
670	   size, and so on.  Thus, interoperability is achieved.  Interestingly,
671	   the final "service" may no longer be well characterized by the
672	   service identifier that would have been placed in the original
673	   INVITE.  For example, in this case, of the original INVITE from the
674	   caller had contained the service identifier, "hi-fi video", but the
675	   video gets negotiated down to a lower rate and picture size, the
676	   service identifier is no longer really appropriate.

678	   This illustrates another key aspect of the interoperability problem:

680	      Usage of explicit service identifiers in the request will result
681	      in inconsistencies with results of any SIP negotiation that might
682	      otherwise be applied in the session.

684	   Of course, there are cases where negotiating to a common baseline is
685	   not what is desired.  SIP provides tools (such as Require), to force
686	   the call to fail unless the desired capabilities are supported.
687	   However, this is not recommended as a general rule [4].

689	   When a service identifier becomes something that both proxies and the
690	   user agent need to understand in order to properly treat a request,
691	   it becomes equivalent to including a token in the Proxy-Require and
692	   Require header fields of every single SIP request.  The very reason
693	   that RFC 4485 frowns upon usage of Require and certainly Proxy-
694	   Require is the huge impact on interoperability it causes.  It is for
695	   this same reason that explicit service identifiers need to be
696	   avoided:

698	      The usage of explicit service identifiers is equivalent to the
699	      usage of Require and Proxy-Require in the request, and has the
700	      same negative impact on interoperability as those headers have.

702	6.2.3.  Stifling of Service Innovation

704	   The probability that any two pair of service providers end up with
705	   the same set of services, and give them the same names, becomes
706	   decreasingly small as the number of providers grow.  Indeed, it would
707	   almost certainly require a centralized authority to identify what the
708	   services are, how they work, and what they are named.  This, in turn,
709	   leads to a requirement for complete homogeneity in order to
710	   facilitate interconnection.  Two providers cannot usefully
711	   interconnect unless they agree on the set of services they are
712	   offering to their customers, and each do the same thing.  This is, in
713	   a very real sense, anathema to the entire notion of SIP, which is
714	   built on the idea that heterogeneous domains can interconnect and
715	   still get interoperability:

717	      Explicit service identifiers lead to a requirement for homogeneity
718	      in service definitions across providers that interconnect, ruining
719	      the very service heterogeneity that SIP was meant to bring.

721	   Indeed, Metcalfe's law says that the value of a network grows with
722	   the square of the number of participants.  As a consequence of this,
723	   once a bunch of large domains did get together, agree on a set of
724	   services, and then a set of well-known identifiers for those
725	   services, it would force other providers to also deploy the same
726	   services, in order to obtain the value that interconnection brings.
727	   This, in turn, will stifle innovation, and quickly force the set of
728	   services in SIP to become fixed and never expand beyond the ones
729	   initially agreed upon.  This, too, is anathema to the very framework
730	   on which SIP is built, and defeats much of the purpose of why
731	   providers have chosen to deploy SIP in their own networks:

733	      Metcalfe's law, when combined with explicit service identifiers,
734	      will stifle the ability of providers to develop new SIP services,
735	      since they have no hope of interconnecting them with anyone else.

737	   Consider the following example.  Several providers get together, and
738	   standardize on a bunch of service identifiers.  One of these uses
739	   audio and video (say, "multimedia conversation").  This service is
740	   successful, and is widely utilized.  Endpoints look for this
741	   identifier to dispatch calls to the right software applications, and
742	   the network looks for it to invoke features, perform accouting, and
743	   QoS.  A new provider gets the idea for a new service, say, avatar-
744	   enhanced multimedia conversation.  In this service, there is audio
745	   and video, but there is a third stream, which renders an avatar.  A
746	   caller can press buttons on their phone, to cause the avatar on the
747	   other person's device to show emotion, make noise, and so on.  This
748	   is similar to the way emoticons are used today in IM.  This service
749	   is enabled by adding a third media stream (and consequently, third
750	   m-line) to the SDP.

752	   Normally, this service would be backwards compatible with a regular
753	   audio-video endpoint, which would just reject the third media stream.
754	   However, because a large network has been deployed that is expecting
755	   to see the token, "multimedia conversation" and its associated audio+
756	   video service, it is nearly impossible for the new provider to roll
757	   out this new service.  If they did, it would fail completely, or
758	   partially fail, when their users call users in other provider
759	   domains.

761	7.  Recommendations

763	   From these principles, several recommendations can be made:

765	   o  Systems needing to perform service identification must examine
766	      existing signaling constructs to identify the service based on
767	      fields that exist within the signaling message already.

769	   o  If it appears that the signaling currently defined in standards is
770	      not sufficient to identify the service, it may be due to lack of
771	      sufficient signaling to convey what is needed, and new standards
772	      work should be undertaken to fill this gap.

774	   o  The usage of an explicit service identifier does make sense as a
775	      way to cache a decision made by a network element, for usage by
776	      another network element within the same domain.  However, service
777	      identifiers are fundamentally useful within a particular domain,
778	      and any such header must be stripped at a network boundary.

780	8.  Security Considerations

782	   Oftentimes, the service associated with a request is utilized for
783	   purposes such as authorization, accounting, and billing.  When
784	   service identification is not done properly, the possibility of
785	   network fraud is introduced.  It is for this reason, discussed
786	   extensively in Section 6.2.1, that the usage of explicit service
787	   identifiers inserted by a UA is NOT RECOMMENDED.

789	9.  IANA Considerations

791	   There are no IANA considerations associated with this specification.

793	10.  Acknowledgements

795	   This document is based on discussions with Paul Kyzivat and Andrew
796	   Allen, who contributed significantly to the ideas here.  Much of the
797	   content in this draft is a result of discussions amongst participants
798	   in the SIPPING mailing list, including Dean Willis, Tom Taylor, Eric
799	   Burger, Dale Worley, Christer Holmberg, and John Elwell, amongst many
800	   others.

802	11.  References

804	11.1.  Normative References

806	   [1]  Bradner, S., "Key words for use in RFCs to Indicate Requirement
807	        Levels", BCP 14, RFC 2119, March 1997.

809	   [2]  Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A.,
810	        Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP:
811	        Session Initiation Protocol", RFC 3261, June 2002.

813	11.2.  Informational References

815	   [3]  Rosenberg, J., "A Data Model for Presence", RFC 4479, July 2006.

817	   [4]  Rosenberg, J. and H. Schulzrinne, "Guidelines for Authors of
818	        Extensions to the Session Initiation Protocol (SIP)", RFC 4485,
819	        May 2006.

821	   [5]  Campbell, B., "The Message Session Relay Protocol",
822	        draft-ietf-simple-message-sessions-19 (work in progress),
823	        February 2007.

825	   [6]  Rosen, B., "Framework for Emergency Calling in Internet
826	        Multimedia", draft-ietf-ecrit-framework-01 (work in progress),
827	        March 2007.

829	   [7]  Schulzrinne, H., "A Uniform Resource Name (URN) for Services",
830	        draft-ietf-ecrit-service-urn-06 (work in progress), March 2007.

832	   [8]  Campbell, B., Rosenberg, J., Schulzrinne, H., Huitema, C., and
833	        D. Gurle, "Session Initiation Protocol (SIP) Extension for
834	        Instant Messaging", RFC 3428, December 2002.

836	Author's Address

838	   Jonathan Rosenberg
839	   Cisco
840	   Edison, NJ
841	   US

843	   Email: jdrosen@cisco.com
844	   URI:   http://www.jdrosen.net

846	Full Copyright Statement

848	   Copyright (C) The IETF Trust (2007).

850	   This document is subject to the rights, licenses and restrictions
851	   contained in BCP 78, and except as set forth therein, the authors
852	   retain all their rights.

854	   This document and the information contained herein are provided on an
855	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
856	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
857	   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
858	   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
859	   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
860	   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

862	Intellectual Property

864	   The IETF takes no position regarding the validity or scope of any
865	   Intellectual Property Rights or other rights that might be claimed to
866	   pertain to the implementation or use of the technology described in
867	   this document or the extent to which any license under such rights
868	   might or might not be available; nor does it represent that it has
869	   made any independent effort to identify any such rights.  Information
870	   on the procedures with respect to rights in RFC documents can be
871	   found in BCP 78 and BCP 79.

873	   Copies of IPR disclosures made to the IETF Secretariat and any
874	   assurances of licenses to be made available, or the result of an
875	   attempt made to obtain a general license or permission for the use of
876	   such proprietary rights by implementers or users of this
877	   specification can be obtained from the IETF on-line IPR repository at
878	   http://www.ietf.org/ipr.

880	   The IETF invites any interested party to bring to its attention any
881	   copyrights, patents or patent applications, or other proprietary
882	   rights that may cover technology that may be required to implement
883	   this standard.  Please address the information to the IETF at
884	   ietf-ipr@ietf.org.

886	Acknowledgment

888	   Funding for the RFC Editor function is provided by the IETF
889	   Administrative Support Activity (IASA).