SIMPLE WG A. Houri Internet-Draft IBM Intended status: Standards Track T. Rang Expires: August 30, 2007 Microsoft Corporation E. Aoki AOL LLC V. Singh H. Schulzrinne Columbia U. February 26, 2007 Problem Statement for SIP/SIMPLE draft-ietf-simple-interdomain-scaling-analysis-00.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. This Internet-Draft will expire on August 30, 2007. Copyright Notice Copyright (C) The IETF Trust (2007). Houri, et al. Expires August 30, 2007 [Page 1] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 Abstract The document analyses the traffic that is generated due to presence subscriptions between domains. It is shown that the amount of traffic can be extremely big. In addition to the very large traffic the document also analyses the affects of a large presence system on the memory footprint and the CPU load. Several suggested optimization to the SIMPLE protocol are analysed with the possible impact on the load. Table of Contents 1. Requirements notation . . . . . . . . . . . . . . . . . . . . 4 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 3. Message Load . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.1. Known Optimizations . . . . . . . . . . . . . . . . . . . 7 3.2. Assumptions . . . . . . . . . . . . . . . . . . . . . . . 7 3.3. Analysis . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.4. SIMPLE with no optimizations . . . . . . . . . . . . . . . 10 3.5. SIMPLE with suggested optimizations . . . . . . . . . . . 11 3.6. Presence Federations . . . . . . . . . . . . . . . . . . . 12 3.6.1. Widely distributed inter-domain presence . . . . . . . 12 3.6.2. Associated inter-domain presence . . . . . . . . . . . 14 3.6.3. Very large network peering . . . . . . . . . . . . . . 15 3.6.4. Intra-domain peering . . . . . . . . . . . . . . . . . 17 4. Resource List Service . . . . . . . . . . . . . . . . . . . . 20 5. State Management . . . . . . . . . . . . . . . . . . . . . . . 22 5.1. State Size Calculations . . . . . . . . . . . . . . . . . 23 5.1.1. Tiny System . . . . . . . . . . . . . . . . . . . . . 23 5.1.2. Medium System . . . . . . . . . . . . . . . . . . . . 23 5.1.3. Large System . . . . . . . . . . . . . . . . . . . . . 23 5.1.4. Very Large System . . . . . . . . . . . . . . . . . . 24 6. Processing complexities . . . . . . . . . . . . . . . . . . . 25 6.1. Aggregation . . . . . . . . . . . . . . . . . . . . . . . 25 6.2. Partial Publish and Notify . . . . . . . . . . . . . . . . 25 6.3. Filtering . . . . . . . . . . . . . . . . . . . . . . . . 26 6.4. Privacy . . . . . . . . . . . . . . . . . . . . . . . . . 26 7. Possible Optimizations . . . . . . . . . . . . . . . . . . . . 27 7.1. Common NOTIFY for multiple watchers . . . . . . . . . . . 27 7.1.1. Privacy filtering . . . . . . . . . . . . . . . . . . 27 7.1.2. NOTIFY failure aggregation . . . . . . . . . . . . . . 28 7.1.3. Transferring the watcher list . . . . . . . . . . . . 28 7.1.4. Message flow example . . . . . . . . . . . . . . . . . 29 7.1.5. SIP message examples for common NOTIFY . . . . . . . . 31 7.2. Aggregation of NOTIFY messages (Batched notification) . . 32 7.2.1. Extracting and sending individual NOTIFY using Aggregated NOTIFY message body . . . . . . . . . . . . 32 Houri, et al. Expires August 30, 2007 [Page 2] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 7.2.2. Subscription termination and failure indication in NOTIFY delivery . . . . . . . . . . . . . . . . . . . 33 7.2.3. Message flow example . . . . . . . . . . . . . . . . . 33 7.2.4. SIP message flow example for batched notification . . 35 7.3. Timed presence . . . . . . . . . . . . . . . . . . . . . . 37 7.4. On-Demand presence (Fetch or Pull Model) . . . . . . . . . 38 7.5. Adapting the subscription rate . . . . . . . . . . . . . . 38 7.6. Other Optimizations . . . . . . . . . . . . . . . . . . . 38 8. Extremely Optimized Model . . . . . . . . . . . . . . . . . . 41 9. Suggested Requirements . . . . . . . . . . . . . . . . . . . . 43 10. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . 45 11. Security Considerations . . . . . . . . . . . . . . . . . . . 46 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 47 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 48 13.1. Normative References . . . . . . . . . . . . . . . . . . . 48 13.2. Informational References . . . . . . . . . . . . . . . . . 48 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 50 Intellectual Property and Copyright Statements . . . . . . . . . . 52 Houri, et al. Expires August 30, 2007 [Page 3] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 1. Requirements notation The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [1]. Houri, et al. Expires August 30, 2007 [Page 4] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 2. Introduction The document analyses the traffic that is generated due to presence subscriptions between domains. It is shown that the amount of traffic can be extremely big. In addition to the very large traffic the document also analyses the affects of a large presence system on the memory footprint and the CPU load. Several suggested optimization to the SIMPLE protocol are analysed with the possible impact on the load. Although this document is an analysis document and not a BCP document, several possible optimizations and directions are listed in addition to an initial set of requirements for what should be the characteristic of the solution to the problem stated in the document This document is intended to be used by the SIMPLE WG in order to work on possible solutions that will make the deployment of a presence server more reasonable task. Note that the document does not try to compare the SIP based presence server to other types of presence servers but only analyses the SIP based presence server. It is very likely that similar scalability issues are inherent to the deployment of presence systems and not to a certain protocol. The document discusses the following areas. In each area we try to show the complexity and the load that the presence server has to handle in order to provide its service. o Messages load - By computing the number of messages that are required for connecting presence systems the document shows that the number of messages is very big and it is quite obvious that some optimizations are needed. In addition we also show that the bandwidth required is also very big. o State management - Due to the nature of the service that the presence server provides, the presence server has to manage a relatively big and complex state and some computations are provided in the document. o Processing complexities - The presence server maintains many small objects and has to do frequent operations on these objects. We show that these operations and especially the optimizations that are intended to save on the amount of data that is being sent between watchers and presence servers, are not so simple and may create a very heavy processing load on the presence server. o Groups - Resource List Servers [12] optimize the number of sessions that are created between the watchers and the presence server. On the other hand, this optimization may create an Houri, et al. Expires August 30, 2007 [Page 5] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 exponential size of subscription due to the unbearable ease of subscribing to large groups. The term presence domain or presence system appears in the document several time. By this term we refer to a presence server that provides presence subscription and notification services to its users. The system can be a system that is deployed in a small enterprise or in a very large consumer network. Houri, et al. Expires August 30, 2007 [Page 6] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 3. Message Load Even though some optimizations are approved or are being defined, we show in this section that a very large number of messages & large bandwidth are needed in order to establish federation between presence systems of large communities. Further thinking is needed in order to make large deployment of presence systems less resource demanding. Note that even though this document talks about inter domain traffic, the introduction of resource list servers (RLSs) [12] introduce very similar traffic pattern within a domain as between domains. See detailed discussion on resource lists in section Section 4. 3.1. Known Optimizations The current optimizations that are approved or considered in the SIMPLE group can be divided into two categories: o Dialogs saving optimization - Here we refer to optimizations as the resource list RFC [12] or to the Uri list subscriptions draft [18]. These documents define ways to reduce the number of dialogs that are required between the subscriber and the presence system. o Notification optimizations - Here we refer to the optimizations that are suggested in the subnot-etags draft [20]. This draft suggests ways to suppress the sending of unnecessary notifies when for example a subscription is refreshed. There are other drafts that reduce the size of messages as partial notifies or filtering but in this document we mostly care about the amount of messages & bandwidth. 3.2. Assumptions In the document we have several assumptions regarding size of messages, rate of presence change and more. It should be noted that these assumptions are not directly based on rigorous statistics that was done on actual SIP based messages but more from experience on other types of presence based systems. Even though the assumptions in this document are not based on rigorous statistical data the target here is not to analyse specific system but show that even with VERY moderate assumptions, the number of messages, the network bandwidth, the required state management and the load on the CPU is very high. Real life systems should have a much bigger scalability requirements. for example the presence state change that we assumed (one presence state change per hour) is maybe one of the most moderate assumptions that we have taken. Experience Houri, et al. Expires August 30, 2007 [Page 7] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 from consumer networks show that the frequency here is much bigger and especially with the younger generation. In an environment where a user may have several devices and other resources for presence information as geographical location and calendar the frequency of presence state changes will be much higher. It is very hard to measure presence load since the behavior of users is very different. Some users will have a very small number of presentities in their watch list while others may have hundreds. Some users will change their state a lot and have many sources of presence information while other may have very small number of changes during the day. In addition there that "rush hour" calculation that was not included in this document yet (to be added). Rush hour differs between different enterprises and is still different in the consumer presence systems. It is very hard if not impossible to take into a static model all the possible combinations. Saying the above, there are still several things to be done to create a more complete picture: o Get rigorous statistical data that can be formally published from real presence systems o Add to the model the possibility of having multiple sources of presence data per presentity and change calculations accordingly o Add "rush hour" calculations for the end and the beginning of the day The authors will especially appreciate any input in this area that will help us to create a more real life model. We intend to try and gather more data and improve the assumptions and the model in the next revisions of this document. 3.3. Analysis The basic SIMPLE subscription dialog involves the following message- transfer: o SUBSCRIBE/200 o Initial NOTIFY/200 o (j) NOTIFY/200 where 'j' is the number of presence changes seen by the watcher o (k) SUBSCRIBE/200 where 'k' is the number of subscription dialog refresh periods Houri, et al. Expires August 30, 2007 [Page 8] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 o SUBSCRIBE/200 with Expires = 0 to terminate the dialog o NOTIFY/200 ending the dialog An individual watcher will generate X number of SIMPLE subscription dialogs corresponding to the number of presentities it chooses to watch. The amount of traffic generated is significantly affected by several factors: o Number of watchers connected to the system o Number of presentities connected to the system o Frequency of changes to presence information This document contains several calculations that show the expected message rate and bandwidth between presence domains. The following explains the assumptions and methods behind the calculations: o (A01) Subscription lifetime (hours)- The assumed lifetime of a subscription in hours. Here we assume 8 hours for all calculations. o (A02) Presence state changes / hour - The average time that a presentity changes his/hers status in one hour. We assumed 3 times per hour for most calculations. Note that for some users in consumer messaging systems, the actual number of changes is likely to be much higher. o (A03) Subscription refresh interval / hour - The duration of the SUBSCRIBE session after which it needs to be refreshed. We assumed that the duration is one hour. o (A04) Total federated presentities per watcher - The number of presentities that the watcher is watching. The number here changes in this document according to the type of the specific deployment o (A05) Number of dialogs to maintain per watcher - The number of the SUBSCRIBE dialogs that are maintained per watcher. if a dialog optimization is not assumed this number is equal to A04, otherwise it is 1 o (A06) Number of watchers in a federated presence domain - The number of watchers in one presence domain that watch presentities in the other domain. The number here varies according to the assumptions for a specific deployment Houri, et al. Expires August 30, 2007 [Page 9] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 o (A07) Initial SUBSCRIBE/200 per watcher = A05*2 (message and an OK o (A08) Initial NOTIFY/200 per watcher = A05*2 (message and an OK) o (A09) Total initial messages = (A07+A08)*A06 o (A10) NOTIFY/200 per watched presentity = (A02*A01*A04*2) (message and an OK) o (A11) SUBSCRIBE/200 refreshes = (A01/A03)*A05*2 (message and an OK) o (A12) NOTIFY/200 due to subscribe refresh - In a deployment where the notification optimization is not deployed this number will be ((A01/A03)*A05), otherwise it is 0 o (A13) Number of steady state messages = (A10+A11+A12)*A06 o (A14) SUBSCRIBE termination = A05*2 (message and an OK) o (A15) NOTIFY terminated = A05*2 (message and an OK) o (A16) Number of sign-out messages = (A14+A15)*A06 o (A17) Total messages between domains (both directions where users from domain A subscribe to users from domain B and vice versa)= (A09+A13+A16)*2 o (A18) Total number of messages / second = A17/A01/3600 (seconds in hour) o (A19) Total number of K bytes per second. Assuming 1K bytes per SUBSCRIBE/200 pair and 4K bytes per NOTIFY/200 pair. Note that in reality the NOTIFY size may be much bigger but using partial NOTIFY should reduce the size considerably 3.4. SIMPLE with no optimizations The following table uses some common presence characteristics to demonstrate the effect these factors have on state and message rate within a presence domain using base SIMPLE protocols without any proposed optimizations. In this example, there are two presence domains, each with 20,000 federating users with an average of 4 contacts in the peer domain Houri, et al. Expires August 30, 2007 [Page 10] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 (A01) Subscription lifetime (hours)...........................8 (A02) Presence state changes / hour...........................3 (A03) Subscription refresh interval / hour....................1 (A04) Total federated presentities per watcher................4 (A05) Number of dialogs to maintain per watcher...............4 (A06) Number of watchers in a federated presence domain..20,000 (A07) Initial SUBSCRIBE/200 per watcher.......................8 (A08) Initial NOTIFY/200 per watcher..........................8 (A09) Total initial messages............................320,000 (A10) NOTIFY/200 per watched presentity.....................192 (A11) SUBSCRIBE/200 refreshes................................64 (A12) NOTIFY/200 due to subscribe refresh....................64 (A13) Number of steady state messages.................6,400,000 (A14) SUBSCRIBE termination...................................8 (A15) NOTIFY terminated.......................................8 (A16) Number of sign-out messages.......................320,000 (A17) Total messages between domains.................14,080,000 (A18) Total number of messages / second.....................489 (A19) Total number of bytes / second on the wire..........830KB Figure 1: SIMPLE with no optimizations 3.5. SIMPLE with suggested optimizations The same analysis provided above is repeated here with the assumption that both the dialog and the notification optimizations are applied. Note that while the sign-in (ramp up) and sign-out messages flows are positively affected, the steady state rates are not. The optimizations enable the creation of a single dialog to the other domain from each watcher for the set of presentities it is watching. The optimizations also enable that there will be no need for a NOTIFY upon refreshing a SUBSCRIBE since the NOTIFY should not be sent in the refresh since it should be the same one that was sent when there was a state change for the presentity. Houri, et al. Expires August 30, 2007 [Page 11] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 (A01) Subscription lifetime (hours)...........................8 (A02) Presence state changes /hour............................3 (A03) Subscription refresh interval / hour....................1 (A04) Total federated presentities per watcher................4 (A05) Number of dialogs to maintain per watcher...............1 (A06) Number of watchers in a federated presence domain..20,000 (A07) Initial SUBSCRIBE/200 per watcher.......................2 (A08) Initial NOTIFY/200 per watcher..........................2 (A09) Total initial messages.............................80,000 (A10) NOTIFY/200 per watched presentity.....................192 (A11) SUBSCRIBE/200 refreshes................................16 (A12) NOTIFY/200 due to subscribe refresh.....................0 (A13) Number of steady state messages.................4,160,000 (A14) SUBSCRIBE termination...................................2 (A15) NOTIFY terminated.......................................2 (A16) Number of sign-out messages........................80,000 (A17) Total messages between domains..................8,640,000 (A18) Total number of messages / second.....................300 (A19) Total number of bytes / second on the wire..........571KB Figure 2: SIMPLE with optimizations 3.6. Presence Federations While these scalability issues exist in any large deployment, certain characteristics make the deployment conducive to the existing resource- list optimizations, and others have characteristics that cannot be exploited with the existing SIMPLE model. Following is a list of federation relationships that have varying usage characteristics. For each, a message rate and bandwidth table is provided reflecting typical changes message rates. Those characteristics can alter the overall effectiveness of existing optimizations. 3.6.1. Widely distributed inter-domain presence In some environments presence federation may be very common, perhaps even more common than intra-domain presence. An example of this type of environment is a small ISV or public server. Users in that small ISV are not likely to subscribe to the presence of other users in the their server since they do not necessarily have any relationship with each other aside from receiving service from the same provider. They are much more likely to be subscribed to the presence of users in one of the federated domains (whether in consumer domains, academic, Houri, et al. Expires August 30, 2007 [Page 12] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 other ISVs, etc). Common characteristics of this deployment are: o Federated subscriptions are the majority of subscription traffic o Individual users are likely to subscribe to multiple users in any one domain o The intersection of users in the deployment watching the same presentities is quite small (i.e., probability that watchers in the domain subscribe to the same presentity is low) To account for the extraordinarily high percentage of federation traffic, the number of federated presentities is increased to 20. The number of watchers in the domain could also be adjusted to account for an expected larger community of users being peered with, it is omitted here for simplification The first table below provides the calculations without optimizations the second table provides the calculations with optimization. Note that the number of messages per second decreases by a quarter with the optimizations but it is still quite big. It is interesting to see that the bandwidth is almost the quarter of the bandwidth when optimizations are applied. (A01) Subscription lifetime (hours)...........................8 (A02) Presence state changes / hour...........................3 (A03) Subscription refresh interval / hour....................1 (A04) Total federated presentities per watcher...............20 (A05) Number of dialogs to maintain per watcher..............20 (A06) Number of watchers in a federated presence domain..20,000 (A07) Initial SUBSCRIBE/200 per watcher......................40 (A08) Initial NOTIFY/200 per watcher.........................40 (A09) Total initial messages..........................1,600,000 (A10) NOTIFY/200 per watched presentity.....................960 (A11) SUBSCRIBE/200 refreshes...............................320 (A12) NOTIFY/200 due to subscribe refresh...................320 (A13) Number of steady state messages................32,000,000 (A14) SUBSCRIBE termination..................................40 (A15) NOTIFY terminated......................................40 (A16) Number of sign-out messages.....................1,600,000 (A17) Total messages between domains.................70,400,000 (A18) Total number of messages / second...................2,444 (A19) Total number of bytes / second on the wire.........1968KB Houri, et al. Expires August 30, 2007 [Page 13] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 Figure 3: Widely distributed inter-domain with no optimizations (A01) Subscription lifetime (hours)...........................8 (A02) Presence state changes / hour...........................3 (A03) Subscription refresh interval / hour....................1 (A04) Total federated presentities per watcher...............20 (A05) Number of dialogs to maintain per watcher...............1 (A06) Number of watchers in a federated presence domain..20,000 (A07) Initial SUBSCRIBE/200 per watcher.......................2 (A08) Initial NOTIFY/200 per watcher..........................2 (A09) Total initial messages.............................80,000 (A10) NOTIFY/200 per watched presentity.....................960 (A11) SUBSCRIBE/200 refreshes................................16 (A12) NOTIFY/200 due to subscribe refresh.....................0 (A13) Number of steady state messages................19,520,000 (A14) SUBSCRIBE termination...................................2 (A15) NOTIFY terminated.......................................2 (A16) Number of sign-out messages........................80,000 (A17) Total messages between domains.................39,360,000 (A18) Total number of messages / second...................1,367 (A19) Total number of bytes / second on the wire..........571KB Figure 4: Widely distributed inter-domain with optimizations 3.6.2. Associated inter-domain presence In this type of environment, the domain is a collection of associated users such as an enterprise. Here, federation is once again very common. However, there is also a strong association between some users in the deployment. These associations make it somewhat more likely that users in that domain will be watchers of the same presentity. This can occur because of business relationships (e.g. two co-workers on a project federating with a partner company). Common characteristics of this deployment are: o Federated subscriptions are large minority or small majority of subscription traffic o Individual users are likely to subscribe to multiple users in any one domain, especially their own Houri, et al. Expires August 30, 2007 [Page 14] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 o The intersection of users in the deployment watching the same presentities increases This federation type has traffic rates similar to the previous examples but with different levels of association of the users. 3.6.3. Very large network peering In this environment, two or more very large networks create a peering relationship allowing their users to subscribe to presence in the other domains. Where as the number of users in other deployment types ranges from hundreds to several hundred thousand, these large networks host up to hundreds of millions of users. Examples of these networks are large wireless carriers and consumer IM networks. Common characteristics of this deployment are: o As users become accustomed to network boundaries disappearing, federated subscriptions become as common as subscriptions within the same domain o Individual users are highly likely to want to see presence of multiple presentities in the peer network o The intersection of users in the deployment watching the same presentities is very high (i.e., two or more users in network A are extremely likely to be watching a same user in network B) o Status changes increase greatly due to typical observed consumer behavior The first table below provides the calculations without optimizations the second table provides the calculations with optimizations. Even though the optimizations help a lot (almost cut the number of messages by half), the numbers are still very high. Note also that the bandwidth required is very high (almost 1GB per second). Houri, et al. Expires August 30, 2007 [Page 15] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 (A01) Subscription lifetime (hours)..............................8 (A02) Presence state changes / hour..............................6 (A03) Subscription refresh interval / hour.......................1 (A04) Total federated presentities per watcher..................10 (A05) Number of dialogs to maintain per watcher.................10 (A06) Number of watchers in a federated presence domain.10,000,000 (A07) Initial SUBSCRIBE/200 per watcher.........................20 (A08) Initial NOTIFY/200 per watcher............................20 (A09) Total initial messages...........................400,000,000 (A10) NOTIFY/200 per watched presentity........................960 (A11) SUBSCRIBE/200 refreshes..................................160 (A12) NOTIFY/200 due to subscribe refresh......................160 (A13) Number of steady state messages...............12,800,000,000 (A14) SUBSCRIBE termination.....................................20 (A15) NOTIFY terminated.........................................20 (A16) Number of sign-out messages....................4,000,000,000 (A17) Total messages between domains................27,200,000,000 (A18) Total number of messages / second....................944,444 (A19) Total number of bytes / second on the wire.........880,555KB Figure 5: Very large network peering with no optimizations Houri, et al. Expires August 30, 2007 [Page 16] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 (A01) Subscription lifetime (hours)..............................8 (A02) Presence state changes / hour..............................6 (A03) Subscription refresh interval / hour.......................1 (A04) Total federated presentities per watcher..................10 (A05) Number of dialogs to maintain per watcher..................1 (A06) Number of watchers in a federated presence domain.10,000,000 (A07) Initial SUBSCRIBE/200 per watcher..........................2 (A08) Initial NOTIFY/200 per watcher.............................2 (A09) Total initial messages............................40,000,000 (A10) NOTIFY/200 per watched presentity........................960 (A11) SUBSCRIBE/200 refreshes...................................16 (A12) NOTIFY/200 due to subscribe refresh........................0 (A13) Number of steady state messages................9,760,000,000 (A14) SUBSCRIBE termination......................................2 (A15) NOTIFY terminated..........................................2 (A16) Number of sign-out messages.......................40,000,000 (A17) Total messages between domains................19,680,000,000 (A18) Total number of messages / second....................683,333 (A19) Total number of bytes / second on the wire.........545,833KB Figure 6: Very large network peering with optimizations 3.6.4. Intra-domain peering Within a particular domain, multiple presence infrastructures are deployed with users split between the two. This scenario is unique in that federated messages do not pass outside the administrative domain's network. The two infrastructures peer directly inside the domain. A common example of this is an enterprise IT system with multiple independent vendor presence solutions deployed(e.g., a presence solution for desktop messaging deployed alongside a presence solution for IP telephony). Common characteristics of this deployment are o The difference between subscriptions to presentities in one system vs. the other are completely arbitrary. Any one presentity is as likely to be homed on one infrastructure as the other o Active users are almost guaranteed of subscribing to many users in the peer infrastructure Houri, et al. Expires August 30, 2007 [Page 17] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 o The level of intersection of presentities is extremely high The first table below provides the calculations without optimizations the second table provides the calculations with optimization. Even though the relatively conservative numbers are used, the amount of messages is still very high even though optimization may cut the traffic by more then half (A01) Subscription lifetime (hours)..............................8 (A02) Presence state changes / hour..............................3 (A03) Subscription refresh interval / hour.......................1 (A04) Total federated presentities per watcher..................10 (A05) Number of dialogs to maintain per watcher.................10 (A06) Number of watchers in a federated presence domain.....60,000 (A07) Initial SUBSCRIBE/200 per watcher.........................20 (A08) Initial NOTIFY/200 per watcher............................20 (A09) Total initial messages.............................2,400,000 (A10) NOTIFY/200 per watched presentity........................480 (A11) SUBSCRIBE/200 refreshes..................................160 (A12) NOTIFY/200 due to subscribe refresh......................160 (A13) Number of steady state messages...................48,400,000 (A14) SUBSCRIBE termination.....................................20 (A15) NOTIFY terminated.........................................20 (A16) Number of sign-out messages........................2,400,000 (A17) Total messages between domains...................105,600,000 (A18) Total number of messages / second......................3,667 (A19) Total number of bytes / second on the wire...........3,683KB Figure 7: Inter-domain peering with no optimizations Houri, et al. Expires August 30, 2007 [Page 18] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 (A01) Subscription lifetime (hours)..............................8 (A02) Presence state changes / hour..............................3 (A03) Subscription refresh interval / hour.......................1 (A04) Total federated presentities per watcher..................10 (A05) Number of dialogs to maintain per watcher..................1 (A06) Number of watchers in a federated presence domain.....60,000 (A07) Initial SUBSCRIBE/200 per watcher..........................2 (A08) Initial NOTIFY/200 per watcher.............................2 (A09) Total initial messages...............................240,000 (A10) NOTIFY/200 per watched presentity........................480 (A11) SUBSCRIBE refreshes.......................................16 (A12) NOTIFY/200 due to subscribe refresh........................0 (A13) Number of steady state messages...................29,760,000 (A14) SUBSCRIBE termination......................................2 (A15) NOTIFY terminated..........................................2 (A16) Number of sign-out messages..........................240,000 (A17) Total messages between domains....................60,480,000 (A18) Total number of messages / second......................2,100 (A19) Total number of bytes / second on the wire...........1,675KB Figure 8: Inter-domain peering with optimizations Houri, et al. Expires August 30, 2007 [Page 19] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 4. Resource List Service RFC [12] defines a way to subscribe on a single URI while that URI is actually a list of resources that are being subscribed to by a single subscription. Although this is quite useful mechanism and it significantly saves on the number of sessions between the watcher and the presence server (as we show in the calculations of messages), this feature has the potential to make the scalability issue of presence systems harder and more complex. The reasons that resource lists may make the scalability problem of the presence server even more complex are: o Subscriptions and state - The resource list may contain reference to many other presence servers in many other domains. This requires the RLS to create subscriptions to other presence servers and buffer the state of all presentities in order to be able to provide the full state of the presentities in the list when needed. So in the overall system, the subscriptions that were saved between the watcher and the presence server are moved to the backend system while state has been duplicated between the various presence servers that serve the various presentities and the RLSs. This issue could have been mitigated if there was a way for the RLS to retrieve the presence information for many watchers while adhering to privacy when sending the actual notifications to the watchers. o Interlinkage - The resource list subscription will reach one RLS that will open it and send it to many presence servers and to other RLSs (if there is a subgroup inside the list). This way a complex linkage between the state of many components is created. This linkage makes state management and other maintenance of a presence systems quite complex. o Big lists are easy - There are two types of groups that may be used with this feature, private groups that are defined by/for each watcher and public groups that are defined in the system and can be used by any watcher. Although we should expect IT administrators to take caution when creating public groups, this may be not the case in real life. The connection between the size of the public group and the load on the presence server system may not apparent to everyone. Furthermore many public groups that are used in presence systems may have been created for other purposes as email systems (where the size of the lists was not so important) and are taken as they are to presence systems. So for example we may very easily find that a public group that actually covers all the users in the enterprise are used by many users in the enterprise thus creating unbearable load on the presence Houri, et al. Expires August 30, 2007 [Page 20] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 server. Note that this issue is not a protocol or design issue but more a usage issue that may have a real impact on the presence system. o Stopping notifications - A watcher may accidentally subscribe to a very big list and be overwhelmed by the amount of notifies that it receives from the presence server. There is no current way to stop this stream of notifies and even canceling the subscription may take time until being affective. The issues mentioned above are one example of an optimization that helps in one part of the system but creates even bigger problems in the overall system. There is a need to think about the problems listed above but more then that there is a need to make sure that when an optimization is introduced it does not create issues in other places. Houri, et al. Expires August 30, 2007 [Page 21] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 5. State Management In previous section we have discussed the big amount of messages that need to be sent to/from a presence server In this section the state that needs to be maintained by a presence server will be analysed and shown to be far from trivial. The presence server has two parallel tasks. 1. Maintain the state of the presentities to which watchers subscribe. 2. Maintain the state of the subscriptions of watchers and provide timely updates to the watchers. For a single subscription from a single watcher on a presentity, the presence server has to maintain the following state: o Subscription state including all the parameters that are needed in order to maintain the subscription as timers. o Optional filtering information that was requested by the watcher. This includes enough information that is needed for doing the filtering. In addition additional information has to be maintained if partial notification is being supported for the subscription o Optional rate management information as throttling o Watcher information [5], [7] that is the result of the subscription in order to enable watched presentities to see who is watching them. For each presentity that has been subscribed to in the presence server, the presence server has to maintain the following state: o A list of the subscriptions for the presentity. Note that this is already taken care of from the size calculation point of view by the subscription state above. o Privacy information for the presentity. For each presentity for which there was any publication and the presentity has a state other then a default value, the presence server has to maintain the current value of the presentity. Houri, et al. Expires August 30, 2007 [Page 22] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 5.1. State Size Calculations Lets assume the following sizes: o Subscription size - 2K bytes. This includes watcher information that need to be created by the presence server for each subscription. o Subscribed to resource - 1K bytes (for privacy information and other management info). The subscriptions themselves are already calculated in the previous bullet. o Resource with a state - 6K bytes. This is a moderate assumption if we take into account the amount of data that is being put in a presence document as multiple devices, calendar and geographical information. 5.1.1. Tiny System o 10K subscriptions = 19M bytes. o 5K subscribed to presentities = 5M bytes. o 10K presentities with state = 58M bytes. Total is 82M bytes. 5.1.2. Medium System o 100K subscriptions = 195M bytes. o 50K subscribed to presentities = 49M bytes. o 100K presentities with state = 586M bytes. Total is 830M bytes. 5.1.3. Large System o 6M subscriptions = 11,718M bytes. o 3M subscribed to presentities = 2,929M bytes. o 4M presentities with state = 23437M bytes. Total is 38G bytes. Houri, et al. Expires August 30, 2007 [Page 23] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 5.1.4. Very Large System o 150M subscriptions = 292,969M bytes. o 75M subscribed to presentities = 73,242M bytes. o 100M presentities with state = 585,937M bytes. Total is 952G bytes which is a very big number for a very dynamic storage as needed by the presence server. Although the numbers above may seem moderate enough for the sizes that the presence server is handling we should consider the following: o Dynamic state - Although the state may seem not so big for databases even for the very large system, we need to remember that this state is a very dynamic state. Subscriptions come and go all the time, the status of presentities is being updated and so forth. This means that the presence server has to manage its state in a medium that is very dynamic and for such large sizes this task is not trivial. o Interlinked state - The subscriptions and the subscribed to presentities are dependent on each other. There need to be a link from the presentity to the subscriptions and vice versa. See section Section 4 about the interlinkage that is created due to resource lists. o Moderate assumptions - The size assumptions that were made above are quite moderate. As presence is becoming more a core middleware functionality that holds a lot of data on the user. In real-life the numbers above may be even higher and the presence server can have additional overhead as managing the SIP sessions, networking and more. Although the calculations above do not show that there is a real issue with state management of presence in medium systems or even in big systems since it should be possible to divide the state between different machines, the state size is still very big. A bigger issue with the state is more when resource lists are involved and create an interlinked state between many servers. In that case the division of very big state to multiple servers becomes less trivial... Houri, et al. Expires August 30, 2007 [Page 24] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 6. Processing complexities The basic presence paradigm consists from a watcher and a presentity to which the watcher watches. It sounds simple enough but there are many additions and extensions that the presence server has to manage that make the processing of the presence server very complex. In this section we show that in addition to the large amount of messages and the big state that the presence server has to handle, it has also to handle quite intensive processing for aggregation, partial notify and publish, filtering and privacy. This adds another complexity to the presence server in the CPU front in addition to the network and memory fronts that were described before. 6.1. Aggregation A presence document may contain multiple resources. These resources can be devices of the presentity, information that is received form external providers of presence information for the presentity as geographical and calendar information and more. The presence server needs to be able to get the updates from all the resources and aggregate them correctly into a single presence document. Although this is just "XML processing" task, the amount of updates that the presence server may get, the need to keep the presence document aligned with its schema and the need to notify the users as soon as possible create a significant processing burden on the presence server 6.2. Partial Publish and Notify Drafts [13], [14] define a way for the watcher to request getting only what was changed in the presence document and for the publisher of presence information to publish only what was changed in the presence document since the last publish. Although these optimizations help in reducing the amount of the data that is sent from/to the presence server, these optimizations create additional processing burden on the presence server. When a partial publish is arriving to the presence server, the presence server has to be able to process the partial publish, change only what is indicated in the partial publish while keeping the presence document in a well formed shape according to the schema. In partial notify the processing is even more complex since each watcher needs to get the partial update based on the last update that was received by that watcher. Therefore [13] specifies a versioning mechanism that enables the watcher to get the updates based on the Houri, et al. Expires August 30, 2007 [Page 25] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 previous state that it has seen. This versioning mechanism has to be maintained by the presence server for each watcher that is subscribed to a presentity and requires partial notify. 6.3. Filtering Filtering as defined in RFCs [10], [11] enables a watcher to request to be notified only when the presence document fulfills certain conditions. Although this is a very convenient feature for watchers, the burden that is put on the presence server is quite big. For each change in the presence document, the presence server needs to compute the filtering expressions which can be very complex, decide whether and what to send to the watcher that have requested filtering. 6.4. Privacy Draft [15] defines presence authorization rules that can be used by presentities to define who can see what from their presence documents. The processing that the presence server has to do here is very similar to filtering. When there is a change to any presence document that has privacy defined for it, the presence server needs to create different notification for different watchers according to what is defined in the authorization rules. Houri, et al. Expires August 30, 2007 [Page 26] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 7. Possible Optimizations This section contains techniques which can be employed by the presence server and clients to reduce presence traffic, specifically, on inter-domain links. Several techniques proposed and briefly described here. The quantitative analysis of these techniques is not fully done yet and will be present in a future version of this document. Protocol mechanisms to employ these techniques are described briefly. This section is intended to help us evaluate and decide if such techniques should become a part of SIMPLE protocol suite. 7.1. Common NOTIFY for multiple watchers When multiple watchers from a domain (for example, domain B) SUBSCRIBE to a presentity in another domain (for example, domain A), a single NOTIFY [2] per presentity in domain B can be sent to domain B's presence server (PS). The presence server in domain B can then distribute the NOTIFY messages to each of the watchers. This eliminates the need to send individual NOTIFY messages from domain A's presence server to each watcher in domain B. The presence server and resource list server (RLS) are assumed to be co--located as a result of which NOTIFY messages are sent to presence server (RLS) in domain B rather then delivered directly to the watchers of domain B. The server distributes the NOTIFY message to a list of watchers based on a single NOTIFY message received from another presence agent. There are three main issues namely, privacy filtering, failure aggregation and transfer of watcher list to watcher's domain presence server to distribute NOTIFY. We discuss these in next subsections. 7.1.1. Privacy filtering Privacy filtering is typically done by presentity's presence server. We propose that presentity's privacy filtering task be handled by watcher domain's presence server, in this case domain B's presence server. There are two possibilities about privacy filtering rules of the presentity as described below. Per domain privacy filters: Presentity in domain A having same privacy filter rules for all the watchers in domain B. In other words, there is a domain level privacy filter specified by the presentity for users from domain B. Privacy filtering can be done by the presence server in domain A and a single NOTIFY can be sent from presence server in domain B. Per watcher privacy filters: Presentity in domain A has different privacy filter rules for different watchers in domain B. Since, Houri, et al. Expires August 30, 2007 [Page 27] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 presentity in domain A has different privacy filtering rules for watchers from domain B, the privacy filter has to be applied by the presence server in domain B. Complete presence state information needs to be sent from the presentity's domain to watcher's domain. Delegating the task of privacy filtering doesn't compromise any additional privacy information when compared with normal operations. The model is very similar to e-mail trust model. Transfer of a single NOTIFY from presentity's domain to watcher's domain implies that the presence server in watcher's domain receives that information and can potentially distribute it to unauthorized watchers. Thus, presentity implicitly trusts the presence server in its own domain as well as watcher's domain. The proposed mechanism extends such a trust to the presence server in domain B so that it performs the privacy filtering on behalf of presentity in domain A. One potential issue is when presence server in domain A encrypts the presence document for each watcher using SMIME in which case the watcher domain PS cannot perform privacy filtering. Hence, this kind of privacy filtering requires a layer 8 security negotiation between the presence servers of the two domains 7.1.2. NOTIFY failure aggregation The success or failure of NOTIFY message by the server changes the subscription status of the watcher on the presentity's presence server. Hence, to update about failure of NOTIFY delivery, domain B's presence server aggregates the success and failure responses for each watcher and send it to the presence server in domain A. Alternatively, application level negative acknowledgement can be used. 7.1.3. Transferring the watcher list In order to distribute the NOTIFY message received from domain A, the watcher domain presence server requires the list of watchers in its domain for that presentity. We propose the following ways to achieve this. o Watcher list sent in NOTIFY message: The watcher list can be sent from domain A's presence server to domain B's presence server in each NOTIFY message. The NOTIFY is then distributed to each watcher in the list. This has a disadvantage when the number of watcher's from domain B is very large, every NOTIFY message increases in size proportionately. An alternative could be sending the complete list initially and sending changes to the list using the XML-patch operations [16] specified in partial- publication and maintaining the list on presence server in domain B. Sending watcher- list and distributing it, is similar to multi Houri, et al. Expires August 30, 2007 [Page 28] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 recipient messages i.e., [19], and SUBSCRIBE contained list or Exploders. o Watcher list obtained by subscribing to WINFO [5]package: In this technique, the watcher's domain (domain B) presence server obtains the watcher list from domain A's PS. It also receives any changes to the watcher-list from domain A's PS by subscribing to the presentity with presence.winfo event package. The domain A's PS maintains and updates the watcher list as a part of its normal operation. The updates are sent whenever watcher list changes. They contain information about watchers from domain B only. o Watcher list created on subscriber's presence server: The watcher domain presence server maintains and updates the list of watchers per presentity based on the SUBSCRIBE requests from these watchers. Such a list is like a resource list of watchers per presentity in watcher's domain built dynamically based on SUBSCRIBE request which are not directly sent to presentity's PS. 7.1.4. Message flow example Below is the message flow diagram of how the system may work. Houri, et al. Expires August 30, 2007 [Page 29] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 Watchers Domain B Domain A Presentity (userB1, B2) (PS + RLS) (PS + RLS) (userA1, A2) ----------------------------------------------------------- 1 | SUBSCRIBE t:userA1 | | | 2 |--------------------->| | | 3 | f:userB1) | SUBSCRIBE | | 4 |<-------200OK---------|------------------>| | 5 | |<-----200OK -------| | 6 | | | | 7 | | NOTIFY | | 8 | |<------------------| | 9 | NOTIFY (f:userA1 |------200OK------->| | 10 |<---------------------| | | 11 | t:userB1) | | | 12 |---------200 OK------>| XCAP Filter B1 | | 13 | |<-----------------<| | 14 | | | | 15 | SUBSCRIBE t:userA1 | | | 16 |--------------------->| SUBSCRIBE | | 17 | f:userB2) |------------------>| | 18 |<-------200OK---------|<-----200OK -------| | 19 | | | | 20 | | NOTIFY | | 21 | NOTIFY (f:userA1 |<------------------| | 22 |<---------------------|------200OK------->| | 23 | t:userB2) | | | 24 |---------200 OK------>| XCAP Filter B2 | | 25 | |<-----------------<| PUBLISH | 26 | | |<-------------| 27 | | |------200OK ->| 28 | | NOTIFY (f:userA1 | | 29 | |<------------------| | 30 | | t: userB1, UserB2)| | 31 | NOTIFY (f:userA1 |------200OK------->| | 32 |<---------------------| | | 33 | t:userB1) | | | 34 |---------200 OK------>| | | 35 | NOTIFY (f:userA1 | | | 36 |<---------------------| | | 37 | t:userB2) | | PUT XCAP | 38 |---------200 OK------>| |<-------------| 39 | | NOTIFY (filter) 40 | |<------------------| | 41 | |------200OK------->| | 42 | | | | 43 | | XCAP Filter B2 | | 44 | |<-----------------<| | ----------------------------------------------------------- Houri, et al. Expires August 30, 2007 [Page 30] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 Figure 9: Example message flow for common NOTIFY for watchers in a domain We can see in figure above that a single NOTIFY from userA1@ domainA.com is sent to watchers {userB1, userB2}@domainB.com. Also, we can see that a change in privacy filter rule causes a NOTIFY which triggers an XCAP-based download of privacy filtering rules by domain B PS. 7.1.5. SIP message examples for common NOTIFY The following NOTIFY message contains the list of watchers and the presence document of the presentity. The RLS /presence server in B will distribute it to all the watchers in the list. NOTIFY sip:rlserver.domainB.com SIP/2.0 Via: SIP/2.0/TCP rlsserver.domainA.com;branch=z9hG4bK4EPlfSFQK1 Max-Forwards: 70 From: ;tag=zpNctbZq To: ;tag=ie4hbb8t Call-ID: cdB34qLToC@domainA.com CSeq: 997935769 NOTIFY Contact: Event: presence Subscription-State: active;expires=7200 Content-Type: multipart/related;type="resource-lists+xml"; start="<2BEI83@rlsserver.domainA.com >"; boundary=" tuLLl3lDyPZX0GMr2YOo " Content-Length: 2014 --tuLLl3lDyPZX0GMr2YOo Content-Transfer-Encoding: binary Content-ID: <2BEI83@rlsserver.domainA. com> Content-Type: application/resource-lists+xml; charset="UTF-8" --tuLLl3lDyPZX0GMr2YOo Content-Transfer-Encoding: binary Content-ID: <2BEI83@rlsserver.domainA.example.com > Content-Type:type="application/pidf+xml;charset="UTF-8" start=""; boundary=" TfZxoxgAvLqgj4wRWPDL" Houri, et al. Expires August 30, 2007 [Page 31] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 --TfZxoxgAvLqgj4wRWPDL closed --TfZxoxgAvLqgj4wRWPDL-- Figure 10: SIP message examples using common notify technique 7.2. Aggregation of NOTIFY messages (Batched notification) When a watcher from a domain (for example domain B) SUBSCRIBE to multiple presentities in another domain (domain A), domain A's presence server can aggregate the notification messages and send them together as a single NOTIFY message to the presence server in domain B. The presence server in domain B can then deliver the message to the watcher or create individual NOTIFY messages for different watchers and send it to them. This reduces the number of NOTIFY/ 200 OK messages on the inter-domain link as well as access network. This aggregation of NOTIFY can be done on per watcher or per domain basis. The RLS specification describes aggregation and throttling however, leaves it open to the implementers. One problem in aggregation is that presence status update for presentities may not occur simultaneously. Hence, in order to bundle the NOTIFY messages for each watcher or domain, the presence server may have to delay some of the NOTIFY messages. One approach to solve this issue could be that the watcher specifies a tolerable delay for receiving presence state update of the presentities. The watcher can specify this delay value using the watcher filtering mechanism or a SIP-header extension in the SUBSCRIBE message. The presence server in presentity's domain can hold the NOTIFY message only for the amount of time specified. 7.2.1. Extracting and sending individual NOTIFY using Aggregated NOTIFY message body The aggregation of NOTIFY bodies originating from different presentities to a single NOTIFY body works on the basis of Multipart (MIME). Bundling of notification imply aggregating multiple NOTIFY bodies destined to a single watcher (or watcher domain) into a single Houri, et al. Expires August 30, 2007 [Page 32] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 NOTIFY and delivered to watcher domain presence server. If all the NOTIFY messages are destined to a single watcher, the watcher domain presence server delivers the message directly. Otherwise, the server extracts multiple presence bodies (PIDF) from the received NOTIFY message. Each presence document (PIDF [6]) contains an entity field which uniquely identifies the presentity; hence, there is no dependency on SIP headers to construct individual NOTIFY messages for delivering them to watchers. Delivering bundled NOTIFY messages reduces the traffic on access network as well. 7.2.2. Subscription termination and failure indication in NOTIFY delivery The Subscription-state header in the NOTIFY message is used to indicate subscription termination to a watcher. Bundled notification doesn't indicate subscription termination, hence, terminating NOTIFY messages cannot be sent using this mechanism. Additionally, the notifier needs to know if the NOTIFY was delivered successfully or not. The subscription can be terminated if NOTIFY is not delivered successfully. The presence server in domain B should aggregate and send to PS in domain A the success or failure of NOTIFY messages. The advantage is observed when a single watcher subscribes to multiple presentities from another domain. The delay tolerance interval specified by the watcher should be good enough so that multiple NOTIFY messages can be bundled or aggregated. The reduction in traffic can be seen under two scenarios, i.e., (i) when watcher logs in and subscribes to all the presentities. The NOTIFY from multiple presentities can be bundled and delivered as a single message to the watcher. (ii) In steady state, the gain can be calculated based on the delay tolerance interval, number of presentities to which a watcher is subscribed, probability of these presentities changing state in that interval. With increase in number of presentities, the probability that presentities will update presence state within a time difference of delay tolerance interval will increase and hence the inter domain traffic reduction (gain) will increase. 7.2.3. Message flow example The message flow diagram in Figure below assumes watchers in domain B (userB1, userB2) and presentities in domain A (userA1, userA2). We can see that when userA1 and userA2 send PUBLISH, a single NOTIFY is sent from domain A to domain B, which is converted to individual NOTIFY messages by presence server at domain B. Houri, et al. Expires August 30, 2007 [Page 33] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 Watchers Domain B Domain A Presentity (userB1,B2) (PS + RLS) (PS + RLS) (userA1,A2) ----------------------------------------------------------- 1 | SUBSCRIBE t:userA1 | | | 2 |--------------------->| | | 3 | f:userB1) | | | 4 |<-------200OK---------| SUBSCRIBE | | 5 | |------------------>| | 6 | |<-----200OK -------| | 7 | | | | 8 | | NOTIFY | | 9 | |<------------------| | 10 | NOTIFY (f:userA1 |------200OK------->| | 11 |<---------------------| | | 12 | t:userB1) | | | 13 |---------200 OK------>| | | 14 | | | | 15 | | | | 16 | SUBSCRIBE t:userA2 | | | 17 |--------------------->| SUBSCRIBE | | 18 | f:userB1) |------------------>| | 19 |<-------200OK---------|<-----200OK -------| | 20 | | | | 21 | | NOTIFY | | 22 | NOTIFY (f:userA2 |<------------------| | 23 |<---------------------|------200OK------->| | 24 | t:userB1) | | | 25 |---------200 OK------>| | PUBLISH | 26 | | userA1|<-------------| 27 | | |------200OK ->| 28 | | | | 29 | | | PUBLISH | 30 | | userA2|<-------------| 31 | | |------200OK ->| 32 | | | | 33 | | NOTIFY (multipart)| | 34 | |<------------------| | 35 | NOTIFY (f:userA1 | (userA1,userA2) | | 36 |<---------------------|------200OK------->| | 37 | t:userB1) | | | 38 |---------200 OK------>| | | 39 | | | | 40 | | | | 41 | NOTIFY (f:userA1 | | | 42 |<---------------------| | | 43 | t:userB1) | | | 44 |---------200 OK------>| | | ----------------------------------------------------------- Houri, et al. Expires August 30, 2007 [Page 34] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 Figure 11: Message flow for aggregation or batched notification 7.2.4. SIP message flow example for batched notification The following NOTIFY message contains presence documents of multiple presentities. In the example, all the presence documents are destined to a single watcher. Houri, et al. Expires August 30, 2007 [Page 35] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 NOTIFY sip:rlserver.domainB.com SIP/2.0 Via: SIP/2.0/TCP rlsserver.domainA.example.com;branch=z9hG4bK4EPlfSFQK1 Max-Forwards: 70 From: ;tag=zpNctbZq To: ;tag=ie4hbb8t Call-ID: cdB34qLToC@ domainA.com CSeq: 997935769 NOTIFY Contact: Event: presence Subscription-State: active;expires=7200 Content-Type: multipart/related;type="rlmi+xml"; start="<2BEI83@rlsserver.domainB.example.com >"; boundary=" tuLLl3lDyPZX0GMr2YOo " Content-Length: 2862 --tuLLl3lDyPZX0GMr2YOo Content-Transfer-Encoding: binary Content-ID: <2BEI83@rlsserver.domainB.example.com> Content-Type: application/pidf+xml;charset="UTF-8" open sip:joe@stockholm.example.org --tuLLl3lDyPZX0GMr2YOo Content-Transfer-Encoding: binary Content-ID: Content-Type: application/pidf+xml;charset="UTF-8" closed --tuLLl3lDyPZX0GMr2YOo-- Figure 12: Message Flow for Aggregation or Batched Notification Houri, et al. Expires August 30, 2007 [Page 36] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 7.3. Timed presence Watchers may be interested in general, coarse-grained availability information of certain presentities rather then getting notification for every status change of the presentity. For example, a manager may be interested in knowing if the employees under him are available or on vacation (calendar/timed-presence) rather then getting notification about every status change. This can be achieved using timed-presence [8]. An example of Timed-presence status is below: open closed sip:Vishal@cs.columbia.edu I'll be in San Diego IETF meeting Figure 13: Time-presence status example Thus, timed-presence can be used to automatically switch the subscription on or off which can lower the presence notification traffic. However, with current watcher filtering specification it is not straightforward to automatically enable or disable notifications based on calendar information from timed-presence. Watchers cannot specify a watcher filter indicating not to send NOTIFY based on timed-status as it would require them to know the 'from'/'until' attribute in before hand. Watcher filtering specification does not allow watchers to specify filter rules to disable notifications based on comparison of timestamps. A watcher application upon obtaining the can specify a watcher filter using the 'from' and 'until' attribute in the received , indicating the server not to send a NOTIFY unless the or 'from' or 'until' attribute changes. A watcher should not blindly un-subscribe for the time specified in the because presentity may update the time-status and watcher may not be aware of this. Hence, watcher must specify a watcher filter which triggers a notification upon changes in elements of , after it has received the first . Once the interval for the received is over, the watcher Houri, et al. Expires August 30, 2007 [Page 37] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 application removes the filter and starts receiving notifications in a normal manner. However, differential notification can be used to know about changes in the timed-presence. From the above discussion, it is clear that watcher filtering specification requires enhancements for timestamp based watcher filters. 7.4. On-Demand presence (Fetch or Pull Model) Watchers need not be notified about every presence update of all the contacts at all times. Watchers may be interested in regularly receiving presence updates for some of their contacts. But for other contacts, watchers may only want to know their presence information when they want to start a communication session. This can be labeled as on-demand presence and can be accomplished by using fetch based SUBSCRIBE with expiration interval set to zero. This approach requires a mechanism in the watcher application to enable watchers to indicate that they are not interested in regular presence updates, rather they only require presence information when starting a new session. Examples may include services, where presence status does not have to be seen or known to a watcher all of the time. For example, a cell-phone associated watcher may need presence updates only when the cell-phone application (e.g., phone book) runs in the foreground on the device. Another example is a presence-based call routing in telephony, where - before the call is delivered - a watcher issues a fetch-based SUBSCRIBE to learn whether and where the callee is available. 7.5. Adapting the subscription rate The rate of notification can be adjusted based on statistical information about past multimedia sessions with user's contacts. This can be initiated by the client or can be automatically done by the server as server can procure such information based on stored call and text session information. As a matter of fact, 60-70% of the calls/IM messages are sent to 20% of the contacts [Reference required, Observation based on call detail records of friends]. Nearly 50% of the buddies are called rarely. This may include buddies from old office, old college, and old city who are present in the buddy list but are not contacted actively. Based on such information the presence server or the client can adapt the subscription rate and use the fetch model for such buddies. 7.6. Other Optimizations This section lists and discusses several other optimizations either are already part of the SIMPLE protocol or they have been suggested in various drafts. the current protocol optimizations that have been defined, are being worked on or are suggested. Houri, et al. Expires August 30, 2007 [Page 38] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 o Subnot-etags - Draft [20]. This draft suggests ways to suppress the sending of unnecessary notifies when for example a subscription is refreshed. This suggestion seems to be an efficient optimization since it saves both the number of messages sent and on the processing time of the presence server. o Resource List Service - [12] enable creating a single subscription session between the watcher and the presence server for subscribing on a list of users. This saves the amount of sessions that are created between watchers and presence servers. On the other hand, this mechanism enables creating very large amount of subscriptions in the presence server/RLS system thus enabling the creation of a very large number of subscriptions between presence servers and RLSs with relatively few clients especially if large public groups are used. It seems that in order to really optimize in this area, the usage of large public groups should not be considered as BCP and there should be a way for an RLS to create a single subscription for multiple occurrences of the same resource in resource lists. See consolidates subscriptions below. o Partial notify/publish - Drafts [13], [14] define a way for the subscriber to request getting only what was changed in the presence document and for the publisher of presence information to publish only what was changed in the presence document since the last publish. Although these optimizations help in reducing the amount of actual data that is sent from/to the presence server, these optimizations create additional processing burden on the presence server as was discussed above. o Filtering as defined in RFCs [10], [11] enables a watcher to request to be notified only when the presence document fulfills certain conditions. Although this optimization enables saving on the amount of messages that are sent from the presence server to the watcher, this optimization puts more burden on the processing time of the presence server as was discussed above. o Throttling [http://tools.ietf.org/html/draft-niemi-sipping-event-throttle-04 - expired at the time of the writing of this document] defines a mechanism in which a watcher requires to be updated only in certain intervals. Although this mechanism may give some extra load on the processing time of the presence server, that load is negligible and the reduction on the amount of messages sent from the presence server to the watchers is significant. This optimization is even more important with resource lists where there can be many resources in the resource lists and if the traffic of updates on resource list is not regulated, the watcher may get very large amount of notifications. Houri, et al. Expires August 30, 2007 [Page 39] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 o Presence specific sigcomp dictionary [17] defines a SIGCOMP [3] dictionary for presence. This optimization will enable to reduce the number of bytes that are transferred in presence systems by compressing the textual SIP messages and using the specialized presence dictionary the compression may be more significant then just using SIGCOMP as is. Note that number of actual messages will remain the same and a calculation of the amount of bytes that will be saved may be useful here. o Content Indirection [9] enables sending only the URI of the presence document to the watcher thus offloading the presence server from sending the presence document to the watcher. This optimization may be useful in some cases but in reality it may have several drawbacks: 1. Due to partial/privacy/filtering and other functionalities, it will be relatively a rare case where many watchers will get exactly the same presence document. 2. There should be a mechanism that will enable removing the content from the content server at the appropriate time. Defining the appropriate time is far from trivial since the removal should be synchronized with all the watcher that need to get the content. o Resubscription to resource list [12] requires that a full state will be sent for subscribe refreshes. In large resource lists the amount of data that needs to be sent for each subscribe refresh may be very big. Having an optimization that will enable sending only partial information at subscribe refreshes may let RLS subscriptions be more optimized. o No Resubscriptions - Due to the nature of SIP that is network agnostic and always assumes the worst for the network layer, resubscriptions are part of the SIP sub/notify model [2]. In many cases it should be possible to negotiate a special connection between watchers and presence servers, this type of connection will use a different mechanism of e.g. keep alives and will not necessitate resubscribes. This will be mostly important between presence domains and between RLSs and presence servers and may save many messages. Houri, et al. Expires August 30, 2007 [Page 40] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 8. Extremely Optimized Model The following calculations are made assuming that the following optimizations are deployed: o No resubscriptions are necessary. o Consolidates Subscriptions are possible. The following table shows the amount of messages that are required in this model using the very large network model numbers. We assume that even though there are 10M watchers from one domain to the other, the number of actually watched resources is only 3M. (A01) Subscription lifetime (hours)..............................8 (A02) Presence state changes / hour..............................6 (A03) Subscription refresh interval / hour.......................0 (A04) Total federated presentities per watcher..................10 (A05) Number of dialogs to maintain per watcher..................1 (A06) Number of watchers in a federated presence domain.10,000,000 (A06-1) Number of resources watched......................3,000,000 (A07) Initial SUBSCRIBE/200 per watcher..........................2 (A08) Initial NOTIFY/200 per watcher.............................2 (A09) Total initial messages............................12,000,000 (A10) NOTIFY/200 per watched presentity........................960 (A11) SUBSCRIBE/200 refreshes....................................0 (A12) NOTIFY/200 due to subscribe refresh........................0 (A13) Number of steady state messages................2,880,000,000 (A14) SUBSCRIBE termination......................................2 (A15) NOTIFY terminated..........................................2 (A16) Number of sign-out messages.......................12,000,000 (A17) Total messages between domains.................5,808,000,000 (A18) Total number of messages / second....................201,333 (A19) Total number of bytes / second on the wire.........402,083KB Figure 14: Very large network peering with extreme optimizations Note that we get almost a 3 fold less messages by only assuming that 10M watchers subscribe to 3M resources while consolidated subscriptions are possible. However, since the NOTIFY messages are big then the saving in the bandwidth is not so big. Due to the usage of the subnot-etags [20] optimization the total removal of resubscribes does not save many messages as the following table shows: Houri, et al. Expires August 30, 2007 [Page 41] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 (A01) Subscription lifetime (hours)..............................8 (A02) Presence state changes / hour..............................6 (A03) Subscription refresh interval / hour.......................1 (A04) Total federated presentities per watcher..................10 (A05) Number of dialogs to maintain per watcher..................1 (A06) Number of watchers in a federated presence domain.10,000,000 (A06-1) Number of resources watched......................3,000,000 (A07) Initial SUBSCRIBE/200 per watcher..........................2 (A08) Initial NOTIFY/200 per watcher.............................2 (A09) Total initial messages............................12,000,000 (A10) NOTIFY/200 per watched presentity........................960 (A11) SUBSCRIBE/200 refreshes...................................16 (A12) NOTIFY/200 due to subscribe refresh........................0 (A13) Number of steady state messages................2,928,000,000 (A14) SUBSCRIBE termination......................................2 (A15) NOTIFY terminated..........................................2 (A16) Number of sign-out messages.......................12,000,000 (A17) Total messages between domains.................5,904,000,000 (A18) Total number of messages / second....................205,000 (A19) Total number of bytes / second on the wire.........402,088KB Figure 15: Very large network extreme optimizations+resubscribe "Only" additional 3.5K messages per second are needed if we re- introduce re-subscriptions, since the subnot-etags [20] optimization is used. Note that even other protocols that do not require subscription refreshes etc. will have "hard time" bettering the above scalability calculation Houri, et al. Expires August 30, 2007 [Page 42] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 9. Suggested Requirements In the previous sections we have shown several areas where the deployment of a presence system is far from being trivial, these include network load, memory load and CPU load. In this section we are listing an initial set of requirements to a possible optimizations in this area. Backward compatibility requirements o The solution should not hinder the ability of existing SIMPLE clients and/or servers from peering with a domain or client implementing the solution. No changes may be required of existing servers to interoperate o It does NOT constrain any existing RFC functional or security requirements for presence o Systems that are not using the new additions to the protocol should operate at the same level as they do today Policy, privacy, permissions requirements o The solution does not limit the ability for presentities to present different views of presence to different watchers o The solution does not restrict the ability of a presentity to obtain its list of watchers o The solution MUST NOT create any new or make worse any existing privacy holes Scalability requirements o It is highly desirable for any presence system (intra or inter- domain) to scale linearly as number of watchers and presentities increase linearly o The solution SHOULD NOT require significantly more state in order to implement the solution o It MUST be able to scale to tens of millions of concurrent users in each domain and in each peer domain o It MUST support a very high level of watcher/presentity intersections in various intersection models Houri, et al. Expires August 30, 2007 [Page 43] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 o Protocol changes MUST NOT prohibit optimizations in different deployment models esp. where there is a high level of cross subscriptions between the domains o New functionalities and extensions to the presence protocol SHOULD take into account scalability with respect to the number of messages, state size and management and processing load. Topology requirement o The solution SHOULD allow for arbitrary federation topologies including direct peering and intermediary routing Houri, et al. Expires August 30, 2007 [Page 44] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 10. Conclusions The document analysis the scalability of presence systems and of the SIP based in particular. It is apparent that the scalability of these systems is far from being trivial from several perspectives: number of messages, network bandwidth, state management and CPU load. Several optimizations are suggested or are surveyed in this document. It is important to note that not every optimization is really an optimization and some of them may seem to optimize in one place while they actually create load in other parts of the system. It is very possible that the issues that are described in this document are inherent to presence systems in general and not specific to the SIMPLE protocol. Organizations need to be prepared to invest a lot in network and hardware in order to create real big systems. However, it is apparent that not all the possible optimizations were done yet and further work is needed in the IETF in order to provide better scalability It seems that we need to think about the problem in a different way. We need to think about scalability as part of the protocol design. The IETF tends not to think about actual deployments when designing a protocol but in this case it seems that if we do not think about scalability with the protocol design it will not be very hard to scale. We should also consider whether using the same protocol between clients and servers and between servers is a good choice with this problem? It may be that in interdomain or even between servers in the same domain (as between RLSs and presence servers) there is a need to have a different protocol that will be very optimized for the load and can assume some assumptions about the network (e.g. do not use unreliable protocol as UDP but only TCP). Another issue that is more concerning protocol design is whether NOTIFY messages should not be considered as media as the audio, video and even text messaging are considered? The SUBSCRIBE can be extended to do similar three way handshake as INVITE and negotiate where the notify messages should go, rate and other parameters. This way the load can be offloaded to specialized NOTIFY "relays" thus not loading the control path of SIP. Houri, et al. Expires August 30, 2007 [Page 45] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 11. Security Considerations This document discusses scalability issues with the existing SIP/ SIMPLE presence protocol and model. Therefore, there are no security considerations to be considered for this document. However, a lot of the possible optimizations that are discussed in theory in this document will most probably have security implications that will need to be solved. Houri, et al. Expires August 30, 2007 [Page 46] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 12. Acknowledgments We would like to thank Jonathan Rosenberg (Cisco), Markus Isomaki (Nokia) Piotr Boni (Verizon), David Viamonte (Genaker) and Aki Niemi (Nokia) for their ideas and input. Houri, et al. Expires August 30, 2007 [Page 47] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 13. References 13.1. Normative References [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, March 1997. 13.2. Informational References [2] Roach, A., "Session Initiation Protocol (SIP)-Specific Event Notification", RFC 3265, June 2002. [3] Price, R., Bormann, C., Christoffersson, J., Hannu, H., Liu, Z., and J. Rosenberg, "Signaling Compression (SigComp)", RFC 3320, January 2003. [4] Rosenberg, J., "A Presence Event Package for the Session Initiation Protocol (SIP)", RFC 3856, August 2004. [5] Rosenberg, J., "A Watcher Information Event Template-Package for the Session Initiation Protocol (SIP)", RFC 3857, August 2004. [6] Sugano, H., Fujimoto, S., Klyne, G., Bateman, A., Carr, W., and J. Peterson, "Presence Information Data Format (PIDF)", RFC 3863, August 2004. [7] Rosenberg, J., "An Extensible Markup Language (XML) Based Format for Watcher Information", RFC 3858, August 2004. [8] Schulzrinne, H., "Timed Presence Extensions to the Presence Information Data Format (PIDF) to Indicate Status Information for Past and Future Time Intervals", RFC 4481, July 2006. [9] Burger, E., "A Mechanism for Content Indirection in Session Initiation Protocol (SIP) Messages", RFC 4483, May 2006. [10] Khartabil, H., Leppanen, E., Lonnfors, M., and J. Costa- Requena, "Functional Description of Event Notification Filtering", RFC 4660, September 2006. [11] Khartabil, H., Leppanen, E., Lonnfors, M., and J. Costa- Requena, "An Extensible Markup Language (XML)-Based Format for Event Notification Filtering", RFC 4661, September 2006. [12] Roach, A., Campbell, B., and J. Rosenberg, "A Session Initiation Protocol (SIP) Event Notification Extension for Resource Lists", RFC 4662, August 2006. Houri, et al. Expires August 30, 2007 [Page 48] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 [13] Lonnfors, M., "Session Initiation Protocol (SIP) extension for Partial Notification of Presence Information", draft-ietf-simple-partial-notify-08 (work in progress), July 2006. [14] Lonnfors, M., "Publication of Partial Presence Information", draft-ietf-simple-partial-publish-06 (work in progress), February 2007. [15] Rosenberg, J., "Presence Authorization Rules", draft-ietf-simple-presence-rules-08 (work in progress), October 2006. [16] Urpalainen, J., "An Extensible Markup Language (XML) Patch Operations Framework Utilizing XML Path Language (XPath) Selectors", draft-ietf-simple-xml-patch-ops-02 (work in progress), March 2006. [17] Garcia-Martin, M., "The Presence-specific Dictionary for the Signaling Compression (Sigcomp) Framework", draft-garcia-simple-presence-dictionary-01 (work in progress), December 2006. [18] Camarillo, G., "Subscriptions to Request-Contained Resource Lists in the Session Initiation Protocol (SIP)", draft-ietf-sipping-uri-list-subscribe-05 (work in progress), May 2006. [19] Garcia-Martin, M. and G. Camarillo, "Multiple-Recipient MESSAGE Requests in the Session Initiation Protocol (SIP)", draft-ietf-sip-uri-list-message-01 (work in progress), January 2007. [20] Niemi, A., "An Extension to Session Initiation Protocol (SIP) Events for Issuing Conditional Subscriptions", draft-niemi-sip-subnot-etags-02 (work in progress), October 2006. Houri, et al. Expires August 30, 2007 [Page 49] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 Authors' Addresses Avshalom Houri IBM Science Park Building 18/D Rehovot, Israel Email: avshalom@il.ibm.com Tim Rang Microsoft Corporation One Microsoft Way Redmond, WA 98052 USA Email: timrang@microsoft.com Edwin Aoki AOL LLC 360 W. Caribbean Drive Sunnyvale, CA 94089 USA Email: aoki@aol.net Vishal Singh Columbia University Department of Computer Science 450 Computer Science Building New York, NY 10027 US Email: vs2140@cs.columbia.edu URI: http://www.cs.columbia.edu/~vs2140 Houri, et al. Expires August 30, 2007 [Page 50] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 Henning Schulzrinne Columbia University Department of Computer Science 450 Computer Science Building New York, NY 10027 US Phone: +1 212 939 7004 Email: hgs+ecrit@cs.columbia.edu URI: http://www.cs.columbia.edu/~hgs Houri, et al. Expires August 30, 2007 [Page 51] Internet-Draft Problem Statement for SIP/SIMPLE February 2007 Full Copyright Statement Copyright (C) The IETF Trust (2007). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Intellectual Property The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Acknowledgment Funding for the RFC Editor function is provided by the IETF Administrative Support Activity (IASA). Houri, et al. Expires August 30, 2007 [Page 52]