Network Working Group Sonu Aggarwal, Microsoft Corp. INTERNET DRAFT Colin Benson, Net Effect John Stracke, eCal Corp. Christophe Vermeulen, Alcatel Expires September, 2000 9 March, 2000 Transport Protocol for Presence Information/ Instant Messaging 1 Status of this Document This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as ``work in progress.'' The list of current Internet-Drafts can be accessed at The list of Internet-Draft Shadow Directories can be accessed at This document and related documents are discussed on the impp mailing list. To join the list, send mail to impp-request@iastate.edu. To contribute to the discussion, send mail to impp@iastate.edu. The archives are at http://www.imppwg.org/ml_archives.html. The IMPP working group charter, including the current list of group documents, can be found at http://www.ietf.org/html.charters/impp-charter.html. STRACKE: This version (-01) of this document is being pulled together at the last minute (two days before the Adelaide deadline) by John Stracke, instead of Colin Benson, the usual editor. Colin is unreachable this week (for a good reason), and the PITP/MITP team realized on Monday that we needed to release an update. John is taking a two-month-old interim version from Colin and modifying it to reflect current thinking; this is obviously more error-prone than Colin's usual practice of making small changes as we go along, but it'll have to do. Paragraphs which may not accurately reflect consensus have been marked with "STRACKE:". 2 Abstract Presence and Instant Messaging have recently emerged as a new medium of communications over the Internet. Presence is a means for finding, retrieving, and subscribing to changes in the presence information (e.g. "online" or "offline") of other users. Instant Messaging is a means for sending small, simple messages that are delivered immediately to online users. A goal of the Instant Messaging and Presence Protocol (IMPP) Working Group is to produce an Internet Standard for Presence and Instant Messaging. Aggarwal, et al. [Page 1] INTERNET-DRAFT PITP/MITP 9 March, 2000 The document draft-ietf-impp-reqts-04.txt [IMPP-REQTS] specifies a detailed set of requirements that such a protocol must meet. The document series draft-ietf-impp-pitp-mitp-xx is a prospective deliverable of the Working Group, with the eventual goal of specifying a transport protocol for Presence Information and Instant Messaging that, in conjunction with the other deliverables of the Working Group, meet the requirements in [IMPP-REQTS]. This version of the document specifies the scope of the transport protocol document(s), discusses the design options for the key issues, and then characterizes the state of discussion on the IMPP mailing list for these key issues. Later revisions of the document will specify protocol details, as the issues are gradually resolved in the Working Group. 3 Scope of PITP/MITP The Presence and Instant Messaging Transport document(s) will eventually specify protocol operations that include, but are not necessarily limited to, the following operations (see [IMPP-MODEL] for definitions of the capitalized terms): o Publishing PRESENCE INFORMATION o Establishing SUBSCRIPTIONS o Retrieving PRESENCE INFORMATION o Retrieving WATCHER INFORMATION o Sending NOTIFICATIONS of changes to PRESENCE INFORMATION o Sending INSTANT MESSAGES In order to specify protocol details for the above operations, the document(s) need to first resolve key issues such as the following: o Domain architecture o Base and higher-level transport layers (e.g. TCP/UDP/SMTP/HTTP/etc.) o Addressing and name resolution o Connection model Once the above issues are resolved, the transport protocol specification can include the following aspects: o Command model o Command encoding formats o Command set o Error handling o Protocol syntax by command Further details about the terminology above are given later in the document, where each aspect is independently described. Since the Working Group is expected to be chartered to specify transport protocols for Presence and Instant Messaging in separate Aggarwal, et al. [Page 2] INTERNET-DRAFT PITP/MITP 9 March, 2000 documents, future versions of this document may be produced as two separate documents, depending on the degree of commonality in the transport protocols. Regardless of whether eventually specified in one document or two, the transport protocol(s) must support deploying Presence and Instant Messaging as separate services, as mandated in [IMPP-REQTS]. This document series complements the other proposed deliverables of the Working Group. The [IMPP-SECURITY] deliverable will specify all security- related aspects of IMPP. Since security considerations are of great import even in the initial stages of designing a transport protocol, technical discussions in this document commence with a summary discussion of security considerations and guiding principles. It is expected that this discussion is consistent with the current intent of authors of the [IMPP-SECURITY] deliverable. The [IMPP-PIDF] deliverable will specify a data format for PRESENCE INFORMATION. The [IMPP-MIDF] deliverable will specify a data format for INSTANT MESSAGES. The [IMPP-SERVICE] deliverable will specify operational aspects of IMPP deployments. There have been notable earlier efforts in the IMPP transport area, including the IMPP Interim Meeting in San Francisco, as well as the [IMPP-BASIS] and [VERMEULEN-TRANSPORT] documents. This document is intended to build upon those efforts. 4 Security Considerations There is a separate team [IMPP-SECURITY] chartered to investigate the security aspects of the IMPP protocol. It is necessary however to keep some security related aspects of [IMPP-REQTS] in mind when designing PITP and MITP. There are many access control requirements. PITP must allow a PRINCIPAL to control which WATCHERS are allowed by a PRESENCE SERVICE to access the PRINCIPAL'S PRESENCE INFORMATION. At a minimum the protocol must allow the PRINCIPAL to configure the PRESENCE SERVICE to make such decisions autonomously ( [IMPP-REQTS] Section 5.3.7). It might be desirable to allow the PRINCIPAL to exercise this kind of control manually when possible but it doesn't seem to be part of [IMPP-REQTS] that this be the case. The access control portion of PITP must also allow a PRINCIPAL to specify that only certain portions of it's PRESENCE INFORMATION be made available to a given WATCHER. To further complicate things, a PRINCIPAL must be able to grant limited control of a PRESENTITY to other PRINCIPALS. One mechanism for achieving this kind of control is an access control list (often called an ACL). At the time of writing there is much controversy on the WG mailing list surrounding the proper place for such an ACL. A similar set of access control requirements apply to INSTANT INBOXES. The controlling PRINCIPAL of an INSTANT INBOX must be able to decide which other PRINCIPALS can send messages to that INSTANT INBOX. Aggarwal, et al. [Page 3] INTERNET-DRAFT PITP/MITP 9 March, 2000 If the group reaches a consensus suggesting that ACLs or some other similar mechanism are part of the PITP and MITP then it may be possible to adopt a similar scheme for authorizing access to both PRESENCE INFORMATION and INSTANT INBOXES. Other than access control, the requirements also generally stipulate that the protocol must have sound authentication and encryption capabilities. Design of the transport protocol must take into account the need for such operations across the Internet, across different administrative entities (e.g. domains). 5 Key Issues We now describe some key transport issues that the PITP/MITP effort must address, that largely influence further specification of protocol details. For each issue, we describe the different design options and considerations for each option, as well as the current state of discussion on the list and in prior submissions. 5.1 Base Transport The PITP/MITP base transport is concerned with the business of getting presence and instant messaging data between clients and servers. It doesn't address the format of that data or its semantics. As with every other interesting issue in IMPP, there is a lot of overlap between the issues surrounding the base transport and many other areas including but not limited to security, higher level transport issues and the connection model. A (probably partial) list of the issues most closely linked with the base transport protocol follows: o Reliability o Efficiency o Security o Multiplexing of IM and PP o Request/response matching Discussion of these issues has generated four main options, TCP, UDP, Hybrid TCP/UDP and SIP. STRACKE: The working group has come to consensus (except for a minority that thinks we should accommodate WAP) that the base transport for client-to-server connections must be TCP-based. For server-to-server connections, there has never been much interest in UDP; most of the arguments in its favor have been client-centric (e.g., it would permit extremely lightweight devices, and conserve bandwidth on the narrowband last hop). Thus, it would appear that we have consensus that all connections will be TCP-based. Accordingly, the following subsections are included for background only; they will be deleted in future versions of this document. 5.1.1 TCP Aggarwal, et al. [Page 4] INTERNET-DRAFT PITP/MITP 9 March, 2000 TCP's primary advantages are well-known: it's a reliability layer that's already in the OSes, that's been debugged over the past 20 years, and that plays nicely with firewalls. However, there remains a camp that believes TCP is too heavyweight for our needs, and that something better could be designed today. There is some reason to believe that a custom reliability protocol could serve us better, because it could be packet-oriented instead of byte-oriented; the problem is that, since it wouldn't be as widely used as TCP, it wouldn't get the attention that TCP gets, and so it wouldn't get refined as much. It is worth noting that the there are two sub-options within TCP. First, a long lived connection could be used. This causes difficulties with existing wireless networks and so may not be viable. Another option is to use TCP for single transactions or short-lived, closely related groups of transactions (ala HTTP). 5.1.2 UDP UDP's primary advantage is that it is lightweight: it is much easier to implement than TCP, and, since it doesn't require connection setup, it gives less latency and consumes less bandwidth. In addition, it may be easier to support on a WAP-type device (WAP has been declared out of scope, but it's still on some people's It Would Be Nice If lists). However, UDP does not provide reliability or congestion control. (The latter is most serious; TCP's congestion control is one of the things that keeps the Internet from collapsing.) Any pure UDP protocol would have to implement these features; by the time you're done, you've done almost as much work as a TCP implementation. 5.1.3 Hybrid TCP/UDP - Dynamic Switching A hybrid approach would be to send each message via UDP, and then, if the receiver does not acknowledge the message, retry via TCP. This approach was suggested by the UDP camp as an answer to the difficulties of pure UDP. The counter-arguments are (a) it does not free the endpoint from having to support a TCP stack; (b) it requires extra machinery to prevent duplicate messages (when the message gets through, but the acknowledgement doesn't); (c) it means you're still sending one non-flow-controlled UDP packet per message. 5.1.4 Hybrid TCP/UDP - Negotiation Based Another type of hybrid approach that might be taken is to allow clients to negotiate for their preferred means of communication. Clients who don't want to use TCP don't have to. However doing so requires that we build congestion control etc. into the UDP protocol or get agreement from the WG chairs/IESG to allow the protocol to violate the requirements in some cases. 5.1.5 SIP The Session Initiation Protocol [SIP] is an IETF standard that has Aggarwal, et al. [Page 5] INTERNET-DRAFT PITP/MITP 9 March, 2000 been designed to allow easy setup, modification and termination of "sessions". There are many types of sesssions including buddy lists and multimedia conferences. SIP has a lot of interesting characteristics. o It is HTTP-based, but fits within requirements set by the IESG. o It is already stable and has got support from major players o It allows usage of UDP and TCP, including a compact form to reduce message length o It allows for any MIME-type of "body" o It allows multicast and unicast communication inside the group o It supports multiple "media types" (mail, phone, streams, ...) Additionally, it already looked at (solved?) most of the problems faced by IMPP, includng SRV DNS records, UDP and TCP usage, Content-encoding, and even digital signatures (with canonicalisation, though it seems that only PGP is covered in the current standard.) and a simple trust model for responses. 5.2 Higher-level Transport 5.2.1 Goals A higher level transport operates over the basis transport. It provides semantics linked to multi-operation sequences, security and addressing. It should be noted that there is some overlap between the higher level transport and the base transport in terms of goals, difficult problems and potential solutions. The higher level transport has several goals. 5.2.1.1 Support small clients It appears to be universally accepted in the working group that we want a protocol that will function acceptably in low-bandwidth, slow-CPU environments (e.g., PDAs with wireless connectivity). 5.2.1.2 Avoid long-lived connections STRACKE: The working group has agreed that long-lived connections are necessary. Since there is no reliable way for a server to connect to a client (see "Avoid server-to-client connections", below), all messages from server to client must flow over a client-initiated connection. Since polling imposes serious costs (forcing one to choose between high latency and high bandwidth consumption), it is necessary for the client to maintain a long-lived connection to the server. Accordingly, this section is included only for background purposes, and will be removed in a future version of this document. Some of the working group participants with experience in the wireless space point out that many current wireless services charge by the minute, even when no data is flowing, and so users should not have to keep a connection open when they aren't using it; if the Aggarwal, et al. [Page 6] INTERNET-DRAFT PITP/MITP 9 March, 2000 server needs to notify the client, it should open a new connection. 5.2.1.3 Avoid server-to-client connections Nearly all business users, and many home users, are behind firewalls and/or NATs, which means that it is difficult or impossible for them to accept incoming connections. Furthermore, many users do not have static IP numbers, meaning that a server might attempt to connect to a client at its last known address and deliver a message to the wrong user. Obviously, this goal conflicts with "avoid long-lived connections" (above). The working group has come to agree that this goal is more important. 5.2.1.4 Minimize latency It appears to be universally accepted in the working group that we want to minimize the latency in the protocol. This is required by section 7.3.1 of [IMPP-REQTS] which states that we must be able to provide conversational performance. 5.2.1.5 Avoid frequent connection setups Setting up a connection in a reliable protocol such as TCP is relatively slow; it would be useful to do it as infrequently as possible. If every IM delivered required a connection setup step, it might drive up latency unacceptably. This could mean that we need long-lived connections, which conflicts with "avoid long-lived connections" (5.2.1.2 above). 5.2.1.6 Security We probably want the security features to be as low in the stack as possible (but no lower). Security is difficult to get right; if we can share it between IM and presence, then we can do one security analysis instead of two, and implementers can write one set of security code instead of two. However, it's not clear how many of the security requirements are common between messaging and presence information. 5.2.1.7 Rapid development We would like to get this protocol designed quickly, because the existing IM world badly needs standardization. This does not mean we should be willing to make mistakes, but we need to keep our eyes open for safe shortcuts. 5.2.1.8 Resolving the conflicts It might be possible to support two modes, one with long-lived connections and one with short-lived connections, set up for each Aggarwal, et al. [Page 7] INTERNET-DRAFT PITP/MITP 9 March, 2000 message or for a few messages at a time. This approach has not been extensively discussed on the mailing list; the problem with it is that each server must support both modes, which makes inter-operability more difficult. No consensus exists on how to resolve this problem. 5.2.2 Necessary characteristics 5.2.2.1 Simple To support small clients, the protocol will have to be simple, in order to reduce memory footprint and CPU requirements. 5.2.2.2 Low Overhead To support small clients, the protocol will have to have minimal overhead, in order to reduce bandwidth requirements. 5.2.2.3 Minimal Round Trips To minimize latency, we must minimize round trips. The usual counter-example is SMTP without pipelining, where the client must issue a sequence of commands and wait for the response to each. It will probably be necessary to have round-trip delays during authentication (since the client must wait to find out what authentication schemes the server supports); but, after that, all round-trip delays should be avoided. 5.2.2.4 Unsolicited Messages To minimize round trips, it must be possible for either end of the connection to send a message (IM or presence notification) at any time (as long as it isn't already sending a message). If end A is already sending a message, it must be possible for end B to start sending a message without waiting for A to finish. 5.2.2.5 Protocol Reuse To get the design completed quickly, it makes sense to reuse existing work whenever possible, provided we can do so without compromising our other goals. The known options are HTTP, LDAP and SIP. 5.2.3 Options 5.2.3.1 HTTP We could build MITP and PITP over HTTP. 5.2.3.2 LDAP We could build PITP over LDAP, with the extensions being developed by the ldapext group. Aggarwal, et al. [Page 8] INTERNET-DRAFT PITP/MITP 9 March, 2000 5.2.3.3 Custom We could design our own protocol(s). 5.2.3.3.1 One protocol We could design one core protocol to be used by both MITP and PITP. It remains an open question whether MITP and PITP would be layered over the core protocol, or would be extensions to the core protocol. 5.2.3.3.2 Two protocols We could design two distinct protocols. 5.2.3.3.3 Pros and cons If the needs for MITP and PITP overlap enough, then one protocol will be easier than two. The uncertainty is whether we will discover significant mismatches in their needs. The list has not yet seen much discussion on this point; the consensus seems to be that we would like to have just one protocol, but we'll have to wait and see whether it's workable. 5.2.3.4 Pros and cons Using HTTP provides the Simple and Protocol Reuse characteristics, but it has problems. Building any sort of event notification (including IM or presence) over HTTP is fraught with difficulties, because of its limited request/response structure (it does not have the Unsolicited Messages characteristic). There has been some discussion on the list on this point; the proponents of HTTP pointed out that it is possible to emulate unsolicited responses via something like the multipart/x-mixed-replace Content-Type. The counter-argument is that such an approach would do a poor job of maintaining anything like a long-term session; if the client needs to change its session state, it has to close the current HTTP connection and open a new one. More generally, HTTP comes with a lot of features that don't make sense for notification (e.g., caching), and doesn't provide other features that do (e.g., Unsolicited Messages). It might be possible to build a presence service over LDAP; this would give us Protocol Reuse. (Note: there has not been much discussion on this option on the list so far, so this paragraph is somewhat speculative.) For a limited service (no SUBSCRIBERs), we wouldn't even have to wait for ldapext; [LDAPV3] provides the dynamic update features we would need. However, it's unclear whether dynamic LDAP can provide the necessary performance; LDAP's original speed benefits over DAP were achieved in part by omitting dynamic update features. In addition, the fact that it is based on ASN.1 may make it too heavyweight for small clients (it doesn't have the Simple and Low Overhead characteristics); and we don't really need its powerful searching capabilities. Also, since LDAP probably wouldn't be usable Aggarwal, et al. [Page 9] INTERNET-DRAFT PITP/MITP 9 March, 2000 for IM, even with ldapext, we'd have to design our own IM protocol anyway, so, even though we'd have the Protocol Reuse characteristic, we wouldn't meet the rapid development goal. Designing our own protocol may be more work, but it seems to be the only way to get the characteristics we need. Recent discussion seems to have assumed that this is the direction we will take. 5.3 Domain architecture The term 'domain architecture' refers to the distribution of both client and server functions. For example, we could choose to have two separate protocols, one for client to server interaction and another for server to server interaction. The issues that drive our decision making here include but are not limited to: o making good use of the performance characteristic of LAN and WAN networks where they are applicable. o avoiding conflicts with potential security schemes (especially since a potential goal for the security team is the avoidance of imposing new security schemes for sites that already have one that they like). 5.3.1 Pros and Cons Using the same protocol in for both client-server and server-server communications may yield some simplifications, since we'd be able to leverage the overlap between the two cases. On the other hand, if the requirements for the two cases turn out to be substantially different, this may cause more trouble than it saves. This point has been brought up on the list, but there has been little or no technical discussion so far; the motivation has been to develop a server-server protocol first, so that existing proprietary systems can inter-operate. As a result, we have not yet figured out which way is technically better. 5.3.2 Do we allow nested domains? Nested domains are domains in which, say, foo@example.com may actually be foo@bar.example.com; the server for example.com manages the mapping somehow, either by forwarding or by redirecting incoming connections. It sounds like a good idea; it makes for greater manageability in large domains, and reduces the cost to remote sites. However, when it was discussed recently, several problems came up. The server for example.com becomes a bottleneck for its subdomains; it becomes necessary for bar.example.com to trust example.com (sounds reasonable, but it is also necessary for example.com to trust com, which is more of a problem); and the redirection approach means that remote servers don't get to maintain fewer connections (which means costs aren't reduced), while the forwarding approach increases latency. It is also worth noting that example.com and bar.example.com might be completely separate organizations making this kind of Aggarwal, et al. [Page 10] INTERNET-DRAFT PITP/MITP 9 March, 2000 hierarchy meaningless. The proponents of nested domains seem to have accepted that they do not provide benefits in proportion to the complexity they require. 5.3.3 Can clients talk to remote servers? Do clients talk only to local servers or can they also talk to remote servers? This point has not been discussed much; most discussion has more or less assumed the client->server->server->client model. It seems likely that a client cannot talk to a remote server unless the client plays the part of a server. This comes about as a result of our goal of preventing forged messages (see section 8.4.3 of [IMPP-REQTS]). The domain model is designed to meet this goal, by giving the domain's IMPP servers the responsibility of authenticating their users. The simplest way to do this is for every IM or NOTIFICATION leaving the domain to pass through the domain's IMPP servers. So a client which wants to connect directly to a remote server must be able to identify itself as a domain server for its domain. (How that identification is done is still undetermined.) However, there is an alternative: if we wanted to permit a client to connect to a remote server, then we could design a model in which the client connects to the remote server and presents its credentials. The credentials could be signed by the IMPP server for the client's home domain, or the remote server could contact the home server and ask it to authenticate the credentials. This latter approach requires credentials that can be passed around safely (it would obviously be inappropriate with passwords, for example). It is not clear whether the flexibility of connecting directly to remote servers is worth the complexity of forwarding credentials. More discussion is needed. Another attribute of a scheme that allows clients to be occasional servers is that it enables efficient messaging on systems where privacy and security are a lesser concern, E.g. within a corporate LAN. 5.4 Addressing We need to determine two things: the format of addresses and how they are resolved. 5.4.1 Name format It is taken as given that we do not use IP addresses, since that would violate several requirements in [IMPP-REQTS] which specify that users must be able to communicate without revealing their IP addresses. (For example, see sections 8.1.18, 8.2.5, and 8.2.6) 5.4.1.1 URL Aggarwal, et al. [Page 11] INTERNET-DRAFT PITP/MITP 9 March, 2000 We could define a URL scheme or schemes which identify IMPP-related names. This approach is favored by people who want to be able to express any COMMUNICATIONS ADDRESS as a URL; what the model calls COMMUNICATIONS MEANS and CONTACT ADDRESS then become the URL scheme and the scheme-specific part. We could define a single impp: URL scheme, which is used to refer to both INSTANT MESSAGE INBOXES and PRESENTITIES. We could define two separate schemes for IM and presence (say, im: and pi:) to better separate the two protocols. 5.4.1.1.1 Pros and cons The advantage of a single URL is simplicity; the advantage of separate URLs is that it makes it easier to, say, get your IM and presence services from two different servers. This hasn't been discussed in detail on the list but several WG meetings seem to have found that having separable instant messaging and presence addresses is important. 5.4.1.2 Email format An alternate approach is to use an RFC-822 style address. In one scenario, we use an email address format, but don't actually use email addresses. One common proposal is to use RFC-2303 tagged email. In another scenario, we use something that looks like an ordinary email address, normally the same as the user's actual email address. 5.4.1.2.1 Pros and cons The "actual email address" scenario seems to have surfaced just recently, and it's not yet clear what all the tradeoffs are. The main advantage is that a user can give out one contact address; if you know somebody's email address, you know their IM address. The main disadvantage is that there's no way for the protocol to ensure that foo@example.com actually goes to the same person via IM as via email, which means that there'd be a risk of messages getting delivered to the wrong person. However, we have not yet figured out whether this is likely to be a problem in actual practice. 5.4.1.3 Email format in a URL This scenario defines a URL scheme with an email-like format for the rest of the URL, as in "im:foo@example.com". The idea is to make the URL format more intuitive than "im://example.com/foo". It could also be used to provide the ability to embed a "send me an IM" link in a Web page, even if we don't use URLs internally. 5.4.1.3.1 Pros and cons The advantages of using a URL scheme are (a) it permits "send me an IM" links (like mailto: links), (b) it provides IMPP systems a clean way to gateway to other systems (just tell your server you want to send an IM to, say, "sms:+3585551234567"), and (c) it permits PIDF to Aggarwal, et al. [Page 12] INTERNET-DRAFT PITP/MITP 9 March, 2000 treat INSTANT INBOX ADDRESSes the same as other COMMUNICATIONS ADDRESSes. STRACKE: Gateway functionality may turn out to be important in the early stages of IMPP deployment (just as it was in the early days of SMTP): we can't expect everybody in the world to adopt IMPP instantly, but we can make the transition simpler. The chief advantage of using email format is that it's what users are used to; if we adopt actual email addresses, then it reduces their memorization load, too. Some usability testing has indicated that this is what users actually want; they do not want URL-like addresses, and they do not want to have to learn a new set of addresses for their friends. 5.4.2 Resolution Any of the above formats basically boils down to an ordered pair (User, Domain). To be neutral, this section will use the ordered pair syntax. For brevity, this section refers to instant messaging only, although everything said here applies equally to presence. Given the INSTANT INBOX ADDRESS (User, Domain), it is necessary to be able to find a server that can deliver messages to the INSTANT INBOX it represents. [Note: the server might be part of a client implementation on the same LAN segment as the sender] There have been a few proposals along these lines. This section refers to "the sender", which is simply the entity (be it a SENDER USER AGENT, a SERVER, a PRESENCE USER AGENT, or a wombat) which needs to find the server for (User, Domain). 5.4.2.1 DNS-based This approach uses DNS to map Domain to a server; User is ignored. 5.4.2.1.1 A only In this scenario, the sender does a DNS lookup looking for A records for Domain; if at least one is found, then the sender uses the server it specifies. 5.4.2.1.2 SRV only In this scenario, the sender searches for SRV records according to the algorithm of [SRV]. If an SRV record is found, then the sender uses the server it specifies. 5.4.2.1.3 A followed by SRV In this scenario, the sender tries the "A only" algorithm first; if it fails, or if the sender cannot connect to an IM server on the host found, the sender then tries the "SRV only" algorithm. Aggarwal, et al. [Page 13] INTERNET-DRAFT PITP/MITP 9 March, 2000 5.4.2.1.4 SRV followed by A In this scenario, the sender tries the "SRV only" algorithm first; if it fails, or if the sender cannot connect to an IM server on the host found, the sender then tries the "A only" algorithm. 5.4.2.1.5 Pros and cons "A only" is simple, but somewhat unpleasant, as it requires DOMAIN names of the form "im.example.com" rather than just "example.com", and makes administration difficult, since the IM server cannot easily be moved from one machine to another. "SRV only" is clearly problematic, since there are probably DNS servers deployed that don't support SRV. Nobody has suggested it on the mailing list; it is mentioned here merely for completeness. "A followed by SRV" is also problematic, since Domain's A record might predate IMPP, and the host it specifies might not be running an IM server, which means that the sender would incur an unnecessary latency in attempting to connect. "SRV followed by A" uses SRV the way it's meant to be used: as a more specific way of pointing to a server for Domain. It can incur unnecessary latencies when Domain has an A record rather than an SRV record (since it introduces an extra DNS lookup); but, if DNS negative caching ( [DNS-NCACHE]) is implemented, then the latency will be reduced in most cases. Theoretically, the latency could be avoided by packing two DNS queries into a single packet; however, although the protocol specification permits this, it doesn't seem to be widely implemented. 5.4.3 Directory search Some of the proponents of URL-based naming syntaxes have proposed that we use URLs internally and use directory searches to map user-friendly names to URLs. STRACKE: WG support for using directory servers seems to be weak or nonexistent; nobody has spoken up in their favor in some time (at least since before the first version of this Draft was published). 5.4.3.1 Pros and cons The main advantage of a directory search is flexibility; for example, a directory could map a single user-friendly address to multiple alternate URLs. The disadvantage is that it's slow and complex. In addition, we would need some way of finding out which directory server to look in. LDAP referrals would be nice, but they aren't ubiquitous enough, so we'd probably have to use DNS, at which point we might as well use DNS directly. Aggarwal, et al. [Page 14] INTERNET-DRAFT PITP/MITP 9 March, 2000 5.5 Connection Model There are several questions to be asked in relation to the connection model. They include (but are probably not limited to): o Does connectivity imply presence? o Do LANs and WANs offer useful opportunities for different modes of behavior? o Should client-client connections be permitted? o Do we have to worry about connection management? o Can we optimize server-server connections? Is it worth doing? 5.5.1 Connectivity and Presence STRACKE: There seems to be at least weak consensus that connectivity should not imply presence; an IM client which wishes such semantics should act as a presence client, too, and publish PI on its own. This brings about its own problems; for example, it requires that we permit presentities to publish diffs to their PI, so that an IM client won't overwrite existing non-IM PRESENCE TUPLES. On the other hand, such an ability would be extremely useful anyway, as it would permit different communications endpoints to maintain their own PRESENCE TUPLES (for example, turning on your cellphone could update your PRESENCE TUPLE for your cell number). If the client opens a connection to a server and sends some kind of keep-alive periodically, the server can use the presence of this connection to indicate PRESENCE. Is this a good idea? 5.5.1.1 Pros and Cons On the positive side, connectivity implying presence works well on a LAN segment where frequent keep-alive messages are a reasonable thing to do. It is a simple and easily understood semantic. On the negative side, the notion doesn't work well for clients with slower or less reliable WAN connections. The notion is no help when a client wishes to inform only a subset of SUBSCRIBERs of a change in status. The most telling argument against connectivity implying presence is that it conflicts with the idea that presence is independent of instant messaging (see [IMPP-REQTS] section 5.1.1), although this can be countered by an observation that the presence service could revert to the distribution of some default PRESENCE INFORMATION in the absence of a connection. 5.5.2 Client-Client Communications If two clients are on the same corporate LAN (or some other setting where a relaxed privacy model is appropriate) then it is possible that they could send instant messages to each other directly and avoid involving a server in their conversation. 5.5.2.1 Pros and Cons Aggarwal, et al. [Page 15] INTERNET-DRAFT PITP/MITP 9 March, 2000 On the positive side, the process is more efficient. If the server is outside the company firewall it also offers easier security for intra-net communications. On the negative side, either a new client-client protocol must be built or clients must include aspects of the server-side protocol (though it should be noted that this can be an optional feature so that devices with limited resources don't have to support this mode of interaction). 5.5.3 Connection Management If we permit long-lived TCP connections (a separate but related issue) then we have to have a way for servers to force the release of TCP connections when server data structures (E.g. sockets) are in short supply. 5.5.3.1 Pros and Cons Creating such a capability is needed if long-lived TCP connections are to be permitted, so all of their positive and negative attributes are associated with this item. In addition it should be noted that allowing a server to abort connections (whether based on inactivity or random selection) interferes with the use of connectivity to indicate presence and requires a more complex implementation on both client and server components. This whole argument may be moot anyway. TCP connections are going to be aborted occasionally whatever we specify so both client and server must be able to handle their disappearance (if TCP is used in the protocol). There are probably different attributes of this issue that apply to client-server and server-server interactions. 5.5.4 Server-Server connection optimization If, between two servers their exists a high volume of IMPP traffic, it might be worthwhile to allow them to pass messages from many clients over a single TCP connection (assuming TCP is the chosen transport mechanism). 5.5.4.1 Pros and Cons The most obvious advantage is that connection set up may be avoided. This is significant in terms of computational overhead and latency reduction. This scheme might also make better use of kernel resources on busy servers. The disadvantage is that the scheme adds some complexity of the implementation (if we assume that its harder to make an IMPP implementation do its own multiplexing than it is to make use of lots of sockets). 6 Other Protocol Aspects The following are aspects of the protocol that will be discussed and specified after the key issues are resolved. The issues are described below with the intent of illustrating the scope of further work in the transport area, and of encouraging discussion about these issues Aggarwal, et al. [Page 16] INTERNET-DRAFT PITP/MITP 9 March, 2000 once the key issues are resolved. Both the list of aspects and the scope of each are probably not exhaustive at this point. 6.1 Command Model The transport protocol will likely consist of a number of discrete operations, or "commands". The protocol may prescribe a certain general pattern of interaction between clients and servers. For example, will a command sequence consist of a "request" followed by a "response"? If so, how can requests and responses be interleaved in a valid interaction? (For example, is pipelining allowed?) How do the various components of a command - e.g. the request and response - get transmitted across intervening "hops", if any? 6.2 Command set This will specify the various discrete protocol operations (such as "FETCH", "CHANGE", etc., for example) and the specific purposes they are used for. 6.3 Command encoding This will specify the general format for individual protocol operations. For example, a request may consist of a command name, certain parameters, a security descriptor, etc., arranged in a certain order; this section will specify the order and syntax of the different "fields" in a protocol entity. The format may be partially or completely adapted from an existing protocol such as IMAP or HTTP, or may be distinct from existing formats. Note that data formats for PRESENCE INFORMATION and INSTANT MESSAGES will be entirely specified by the [IMPP-PIDF] and [IMPP-MIDF] deliverables; this section will only define the other transport-related aspects of protocol formats. 6.4 Error handling The protocol will specify error handling policies. How are errors identified, and by which entities? Does the protocol adopt end-to-end or hop-by-hop error detection? How do various entities respond to errors? What policy do entities adopt for retrying protocol operations? How do entities respond gracefully to crashes or unanticipated failures? 6.5 Error set The protocol will specify an exhaustive and useful set of error conditions. Such a set may or may not draw upon existing error sets, such as that used by HTTP. 6.6 Command syntax and semantics, by command For each protocol command, the protocol will specify the precise Aggarwal, et al. [Page 17] INTERNET-DRAFT PITP/MITP 9 March, 2000 syntax specific to that command, including the arguments and other fields present in both the request and response (if organized as such) for that command. In addition, the protocol will specify semantics specific to the command, as well as handling of error conditions specific to that command. For example, the command(s) to establish SUBSCRIPTIONS may specify a mechanism to establish leases on SUBSCRIPTIONS, and similarly the command to publish PRESENCE INFORMATION may specify a mechanism to establish a lease or a timeout on that information. 6.7 Scenarios The protocol will also specify, though perhaps as an informational exercise rather than a binding specification, actual scenarios such as a user "logging on" to a Presence/Instant Messaging service, adding users to their "contact list", maintaining up-to-date status for those users, and sending users Instant Messages. It may also be useful to describe specific scenarios that pertain solely to either Presence or Instant Messaging, since the [IMPP-REQTS] stipulate that the services must be deployable independently. 7 Internationalization Considerations This document has no internationalization impact (yet). 8 IANA Considerations This document does not introduce any new IANA considerations (yet). 9 Copyright The following copyright notice is copied from RFC-2026 [Bradner, 1996], section 10.4, and describes the applicable copyright for this document. Copyright (C) The Internet Society April 5, 1998. All Rights Reserved. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English. Aggarwal, et al. [Page 18] INTERNET-DRAFT PITP/MITP 9 March, 2000 The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assignees. This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 10 Intellectual Property The following notice is copied from RFC-2026 [Bradner, 1996], section 10.4, and describes the position of the IETF concerning intellectual property claims made against this document. The IETF takes no position regarding the validity or scope of any intellectual property or other rights that might be claimed to pertain to the implementation or use other technology described in this document or the extent to which any license under such rights might or might not be available; neither does it represent that it has made any effort to identify any such rights. Information on the IETF's procedures with respect to rights in standards-track and standards-related documentation can be found in BCP-11. Copies of claims of rights made available for publication and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementors or users of this specification can be obtained from the IETF Secretariat. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights which may cover technology that may be required to practice this standard. Please address the information to the IETF Executive Director. 11 Acknowledgements Valuable review comments were received from Greg Hudson, Marc Horowitz and Jonathan Rosenberg. 12 References [DNS-NCACHE] M. Andrews, "Negative Caching of DNS Queries (DNS NCACHE)", RFC-2308. CSIRO. March, 1998. [IMPP-BASIS] S. Aggarwal, M. Day, G. Hudson, and G. Mohr, "Proposed Design Decisions for IMPP," draft-day-impp-basis-00.txt. Internet Draft, work in progress. Lotus; Microsoft; CMGI Solutions; MIT. October, 2000. [IMPP-MIDF] J. Stracke, "Message Information Data Format." Aggarwal, et al. [Page 19] INTERNET-DRAFT PITP/MITP 9 March, 2000 draft-ietf-impp-midf-00.txt. Internet Draft, work in progress. eCal Corp. January, 2000. [IMPP-MODEL] M. Day, J. Rosenberg, H. Sugano, "A Model for Presence and Instant Messaging". RFC-2778 Lotus; Bell Labs; Fujitsu. February, 2000. [IMPP-PIDF] H. Sugano, C. Vermeulen, "Presence Information Data Format." draft-ietf-impp-pidf-00.txt. Internet Draft, work in progress. Fujitsu; Alcatel. January, 2000. [IMPP-REQTS] M. Day, S. Aggarawal, G. Mohr, J. Vincent, "Instant Messaging / Presence Protocol Requirements." RFC-2779 Lotus; Microsoft; Activerse; Into Networks. February, 2000. [IMPP-SECURITY] G. Klyne, "Security Framework for Instant Messaging and Presence Protocol." draft-ietf-impp-security-framework-01.txt. Internet Draft, work in progress. Content Technologies. March, 2000. [IMPP-SERVICE] "Service Document for Instant Messaging and Presence Protocol." Proposed document to be produced by the IMPP working group. [LDAPV3] M. Wahl, T. Howes, S. Kille. "Lightweight Directory Access Protocol (v3)." RFC-2251 Critical Angle; Netscape; Isode. December, 1997. [SIP] M. Handley, H. Schulzrinne, E. Schooler, J. Rosenberg. "SIP: Session Initiation Protocol." RFC-2543 ACIRI; Columbia U.; Cal Tech; Bell Labs. March, 1999. [SRV] A. Gulbrandsen, P. Vixie, L. Esibov. "SIP: Session Initiation Protocol." RFC-2782 Troll Technologies; ISC; Microsoft. March, 1999. [VERMEULEN-TRANSPORT] C. Vermeulen, "A Presence Information Transport Protocol." draft-vermeulen-impp-pitp-00.txt. Internet Draft, work in progress.. Alcatel. November, 1999. [MUSTS] Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels," BCP 14, RFC-2119, Harvard University, March 1997. 13 Authors' Addresses Sonu Aggarwal Microsoft Corp. One Microsoft Way Redmond, WA 98052 USA sonuag@microsoft.com Colin Benson Net Effect Aggarwal, et al. [Page 20] INTERNET-DRAFT PITP/MITP 9 March, 2000 4144 Lankershim Boulevard Suite 200 North Hollywood, CA 91602 USA colin@neteffect.com John Stracke eCal Corp. 234 N. Columbus Blvd., 2nd Floor Philadelphia, PA 94043 USA francis@ecal.com Christophe Vermeulen Alcatel Research Center DS9/C0 F. Wellesplein, 1 B-2018 Antwerp Belgium Christophe.Vermeulen@alcatel.be Aggarwal, et al. [Page 21]