An XMPP Sub-protocol for WebSocket
&yetlance@andyet.netMozillajack@metajack.imcstar industrieseric@cestari.info
Applications
XMPP Working GroupI-DInternet-DraftWebSocketXMPP
This document defines a binding for the XMPP protocol over a
WebSocket transport layer. A WebSocket binding for XMPP
provides higher performance than the current HTTP binding for
XMPP.
Applications using XMPP (see
and ) on the Web currently make
use of BOSH (see and
), an XMPP binding to HTTP. BOSH is
based on the HTTP long polling technique, and it suffers from
high transport overhead compared to XMPP's native binding
to TCP. In addition, there are a number of other known
issues with long polling , which have
an impact on BOSH-based systems.
It would be much better in most circumstances to avoid
tunneling XMPP over HTTP long polled connections and instead
use the XMPP protocol directly. However, the APIs and sandbox
that browsers have provided do not allow this. The WebSocket
protocol now exists to solve these
kinds of problems. The WebSocket protocol is a bi-directional
protocol that provides a simple message-based framing layer
over raw sockets and allows for more robust and efficient
communication in web applications.
The WebSocket protocol enables two-way communication
between a client and a server, effectively emulating TCP
at the application layer and therefore overcoming many of
the problems with existing long-polling techniques for
bidirectional HTTP. This document defines a WebSocket
sub-protocol for the Extensible Messaging and Presence
Protocol (XMPP).
The basic unit of framing in the WebSocket protocol is called
a message. In XMPP, the basic unit is the stanza, which is a
subset of the first-level children of each document in an XMPP
stream (see Section 9 of ). XMPP also
has a concept of messages, which are stanzas whose top-level
element name is message. In this document, the word "message"
will mean a WebSocket message, not an XMPP message stanza (see
).
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described
in .
The XMPP sub-protocol is used to transport XMPP over a
WebSocket connection. The client and server agree to this
protocol during the WebSocket handshake (see Section 1.3 of
).
During the WebSocket handshake, the client MUST include the
|Sec-WebSocket-Protocol| header in its handshake, and the
value |xmpp| MUST be included in the list of protocols. The
reply from the server MUST also contain |xmpp| in its own
|Sec-WebSocket-Protocol| header in order for an XMPP
sub-protocol connection to be established.
Once the handshake is complete, WebSocket messages sent or
received will conform to the protocol defined in the rest of
this document.
Data frame messages in the XMPP sub-protocol MUST be of the
text type and contain UTF-8 encoded data. The close control
frame's contents are specified in .
Control frames other than close are not restricted.
Unless noted in text, the word "message" will mean a
WebSocket message composed of text data frames.
The first message sent after the handshake is complete MUST
be an <open /> element using the "urn:ietf:params:xml:ns:xmpp-framing"
namespace, whose 'from', 'id', 'to' and 'version' attributes mirror those in the XMPP
opening stream tag as defined for the 'http://etherx.jabber.org/streams' namespace in XMPP . The '<' character of the open tag
MUST be the first character of the text payload.
The server MUST respond with an <open /> element, or
a <close /> element (see ).
Clients MUST NOT attempt to multiplex XMPP streams for multiple
JIDs over the same WebSocket.
Stream level errors in XMPP are terminal. Should such an
error occur, the server MUST send the stream error as a
complete element in a message to the client.
If the error occurs during the opening of a stream, the
server MUST send the initial open element response, followed by
the stream level error in a second WebSocket message frame. The
server MUST then close the connection as specified in .
Either the server or the client may close the connection at any
time. Before closing the connection, the closing party SHOULD
close the XMPP stream, if it has been established, by sending a
message with the <close /> element, qualified by
the "urn:ietf:params:xml:ns:xmpp-framing" namespace. The
stream is considered closed when a corresponding <close/>
element is received from the other party.
To initiate closing the WebSocket connection, the closing
party MUST send a normal WebSocket close message with
an empty body. The connection is considered closed when a
matching close message is received (see Section 1.4 of
).
If a client closes the WebSocket connection without closing the
XMPP stream after having enabled stream management (see
), the server SHOULD keep the XMPP session alive
for a period of time based on server policy, as specified in
. If the client has not negotiated the
use of , there is no distinction between
a stream that was closed as described above and a simple disconnection;
the stream is then considered implicitly closed and the XMPP session
ended.
If the server (or a connection mananger intermediary) wishes to instruct
the client to move to a different WebSocket endpoint (e.g. for load balancing
purposes), the server MAY send a <close /> element and set the
"see-other-uri" attribute to the URI of the new WebSocket endpoint.
Clients MUST NOT accept suggested endpoints with a lower security context (e.g. moving
from a "wss://" endpoint to a "ws://" endpoint).
Every XMPP stanza or other XML element sent directly over
the XMPP stream (e.g. <features xmlns="http://etherx.jabber.org/streams" />)
MUST be sent in its own message. As such, every WebSocket text
message that is received MUST be a complete
and parsable XML fragment, with all relevant xmlns and xml:lang
declarations specified.
As it is already mandated that the content of each message is
UTF-8 encoded, XML text declarations SHOULD NOT be included
in messsages.
After successful SASL authentication, an XMPP stream needs
to be restarted. In these cases, as soon as the message is
sent (or received) containing the success indication, both
the server and client streams are implicitly closed, and
new streams need to be opened. The client MUST open a new
stream as in and MUST NOT send a
closing <close /> element.
XMPP servers send whitespace pings as keepalives between
stanzas, and XMPP clients can do the same as these extra
whitespace characters are not significant in the protocol.
Servers and clients SHOULD use WebSocket ping control
frames instead for this purpose.
In some cases, the WebSocket connection might be served by
an intermediary connection manager and not the XMPP server.
In these situations, the use of WebSocket ping messages are
insufficient to test that the XMPP stream is still alive.
Both the XMPP Ping extension and
the XMPP Stream Management extension
provide mechanisms to ping the XMPP server, and either extension
(or both) MAY be used to determine the state of the connection.
TLS cannot be used at the XMPP sub-protocol layer because the
sub-protocol does not allow for raw binary data to be sent.
Instead, enabling TLS SHOULD be done at the WebSocket layer
using secure WebSocket connections via the |wss| URI scheme.
(See Section 10.6 of ).
Because TLS is to be provided outside of the XMPP
sub-protocol layer, a server MUST NOT advertise
TLS as a stream feature (see Section 4.6 of
), and a client MUST ignore any
advertised TLS stream feature, when using the XMPP
sub-protocol.
In order to alleviate the problems of temporary disconnections,
the XMPP Stream Management extension
MAY be used to confirm when stanzas have been received by the server.
In particular, the use of session resumption in
MAY be used to allow for recreating
the same stream session state after a temporary network
unavailability or after navigating to a new URL in a browser.
The XMPP extension Discovering Alternate XMPP Connection Methods
provides mechanisms to discover
the additional information needed to connect to an XMPP server
outside of the procedure defined in in Section 3 of
.
Servers MAY expose such discovery information, and clients MAY
use such information to determine the WebSocket endpoint for a server.
Use of the HTTP lookup method in MAY be used
to establish trust between the XMPP server domain and the WebSocket endpoint,
particularly in multi-tenant situations where the same WebSocket endpoint is
serving multiple XMPP domains.
This specification requests IANA to register the WebSocket XMPP
sub-protocol under the "WebSocket Subprotocol Name" Registry
with the following data:
xmpp
WebSocket Transport for the Extensible Messaging and
Presence Protocol (XMPP)
RFC XXXX
[[ NOTE TO RFC EDITOR: Please replace "XXXX" with the number assigned to this document upon publication as an RFC. ]]A URN sub-namespace for framing of Extensible Messaging and Presence Protocol (XMPP) streams is defined as follows.urn:ietf:params:xml:ns:xmpp-framingRFC XXXXThis is the XML namespace name for framing of Extensible Messaging and Presence Protocol (XMPP) streams as defined by RFC XXXX.IESG <iesg@ietf.org>[[ NOTE TO RFC EDITOR: Please replace "XXXX" with the number assigned to this document upon publication as an RFC. ]]
Since application level TLS cannot be used (see ), applications which need to protect the privacy
of the XMPP traffic need to do so at the WebSocket or other
appropriate layer.
Browser based applications are not able to inspect
and verify at the application layer the certificate used for the WebSocket connection to ensure
that it corresponds to the domain specified as the "to" address
of the XMPP stream. For hosts whose domain matches the origin for the
WebSocket connection, that check is already performed by the browser.
However, in situations where the domain of the XMPP server
might not match the origin for the WebSocket endpoint (especially multi-tenant
hosting situations), the HTTP discovery
method in MAY be used to delegate trust from
the XMPP server domain to the WebSocket origin.
When presented with a new WebSocket endpoint via the "see-other-uri" attribute
of a <close/> element, clients MUST NOT accept the suggestion if the security
context of the new endpoint is lower than the current one in order to prevent downgrade
attacks from a "wss://" endpoint to "ws://".
The Security Considerations for both WebSocket (see Section
10 of and XMPP (see Section 13 of
) apply to the WebSocket XMPP
sub-protocol.
XML Schema Part 1: Structures Second EditionThe following schema formally defines the 'urn:ietf:params:xml:ns:xmpp-framing' namespace used in this document, in conformance with W3C XML Schema . Because validation of XML streams and stanzas is optional, this schema is not normative and is provided for descriptive purposes only.