An XMPP Sub-protocol for WebSocket
&yet
lance@andyet.net
jack@metajack.im
ProcessOne
ecestari@process-one.com
Applications
HyBi Working Group
I-D
Internet-Draft
WebSocket
XMPP
This document defines a binding for the XMPP protocol over a
WebSocket transport layer. A WebSocket binding for XMPP
provides higher performance than the current HTTP binding for
XMPP.
Applications using XMPP (see
and ) on the Web currently make
use of BOSH (see and
), an XMPP binding to HTTP. BOSH is
based on the HTTP long polling technique, and it suffers from
high transport overhead compared to XMPP's native binding
to TCP. In addition, there are a number of other known
issues with long polling , which have
an impact on BOSH-based systems.
It would be much better in most circumstances to avoid
tunneling XMPP over HTTP long polled connections and instead
use the XMPP protocol directly. However, the APIs and sandbox
that browsers have provided do not allow this. The WebSocket
protocol now exists to solve these
kinds of problems. The WebSocket protocol is a bi-directional
protocol that provides a simple message-based framing layer
over raw sockets and allows for more robust and efficient
communication in web applications.
The WebSocket protocol enables two-way communication
between a client and a server, effectively emulating TCP
at the application layer and therefore overcoming many of
the problems with existing long-polling techniques for
bidirectional HTTP. This document defines a WebSocket
sub-protocol for the Extensible Messaging and Presence
Protocol (XMPP).
The basic unit of framing in the WebSocket protocol is called
a message. In XMPP, the basic unit is the stanza, which is a
subset of the first-level children of each document in an XMPP
stream (see Section 9 of ). XMPP also
has a concept of messages, which are stanzas whose top-level
element name is message. In this document, the word "message"
will mean a WebSocket message, not an XMPP message stanza (see
).
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described
in .
The XMPP sub-protocol is used to transport XMPP over a
WebSocket connection. The client and server agree to this
protocol during the WebSocket handshake (see Section 1.3 of
).
During the WebSocket handshake, the client MUST include the
|Sec-WebSocket-Protocol| header in its handshake, and the
value |xmpp| MUST be included in the list of protocols. The
reply from the server MUST also contain |xmpp| in its own
|Sec-WebSocket-Protocol| header in order for an XMPP
sub-protocol connection to be established.
Once the handshake is complete, WebSocket messages sent or
received will conform to the protocol defined in the rest of
this document.
Data frame messages in the XMPP sub-protocol MUST be of the
text type and contain UTF-8 encoded data. The close control
frame's contents are specified in . Control frames other than close are not
restricted.
Unless noted in text, the word "message" will mean a
WebSocket message containing a text data frame.
The first message sent after the handshake is complete MUST
be an XMPP opening stream tag as defined in XMPP or an XML text declaration (see Section
4.3.1 of ) followed by
an XMPP opening stream tag. The stream tag MUST NOT be
closed (i.e. the closing </stream:stream> tag should not
appear in the message) as it is the start of the client's
outgoing XML. The '<' character of the tag or text
declaration MUST be the first character of the text payload.
The server MUST respond with a message containing an error
(see ), its own opening stream tag,
or an XML text declaration followed by an opening stream
tag.
Except in the case of certain stream errors (see ), the opening stream tag,
<stream:stream>, MUST appear in a message by itself.
Stream level errors in XMPP are terminal. Should such an
error occur, the server MUST send the stream error as a
complete element in a message to the client.
If the error occurs during the opening of a stream, the
stream error message MUST start with an opening stream tag
(see Section 4.7.1 of ) and end with
a closing stream tag.
After the stream error and closing stream tag have been
sent, the server MUST close the connection as in .
Either the server or the client may close the connection at
any time. Before closing the connection, the closing party
MUST close the XMPP stream if it has been established. To
initiate the close, the closing party MUST send a normal
WebSocket close message with an empty body. The connection
is considered closed when a matching close message is
received (see Section 1.4 of ).
Except in the case of certain stream errors (see ), the closing stream tag,
</stream:stream>, MUST appear in a message by itself.
Each XMPP stanza MUST be sent in its own message. A stanza
MUST NOT be split over multiple messages. All first level
children of the <stream:stream> element MUST be treated
the same as stanzas (e.g. <stream:features> and
<stream:error>).
After successful SASL authentication, an XMPP stream needs
to be restarted. In these cases, as soon as the message is
sent (or received) containing the success indication, both
the server and client streams are implicitly closed, and
new streams needs to be opened. The client MUST open a new
stream as in and MUST NOT send a
closing stream tag.
XMPP servers send whitespace pings as keepalives between
stanzas, and XMPP clients can do the same thing. These extra
whitespace characters are not significant in the protocol.
Servers and clients SHOULD use WebSocket ping messages
instead for this purpose.
The XMPP Ping extension allows
entities to send and respond to ping requests. A client
sending a WebSocket ping is equivalent to pinging the
WebSocket server, which may also be the XMPP server. When
the XMPP server is not also the WebSocket server, a
WebSocket ping may be useful to check the health of the
intermediary server.
TLS cannot be used at the XMPP sub-protocol layer because the
sub-protocol does not allow for raw binary data to be sent.
Instead, enabling TLS SHOULD be done at the WebSocket layer
using secure WebSocket connections via the |wss| URI scheme.
(See Section 10.6 of ).
Because TLS is to be provided outside of the XMPP
sub-protocol layer, a server MUST NOT advertise
TLS as a stream feature (see Section 4.6 of
), and a client MUST ignore any
advertised TLS stream feature, when using the XMPP
sub-protocol.
Implications of, and recommendation to use, the XMPP Stream
Management extension to be added.
Examples will be added as soon as the WebSocket protocol
specification is more stable.
Since application level TLS cannot be used (see ), applications which need to protect the privacy
of the XMPP traffic need to do so at the WebSocket or other
appropriate layer.
The Security Considerations for both WebSocket (See Section
10 of and XMPP (See Section 13 of
) apply to the WebSocket XMPP
sub-protocol.
This specification requests IANA to register the WebSocket XMPP
sub-protocol under the "WebSocket Subprotocol Name" Registry
with the following data:
xmpp
WebSocket Transport for the Extensible Messaging and
Presence Protocol (XMPP)
RFC XXXX
[[NOTE TO RFC EDITOR: Please change XXXX to the number assigned
to this document upon publication.]]