Re: [hybi] WebSockets feedback (Was: Bayeux / Jetty perspective.)

Ian Hickson <ian@hixie.ch> Mon, 06 April 2009 08:27 UTC

Return-Path: <ian@hixie.ch>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 437913A6BE4 for <hybi@core3.amsl.com>; Mon, 6 Apr 2009 01:27:30 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.318
X-Spam-Level:
X-Spam-Status: No, score=-2.318 tagged_above=-999 required=5 tests=[AWL=-1.234, BAYES_00=-2.599, J_CHICKENPOX_12=0.6, J_CHICKENPOX_45=0.6, SARE_MILLIONSOF=0.315]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id w-0Z0UCNiGAC for <hybi@core3.amsl.com>; Mon, 6 Apr 2009 01:27:22 -0700 (PDT)
Received: from looneymail-a4.g.dreamhost.com (caibbdcaaaaf.dreamhost.com [208.113.200.5]) by core3.amsl.com (Postfix) with ESMTP id ECC143A6846 for <hybi@ietf.org>; Mon, 6 Apr 2009 01:27:18 -0700 (PDT)
Received: from hixie.dreamhostps.com (hixie.dreamhost.com [208.113.210.27]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by looneymail-a4.g.dreamhost.com (Postfix) with ESMTP id EC913125B51; Mon, 6 Apr 2009 01:28:23 -0700 (PDT)
Date: Mon, 06 Apr 2009 08:28:23 +0000
From: Ian Hickson <ian@hixie.ch>
To: Paul Prescod <paul@prescod.net>, Peter Saint-Andre <stpeter@stpeter.im>, Greg Wilkins <gregw@webtide.com>, Sylvain Hellegouarch <sh@defuze.org>, Alex Russell <slightlyoff@google.com>, Tony Garnock-Jones <tonyg@lshift.net>
In-Reply-To: <49D9008B.9060606@lshift.net>
Message-ID: <Pine.LNX.4.62.0904051912520.25058@hixie.dreamhostps.com>
References: <1cb725390904051000m45e9db66geb038acb452d6479@mail.gmail.com> <49D8FE80.2010100@lshift.net> <49D9008B.9060606@lshift.net>
Content-Language: en-GB-hixie
Content-Style-Type: text/css
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset="US-ASCII"
Cc: hybi@ietf.org
Subject: Re: [hybi] WebSockets feedback (Was: Bayeux / Jetty perspective.)
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 06 Apr 2009 08:27:30 -0000

What I'd like to work on is "TCP for the Web".

I think one of the key points is that when adding things to the Web
platform, IMHO we want to add basic building blocks, rather than
comprehensive protocols.

For example, we have HTML, not DocBook. We have a generic scripting 
language, not a domain-specific language. There are libraries like jQuery, 
Prototype, Y!UI, etc, which provide richer environments for authors. In 
recent times we've added generic SQL database support; despite the main 
use case being client-side e-mail storage, we didn't provide, say, local 
IMAP support. We added generic JavaScript background threads; we didn't 
add, say, a high-level database synchronisation mechanism.

The idea is to provide features that allow people to build libraries that 
provide complex features. This is similar to how the POSIX platform 
provides basic building blocks (libc) and then libraries like libssl, 
libz, libcurl, etc, provide the high-level functionality.

So for instance with bidirectional communication, I think what we should 
be aiming for is something on the level of a TCP socket (modulo security 
requirements), not something like XMPP or HTTP. We're not trying to solve 
a specific problem, we're trying to make it possible to solve a broad 
range of specific problems by providing a simple building block: a socket, 
the Web way.


Now, I have no objection to Jabber or other high-level protocols being 
made available to Web pages, if they can be shown to be securable, but 
that's not the kind of problem I'm trying to solve. WebSocket isn't 
intended to compete with that kind of feature.


With that in mind:

On Fri, 3 Apr 2009, Paul Prescod wrote:
> On Fri, Apr 3, 2009 at 2:07 PM, Ian Hickson <ian@hixie.ch> wrote:
> >
> > [IM is] _an_ application, but I don't think it's the biggest. I would 
> > expect games to be bigger, for example, and would expect background 
> > update notificiations in systems like Google Calendar to be even more 
> > common.
> 
> Most real-time social games will have chat embedded or on the page 
> around them.

Sure, but they'll all have subtly different needs. I imagine there will be 
libraries that provide such functionality for many sites, but for example 
while GMail will want to provide chat features with all the users in the 
greater federated Jabber network, a site like casualcollective will want 
to only have chat within the users of the site, and during a game, only 
the users in the game. A game like that from the Rainbow 6 series of games 
will want to distinguish between players who are still alive and players 
who are not; a game from the Quake series of games will not. Some will 
want some kinds of smileys, others might want audio and video. If we 
provide a protocol that is tuned to the purposes of one kind of chat, 
we'll fail to provide the features desired by another site, and sites will 
have to coerce the provided features into doing what they want.


> For example, how do I send a GIF over a Web Socket? HTTP and Jabber have 
> already solved that problem. How do I correlate a response to a message 
> with the original message?

If the solution HTTP has for sending a file is satisfactory, then 
presumably we don't need one for WebSocket. :-)

There are lots of solutions to the problem of sending files to a server, 
or sending messages in a variety of ways correlated with a request, like 
HTTP and Jabber as you mention, but also like FTP, IMAP, POP, BEEP, and a 
whole host of other high-level application protocols.

I'm interested in providing a feature that is the level below that, like 
TCP is to the above.


> > I'm not convinced that it's really a fair comparison, either. rHTTP as 
> > a protocol is remarkably complex, in that it includes by reference the 
> > whole of HTTP, which is a remarkably complex protocol.
> 
> I did say that the NEW specification is very short and simple.

It doesn't really matter how simple the "NEW" specification is, what 
matters is the complexity the author is exposed to.


> > Actually using rHTTP for true full-duplex connections would be much 
> > more complicated, since you'd need to somehow get two separate 
> > connections from the client to the server into the same CGI script.
> 
> I don't know what you mean.
> 
> These services cannot be served with CGI. CGI is built around 
> request-response. Just as PHP scripts do not in general implement HTTP, 
> I wouldn't expect the handlers of these things to implement the 
> protocol.

On the contrary, I expect the wide variety of uses of this feature to lead 
directly to a situation where people write custom implementations all the 
time. That's one the of the assumptions behind the Web Socket design. 
Interoperable heterogeneity is good, it reduces the impact of any single 
security bug, for instance.


> Here's how I expect it would work: there is a service optimized for 
> long-running TCP connections. Maybe it's written in C or Erlang. There 
> are PHP scripts (or whatever) for info coming in from web clients. When 
> PHP wants to tell the universe that "something happens" it puts a 
> message in a queue. Erlang or C delivers it.

That might work for some uses of bi-directional communication, but it 
wouldn't work for all of them. Luckily, this is the kind of thing that is 
easily built on top of the Web Socket protocol.


> Web developers do not want to become protocol handler developers and 
> they should not. The protocol should not be designed with the 
> presumption that they will be protocol developers. In fact: in my more 
> than 10 years of web development experience, I don't recall ever 
> implementing an HTTP client or server.

Certainly it would be reasonable for people to write servers that can be 
extended in that manner, but that doesn't mean we should preclude the 
possibility of people writing their own. Just like how with CGI scripts it 
is common for libraries to be used for interpreting the CGI environment 
variables, but people still sometimes write their own.


> What I have done frequently though, is trace the text flowing on the 
> wire, which is why a text-based protocol is preferable to me than a 
> binary one.

The Web Socket protocol is trivial to understand (0x00 indicates the start 
of a frame, 0xFF the end, and everything in between is ASCII/UTF-8) so I 
really don't think that it will make be hard to debug on the wire.


> > With Web Socket, there is only one underlying TCP connection.
> 
> So what? How does that make my app simpler? Are you expecting PHP 
> programmers to write long-running TCP server apps?

Yes.


> I don't understand why it matters whether it is one connection or two.

Having two connections is IMHO an misuse of the underlying protocol (TCP).


> > - It's not clear that Jabber's protocol supports the opt-in security
> >   model that is desired for an HTML API.
> 
> Not sure what that means.

We need to make sure that whatever we provide is consistent with the 
browser "same-origin" security model. It's not clear to me that Jabber is 
compatible with this model, since it assumes that the user agent is 
acting on behalf of the user.


> > Also, it's not clear to me that streaming XML is a particularly good 
> > solution. Experience with Web authors suggests that XML's draconian 
> > error handling and XML's namespaces are more confusing than 
> > desireable.
> 
> The web developer is just putting messages into a queue. It's the 
> protocol handler (like Apache, ejabberd, postfix) that talks the 
> protocol.
> 
> Can you point to a single protocol deployed on the Internet that is 
> commonly implemented and re-implemented and then re-re-implemented in 
> scripts, rather than being packaged in servers in libraries?

IRC. The multitude of multiplayer game protocols. The variety of protocols 
layered on top of <iframe>+<script>+XHR. The name/value pairs used in form 
data to write applications over HTML+forms.

The whole point here is to enable more of this for Web application 
developers. Right now they have no way to invent their own protocol at the 
TCP level.


On Fri, 3 Apr 2009, Paul Prescod wrote:
>
> By the way, you mentioned calendars: calDAV is the web protocol for 
> manipulating them. It might be a good fit for reverse HTTP.

We're not taking about manipulating calendars, we're talking about the UI 
side of an application talking to its logic backend. Other examples would 
be Google Docs getting a user's file list, or updating a document or 
spreadsheet as it is being edited. AtomPub and CalDAV wouldn't be 
appropriate, because they are generic inter-application protocols, not 
application-specific internal protocols tuned for the needs of those 
specific applications.


On Fri, 3 Apr 2009, Peter Saint-Andre wrote:
> > 
> >  - It's not clear that one can easily upgrade from HTTP to a 
> >    bidirectional Jabber connection.
> 
> Why upgrade from HTTP to XMPP?

Because in some environments HTTP is all we have, so we need a migration 
path from an HTTP connection to the two-way solution.


> Just build XMPP support into the browser, and you can refer to certain 
> resources as XMPP URIs/IRIs and other resources as HTTP URLs. Or the 
> client (browser) and server can negotiate an XMPP channel between 
> themselves for more advanced bidirectional functionality.

As noted earlier, I have no objection to this happening as well. But it 
dosn't solve all the needs. For example it wouldn't be a very efficient 
way of doing the train control I mentioned. It also would be a pretty poor 
transport for live editing updates as seen in Google Docs -- if the user 
hits one key, and so you need to send one byte to the server, a 200% 
overhead (as with Web Sockets) is already pretty extreme. Forcing the 
application author to frame each character in XML elements and namespaces 
and so forth would not be our proudest moment.


> >  - It's not clear that Jabber's protocol is designed to be resistant to 
> >    the attacks we expect (such as connecting to an SMTP server, or an 
> >    existing Jabber server, or whatnot).
> 
> Have you completed a threat analysis for the projected use cases? I 
> don't see anything about that in draft-hixie-thewebsocketprotocol-07 but 
> perhaps it is defined elsewhere.

I don't have anything written down, but the protocol was examined in this
context, yes. I would welcome an independent analysis though.


> > Experience with Web authors suggests that XML's draconian error 
> > handling and XML's namespaces are more confusing than desireable.
> 
> Whose experience? Where is this documented?

For draconican error handling, the disaster that is XML deployment on the 
Web is the primary documentation.

For namespaces, see: http://wiki.whatwg.org/wiki/Namespace_confusion


> The XMPP developer community has produced a large volume of software 
> (see links above) and runs a large network of servers with millions of 
> IM users. So far we have not experienced any significant interop 
> problems.

Sure, but you're all competent. We're talking about letting random authors 
who are barely able to write CGI scripts create new interactive 
applications here. More than 95% of HTML pages are invalid in some way. 
Simplicity is key here.


> > - The protocol should support running over ports 80 and 443,
> >   ideally with the ability to share the port on the server
> >   with an HTTP server.
> 
> Why? Using a different port for a different protocol seems like the 
> proper approach.

Because in practice, there are environments that block all but port 80, 
and applications like GMail won't use solutions that don't work in those 
environments too.

(Note that Web Sockets default to another port; the requirement is only 
that ports 80 and 443 be supported, not required or the default.)


On Sat, 4 Apr 2009, Greg Wilkins wrote:
> 
> If the IETF does come out with a bidirectional HTTP recommendation, it 
> will be a huge once in a decade opportunity to upgrade the protocols 
> supported by the network infrastructure.

Why would it be a once in a decade opportunity? If there is a need, then 
it can be filled. It's not like we can only create new protocols when the 
planets align in some particular way.


> Websockets sentinel UTF-8 datagrams may well be ideally suited for a 
> specific type of application, but can we really say that this will be 
> the only content type that we need to transport asynchronously server to 
> client for the next decade or two?

The protocol is generic enough (it can do text and known-length binary out 
of the box, and is easily extensible to handle multiplexed infinite 
streams, structured data, finite unknown length binary data via chunking, 
etc), and simple enough (it has virtually no features, leaving everything 
up to the application) to be usable for most bi-directional use cases for 
Web applications.


> What if UTF-16 becomes the preferred encoding? or gzipped UTF-16 or some 
> as-of-yet-un-invented compression algorithm specially designed for 
> packets of UTF-16 data holding JSON becomes in common usage.

We can use the known-length encoding mechanism to transport all of those, 
so they would be easy extensions.


> I'm already working with clients that wish to push images.

We can use the known-length encoding mechanism support by the Web Socket 
protocol as designed today to handle any kind of binary data.


> If web socket supports arbitrary binary data, then why is there a need 
> for the sentinel mechanism and special status for UTF-8 data?

There's a need for a separate "this is text" frame than "this is arbitrary 
data" frame because at the API level the only datatype today is text. 
(This is why the binary mode frames are currently discarded -- JS doesn't 
yet support representing the binary data.)

Once you're sending binary data, you want to avoid length markers because 
authors are very much more likely to get measuring UTF-8 data wrong 
compared to just stick an 0xFF byte on the end.


> Note also with websocket binary transfer, I believe it is a limitation 
> that there cannot be meta data (headers) associated with a datagram. 
> Thus information like mime types, char-sets content-encodings and 
> transfer encodings will need to be handled with stateful mechanisms 
> based on assumption, connection and/or previous sent datagrams.

There can be as much metadata as the applicaton layer wants -- just put 
it before the data, using an application-specific format.

This is just like with TCP, there's no room for MIME types. Protocols that 
use TCP, like HTTP, layer MIME types on top. This isn't a failing of TCP.


On Fri, 3 Apr 2009, Alex Russell wrote:
> >
> > I'm not convinced that it's really a fair comparison, either. rHTTP as 
> > a protocol is remarkably complex, in that it includes by reference the 
> > whole of HTTP, which is a remarkably complex protocol.
> 
> That sort of depends on which version you're talking about. HTTP 1.0 is 
> relatively simple, whereas 1.1 is hugely complicated, particularly when 
> it comes to accommodating chunked encoding and pipelining.

HTTP 1.0 is relatively simple compared to 1.1, but it has all kinds of 
weird subtlties, like header continuations, the many verbs (GET POST 
etc) and respone codes (200, 404, etc), that I wouldn't be at all 
surprised to see misimplemented by random authors. Getting a complete 
conforming implementation is non-trivial, even for 1.0.


> If you're looking to have the complexity of 1.0 as a high-water-mark, 
> that would seem perfectly reasonable. People are going to mis-implement 
> whatever you spec anyway.

Granted, but it's actually pretty hard to misimplement WebSockets as 
specced now and still have it work in the common case. (Though I'm sure 
creative Web designers will prove me wrong.)

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'