[hybi] WebSocket feedback

Ian Hickson <ian@hixie.ch> Fri, 24 April 2009 09:50 UTC

Date: Fri, 24 Apr 2009 09:51:40 +0000
From: Ian Hickson <ian@hixie.ch>
To: hybi@ietf.org
In-Reply-To: <3a05072b0904170945n56b14d28o30f578c96a372323@mail.gmail.com>
Message-ID: <Pine.LNX.4.62.0904240611350.10370@hixie.dreamhostps.com>
References: <5dd474260812030437i57287df8l164006ea2859d203@mail.gmail.com> <49369C5C.1070609@gmail.com> <5dd474260812030734s3389aec3y5f63c7dd572ac82e@mail.gmail.com> <4936C476.1010708@gmail.com> <5dd474260812151553g23e3d9e5i478282da561b75cd@mail.gmail.com> <3a05072b0904080200v4f824b29s21925e2809c93ae@mail.gmail.com> <c7c14730904080356s2df228d1id324d3e42cf1fa3e@mail.gmail.com> <3a05072b0904170945n56b14d28o30f578c96a372323@mail.gmail.com>
Content-Language: en-GB-hixie
Content-Style-Type: text/css
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset="US-ASCII"
Subject: [hybi] WebSocket feedback
Precedence: list

Sorry about my absence from this list the past few days, I've been busy 
with HTML5 work. I put aside a few e-mails to reply to on the topic of 
WebSocket, which I reply to below:

On Mon, 6 Apr 2009, Peter Saint-Andre wrote:
> On 4/6/09 2:28 AM, Ian Hickson wrote:
> > What I'd like to work on is "TCP for the Web".
> 
> That's what we tried to define with BOSH. Whether we succeeded is an 
> open question.

I think reusing TCP's semantics (e.g. the tight coupling of the two 
directions of communication into one TCP channel) is important in making a 
"TCP for the Web" -- I don't think that a TCP analogue should map to 
multiple TCP connections underneath.

On Tue, 7 Apr 2009, Mridul Muralidharan wrote:
>
> Note that browser based javascript clients are not the only consumers of 
> bi-directional http; others like mobile clients, firewalled thick 
> clients, etc are also possible (and actually exist with BOSH for 
> example).
>
> So possibly targeting specific implementation technology might not be 
> appropriate for designing the spec ?

Why wouldn't mobile clients, firewalled thick clients, etc, just use a raw 
TCP socket?

The only reason we're not looking at raw TCP sockets when it comes to what 
to provide JavaScript Web apps access to is that TCP doesn't natively 
support the browser's security model. If it wasn't for that, raw TCP is 
exactly what we would have by now -- in fact the SVG WG even specced an 
API out for exactly that in SVG 1.2 (before the security problems with 
doing that were brought up).

On Tue, 7 Apr 2009, Sylvain Hellegouarch wrote:
> 
> In this context I can understand why your draft is so short and that's 
> probably a good sign because it'll be that easier to see deployed. 
> Nonetheless this long discussion makes me wonder whether or not having a 
> new protocol is a better idea than picking one of the existing 
> alternatives (Comet, BOSH). Wouldn't it be better to focus energy on 
> best practices when it comes to long polling connections based on what's 
> working now?

Adding something new doesn't prevent things that already work from 
working.

On Wed, 8 Apr 2009, Mark Lentczner wrote:
> 
> I see the attraction of just giving something very simple to application 
> developers. Giving them a protocol that does Minimal Events is easy to 
> explain and would have a simple API. However, I have seen that over 
> time, applications often need to build most of the framework that is 
> already provided by more complex abstractions. Where the cost of those 
> abstractions is small, and the ability to use them in simpler ways easy, 
> it is often better to supply a richer abstraction to all, since the 
> uniformity of the often implemented bits means better code for all.

That's what libraries are for. Core platforms (POSIX, Win32, Cocoa, the 
Web) provide fundamental building blocks, on top of which libraries are 
built that expose many different possible abstractions.

It's better that way, as it gives authors more choice as to how to do 
things. Someone might want a comprehensive protocol with all kinds of 
metadata and channels and reconnection support and so on; someone else 
might want a different set of abstractions based more on resources with 
identifiers and types. Why should we force one to use the protocol the 
other wanted? Better to provide an infrastructure on top of which both of 
these can be built.

Having said that, if people want higher-level protocols standardised, I've 
nothing against that.

> Example: Implementing a Minimal, Event protocol with Message, Req/Resp 
> is just a matter of accepting the overhead of the framing of the simple 
> data blob, and the cost of the minimal response message.

It also probably means dealing with a framework that's more than a hundred 
lines of Perl (say), which puts a lot of burder on the developer, and 
probably the inexperienced one at that.

> On the other hand, implementing a Message, Req/Resp. protocol with 
> Minimal, Events, means building a whole framework for encoding message 
> parts (address, meta-data and data) into the minimal data units 
> (strings? binary blobs?) and coming up with a way to correlate responses 
> to requests, etc... all in the application layer.

Right, but in this case it's likely the experienced developer.

On Thu, 9 Apr 2009, Peter Saint-Andre wrote:
> 
> The same-origin model is often invoked here. Is there a good definition 
> of it? Perhaps it would be helpful for someone to write a small I-D 
> about this.

The HTML5 spec defines it formally, but I don't know of a simple 
introduction to it.

Basically it consists of partitioning untrusted code based on tuples that 
consist of hostname or IP address, port, and protocol (scheme). Code is 
then limited to dealing with resources from the same origin (with some 
historical exceptions, like the ability to invoke script from any origin 
in the context of one's own origin).

Cross-origin mechanisms work on the principle of code from one origin 
opting into communication with code from another origin, with 
communication only being possible if both sides opt in.

On Thu, 9 Apr 2009, Linner, David wrote:
>
> Looking into the hybi wiki and the use cases listed there (e.g. 
> collaborative editing, multiplayer games) I get the feeling state 
> synchronization is quite important. So, I think a shared memory 
> interaction model would be suitable for these use cases. Implementing 
> such an interaction model on top of bidirectional messaging between 
> browser and server is not a problem, but a realization as JavaScript 
> library would result in a significant loss of performance.

In practice JavaScript performance has improved by orders of magnitude in 
the recent past and shows every sign of continuing to be the focus of 
serious performance work, so I don't think that JavaScript libraries are 
likely to be a performance bottleneck in the future.

On Fri, 10 Apr 2009, Greg Wilkins wrote:
> 
> However, IF we are in the game of picking winners or designing new 
> protocols, we could do a lot worse than starting with a close inspection 
> of BEEP and the concerns that they address.

BEEP was amongst the many protocols that I examined closely when writing 
the WebSocket proposal.

> The standard HTTP security model would be used to establish a connection 
> and then the upgrade is unable to break out of that connection etc.

What is the standard HTTP security model? WebSocket wasn't able to use it, 
nor was XHR2; in both cases we had to add an extra layer to get cross- 
origin restrictions into HTTP. Are we missing something?

On Mon, 13 Apr 2009, Jonas Sicking wrote:
> 
> But the point is that the type is transferred at a specified location. 
> And IMHO that specified location should be separated from other fields 
> with other purposes. My understanding is that the path is intended to 
> allow for several websocket services from the same origin, thus I would 
> prefer if the protocol is separate from the path.

I've added an optional protocol ID to WebSocket, which is sent along with 
the handshake and (if present) must be present in the response back from 
the server.

On Tue, 14 Apr 2009, Jamie Lokier wrote:
> 
> But it would be nice if the (imho highly desirable ;-) features from my 
> protocol were available to others without them needing yet another big 
> Javascript library in future browsers, and then only perhaps with one 
> web server and application backend.

Why would it be better for it to be a big bunch of new code in the 
browser, with perhaps only one web server and application backend, rather 
than a big bunch of new code in a JS library, with perhaps only one web 
server and application backend?

> But they don't look compelling for "I want to write an attractive web 
> application with lots of updating documents and tiny widgets without 
> using a pre-existing application framework or large bit of Javascript".

Writing an attractive web application with lots of updating documents and 
tiny widgets without using a large bit of Javascript seems optimistic.

On Wed, 15 Apr 2009, Greg Wilkins wrote:
> 
> while workers might make such sharing technically possible, I think 
> there is still an issue since the websocket API is both an application 
> API and a protocol API.  Developers who just want UTF-8 datagrams and 
> don't care about the transport will use the websocket API directly and 
> will thus inadvertently make a decision about the creating a connection 
> to the server!.

Well, it's their server. Maybe they don't mind.

> I think in this regard websocket API is a step in the wrong direction as 
> it makes the developer directly request a connection.  Instead they 
> should be directly requesting a channel and it should be up to the 
> browser to decide if that is a new connection or a shared multiplexed 
> connection.

IMHO, taking control away from the author like this isn't a good idea. 
Giving authors a shared worker and low-level sockets IMHO gives them much 
more flexibility than imposing a particular multiplexing design on them.

On Thu, 16 Apr 2009, Greg Wilkins wrote:
> 
> It is my experience in deploying many Comet Ajax Push applications that 
> scalability is limited either by message rate (for apps like stock 
> tickers) or by number of connections (apps like auctions, IM and/or 
> chat). Connections are a big problem and will only become more so if 
> creating them become easier.

But the tools to fix the problem are available to the author. It's not a 
problem if the author can fix it. It's not like we're preventing authors 
from using multiplexed connections.

> If sockets are to be shared, then it will have to be as a result of 
> those applications and frameworks voluntarily and explicitly taking part 
> in some multi-plexing scheme - just like in the good old days when 
> multi-tasking was done by applications voluntarily giving back the CPU!

The difference is that the resource here is per-server anyway. Cooperative 
multitasking on a per-process level is a perfectly reasonable strategy 
(indeed it underlies "modern" techniques like fibers), it's only a problem 
when it is the mechanism used to share the CPU between different 
processes, where it allows one to take down another.

On Fri, 17 Apr 2009, Greg Wilkins wrote:
> 
> The spec really needs to have the key concepts described in introductory 
> sections rather than by pseudo code repeated several times in the body 
> of the document.

The introduction section is yet to be written, but will be in due course.

> Another problem is that the handshake is defined by explicitly saying 
> what bytes to send: [...] That's a really long winded way of saying send 
> a HTTP/1.1 request with some headers set.

There's a big difference, which is that the exact bytes matter, as the 
client will check that they are sent as described exactly.

> More over, that specification implies that the order has to be:
> 
>  GET /path HTTP/1.1
>  Upgrade: WebSocket
>  Connection: Upgrade
>  Host: acme.com
>  Origin: origin
> 
> and thus it would be illegal to send:
> 
>  GET /path HTTP/1.1
>  Origin:                origin
>  Host:
>    acme.com:80
>  Upgrade:                 WebSocket
>  Connection:      Upgrade

Yes, it is indeed illegal as a WebSocket handshake. This is required 
because if we allowed any correct HTTP request/response that happened to 
have the right semantics, it would be significantly easier to smuggle 
messages to non-HTTP servers.

On Fri, 24 Apr 2009, Greg Wilkins wrote:
> 
> It would be great if the websocket proposal could include standard 
> definitions for mime encoded datagrams.
> 
> Current frame types are:
> 
>   0x00  - sentinel framed UTF-8 message
>   0x80  - length framed binary data.
> 
> I'd like to see two additional frame types supported
> by default:
> 
>   0x01  - sentinel framed UTF-8 encoded MIME message
>   0x81  - length framed MIME message.
> 
> Both these data types would contain a data that commenced with a 
> standard mime header (RFC 2045).  The header is optional and terminated 
> by CR LF CR LF.  Thus these types have a minimal overhead of 4 bytes.

Why can't the 0x00 and 0x80 frames be used for this?

> The websocket API would need to be slightly extended to support some 
> common types of message.

Why can't the existing API be wrapped by a library to offer this?

Generally speaking I am strongly against the idea of the Web platform APIs 
doing things for convenience rather than adding actual new features. As 
noted above, I think it is important for browsers to provide a core 
feature set but not to succomb to a feature creep where they provide all 
the features that authors will ever need.

Christopher Blizzard of Mozilla said it in the context of O3D recently:

   http://www.0xdeadbeef.com/weblog/?p=1223

I think his point applies very much here.

On Thu, 16 Apr 2009, Greg Wilkins wrote:
> Ian Hickson wrote:
> > I don't understand the relevance of HTTP to this discussion.
> 
> HTTP is our starting point.

TCP is my starting point.

> It is how our browsers initiate connections

HTTP is how browsers request resources. That this is layered on top of 
connections is IMHO mostly academic; HTTP isn't a connection protocol.

> It is the protocol that is most likely to get through firewalls

SSL seems even more likely to go through reliably.

> It is a brilliant example of how an extensible flexible protocol can
> adapt to changing usage.

It certainly has been extended a lot.

> So if you want to have a bidirection full-duplex protocol, then the 
> chances are the connection will be created by a HTTP client talking to a 
> HTTP server and then doing an upgrade.

That's certainly one scenario that seems useful to support, yes.

> For that upgrade to be successful, we are going to need to engage all 
> the firewall/gateway/proxy/cache venders and developers so that they 
> will support upgrade either to a specific protocol or to a class of 
> protocols.

Or we can tunnel it over SSL.

> If websocket is just one of many possible upgrade targets, then I don't 
> think that is a problem and websocket can continue to focus on it's 
> prime use-cases. However, if websocket is to be the only widely 
> supported upgrade protocol, then I think it needs to learn some lessons 
> from HTTP as to how to keep most people mostly happy most of the time.

I don't see why WebSocket would be the only widely supported upgrade 
protocol. I do think each protocol should be considered, developed, and 
implemented on its own merits.

On Thu, 9 Apr 2009, Peter Saint-Andre wrote:
> > 
> > Sure, if you can use libraries then implementing any protocol is 
> > trivial. The point is that the protocol should be simple enough that 
> > one doesn't need to rely on libraries to get a compliant 
> > implementation.
> 
> Where does this requirement come from? I have never seen a similar 
> requirement in any RFC.

It comes from experience seeing all those RFCs get implemented in 
incredibly buggy ways over and over again.

> And are you assuming that developers should be able to implement this 
> protocol without any libraries whatsoever? Not even for TCP, UTF-8, or 
> anything else?

I wouldn't expect people to implement the UTF-8 layer themselves, though 
if they don't need to transfer user text (or if they just pass the user 
text unmodified to a database, or some such) then they wouldn't even need 
that. Both UTF-8 and TCP are core parts of most system APIs these days.

On Tue, 14 Apr 2009, Maciej Stachowiak wrote:
> 
> I think multiplexing multiple message streams, though it may seem useful 
> at the protocol level, is not a critical building block.

I agree; I think it is easily enough implemented as a layer on both sides 
(just prefix each message with a number giving the channel ID, and 
dispatch incoming messages to registered listeners based on ID).

On Tue, 14 Apr 2009, Greg Wilkins wrote:
> 
> implementing multiplexing is not the pain point as there are a million 
> ways to do it, ranging from trivial to amazingly complex.

Exactly. That's why we shouldn't bake one into the spec, IMHO.

> Websocket presents a nice simple interface that is easy for web developers
> to use directly.  So they will use it directly!

Why won't people use libraries on the client, as you argue they will to 
use WebSocket on the server?

> There is ZERO chance that all these diverse communities will come together
> and decide on how to share and multiplex connections.

Why is that a problem? We don't need everyone doing the same thing. We 
only need each site to be consistent within the site.

On Tue, 14 Apr 2009, Jamie Lokier wrote:
> 
> It's easy to write a simple HTTP server using a select() loop too, if 
> you know unix, or Perl, or Python.  There are lots of bad ones around to 
> prove it.  It's even easier with fork().

It's a lot more than 100 lines of code to write a compliant HTTP server 
in Perl.

On Thu, 16 Apr 2009, Greg Wilkins wrote:
> 
> I don't know if limiting connections on the server was a motivation for 
> the 2 connection limit, but as a developer of a HTTP server, I 
> appreciate the limit as every connection to the server consumes some 
> kernel resources and buffers.

The Web Socket protocol spec has been updated to clarify that there is no 
limit to the number of connections, but that user agents should never 
actually open more than one _new_ connection to the server at a time (so 
the most number of unexpected connections per client that a server can see 
is one).

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

[hybi] WebSocket feedback Ian Hickson
Re: [hybi] WebSocket feedback Martin Tyler
Re: [hybi] WebSocket feedback Greg Wilkins
Re: [hybi] WebSocket feedback Jamie Lokier
Re: [hybi] WebSocket feedback Greg Wilkins
Re: [hybi] WebSocket feedback Mario Balibrera
Re: [hybi] WebSocket feedback Ian Hickson
Re: [hybi] WebSocket feedback Jamie Lokier
Re: [hybi] WebSocket feedback Roy T. Fielding
Re: [hybi] WebSocket feedback Ian Hickson
Re: [hybi] WebSocket feedback Greg Wilkins
Re: [hybi] WebSocket feedback Jamie Lokier
Re: [hybi] WebSocket feedback Jamie Lokier
Re: [hybi] WebSocket feedback Ian Hickson
Re: [hybi] WebSocket feedback Sylvain Hellegouarch
Re: [hybi] WebSocket feedback Greg Wilkins
Re: [hybi] WebSocket feedback Ian Hickson
Re: [hybi] WebSocket feedback Jamie Lokier
Re: [hybi] WebSocket feedback Jamie Lokier
Re: [hybi] WebSocket feedback Sylvain Hellegouarch
Re: [hybi] WebSocket feedback Ian Hickson
Re: [hybi] WebSocket feedback Jamie Lokier
Re: [hybi] WebSocket feedback Jamie Lokier
Re: [hybi] WebSocket feedback Sylvain Hellegouarch
Re: [hybi] WebSocket feedback Sylvain Hellegouarch
Re: [hybi] WebSocket feedback Ian Hickson
Re: [hybi] WebSocket feedback Greg Wilkins
Re: [hybi] WebSocket feedback Jamie Lokier
Re: [hybi] WebSocket feedback Greg Wilkins
Re: [hybi] WebSocket feedback Martin Tyler