There have been a number of suggestions recently for changing how the
handshake works in Web Sockets. I'd rather avoid doing this unless we can
get a significant improvement out of it, so I'd like to solicit feedback
on how much of an improvement such changes would result in.
The main suggestions are:
* Using a challenge instead of a fixed sequence of bytes for the
handshake. Basically the client would send a nonce, and the server would
have to send back some value generated from that nonce.
- this means the server has to read the handshake, which avoids the
cross-site message forgery attack.
- we could make it even more secure by saying that you have to xor
something derived from this nonce with the frame types (or maybe every
byte) of all the data sent and received, so that even if the server
still muddles through without a nonce, you can't possibly inject
anything meaningful into the stream.
- makes it less likely that an attack can be found where you can cause
the server to echo text back to do a cross-protocol attack.
- means that people who are using infrastructures with limitations on
header order or whatnot can still use it.
- it has been suggested that we'd still want the first few bytes of
the response to be fixed (and unique) even with the nonce; this would
lose most of the legacy intermediary benefits, though.
- it would dramatically increase the complexity of the simplest possible
websocket server. Right now you can have a server that doesn't even
read the client handshake at all; this would force you to parse it.
You could no longer write simple test servers that just send back a
bunch of frames, for instance.
* Not using HTTP at all is something I'd like to do, but since people are
going to reuse port 443 with this regardless of what we say, it seems
better to make it usable with HTTP-based parsers.
* Using ports 81/815 instead of 80/443 would be ideal, but IANA said that
if we look like HTTP, we must use ports 80/443.
* Using CONNECT instead of GET with an Update has been suggested by some
people. From the point of view of Web Sockets it doesn't matter either
way, it's just an opaque handshake. From the point of view of HTTP, it
doesn't seem CONNECT is very well defined, so I'm not sure what the
implications of this are. As far as I can tell, we wouldn't do an
Upgrade, we'd just be using the HTTP server like a proxy. It seems a bit
dubious from an HTTP semantics point of view. As far as I can tell, it
would mean that we don't have to worry about the server sending back
something that looks like HTTP, though, which would get around the
issue of limitations in legacy intermediaries.
I'm especially interested in feedback regarding actual effects these
changes would have on concrete deployed softare (like, how much easier
would this make deploying Web Socket on a specific load balancer) and
protocols (like, what actual attacks does this prevent or allow).
--
Ian Hickson U+1047E )\._.,--....,'``. fL
http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,.
Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Note Well: Messages sent to this mailing list are the opinions of the senders and do not imply endorsement by the IETF.