Re: [hybi] Redesigning the Web Socket handshake

Maciej Stachowiak <mjs@apple.com> Wed, 03 February 2010 05:21 UTC

Return-Path: <mjs@apple.com>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id ACB5E3A69F1 for <hybi@core3.amsl.com>; Tue, 2 Feb 2010 21:21:36 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -106.393
X-Spam-Level:
X-Spam-Status: No, score=-106.393 tagged_above=-999 required=5 tests=[AWL=0.206, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NNPsyJ4z4sSr for <hybi@core3.amsl.com>; Tue, 2 Feb 2010 21:21:35 -0800 (PST)
Received: from mail-out4.apple.com (mail-out4.apple.com [17.254.13.23]) by core3.amsl.com (Postfix) with ESMTP id 83F8C3A69B4 for <hybi@ietf.org>; Tue, 2 Feb 2010 21:21:35 -0800 (PST)
Received: from relay13.apple.com (relay13.apple.com [17.128.113.29]) by mail-out4.apple.com (Postfix) with ESMTP id 83CF689E186C for <hybi@ietf.org>; Tue, 2 Feb 2010 21:22:16 -0800 (PST)
X-AuditID: 1180711d-b7b18ae000001001-3f-4b6908087dc7
Received: from et.apple.com (et.apple.com [17.151.62.12]) by relay13.apple.com (Apple SCV relay) with SMTP id 1E.20.04097.808096B4; Tue, 2 Feb 2010 21:22:16 -0800 (PST)
MIME-version: 1.0
Content-transfer-encoding: 7bit
Content-type: text/plain; charset="us-ascii"
Received: from [17.151.86.222] by et.apple.com (Sun Java(tm) System Messaging Server 6.3-7.04 (built Sep 26 2008; 32bit)) with ESMTPSA id <0KX900A542X3E300@et.apple.com> for hybi@ietf.org; Tue, 02 Feb 2010 21:22:16 -0800 (PST)
From: Maciej Stachowiak <mjs@apple.com>
In-reply-to: <5c902b9e1002021431w25768b2eu4e21244f080bed25@mail.gmail.com>
Date: Tue, 02 Feb 2010 21:22:15 -0800
Message-id: <9A862D96-FD32-4532-BDBE-AAC5C82DB954@apple.com>
References: <Pine.LNX.4.64.1002012305000.21600@ps20323.dreamhostps.com> <4B676E8C.70804@webtide.com> <Pine.LNX.4.64.1002020311030.3846@ps20323.dreamhostps.com> <4B679E2C.2080502@webtide.com> <FD440FEA-9F53-4F4C-8AA5-98B23318F0F7@apple.com> <5c902b9e1002021431w25768b2eu4e21244f080bed25@mail.gmail.com>
To: Justin Erenkrantz <justin@erenkrantz.com>
X-Mailer: Apple Mail (2.1077)
X-Brightmail-Tracker: AAAAAQAAAZE=
Cc: hybi@ietf.org
Subject: Re: [hybi] Redesigning the Web Socket handshake
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 03 Feb 2010 05:21:36 -0000

On Feb 2, 2010, at 2:31 PM, Justin Erenkrantz wrote:
> Yes, I admit it'd be a bit helpful if the nonce proposal was little
> more concrete.

Let me start by laying out the security risks in the current handshake. Then I will explain my proposed change and how I believe it addresses them.


1) Hostile JavaScript could use WebSocket for a cross-protocol attack against vanilla HTTP resources or non-HTTP servers.

If the attacker can trick a non-WebSocket server into echoing back chosen text (for example through something in the URL part of the request), then they could make it give what appears to be a valid WebSocket handshake response. This could result in unauthorized access.

The current handshake attempts to mitigate this, by requiring some fixed parts of the response after the status line; thus, the fixed part of the handshake response effectively includes newlines. Since none of the parts of the request controlled by client JS can contain literal newlines, this somewhat increases the difficulty of echoing back exactly the right handshake. However, this is weaker than it could be. The exact correct handshake response is fully predictable, and furthermore consists only of fixed parts and verbatim repetition of parts of the request.

While we do not know of any service that is currently vulnerable, if the correct handshake response could not be predicted by client JS, that would be much more robust. And we do know that cross-protcol attacks from Web browsers are a real issue in general.


2) Cross-site XMLHttpRequest (using CORS or XDomainRequest) could be used for a cross-protocol attack against WebSocket resources, potentially violating integrity (though not confidentiality).

The WebSocket protocol currently does not require any checking of the client handshake. However, any WebSocket server that performs any side effects in response to messages from the client has a security vulnerability if it does not check correctness of the handshake request from the client. A cross-site XMLHttpRequest facility could be used to send a request to the victim an HTTP request with a body that looks like one or more valid WebSocket messages. This request would be a POST instead of a GET, and would lack the required WebSocket headers. But if the server is not checking, it would happily chug along.

The server would, of course, not respond with the right CORS headers needed for the attacker to actually read the response, so confidentiality cannot be violated. However, if the server treats incoming client messages as commands, then there is no protection against violating integrity.

To give a concrete example, consider a chat service over WebSocket. Let's say it doesn't check correctness of the handshake, and does authentication in-band. An attacker could construct an HTTP POST body which would look like an authentication message plus a couple of submitted sent messages. True, the attacker cannot read the victim's chat messages, but could send chat messages as the victim. Clearly this would be a significant security breach.

Admittedly, the spec could be changed to require servers to check some aspects of client handshake correctness. But because they can give a correct response without any processing, it remains relatively easy to forget to do this. A handshake protocol that required the server to read the client's request would reduce the risk.

Ian argued that services which do not read messages from the client at all (perhaps a data stream service like a stock ticker) would face an extra burden without gaining any safety from such a requirement. However, pure data stream services are easier to do via EventSource anyway; EventSource deals with intermediaries and can work with a completely unmodified Web server. So I don't think WebSocket needs to put a lot of weight on that use case.


Suggested solution:

We could mitigate both of these risks, and also reduce the implementation difficulty for servers and proxies, by changing the handshake. We would remove the requirement for newlines or exact capitalization of the headers. Instead, we could have the browser generate a unique random nonce, which it includes in the handshake request. The server would be required to read it, and respond with a hash of the nonce value. 

This has two benefits: (a) client JS can no longer predict the exact text of a valid handshake response; and (b) servers are essentially forced to look at the handshake, and must look for at least one element that cannot be forged with cross-site XMLHttpRequest. Thus, this mechanism mitigates risks (1) and (2) above. The reason to respond with a hash of the nonce, instead of the nonce literally, is so the response includes something that is both unpredictable *and* not a verbatim echo of any part of the request. There is no need for the hash to be cryptographically strong.

There are many possible variants, but here is an example of what the handshake request and response could look like (the hash used here is MD5; we could use a weaker hash that is faster to compute however):

Handshake from the client:

        GET /demo HTTP/1.1
        Upgrade: WebSocket
        Connection: Upgrade
        Host: example.com
        Origin: http://example.com
        WebSocket-Protocol: sample
        WebSocket-Nonce: 2d1283cf01e0e9f562989f0781450e7e

Response from the server:

        HTTP/1.1 101 Web Socket Protocol Handshake
        Upgrade: WebSocket
        Connection: Upgrade
        WebSocket-Origin: http://example.com
        WebSocket-Location: ws://example.com/demo
        WebSocket-Protocol: sample
        WebSocket-Nonce-Hash: 8ba7ca1e53376d29842e88d0f9db6978

The status line must come first, but order and capitalization of all request and response headers would be free. If we wanted to, we could even allow the status line to use any HTTP version.

A possible variant would be to include the nonce hash in the status line instead of in a header. But I think header is probably better.


Regards,
Maciej