[hybi] A WebSocket handshake
Adam Barth <ietf@adambarth.com> Tue, 05 October 2010 22:15 UTC
Return-Path: <ietf@adambarth.com>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id E7A003A6E87 for <hybi@core3.amsl.com>; Tue, 5 Oct 2010 15:15:18 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.799
X-Spam-Level:
X-Spam-Status: No, score=-0.799 tagged_above=-999 required=5 tests=[AWL=-1.236, BAYES_40=-0.185, FM_FORGED_GMAIL=0.622]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id HMIGDtPFWaKH for <hybi@core3.amsl.com>; Tue, 5 Oct 2010 15:15:16 -0700 (PDT)
Received: from mail-ew0-f44.google.com (mail-ew0-f44.google.com [209.85.215.44]) by core3.amsl.com (Postfix) with ESMTP id 031403A6F0D for <hybi@ietf.org>; Tue, 5 Oct 2010 15:15:05 -0700 (PDT)
Received: by ewy26 with SMTP id 26so3490409ewy.31 for <hybi@ietf.org>; Tue, 05 Oct 2010 15:16:03 -0700 (PDT)
Received: by 10.213.41.133 with SMTP id o5mr17767ebe.54.1286316963417; Tue, 05 Oct 2010 15:16:03 -0700 (PDT)
Received: from mail-iw0-f172.google.com (mail-iw0-f172.google.com [209.85.214.172]) by mx.google.com with ESMTPS id v8sm136171eeh.2.2010.10.05.15.16.01 (version=SSLv3 cipher=RC4-MD5); Tue, 05 Oct 2010 15:16:03 -0700 (PDT)
Received: by iwn3 with SMTP id 3so10537241iwn.31 for <hybi@ietf.org>; Tue, 05 Oct 2010 15:16:00 -0700 (PDT)
Received: by 10.231.182.204 with SMTP id cd12mr12859823ibb.101.1286316959962; Tue, 05 Oct 2010 15:15:59 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.231.149.20 with HTTP; Tue, 5 Oct 2010 15:15:22 -0700 (PDT)
From: Adam Barth <ietf@adambarth.com>
Date: Tue, 05 Oct 2010 15:15:22 -0700
Message-ID: <AANLkTimQ5x-v+Mz_OHrNDdtVd94E+HOBWwo3_f1ktEeg@mail.gmail.com>
To: Hybi <hybi@ietf.org>
Content-Type: text/plain; charset="windows-1252"
Content-Transfer-Encoding: quoted-printable
Subject: [hybi] A WebSocket handshake
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 05 Oct 2010 22:15:21 -0000
Please find below a proposal for a new WebSocket handshake. The handshake attempts to combine the benefits of the HTTP handshake with the benefits of a TLS-based handshake. The handshake incorporates ideas from a number of the other handshakes discussed previously, including those from Maciej Stachowiak, Ian Hickson, and Greg Wilkins. In addition to proposing a handshake, the document also contains a threat model and a security analysis. Feedback appreciated. Kind regards, Adam Pretty HTML version: https://docs0.google.com/document/edit?id=1hRLcVc8FHsXOQvaulG2KmvGKepgFffcevyJn-dAEsrI&hl=en&authkey=COOWhaAD&pli=1 Not-so-pretty text version: = A WebSocket Handshake = Adam Barth Eric Rescorla October 5, 2010 == Introduction == This document describes a handshake for the WebSocket protocol that resists cross-protocol attacks. The handshake sends a fixed sequence of bytes and a random nonce from the client to the server to establish two keys for a bidirectional encrypted tunnel, which the parties then use for further communication. Although an eavesdropper can determine the encryption keys, computing the keys requires knowledge of a globally unique identifier, making it unlikely that an observer unfamiliar with the the WebSocket protocol will interpret the encrypted bytes on the wire as anything other than random bytes. Before explaining the handshake, we present a model of the threats posed by exposing a new network protocol to untrusted content running in a web browser. We then work through some simple handshake designs to build intuition for what can go wrong in a flawed design. == Threats == In this document, we evaluate the risks posed by exposing the WebSocket protocol to untrusted web content in a standard web browser. We make the usual assumption in web security that the user visits the attacker’s web site. Web browsers already expose an HTTP-based networking facility to untrusted web content. In designing WebSockets, we are concerned with the additional risks incurred by granting the attacker additional network privileges. We are chiefly concerned with three scenarios: 1) The attacker uses the WebSocket protocol to attack a server that does not support the WebSocket protocol. In this scenario, we are concerned with protecting a wide variety of servers that implement a wide variety of protocols. a) We do not assume the server implements any particular protocol exactly according to its specification. Instead, we aim for “real world” security in which servers might have a number of common bugs. b) We do not assume the server uses a strong authentication scheme. In particular, we are concerned with protecting servers that rely on connectivity alone for authentication (e.g., inside a corporate intranet). Although using strong authentication is a best practice, strong authentication is far from universal in deployments. 2) The attacker uses other network facilities in the browser to attack a WebSocket server. For example, the attacker might use and HTML form element to generate an HTTP message targeted at a WebSocket server. In this scenario, do not assume the WebSocket server follows the WebSocket protocol specification in every detail. Instead, we seek to protect WebSocket servers that contain some implementation errors. Of course, we cannot hope to protect servers with arbitrary implementation errors (e.g., memory safety errors), but, when given a choice, we prefer protocols whose security is robust to sloppy implementation. We are concern with two kinds of attacks in this model: a) The attacker crafts an HTTP request that confuses the WebSocket server into performing an undesirable mutation to its internal state. b) The attacker crafts an HTTP request that confuses the WebSocket server into responding with content that the browser then interprets to the detriment of the server (e.g., allows the attacker to mount a cross-site scripting attack against the server’s origin). 3) The attacker communicates with a WebSocket server, but the ensuing traffic confuses a network intermediary. Without loss of generality, we can assume that the WebSocket server colludes with the attacker to aid him or her in confusing the intermediary. In particular, we are especially concerned with transparent HTTP proxies in corporate intranets because these proxies are common and confusing such as proxy could let the attacker extract confidential information from the corporation. == Strawmen == One natural approach is to design the handshake to mimic an HTTP POST request. Using a POST request as a template is attractive because an attacker can already generate POST requests to many network locations using the HTML form element. If WebSockets are less generative than the form element, then we can argue by reduction the WebSockets does not increase the attack surface for cross-protocol attacks. Here’s an example WebSocket handshake templated on a POST request: Client -> Server: POST /path/of/attackers/choice HTTP/1.1 Host: host-of-attackers-choice.com Sec-WebSocket-Key: <connection-key> Server -> Client: HTTP/1.1 200 OK Sec-WebSocket-Accept: <connection-key> The idea behind this protocol is that by echoing back the connection-key, the server has agreed to establish a WebSocket connection. Unfortunately, this handshake has serious problems. If the attacker can host an htaccess file at any location a target HTTP server, the attacker can opt the server into using WebSockets. The server will believe the first HTTP request is complete and is expecting another HTTP request on the socket. However, the attacker can now send (roughly) arbitrary bytes on the socket, spoofing HTTP requests and reading back the response. To repair this vulnerability, we replace value of the Sec-WebSocket-Accept response header with HMAC-SHA1(<connection-key>, <uuid>), on the assumption that a simple configuration file will be unable compute an HMAC. However, this modification is insufficient. Consider, for example, a virtual hosting environment in which the attacker can place PHP scripts on the server. For example, such hosting environments are widely available commercially, such as from 1and1.com. Now, the attacker can complete the WebSocket handshake because the PHP script can compute the HMAC and send the appropriate response header. The attacker has now opted into the WebSocket protocol on behalf of the rest of the entire socket. Unfortunately, the attacker is only empowered to speak on behalf his own virtual host. This privilege escalation is likely to be exploitable by spoofing further HTTP requests in WebSocket message frames. In these spoofed messages, the attacker can spoof the Host header and interact with other virtual hosts reachable on the same socket. To attempt to repair this vulnerability, we remove the attacker’s ability to designate a PHP script on the server: Client -> Server: OPTIONS * HTTP/1.1 Host: host-of-attackers-choice.com Sec-WebSocket-Key: <connection-key> Server -> Client: HTTP/1.1 200 OK Sec-WebSocket-Accept: HMAC(<connection-key>, “...”) This handshake still has problems in more sophisticated virtual hosting scenarios, but let’s put those aside for the moment to consider how this handshake interacts with transparent HTTP proxies. Recall that the browser will not use the proxy version of the handshake because the proxy is transparent. Unfortunately, this handshake is likely to confuse a transparent proxy. After seeing these messages exchanged, a transparent proxy will likely believe that the next bytes emitted by the browser will be another HTTP request. However, the browser believes it has established a WebSocket connection and will let the attacker send WebSocket frames to the transparent proxy. The attacker can likely use these frames to spoof HTTP requests for intranet resources (again, by spoofing the Host header) and read back the response, stealing confidential information from the corporation’s intranet. To attempt to repair this vulnerability, we add the Upgrade header to inform the transparent proxy that the socket is switching protocols: Client -> Server: OPTIONS * HTTP/1.1 Host: host-of-attackers-choice.com Connection: Upgrade Sec-WebSocket-Key: <connection-key> Upgrade: WebSocket Server -> Client: HTTP/1.1 101 Switching Protocols Connection: Upgrade Upgrade: WebSocket Sec-WebSocket-Accept: HMAC(<connection-key>, “...”) Unfortunately, the RFC 2817 HTTP upgrade mechanism is virtually unused in practice. If you search the web for references to upgrade, you either find links to RFC 2817 or discussion of the WebSocket protocol. It seems entirely likely that some number of transparent proxies will be oblivious to the HTTP upgrade mechanism. Organizations could easily deploy such proxies and never have any operational issues with them. For this reason, assuming that transparent proxies the HTTP upgrade mechanism is a dangerous assumption. If the proxy is oblivious to HTTP upgrade, the proxy could easily treat this handshake the same way it would treat the previous iteration, which allows the attacker to steal confidential information from corporate intranets. Rather than relying upon the rarely used HTTP upgrade mechanism to inform network intermediaries that the remainder of the socket is not HTTP, we propose using the RFC 2817 CONNECT mechanism. This mechanism is widely used on the Internet to tunnel TLS connections through proxies. Proxy implementations that lack support for the CONNECT mechanism will likely discover and repair that oversight quickly. == Proposal == In this section, we present our proposal for a WebSocket handshake and tunnel. The handshake established a shared “secret” between the client and the server, which they use to encrypt subsequent traffic. This handshake lacks a number of endpoint and extension negotiation features of the current handshake. We expect the working group to add these features inside the encrypted tunnel. === Handshake Request === To establish a WebSocket connection, the browser sends an RFC 2817 CONNECT request: Client -> Server: CONNECT 1C1BCE63-1DF8-455C-8235-08C2646A4F21.invalid:443 HTTP/1.1 Host: 1C1BCE63-1DF8-455C-8235-08C2646A4F21.invalid:443 Sec-WebSocket-Key: <connection-key1> where <connection-key1> is a 128-bit random number encoded in base64. This initial message has several desirable properties: 1) The attacker cannot influence any of the bytes included in the message. Instead of using the attacker’s host name, we use an invalid host name (per RFC 2606). Although we could use any invalid host name, we use this host name as a globally unique identifier for the WebSocket protocol. 2) Any intermediaries that understand this message according to its HTTP semantics with route the request to a non-existent domain and fail the request. In particular, they will not route the Sec-WebSocket-Key to the attacker, making it difficult for the attacker to perform actions based on the key. 3) Transparent proxies are likely to interpret this request as an HTTPS connect request and assume the remainder of the socket is unintelligible. Because the remainder of the bytes on the socket are encrypted (see below), the attacker is unlikely to be able to trick the transparent proxy into taking further action. 4) This message cannot be generated by a web attacker in today’s browsers. 5) A server that wishes to multiplex HTTP and WebSockets on the same port can use the request-line to distinguish the two protocols. The client can also include additional information in the first handshake message by encrypting that information in AES-128-CTR using the key HMAC-SHA1(<connection-key1>, “C1BA787A-0556-49F3-B6AE-32E5376F992B”) and a counter block that is the byte number represented in 128-bit network byte order (big-endian). We expect browsers to use this additional information to include additional meta-data about the connection (e.g., the origin of the web site that created the WebSocket) rather than application-layer messages. Encrypting the additional information makes it difficult for the attacker to predict the bytes that appear on the wire. Without the ability to predict on-the-wire bytes, the attacker will have difficulty crafting a network message that confuses a non-WebSocket server or an intermediary. Effectively, the attacker is limited to sending random traffic to a chosen server. To limit opportunities for abuse, the browser should limit the amount of unsolicited data the attacker can send (500 bytes?) before the server accepts the WebSocket connection to avoid spamming unwitting servers with too much traffic. === Handshake Response === To accept the request, the server replies with the following message: Server -> Client: HTTP/1.1 200 OK Sec-WebSocket-Accept: <hmac> Sec-WebSocket-Key: <connection-key2> where <hmac> is HMAC-SHA1(<connection-key1>, “258EAFA5-E914-47DA-95CA-C5AB0DC85B11”) encoded in base64 and <connection-key2> is a 128-bit random number encoded in base64. If <connection-key2> is identical to <connection-key1>, the client aborts the handshake. This message completes the CONNECT mechanism. The entity that generated the HMAC has demonstrated understanding of the WebSocket protocol by including the UUID in the HMAC. Because the original network message did not designate any particular host, we can have reasonable assurance that the entity that generated the HMAC speaks on behalf of the entire socket (and not just on behalf of one virtual host). Because the HMAC occurs near the beginning of the socket (and is proceeded by a fixed string), we mitigate the risk that the replying entity is actually speaking a non-HTTP, non-WebSocket protocol. After sending the handshake response, the server can begin sending information over the encrypted tunnel described in the following section. We expect that the first message sent by the server will contain meta-data about the connection and that subsequent messages will contain application-layer messages. === Tunnel === The handshake establishes two keys, which the client and server use to form an encrypted tunnel for further communication: Client -> Server Key: HMAC-SHA1(<connection-key1> || <connection-key2>, “363A6078-74D2-4C0B-8CBC-1E6A36E83442”) Server -> Client Key: HMAC-SHA1(<connection-key1> || <connection-key2>, “2306C3BE-0ACF-42C0-B69E-DFFE02CFA346”) All subsequent bytes are encrypted using AES-128-CTR with the appropriate directional key and a counter block that is the byte number represented in 128-bit network byte order (big-endian). Encrypting the tunnel makes it difficult for an attacker to use the browser’s HTTP network facilities to attack a poorly implemented WebSocket server. Because the attacker is unable to learn the <connection-key2> chosen by the server, the attacker will have difficulty crafting an HTTP request that the WebSocket server will decrypt to something sensible. Encrypting the traffic from the server to the client makes it difficult for the attacker to generate an HTTP request to an honest by poorly implemented WebSocket server that causes its response to be interpreted to its detriment by the browser. In particular, it is unlikely that the server’s response will be treated as an HTML document by the browser, preventing the attacker from leveraging the WebSocket server to mount a cross-site scripting attack against the server’s origin. == Analysis == We analyze the risks of this protocol in the three scenarios of interest: 1) The attacker uses the WebSocket protocol to attack a server that does not support the WebSocket protocol. There are two cases to consider: the server is familiar with HTTP semantics or the server is oblivious of HTTP: a) The attacker will find it difficult to attacker a server that is familiar with HTTP semantics with this handshake because the HTTP semantics of the handshake point to routing the request to a non-existent network location. If the request somehow routes to the attacker, the HTTP semantics then point to transporting opaque data over the socket. b) The attacker will find it difficult to attack an HTTP-oblivious server with this handshake because the attacker can send only a fixed message followed by seemingly random bytes. None of the bytes sent to the server can be controlled directly by the attacker. It seems unlikely that the attacker will be able to advance the non-WebSocket server very far down its state machine. 2) The attacker uses other network facilities in the browser to attack a WebSocket server. There are two cases to consider: the server implements the WebSocket protocol correctly or the server implements an imperfect version of the WebSocket protocol: a) If the server correctly implements the WebSocket protocol, the attacker will be unable to use the other network facilities of the browser to complete the handshake with the server because the attacker is unable to generate the first network message. b) If the server implements an imperfect version of the WebSocket protocol, the attacker will be unable to learn the value of either of the directional keys for the tunnel. Without knowledge of these keys, the attacker will find it difficult (i) to craft a message that decrypts to something meaningful to the WebSocket server and (ii) to trick the WebSocket server into responding with something meaningful to the browser. 3) The attacker communicates with a WebSocket server, but the ensuing traffic confuses a network intermediary. If the intermediary attempts to route the request (e.g., because the intermediary is an HTTP proxy), the handshake will fail because the request does not contain any routing information for the target server. If the handshake completes and the intermediary understands HTTP semantics (as widely used), the intermediary will likely reason that the remainder of the socket is an opaque TLS connection. In either case, the intermediary is unlikely to take undesirable actions as a result of the WebSocket connection. == Conclusion == We believe this handshake is superior to the current handshake because this handshake has a stronger argument for security. Because the attacker cannot control any of the bytes sent by the browser, the attacker will have difficulty mounting a cross-protocol attack using this handshake. That said, there is no guarantee that this handshake resists cross-protocol attacks. These security properties are not very well studied, making designing protocols that achieve these properties more art than science. However, the handshake we propose has a number of heuristic properties that suggest it might stand up to further scrutiny.
- Re: [hybi] A WebSocket handshake Adam Barth
- [hybi] A WebSocket handshake Adam Barth
- Re: [hybi] A WebSocket handshake Adam Barth
- Re: [hybi] A WebSocket handshake Willy Tarreau
- Re: [hybi] A WebSocket handshake Eric Rescorla
- Re: [hybi] A WebSocket handshake Willy Tarreau
- Re: [hybi] A WebSocket handshake Adam Barth
- Re: [hybi] A WebSocket handshake Willy Tarreau
- Re: [hybi] A WebSocket handshake Adam Barth
- Re: [hybi] A WebSocket handshake Adam Barth
- Re: [hybi] A WebSocket handshake Willy Tarreau
- Re: [hybi] A WebSocket handshake Greg Wilkins
- Re: [hybi] A WebSocket handshake Greg Wilkins
- Re: [hybi] A WebSocket handshake Willy Tarreau
- Re: [hybi] A WebSocket handshake Greg Wilkins
- Re: [hybi] A WebSocket handshake Willy Tarreau
- Re: [hybi] A WebSocket handshake Adam Barth
- [hybi] Strawman (was: A WebSocket handshake) S Moonesamy
- Re: [hybi] A WebSocket handshake Maciej Stachowiak
- Re: [hybi] A WebSocket handshake Adam Barth
- Re: [hybi] A WebSocket handshake Maciej Stachowiak
- Re: [hybi] A WebSocket handshake Maciej Stachowiak
- Re: [hybi] A WebSocket handshake Adam Barth