Re: [hybi] WebSocket -76 is incompatible with HTTP reverse proxies

Greg Wilkins <gregw@webtide.com> Tue, 06 July 2010 23:14 UTC

Return-Path: <gregw@webtide.com>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 045513A68FB for <hybi@core3.amsl.com>; Tue, 6 Jul 2010 16:14:33 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.623
X-Spam-Level:
X-Spam-Status: No, score=0.623 tagged_above=-999 required=5 tests=[BAYES_50=0.001, FM_FORGED_GMAIL=0.622]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id gg9hiB3hRQQK for <hybi@core3.amsl.com>; Tue, 6 Jul 2010 16:14:28 -0700 (PDT)
Received: from mail-fx0-f44.google.com (mail-fx0-f44.google.com [209.85.161.44]) by core3.amsl.com (Postfix) with ESMTP id 1A5EC3A67B2 for <hybi@ietf.org>; Tue, 6 Jul 2010 16:14:27 -0700 (PDT)
Received: by fxm1 with SMTP id 1so5552261fxm.31 for <hybi@ietf.org>; Tue, 06 Jul 2010 16:14:27 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.223.120.205 with SMTP id e13mr5064990far.5.1278458067442; Tue, 06 Jul 2010 16:14:27 -0700 (PDT)
Received: by 10.223.119.140 with HTTP; Tue, 6 Jul 2010 16:14:27 -0700 (PDT)
In-Reply-To: <20100706210039.GA12167@1wt.eu>
References: <20100706210039.GA12167@1wt.eu>
Date: Wed, 07 Jul 2010 09:14:27 +1000
Message-ID: <AANLkTimHAbUM-HHQF6V2LZUFNE2Ls3yqx0QN9ThNOsga@mail.gmail.com>
From: Greg Wilkins <gregw@webtide.com>
To: Willy Tarreau <w@1wt.eu>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Cc: hybi@ietf.org
Subject: Re: [hybi] WebSocket -76 is incompatible with HTTP reverse proxies
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 06 Jul 2010 23:14:33 -0000

Willy

thanks for giving a real world example to the problems of non HTTP
compatibility that were predicted to occur if -76 was adopted.

My understanding is that the nonce is sent as a non-HTTP non-Websocket
data packet precisely to detect if there is an intermediary that
cannot handle websocket - and thus fail fast. Your example show that
this mechanism is neither a good test for websocket capability, nor
does it actually fail fast. I believe HAproxy is a load balancer that
will handle the first HTTP Request/Response on a connection as HTTP,
before turning into a byte forwarding pipe.   Such a proxy will work
fine with websocket if the handshake is compliant HTTP, but fails on
-76 because it is not compliant.

The proposal that I have made several times on this list is that the
handshake should include the nonce as a header.  The 16 byte check
code can be sent as non-HTTP data after the 101, as there is no longer
a requirement to be HTTP then - however I think that if we wish to
test websocket capability, it would be far better to send that in a
websocket frame:

Send:

    GET /index.html HTTP/1.1
    Host: example.com
    Connection: Upgrade
    Sec-WebSocket-Key1: 4 @1  46546xW%0l 1 5
    Sec-WebSocket-Key2: 12998 5 Y3 1  .P00
    Sec-WebSocket-Nonce: 4ABC2FEE237CD165
    Sec-WebSocket-Protocol: sample
    Upgrade: WebSocket
    Origin: http://example.com

Receive:

        HTTP/1.1 101 WebSocket Protocol Handshake
        Upgrade: WebSocket
        Connection: Upgrade
        Sec-WebSocket-Origin: http://example.com
        Sec-WebSocket-Location: ws://example.com/demo
        Sec-WebSocket-Protocol: sample

        0x81 0x10  <16 bytes>


Only after the 16 byte binary websocket frame is received and
validated should the client trigger the onOpen handling. This
handshake is compliant HTTP and includes a check of websocket
capability server to client without an extra round trip. The only
think it lacks is a check of client to server websocket capability,
but:
   a) we can see the current -76 check doesn't work anyway
   b) all the intermediaries say the upgrade request and 101 response,
so they had the opportunity to refuse the upgrade
   c) it would be easy to add a client to server websocket keep-alive
send after the handshake that would check that path.  If server could
timeout the connection if a no message is received within a reasonable
timeout.  To check connectivity there is no way to avoid having such a
timeout since you can't test for something not being received any
other way... at least this way, the message could actually be a real
application message and thus a useful message and not another round
trip.

But all this has been said before and ignored by the editor, who wont
even remove the inappropriate redirection in the intro of
   http://www.ietf.org/id/draft-ietf-hybi-thewebsocketprotocol-00.txt
so don't hold your breath waiting for good sense or good process to
prevail.  Sorry for the sour grapes... but I don't know what else to
say.

regards



On 7 July 2010 07:00, Willy Tarreau <w@1wt.eu> wrote:
> Hi,
>
> I was having a private discussion with Ian last week about a recent
> issue introduced in draft 76, but I realized it would be more useful
> and constructive to bring the issue to the list than to privately
> discuss possible fixes.
>
> Last week, it was reported to me that a site that was running fine
> on draft 75 could not get the draft 76 handshake to complete via a
> HAProxy load balancer, which runs as an HTTP reverse proxy. The
> connection would remain open between the client and haproxy, and
> between haproxy and the server, with the server never responding.
> The same client (Chromium 6.0.414.0) directly connected to the
> server worked fine.
>
> The guy was kind enough to send me some network captures which show
> an obvious problem : the 8-bytes nonce from the client is not advertised
> as a content-length, so it is not forwarded by the reverse proxy as
> it is either part of a next request or pending data for when the
> handshake completes. Unfortunately, the server wants those bytes to
> complete the handshake, so we have a dirty deadlock not even detectable
> by the end user.
>
> Ian proposed to upgrade the reverse proxy to detect the WebSocket
> handshake in the request (before it completes) and that it accepts
> to forward those 8 bytes.
>
> I can't agree with that because until the handshake completes, the
> proxy does not know whether the server will handle the request as
> a WS handshake or anything else, and it must absolutely not accept
> to blindly trust any random client who sets an Upgrade header that
> any server is free to ignore. Doing so would make the reverse-proxy
> vulnerable to HTTP request smuggling attacks [1] or even to any type
> of filtering bypass depending on the length of the data it lets go.
> Even with 8 bytes it is possible to send a "GET /xx\n" which is a
> valid HTTP/0.9 request and is accepted by some servers in a keep-alive
> connection (including Apache).
>
> Example :
>
>        GET /index.html HTTP/1.1
>        Host: example.com
>        Connection: Upgrade
>        Sec-WebSocket-Key2: 12998 5 Y3 1  .P00
>        Sec-WebSocket-Protocol: sample
>        Upgrade: WebSocket
>        Sec-WebSocket-Key1: 4 @1  46546xW%0l 1 5
>        Origin: http://example.com
>
>        GET /..
>
> If the server does not handle WebSocket for this ressource, it will
> happily return the index and/or a 404 and proceed with next request
> contained in the 8 bytes it received.
>
> Conversely, having no Content-Length header in the request means that
> we don't know what a reverse proxy will do if it receives a valid one.
> For instance, we could very well imagine that some reverse proxies
> which will assume that Content-Length == 8 for any request containing
> "Upgrade: WebSocket" will have trouble when receiving a different
> Content-Length header. This could be used to pass larger amounts of
> data than what is allowed by the protocol to a second reverse-proxy,
> which, if it is able to parallelize pipelined requests, will forward
> the first one to the server and the second one (embedded in the apparent
> data) to another server.
>
> The first obvious solution that comes to mind is to comply with the
> HTTP protocol which will be implemented along the whole chain and to
> simply add a "Content-Length: 8" header in the request. This fixes
> the dirty hang, this fixes the fact that reverse proxies have to blindly
> trust the client, this fixes the case with the different content length,
> and it even makes it possible for WebSocket aware reverse proxies to
> refuse requests which don't have exactly "Content-Length: 8".
>
> But this raises a second point : shouldn't we switch to POST instead of
> GET then ? After all, GET + content-length is not well defined, httpbis
> p1 says :
>
>   The presence of a message-body in a request is signaled by the
>   inclusion of a Content-Length or Transfer-Encoding header field in
>   the request's header fields.  When a request message contains both a
>   message-body of non-zero length and a method that does not define any
>   semantics for that request message-body, then an origin server SHOULD
>   either ignore the message-body or respond with an appropriate error
>   message (e.g., 413).  A proxy or gateway, when presented the same
>   request, SHOULD either forward the request inbound with the message-
>   body or ignore the message-body when determining a response.
>
> So that means that we're not certain again that the data will pass
> through all reverse proxies. That said, we could consider that WebSocket
> aware reverse proxies will let the data flow, but the problem of the hung
> connection remains if the reverse proxy eats the data before forwarding
> to the server.
>
> The POST solves all that. POST + content-length is normal and well
> supported everywhere. The POST also has the nice advantage that we're
> sure we won't get a reply from a cache (and the POST method is even
> one of the 3 defined methods which must invalidate caches).
>
> The POST also has a nice advantage for various client implementations
> that it is easier to implement than GET with some data.
>
> After some thinking, I'm wondering why we want to pass the nonce as
> data in the request instead of passing it as headers. After all, we're
> interested in data in the response to ensure we're able to let data
> flow and that the whole chain understands the Upgrade request. In fact,
> if any one intermediate ignores the 101 and takes it as a 100 (as is
> explicitly permitted by httpbis-p2), the response will be aborted
> because the pending data does not look like an HTTP response, and this
> is perfect, it's what we're looking for. But in the request ? I fail
> to see the added value of having it in the data. Probably that it would
> be easier to put it in a header and keep the GET.
>
> Anyway, we have to do something now because we've reached the point Ian
> tried to ensure we would avoid a long time ago : the deadlock which is
> undetectable by the client. And it already happens through a common
> reverse proxy since draft 76, and will do through load balancers, IDS
> and HTTP-aware firewalls for the same reasons, without any simple way
> to fix it without breaking HTTP security on those components. And it's
> not like if we could imagine those components will not be in use where
> WebSocket is deployed !
>
> Best regards,
> Willy
>
> [1]   http://www.owasp.org/index.php/HTTP_Request_Smuggling
>
>
> _______________________________________________
> hybi mailing list
> hybi@ietf.org
> https://www.ietf.org/mailman/listinfo/hybi
>