Re: [hybi] Reliable message delivery (was Re: Technical feedback.)

Jamie Lokier <jamie@shareable.org> Wed, 03 February 2010 00:53 UTC

Return-Path: <jamie@shareable.org>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 53FBF3A686C for <hybi@core3.amsl.com>; Tue, 2 Feb 2010 16:53:49 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.488
X-Spam-Level:
X-Spam-Status: No, score=-2.488 tagged_above=-999 required=5 tests=[AWL=0.111, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id R3aOoQzFPFsS for <hybi@core3.amsl.com>; Tue, 2 Feb 2010 16:53:48 -0800 (PST)
Received: from mail2.shareable.org (mail2.shareable.org [80.68.89.115]) by core3.amsl.com (Postfix) with ESMTP id 44DEE3A63EC for <hybi@ietf.org>; Tue, 2 Feb 2010 16:53:48 -0800 (PST)
Received: from jamie by mail2.shareable.org with local (Exim 4.63) (envelope-from <jamie@shareable.org>) id 1NcTVf-0004Eg-Go; Wed, 03 Feb 2010 00:54:23 +0000
Date: Wed, 03 Feb 2010 00:54:23 +0000
From: Jamie Lokier <jamie@shareable.org>
To: Justin Erenkrantz <justin@erenkrantz.com>
Message-ID: <20100203005423.GF32743@shareable.org>
References: <20100130144936.GD19124@shareable.org> <5c902b9e1001301552n6efb7969o34110373e3ab4945@mail.gmail.com> <4B672C9D.9010205@ericsson.com> <op.u7gy9bag64w2qv@annevk-t60> <96935605-E8B8-4718-B60F-570FD2C199E4@apple.com> <8B0A9FCBB9832F43971E38010638454F032E5065ED@SISPE7MB1.commscope.com> <20100202012534.GB32743@shareable.org> <8B0A9FCBB9832F43971E38010638454F032E50667D@SISPE7MB1.commscope.com> <D17188F8-4B5B-41B4-993F-7AF0DC0B4D47@apple.com> <5c902b9e1002020905x5f38c65epa82d8b25ff044d6b@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <5c902b9e1002020905x5f38c65epa82d8b25ff044d6b@mail.gmail.com>
User-Agent: Mutt/1.5.13 (2006-08-11)
Cc: Hybi <hybi@ietf.org>, "Thomson, Martin" <Martin.Thomson@andrew.com>
Subject: Re: [hybi] Reliable message delivery (was Re: Technical feedback.)
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 03 Feb 2010 00:53:49 -0000

Justin Erenkrantz wrote:
> On Mon, Feb 1, 2010 at 7:56 PM, Maciej Stachowiak <mjs@apple.com> wrote:
> > If either of these concerns makes half-close-only impractical, then I think Jamie's suggestion is the best: allow any of half-close, explicit close frame, or both.
> 
> Given how hard it is to get useful (or consistent) TCP error
> information from most OS stacks, I think having an explicit close
> frame in the common case is going to be well worth it in the long run.

I agree.  I didn't think of it until a really long hard think.

And then I realised that an explicit close message(*) is essential if
you want to know that the application cleanly closed the socket, versus:

    - Application crash looks the same as shutdown() to the receiver.

    - OS killing the application due to, e.g., memory pressure looks
      the same as shutdown() to the receiver.

    - Intermediate TCP relay (there are many types; see my other post)
      killing the connection looks the same as shutdown() to the receiver.

    - Virtual machine containing the application is hard-rebooted, looks
      the same as shutdown() to the receiver with some VM types.

    - HTTP proxy forwarding over CONNECT, killing the connection looks
      the same as shutdown() to the receiver.

In a nutshell, there are *two* distinct uses for signalling clean shutdown:

    1. To avoid the TCP reset hazard.

    2. To signal that it was a clean shutdown, so there is no protocol
       uncertainty about what has been processed / what can be retried
       on another connection.(**)(***)

shutdown() and lingering close does deal with 1, but it isn't reliable
for 2 because of all the other things which can look like shutdown().

So I agree with Justin and others, that an explicit "clean close"
message is more useful and we'll be glad of it in the long run,
compared with just shutdown().

Oh yes, and there is another reason not to rely on shutdown():

    - Not all TCP relays will forward a half-close.  I know
      for a fact that some I've written (long ago ;-) didn't have a
      shutdown() call anywhere.

    - I would not be surprised to find some HTTP proxy over CONNECT
      doesn't forward a half-close reliably, or at all.

Here are some more technical issues we should consider:

    a. Because the sender must have a timeout after sending the close
       message(***), for clean shutdown from *both* sides, it's
       desirable that the receiver finishes the connection quickly,
       rather than writing "all queued messages", which can be any
       amount of data and take any amount of time.  The receiver
       should detect the close message as soon as possible, and finish
       sending its current message only, or even abort the current
       message (if we used byte 0xfe for that purpose, say).
       Detecting the close message is made quicker by following it
       with FIN (EOF), because processing code will see the EOF before
       it parses incoming queued data.  (OOB would expedite it further
       but I don't see it being popular :-)

    b. It is possible for *both* sides to decide to close around the
       same time, independently.  E.g. if they both implement an idle
       timeout or are driven by resource limits.  I'd appreciate if
       someone double checks this works out ok.

    c. This "lingering close" (of any kind) takes time - up to the
       timeout, which might be 30 seconds, 2 minutes or whatever makes
       sense.  (E.g. see Apache).  What should applications do when
       they want to terminate and are in this state?  E.g. a web
       browser sends a close message, and then the user quits the
       browser.  Does it abruptly close(), or is it important to leave
       a background process handling the lingering close?

    d. Sending a close message followed by shutdown() to half-close.
       Is that *actually* reliable over HTTP proxies when using
       CONNECT?  I wonder if some proxies will close() the next hop,
       thereby breaking orderly close if shutdown() is used.  Perhaps
       Anne's idea to send the close message _without_ using
       shutdown() is a good idea after all.  Does anyone have data?

Footnotes

(*)    Let's not call them frames, to avoid confusion with TCP
       frames/packets :-)

(**)   ACK messages alone cannot do this, because without a clean
       shutdown message, you can't tell if an ACK was lost.

(***)  It also signals "don't bother sending more requests because I
       won't be sending any more responses on this connection".
       Unless you get into the dubious area of reviving half-dead
       connections.

(****) It's related to keepalive.  You can't use a keepalive to avoid
       needing the timeout; that'd just be an equivalent timeout.

And a couple of notes about TCP stacks:

     - Good TCP stacks (e.g. Linux (MSG_MORE/TCP_CORK), FreeBSD
       (TCP_NOPUSH)) can be made to combine the close message and FIN
       in the same packet.  So the close message is essentially free
       on them anyway.

     - Many TCP stacks actually do implement "lingering close" in the
       OS.  See SO_LINGER.  But it is often not correctly implemented
       or not useful, which is why Apache et al must do it themselves.

-- Jamie