[hybi] [permessage-deflate] Compressing fragmented data

Takeshi Yoshino <tyoshino@google.com> Tue, 14 January 2014 04:05 UTC

Return-Path: <tyoshino@google.com>
X-Original-To: hybi@ietfa.amsl.com
Delivered-To: hybi@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C12D01AE1C9 for <hybi@ietfa.amsl.com>; Mon, 13 Jan 2014 20:05:41 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.916
X-Spam-Level:
X-Spam-Status: No, score=-1.916 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, RP_MATCHES_RCVD=-0.538, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Wf-yiA8Kl_qs for <hybi@ietfa.amsl.com>; Mon, 13 Jan 2014 20:05:39 -0800 (PST)
Received: from mail-wg0-x232.google.com (mail-wg0-x232.google.com [IPv6:2a00:1450:400c:c00::232]) by ietfa.amsl.com (Postfix) with ESMTP id 26B9D1AE1BF for <hybi@ietf.org>; Mon, 13 Jan 2014 20:05:38 -0800 (PST)
Received: by mail-wg0-f50.google.com with SMTP id l18so6439850wgh.17 for <hybi@ietf.org>; Mon, 13 Jan 2014 20:05:27 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:from:date:message-id:subject:to:cc:content-type; bh=Tnbw7Y2UqrNnLB+N5ZC4fgPa3fBpN6nMBU/SIck+FZw=; b=UzWki2Z1f4CUuCOLmdRxSAt+BdlNL9wNHWjmxo04WuL7IwYwu32ae3DsMbF6/umx2v rHxNZEV00/gfY26ueBMYtrnqzH38DBthhJVQiIeV9PVIcbkCef32RC6vqOWVC2++sQXS oWkfQFEOxcuLAFi0iv0cJDozwuaBtPV8i5FJK0etZ0FpBBhYpVvxky3c4bUiwbGzv7bL gt7YIlbj7ikVS6Dm/+wSWtoSeAI3NDwAvfsbCLmz/1OtMqtB+LpEliFd7ZMHfRu8+7Uj 4HIWTTAsu2Ofsi/giLXeMhSFWwvf/cpK+55JwevPmg+NY/Uc6+xsxyWFMgV1WMJNLoz3 fjEg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:from:date:message-id:subject:to:cc :content-type; bh=Tnbw7Y2UqrNnLB+N5ZC4fgPa3fBpN6nMBU/SIck+FZw=; b=JaOgAuEDVNPzWJRRpJ6U9M1VZ/ON/sYcELHPnZbTTS6AnhPW9StgY3cO9Un7heXejY 9bCXhNOljeCtXuwS55JwuBQlNJ9hDUsiO1vUYSePTBgRq68745Iu2G4JoQuszCTHXFbE mzGGoP6dzy2aKdgksoRZW6TFlFuF3A+EOnwiyE0xLNEpuDcrwybfR9NrMVJFCb56GnBp kqQE/oPZoxZDP/xvVXx/rUA+YN8Sx26A/K5qXCp7vie4SWEWyqF38wfXr8Pmj36pPUXM A3hZEwxxt2YV+TsJDSfOsEXBLbiPQpVGsHO540ahuGK8Pc40jqbT8NFx9iSfStB9+uTM rg3w==
X-Gm-Message-State: ALoCoQm7ZixlRNOyI6TnMR6yOZAjkU2daCykISDyUD+Jx9bffuMCJTtGbAPujU1ykQpEh4VNDvhzrummwYvCT1mj4NsAnxLqyPK7IgCTaEQ/Jw+lhhQW0LT2ZgDFN3hq+tancTc7hpeoGe4yGFufgIIO30UhVh/xy3zw3wodE1hfXTUZ4aZxgIB4HU96/d2JPtYH0Q8VI2WF
X-Received: by 10.180.107.136 with SMTP id hc8mr16437330wib.11.1389672327483; Mon, 13 Jan 2014 20:05:27 -0800 (PST)
MIME-Version: 1.0
Received: by 10.194.8.231 with HTTP; Mon, 13 Jan 2014 20:05:07 -0800 (PST)
From: Takeshi Yoshino <tyoshino@google.com>
Date: Tue, 14 Jan 2014 13:05:07 +0900
Message-ID: <CAH9hSJbeY5VOY_iuwrdBq-KYcoVkW_8KArPp70hP4tdZj6eQfg@mail.gmail.com>
To: Arman Djusupov <arman@noemax.com>
Content-Type: multipart/alternative; boundary="e89a8f3bad456f293504efe64deb"
Cc: "hybi@ietf.org" <hybi@ietf.org>
Subject: [hybi] [permessage-deflate] Compressing fragmented data
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi/>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 14 Jan 2014 04:05:42 -0000

Hi Arman,

Let me change the subject.

On Thu, Jan 9, 2014 at 11:46 PM, Arman Djusupov <arman@noemax.com> wrote:

> Hello Takeshi,
>
>
>
> In the current draft, the description of the compression process states :
>
>
>
>    1.  Compress all the octets of the payload of the message using
>
>        DEFLATE.
>
>
>
> This doesn’t take into account outbound messages of arbitrary size that
> NEED to be fragmented, for which it is more favorable to use
> fragmentation and to compress & send fragments until the end of the
> message is reached.
>

OK. I'll add some text to explain how this works for fragmented messages.


>
>
> In such cases the implementation buffers the input from its source and
> then flushes it out into a compressed fragment, repeating this processuntil the end of the source data is
> reached. At that point it flushes any remaining bytes buffered into a
> final frame, or flushes a final frame of 0 length.
>
>
>
>
>
> When the implementation produces compressed fragments, it periodically
> produces frames with 0x00 0x00 0xFF 0xFF at the end of the frame due to
> flushing DEFLATE. But because of the following requirements:
>
>
>
>    2. If the resulting data does not end with an empty DEFLATE block
>
>        with no compression (the "BTYPE" bits is set to 00), append an
>
>        empty DEFLATE block with no compression to the tail end.
>
>
>
>    3.  Remove 4 octets (that are 0x00 0x00 0xff 0xff) from the tail end.
>
>        After this step, the last octet of the compressed data contains
>
>        (possibly part of) the DEFLATE header bits with the "BTYPE" bits
>
>        set to 00.
>
>
>
> the implementation must ensure that the message ends empty blocks without a
> trailing 0x0000FFFF, so the implementation must keep track on what type
> of block was at the end of the frame that was sent last. If the last frame
> sent ends with 0x0000FFFF then the implementation cannot remove those 4
> bytes from the wire, but it must artificially produce an empty DEFLATE
> block to send as the final frame. This is only due to the requirement to
> remove 0x0000FFFF.
>

Yes. It's just two byte long. DEFLATE block header (3bit) + Fixed Huffman
code for end of block symbol (7bit) + padding (6bit).


>
>

>
> Wouldn’t it be easier to make the removal 0x00 0x00 0xFF 0xFF optional,
> while at the same time requiring that the receiving side appends a 0x00
> 0x00 0xFF 0xFF to the final frame in case when its missing?
>
>
>

I think it complicates decoders. Code which doesn't know DEFLATE
decompressor's state cannot check if 0x00 0x00 0xFF 0xFF at the end is
really an uncompressed block body or not. It could be something else, e.g.
body of compressed block.


>  It is much simpler to just check the final bytes of the message instead
> of having to remember the state of the previous frame.
>