Re: [Cbor] [core] draft-ietf-cbor-7049bis - Change suggested Canonicalization

Francesca Palombini <francesca.palombini@ericsson.com> Thu, 05 October 2017 11:20 UTC

Return-Path: <francesca.palombini@ericsson.com>
X-Original-To: cbor@ietfa.amsl.com
Delivered-To: cbor@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A658A133347 for <cbor@ietfa.amsl.com>; Thu, 5 Oct 2017 04:20:47 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.22
X-Spam-Level:
X-Spam-Status: No, score=-4.22 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=ericsson.onmicrosoft.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 59DaRBPGTEZT for <cbor@ietfa.amsl.com>; Thu, 5 Oct 2017 04:20:44 -0700 (PDT)
Received: from sesbmg23.ericsson.net (sesbmg23.ericsson.net [193.180.251.37]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 52AB4134231 for <cbor@ietf.org>; Thu, 5 Oct 2017 04:20:44 -0700 (PDT)
X-AuditID: c1b4fb25-a5fff700000060a2-17-59d6158ab46b
Received: from ESESSHC002.ericsson.se (Unknown_Domain [153.88.183.24]) by sesbmg23.ericsson.net (Symantec Mail Security) with SMTP id 5C.50.24738.A8516D95; Thu, 5 Oct 2017 13:20:42 +0200 (CEST)
Received: from EUR01-HE1-obe.outbound.protection.outlook.com (153.88.183.145) by oa.msg.ericsson.com (153.88.183.24) with Microsoft SMTP Server (TLS) id 14.3.352.0; Thu, 5 Oct 2017 13:20:41 +0200
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ericsson.onmicrosoft.com; s=selector1-ericsson-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=oZ/6RCxCbH82FvM+y5hbphChw/coGmntThUnTer9ssA=; b=l4bksmHnh0PTwtu6sLyA7m3GxfyRqkcqo9RaTV3GltCoARjuuB9xQZDPCVQce9Rit//p/BAlIRukQs+Mn+y7iJWXKrJGhPsS1iXiSUibwS5VKdPmUoQJ9KDDp1cHRkCxq6YjgS55BA8xGFC+6MmjaIKaCff4GEpbdPWtyPiN2+Y=
Received: from HE1PR0701MB2539.eurprd07.prod.outlook.com (10.168.129.17) by HE1PR0701MB2539.eurprd07.prod.outlook.com (10.168.129.17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.77.5; Thu, 5 Oct 2017 11:20:40 +0000
Received: from HE1PR0701MB2539.eurprd07.prod.outlook.com ([fe80::456e:eb29:8077:6178]) by HE1PR0701MB2539.eurprd07.prod.outlook.com ([fe80::456e:eb29:8077:6178%17]) with mapi id 15.20.0077.018; Thu, 5 Oct 2017 11:20:40 +0000
From: Francesca Palombini <francesca.palombini@ericsson.com>
To: "cbor@ietf.org" <cbor@ietf.org>
CC: Alexey Melnikov <aamelnikov@fastmail.fm>, "Joe Hildebrandt (hildjj@cursive.net)" <hildjj@cursive.net>
Thread-Topic: [core] draft-ietf-cbor-7049bis - Change suggested Canonicalization
Thread-Index: AQLkcV5/cuYwYNvq7zdAWu7g9fdkoQDl1w1UoI+AzVCAAC11AIAb1yIQ
Date: Thu, 05 Oct 2017 11:20:40 +0000
Message-ID: <HE1PR0701MB2539219033904FD2A45771BA98700@HE1PR0701MB2539.eurprd07.prod.outlook.com>
References: <012801d32f2e$a95aaf10$fc100d30$@augustcellars.com> <7C19E4CE-32E2-44B2-BD44-1BAA48190674@tzi.org> <013a01d32fcb$ac8cede0$05a6c9a0$@augustcellars.com> <C55850CF-C510-4D2E-8298-3A40E3623CDB@tzi.org>
In-Reply-To: <C55850CF-C510-4D2E-8298-3A40E3623CDB@tzi.org>
Accept-Language: en-GB, en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [192.176.1.84]
x-ms-publictraffictype: Email
x-microsoft-exchange-diagnostics: 1; HE1PR0701MB2539; 6:7QVnCqeH7Ocq17QlG1fuAhM9CiudeAQKe3CnDlC/PQtXIJIDiukNLQooiH913HWolx53jGoc4cqfWouToWfTVmYvP7ENaIb9DsTn3J5Av00/wH2HOx8G2YVASI3OP0cNIc14X2ffizm3LJHOtY4fiO8QOmEZkn5uMKT2j8C93doKT69zEGa0BYwsd4HZ5+N4q+7vJMeT9ZOvSgq+bmRl9SH4Mo92ajgozxiFYbgwnUKaqaD/VF1aOdAhi+f9Di734ZQHTnjNOpa5QQ32o8V4Kg0e6RowqxOipilX2rAAdBSELjC6vBMS8QvGziXvwOMX1uFQbg3T9tVdqD4+4r/o6Q==; 5:AFiNN6CKZ5zM7JbUGSGqmMR1JaRAU++HlTZtEp9OQgRMZbLY5LRfsJqfUvKJx2JjQdv9Eu6jFcrr9EHm4PQ9Yf89G3y8wQDC522fCqwliZZ66eQ6/r/JFnKlUuTNNiYsxhNP8UYm1zRn9VtYrl2Aug==; 24:FC/DVPoqpxa0PLyq92i9E1p9qGHIAHgJYQS9SB8A9AEEAhUTHkIp7h0AEbEPM5rE/MQ9rRKUZpkQ0WAhVN8oOSq8ZY6J+JlN11feg0+pVqw=; 7:Aca5SEe/v/H0yL13AhdRsSlK42V5XpunLxLZ0F37n11VtkYxZb/WnuB+6M7h7j0rzXPUfF5YSb23BdmRONZLYkpk9byTfwEtOIWVnrHmbQFeA65qJDqk62abmVaIQkqMcNDPCfSRoFfcUjLMmSPHGeiGvNEqXY2B6gv4myHlRYcuaazsqB5pHQVni4W+7/Ds99oyHpXt75Ma4pP+kp2NCFf6z4c2vnxQGMIHKvpCEug=
x-ms-exchange-antispam-srfa-diagnostics: SSOS;
x-ms-office365-filtering-correlation-id: 1e481cb2-274b-449b-1afc-08d50be31c5e
x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(22001)(2017030254152)(2017052603199)(201703131423075)(201703031133081)(201702281549075); SRVR:HE1PR0701MB2539;
x-ms-traffictypediagnostic: HE1PR0701MB2539:
x-exchange-antispam-report-test: UriScan:;
x-microsoft-antispam-prvs: <HE1PR0701MB2539377D40CC220D7ECF80FC98700@HE1PR0701MB2539.eurprd07.prod.outlook.com>
x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(2401047)(8121501046)(5005006)(12181511122)(3002001)(10201501046)(100000703101)(100105400095)(93006095)(93001095)(6041248)(20161123560025)(20161123564025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123562025)(20161123558100)(20161123555025)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095); SRVR:HE1PR0701MB2539; BCL:0; PCL:0; RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095); SRVR:HE1PR0701MB2539;
x-forefront-prvs: 04519BA941
x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(6009001)(39860400002)(376002)(346002)(52314003)(199003)(13464003)(189002)(31014005)(51444003)(86362001)(6116002)(189998001)(14454004)(9686003)(55016002)(93886005)(6306002)(229853002)(99286003)(1730700003)(54906003)(8676002)(81166006)(81156014)(8936002)(53546010)(66066001)(6436002)(3280700002)(3660700001)(53936002)(7696004)(6506006)(4326008)(316002)(2906002)(5660300001)(5640700003)(6246003)(50986999)(105586002)(106356001)(2351001)(54356999)(101416001)(25786009)(8666007)(102836003)(76176999)(966005)(74316002)(7736002)(2950100002)(33656002)(6916009)(3846002)(68736007)(305945005)(2501003)(2900100001)(230783001)(5250100002)(97736004)(478600001); DIR:OUT; SFP:1101; SCL:1; SRVR:HE1PR0701MB2539; H:HE1PR0701MB2539.eurprd07.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en;
received-spf: None (protection.outlook.com: ericsson.com does not designate permitted sender hosts)
authentication-results: spf=none (sender IP is ) smtp.mailfrom=francesca.palombini@ericsson.com;
spamdiagnosticoutput: 1:99
spamdiagnosticmetadata: NSPM
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-MS-Exchange-CrossTenant-originalarrivaltime: 05 Oct 2017 11:20:40.2641 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 92e84ceb-fbfd-47ab-be52-080c6b87953f
X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR0701MB2539
X-OriginatorOrg: ericsson.com
X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFmphleLIzCtJLcpLzFFi42KZGbFdQrdL9Fqkwdk/Ghb73x9ispjZMZ3F gclj56kDbB5LlvxkCmCK4rJJSc3JLEst0rdL4Mq4NmUZW8GT4Iqe2ZeZGxgvBHYxcnJICJhI XNy1iKmLkYtDSOAIo8S1C6uZIZzjjBJNK5YygjgsAr3MEhu2rWSByMxgkri68ylU2TNGiV8L L7GADGMTsJG48PA9K4gtIqAsMXXKYbYuRg4OZoFsiV0T/EDCwgKBEksO3WYECYsIBEmcOF8P Ue0mcftkMzuIzSKgIvH11USwTl6BBIk5C3QgNl1nlDjevx2shlPAWuJa21FmEJtRQFbiS+Nq MJtZQFzi1pP5TBCvCUgs2XOeGcIWlXj5+B8rhK0g8aq7gQ3ClpW4NL8b7EkJgQ52iee3/rNA JPQktk58ywhh+0r8WXwTqmgBo8T1ty/YIRI6El+3X4Aq0pLoODILanO+xL67d5ggGu6wSvx7 +BPsYwkBGYnn/3gh4sfYJGbcfsA6gVF3FpLLZ4HDS1Ni/S59iLCixJTuh+wgNq+AoMTJmU9Y FjCyrGIULU4tTspNNzLWSy3KTC4uzs/Ty0st2cQITBsHt/xW3cF4+Y3jIUYBDkYlHt6cP1cj hVgTy4orcw8xSnAwK4nwKv0HCvGmJFZWpRblxxeV5qQWH2KU5mBREud13HchQkggPbEkNTs1 tSC1CCbLxMEp1cBow1/w6MbaItv/8ff3bdDxfd3SvzAh9W/DxRyGiqTZHTu21UzecEmmTFJP +/feNcf6Fx388Km/ZF2b9TsX79a1CpMNY/84nf2QJrWJfcG+f6uanCYtTT/0IjtRycgrOe1w gFHk3zJdQznJtL/rI/gjOD6YWaR0PNcs1liswtdxZt/Gxkt2ky8osRRnJBpqMRcVJwIAvGYv sBcDAAA=
Archived-At: <https://mailarchive.ietf.org/arch/msg/cbor/2RjjqTV_JTwm0M2jb2dWGhrmDEQ>
Subject: Re: [Cbor] [core] draft-ietf-cbor-7049bis - Change suggested Canonicalization
X-BeenThere: cbor@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: "Concise Binary Object Representation \(CBOR\)" <cbor.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/cbor>, <mailto:cbor-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/cbor/>
List-Post: <mailto:cbor@ietf.org>
List-Help: <mailto:cbor-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/cbor>, <mailto:cbor-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 05 Oct 2017 11:20:47 -0000

Hi,

(chair-hat on) I would really like to hear the opinion of the working group on this one.
Please make sure you read through the mail thread and express your thoughts.

Thanks,
Francesca

> -----Original Message-----
> From: core [mailto:core-bounces@ietf.org] On Behalf Of Carsten Bormann
> Sent: den 17 september 2017 20:08
> To: Jim Schaad <ietf@augustcellars.com>
> Cc: cbor@ietf.org; core@ietf.org WG <core@ietf.org>
> Subject: Re: [core] draft-ietf-cbor-7049bis - Change suggested
> Canonicalization
> 
> Hi Jim,
> 
> I do share your frustration a bit, because I do believe the canonical map key
> sorting order is one of the very few things we didn’t quite get right about RFC
> 7049.  The question about fixing this really is procedurally, can we fix it, and
> what is the impact on people who are now using the old order.
> 
> > I complained that this was not a good sorting order before RFC 7049 was
> published, so this is not a new issue.  I wish that I could find the mails from
> the time as my memory was that you said we could discuss it on the next
> version which is what I am trying to do.
> 
> I must admit I don’t remember that discussion (it would have been more
> than four years in the past) and my mail program apparently doesn’t either.
> 
> (My mail program does remember errata report 4409 from July 2015, where
> you did propose another sorting order.)
> 
> >> Why do you think the values need to be in there, too?
> >> Map keys must be different, so the value never can have an effect,
> >
> > That is from below - where you could have multiple duplicate keys.  It is
> very possible people will do this even if it is not a valid encoding.
> 
> The reason why 7049 does not outright make emitting duplicate keys a
> conformance issue is that, in a streaming implementation, it is hard to check
> for duplicate keys.  That argument does not apply to an implementation that
> has to sort map member keys to achieve a canonical encoding.  So, while
> creating canonical encoding, the implementation can also check for key
> duplication and raise an error if that happens.
> 
> >> Now, if it were mid-2013, I would just agree with you, change the
> >> draft, and ask the WG (at the time appsawg) whether there are any
> further comments.
> >
> > Given that you did not agree with me at the time when I raised this issue, I
> would disagree with this statement.
> 
> Well, I probably should have said “knowing what I know now” (i.e., the
> degree of POLS violation that the current rules pose) I would agree with you.
> (At the time, the definition we now have in RFC 7049 appeared to be
> expedient.)
> 
> >> Unfortunately, CBOR has been out there, we are on the way to an
> >> Internet STD, and the current canonicalization is what’s implemented
> >> (at least in a couple of places).  If we change this, we have to take
> >> canonicalization out from the STD and put it into a separate specification.
> >> We could decide to do this, but I’m not sure that this helps.  (We’ll
> >> also have the CER vs. DER situation again.)
> >
> > I do not believe that changing this would in any way stop the advancement
> to STD.
> 
> Now that (i.e., just going ahead and changing things here) is an interesting
> thought.
> 
> One problem with that is that in the IETF, the IESG is the gating function, and
> it is sufficient for a single member of the IESG to disagree with this thought to
> hold up the process.
> 
> > This section is completely non-normative there is not a single normative
> statement in the entire section. Nor is the set of rules complete given that
> there are two paragraphs at the end which have additional rules that "might"
> need to be added.  Making this one change would not alter the fact that it is
> non-normative and incomplete.
> 
> Filling in those blanks might be another argument for going ahead and
> writing a separate document about canonicalized CBOR instead.
> 
> > There is a possible multiple encoding issue that may come, but that is
> already implicit in the text of this section.  The current text reads "Those
> protocols are free to define what they mean by a canonical format and what
> encoders and decoders are expected to do." Which means that multiple
> encodings are already not only possible but probably.  I think that we can get
> this fixed now and go forward with an obsoleted suggested canonicalization
> and be fine.
> >
> > The suggested algorithm is far easier to understand, easier to get right and
> also has some advantages where one can do the canonicalization without
> having to do the encoding first.
> >
> > int CompareNodes(node1, node2)
> >   if node1.majorMode != node2.majorMode then return node1.majorMode
> - node2.majorMode;
> >   if node1.majorMode has a length field and node1.length != node2.length
> then return node1.length - node2.length;
> >   return compare node1 and node2 values - this is major mode dependent.
> >
> > With the current method, you need to do a lot more work to try and get
> things in the correct order if you have any mixing of major modes, which is
> now very common place despite the statement that this is probably a bad
> practice.
> 
> Indeed, this is one thing we have learned since 2013: There are quite good
> reasons for mixing major types in the keys of a single map.
> 
> > If you have to emit all of the keys, sort them and then remember the
> original order to emit the values, it is harder than concatenating the entire
> thing and then emitting the values after doing the sort.  You are going to
> need to keep some type of more complex structure - and thus more code- to
> do the emission in the generic case.  Yes for a small fixed set of keys this is
> not necessary but I do not believe that is where a good canonicalization
> routine is going to be needed.
> 
> Here is my implementation of a pre-canonicalizer for maps from the cbor-
> canonical gem (the type “Hash” in Ruby is an order-preserving map, so we
> can do all this entirely at the data model level):
> 
>       def cbor_pre_canonicalize
>         Hash[collect {|k, v|
>                       k = k.cbor_pre_canonicalize
>                       v = v.cbor_pre_canonicalize
>                       cc = k.to_cbor # already canonical
>                       [cc.size, cc, k, v]}.sort.collect{|s, cc, k, v| [k, v]}]
>       end
> 
> A sorting rule that is entirely based on (byte-wise) lexical ordering would
> enable pre-sorting on types — there often would be no need to generate the
> entire key encoding for sorting.  But then, you have to generate them
> anyway, so the additional overhead is mostly a matter of memory
> allocation/copying.
> 
> Here is an (untested) implementation of what I think you are proposing:
> 
>       def cbor_pre_canonicalize
>         Hash[collect {|k, v|
>                       k = k.cbor_pre_canonicalize
>                       v = v.cbor_pre_canonicalize
>                       cc = k.to_cbor # already canonical
>                       [cc, k, v]}.sort.collect{|cc, k, v| [k, v]}]
>       end
> 
> I.e., the size of the key encoding is removed from the list of things to sort for.
> 
> > I strongly urge that this change be done even if it might hurt backwards
> compatibility and the sooner we make the decision to do it the better as it
> reduces the number of people who will do it wrong.
> 
> Now whether we should do this or not is a good question for the CBOR WG
> to look at.
> 
> (I’ve added the WG back to the recipient list.)
> 
> Grüße, Carsten
> 
> 
> _______________________________________________
> core mailing list
> core@ietf.org
> https://www.ietf.org/mailman/listinfo/core