Hi Raphael,

Thanks for explaining! Using "descendants" is definitely better for sec= urity, but also, as Hubert mentioned, less efficient.

Also we did understand =93resolution=94 the same way as you when we proposed th= e strong parent hash (sorry for the confusion).

But we obviously really want to understand if/why using resolution is insecure= . Ultimately, the security property we care about is that privacy and authe= nticity of application messages holds. The Parent Hash mechanism helps us e= nsure this holds also for epochs in adversarially generated groups, as long as the adversary does not contr= ol any leaves in this epoch (i.e. does not know the signing key of the leaf= ). So to really see if your examples break security, it would be good to se= e if they let us violate this security goal.

To see why we believe the new parent hash notion (i.e. with resolutions) achi= eves this, let us first clarify what technical property about (even adversa= rially produced) ratchet trees we want from verifying parent hashes. It=92s= not that the adversary can=92t produce manipulated trees at all. It=92s that the only manipulations she is able t= o do will not let her produce a tree that violates the tree invariant. That= =92s because (in our model) when the tree invariant holds then we can show = that the epoch is secure.

So some manipulations *are* possible. They are just =93benign=94. For example= , suppose adversary A controls a leaf L with HPKE key pk in tree T (i.e. sh= e knows the signing key of L=92s owner). Then A can construct a new tree T= =92 identical to T but where leaf L has pk=92 !=3D pk (say, by performing a commit at leaf L to obtain T=92). But for th= is, in T=92 leaf L must still be controlled by A. We call this manipulation= =93benign=94 because even though A might now break privacy of application = messages in epoch with tree T=92 it doesnt violate our security goal because privacy is conditioned on A *not* controlling an= y leaves.

Conversely, one thing A can *not* do is the following. Say T=92 has a node V with HPKE= public key pk0 such that A knows corresponding sk0. Then for T=92 to pass = verification, it must contain a leaf controlled by A. This holds since for = T=92 to pass verification there must be a leaf in V=92s subtree with a signature covering a hash that includes pk0= . I.e. some leaf must =93attest=94 to having inserted pk0 into T=92. (Recal= l that in our model the adversary can=92t just leak sk0 alone. They must le= ak a full state of some client which also means leaking the client's signing key.)

So our question to you is the following. (Does this make sense? And) we=92re = not exactly sure what you have in mind when you say =93swapping out two lea= f nodes=94. But we think it would only be a problem if it is not benign. Is= this true for your attack? Could you expand a bit on what tree manipulation you have in mind?

Best,

Jo=EBl

I think that if you use "descendants" as=
you define it, then calculating the parent_hash will become O(n) all the t=
ime, rather than O(log n) in the best-case, since the size of the descendan=
ts list will be O(n). I'm not sure if there's
a good way to maintain O(log n) and still provide the desired security.

On Thu, 31 Dec 2020, at 06:47, Raphael Robert wrote:

> Hi Jo=EBl, Daniel and Marta,

>

> I just implemented the stronger parent hash scheme you proposed in the=

> way it is described in the spec now.

> The good news is that it is implementable, the bad news is that is doe= s

> not solve the problem you initially describe. Specifically, swapping <= br> > out two leaf nodes still goes undetected.

>

> I think it is due to a terminology conflict, but it would be great if =

> you could validate that assumption.

> In 7.4 you introduce the notion of =93original child resolution=94 as = " the

> array of HPKEPublicKey values of the nodes in the resolution of S=94. = The

> term =93resolution=94 is defined in 5.2. It is essentially a way of de= aling

> with blank nodes by substituting them with the first non-blank

> descendants you find when descending the subtree of a node.

> Here is my assumption: You probably thought the resolution would cover=

> all non-blank descendants of a node, not just the first non-blank

> descendant. In that case we would need another function that does that=

> (let=92s call it =93descendants=94).

>

> If we look at the example of 5.2:

>

> - The resolution of node 3 is the list [A, CD, C]

> - The descendants of node 3 is the list [A, CD, C, D]

>

> - The resolution of node 5 is the list [CD]

> - The descendants of node 5 is the list [CD, C, D] (we just need= to

> additionally filter out C, because it is an unmerged leaf of node 5)**
> **

> Here is a code snippet of what the =93descendants=94 function would lo= ok

> like (sorry that it=92s Rust, not Python):

>

> fn descendants(x: NodeIndex, size: LeafIndex) -> Vec<NodeIndex&g= t; {

> if level(x) =3D=3D 0 {

> vec![x]

> } else {

> let left_child =3D lef= t(x);

> let right_child =3D ri= ght(x, size);

> [

> = ; descendants(left_child, size),

> = ; vec![x],

> = ; descendants(right_child, size),

> ]

> .concat()

> }

> }

>

> There=92s a more efficient implementation of course, this is just to i= llustrate.

>

> In other words, the new =93original child resolution=94 of S would be = the

> subtree under S, but with filtered-out blank nodes and unmerged leaves=

> of S.

>

> If my assumption is correct, I=92ll follow up quickly with a PR since =

> this is currently a blocker for interop based on draft-11.

>

> Thanks

>

> Raphael

>

>

>

> > On 12. Nov 2020, at 22:14, Joel Alwen <jalwen@wickr.com> wr= ote:

> >

> > Hey people,

> >

> > Just a quick heads up that Daniel, Marta and I put in a PR with a= new version of

> > strong parent hash to prevent the attacks from the earlier thread= . We tried to

> > make it as minimal as we could to help with deniability while sti= ll preventing

> > the attacks permitted by weak parent hashes.

> >

> > In a nutshell, when computing parent_hash at node v with parent p= and sibling w

> > we include

> > - p's HPKE pubkey

> > - p's parent_hash value and

> > - HPKE pub keys in the resolution of w except for those belonging= to leaves

> > unmerged at p.

> >

> > As a sanity check, notice that as long as p's keys remain the sam= e one can

> > always recompute the same parent_hash value at v as was initially= computed by

> > the member that set p's keys. (In other words, new members can ve= rify that the

> > stored parent_hash values match whats in the tree.) In particular= , that's coz as

> > long as p's keys are unchanged so is the resolution of w. The onl= y exception are

> > leaves being added as unmerged at w. But those leaves are also ad= ded as unmerged

> > at p so they are left out of the hash.

> >

> > As for deniability, at least parent_hash only binds HPKE keys and= nothing else

> > (like, say, credentials or signatures). Its the best we were able= to come up

> > with for now...

> >

> > - jo=EBl

> >

> > _______________________________________________

> > MLS mailing list

> > MLS@ietf.org

> > https://www= .ietf.org/mailman/listinfo/mls

>

> _______________________________________________

> MLS mailing list

> MLS@ietf.org

> https://www.ietf= .org/mailman/listinfo/mls

>

_______________________________________________

MLS mailing list

MLS@ietf.org

https://www.ietf.org/= mailman/listinfo/mls

--_000_32c5c409abbc4cf79ecf74c9e2a2befainfethzch_--
From nobody Fri Jan 1 08:30:12 2021
Return-Path: On Thu, 31 Dec 2020, at 06:47, Raphael Robert wrote:

> Hi Jo=EBl, Daniel and Marta,

>

> I just implemented the stronger parent hash scheme you proposed in the=

> way it is described in the spec now.

> The good news is that it is implementable, the bad news is that is doe= s

> not solve the problem you initially describe. Specifically, swapping <= br> > out two leaf nodes still goes undetected.

>

> I think it is due to a terminology conflict, but it would be great if =

> you could validate that assumption.

> In 7.4 you introduce the notion of =93original child resolution=94 as = " the

> array of HPKEPublicKey values of the nodes in the resolution of S=94. = The

> term =93resolution=94 is defined in 5.2. It is essentially a way of de= aling

> with blank nodes by substituting them with the first non-blank

> descendants you find when descending the subtree of a node.

> Here is my assumption: You probably thought the resolution would cover=

> all non-blank descendants of a node, not just the first non-blank

> descendant. In that case we would need another function that does that=

> (let=92s call it =93descendants=94).

>

> If we look at the example of 5.2:

>

> - The resolution of node 3 is the list [A, CD, C]

> - The descendants of node 3 is the list [A, CD, C, D]

>

> - The resolution of node 5 is the list [CD]

> - The descendants of node 5 is the list [CD, C, D] (we just need= to

> additionally filter out C, because it is an unmerged leaf of node 5)

> Here is a code snippet of what the =93descendants=94 function would lo= ok

> like (sorry that it=92s Rust, not Python):

>

> fn descendants(x: NodeIndex, size: LeafIndex) -> Vec<NodeIndex&g= t; {

> if level(x) =3D=3D 0 {

> vec![x]

> } else {

> let left_child =3D lef= t(x);

> let right_child =3D ri= ght(x, size);

> [

> = ; descendants(left_child, size),

> = ; vec![x],

> = ; descendants(right_child, size),

> ]

> .concat()

> }

> }

>

> There=92s a more efficient implementation of course, this is just to i= llustrate.

>

> In other words, the new =93original child resolution=94 of S would be = the

> subtree under S, but with filtered-out blank nodes and unmerged leaves=

> of S.

>

> If my assumption is correct, I=92ll follow up quickly with a PR since =

> this is currently a blocker for interop based on draft-11.

>

> Thanks

>

> Raphael

>

>

>

> > On 12. Nov 2020, at 22:14, Joel Alwen <jalwen@wickr.com> wr= ote:

> >

> > Hey people,

> >

> > Just a quick heads up that Daniel, Marta and I put in a PR with a= new version of

> > strong parent hash to prevent the attacks from the earlier thread= . We tried to

> > make it as minimal as we could to help with deniability while sti= ll preventing

> > the attacks permitted by weak parent hashes.

> >

> > In a nutshell, when computing parent_hash at node v with parent p= and sibling w

> > we include

> > - p's HPKE pubkey

> > - p's parent_hash value and

> > - HPKE pub keys in the resolution of w except for those belonging= to leaves

> > unmerged at p.

> >

> > As a sanity check, notice that as long as p's keys remain the sam= e one can

> > always recompute the same parent_hash value at v as was initially= computed by

> > the member that set p's keys. (In other words, new members can ve= rify that the

> > stored parent_hash values match whats in the tree.) In particular= , that's coz as

> > long as p's keys are unchanged so is the resolution of w. The onl= y exception are

> > leaves being added as unmerged at w. But those leaves are also ad= ded as unmerged

> > at p so they are left out of the hash.

> >

> > As for deniability, at least parent_hash only binds HPKE keys and= nothing else

> > (like, say, credentials or signatures). Its the best we were able= to come up

> > with for now...

> >

> > - jo=EBl

> >

> > _______________________________________________

> > MLS mailing list

> > MLS@ietf.org

> > https://www= .ietf.org/mailman/listinfo/mls

>

> _______________________________________________

> MLS mailing list

> MLS@ietf.org

> https://www.ietf= .org/mailman/listinfo/mls

>

_______________________________________________

MLS mailing list

MLS@ietf.org

https://www.ietf.org/= mailman/listinfo/mls

I=E2=80=99m glad we are on the same =
page regarding the definition of the resolution. I think I understand =
where the misunderstanding is coming from.

Initially you proposed a =E2=80=9Cstrong =
hash=E2=80=9D solution on the ML (on Oct 23 2020) that was the =
following:

"strong parent_hash" : Define parent_hash to include = tree_hash (and tree_hash to*not* include = parent_hash). The tree_hash of a node covers all values in thesubtree rooted at that node (except the parent_hashes = and signatures). Sobasically something like = this:- leaf.tree_hash :=3D = H(leaf.data)- node.tree_hash :=3D = H(node.data, node.lchild.tree_hash, node.rchild.tree_hash)= - root.parent_hash :=3D 0- = node.parent_hash :=3D H(node.parent.tree_hash, = node.parent.parent_hash).

This definition did =
include the whole subtree and therefore I was assuming that this would =
also end up in the spec change. As far as I now understand, you went for =
a more lightweight approach where you say that just using the resolution =
is enough, because pinning all nodes in the subtree is not necessary and =
would only prevent =E2=80=9Cbenign=E2=80=9D attacks. I guess I missed =
the part where you switched from one concept to the other.

What I discovered during =
implementation is that when you only use the resolution approach and the =
tree has no blanks, you are not always able to detect if the attacker =
has swapped two leaves. This might indeed just be a benign attack in the =
sense that the tree invariant is not violated.

At this point I don=E2=80=99=
t have strong arguments for or against using the resolution approach, I =
would have to think about it some more. Just as a reminder for context, =
the original concept by Benjamin and Karthik was =E2=80=9Cparent hash =
covers the tree hash=E2=80=9D. That=E2=80=99s pretty much what you =
initially proposed, except that it also covered the KeyPackage at the =
leaf level, not just the HPKE public key.

My question right now would be: What is =
the benefit of the lightweight resolution-based approach? If it=E2=80=99s =
just efficiency, we should look at whether there=E2=80=99s really that =
much to gain.

On=
efficiency:

The parent hashes are verified in two =
instances: a) when a new member joins the group and validates the public =
tree it received and b) when a member receives a full Commit from =
another member.

Re a): The spec now says "Moreover, when joining a group, new =
members MUST authenticate each non-blank parent node P. A parent node P =
is authenticated=E2=80=9D. This means the effort is linear to start =
with. Depending on how you define the operation, it is even worse than =
linear. Say the tree has n leaves and therefore 2n-1 nodes. For every =
node, we need to iterate over at least a part of the subtree and collect =
public keys that we then concatenate. We end up only hashing once per =
node, but with a variable size payload. I think the cost of navigating =
the subtree and collecting the keys is O(log n) for the =E2=80=9Cresolutio=
n approach and O(n) for the =E2=80=9Cdescendants" approach and the cost =
of hashing is linearly proportional to the size of the payload. If we =
assume that hashing is the most expensive operation, the total cost =
would be O(n * log n) for =E2=80=9Cresolution=E2=80=9D and O(n ^ 2) for =
=E2=80=9Cdescendants".

Re b): It is very similar to a), except =
that we only have to iterate over log n nodes of the direct path, not =
the whole tree. That means that the total effort is O(log n * log n) for =
=E2=80=9Cresolution=E2=80=9D and O(n * log n) for =
=E2=80=9Cdescendants=E2=80=9D.

There=E2=80=99s clearly a difference in =
terms of efficiency between the two approaches. That being said, we =
should keep in mind that for small payloads, hashing is a lot faster =
than other cryptographic operations like e.g. producing and verifying =
signatures. Since a new joiner needs to also verify n signatures when =
joining a group, it could well be that the effort of verifying the =
parent hashes is negligible compared to the cost of verifying all =
signatures.

In =
summary: I still don=E2=80=99t have strong arguments against the =
=E2=80=9Cresolution=E2=80=9D approach, but I also have a feeling that =
the efficiency differences are not dramatic either.

Happy New Year and =
thanks!

Raphael

On 1. Jan 2021, at 15:48, = Mularczyk Marta <marta.mularczyk@inf.ethz.ch> wrote:Hi Raphael,Thanks for explaining! Using "descendants" is definitely better for security, = but also, as Hubert mentioned, less efficient.Also we did understand =E2=80=9Cresolution=E2=80=9D the same way as you when = we proposed the strong parent hash (sorry for the = confusion).But we obviously really want to understand if/why using resolution is = insecure. Ultimately, the security property we care about is that = privacy and authenticity of application messages holds. The Parent Hash = mechanism helps us ensure this holds also for epochs in adversarially generated groups, as long as the adversary does not = control any leaves in this epoch (i.e. does not know the signing key of = the leaf). So to really see if your examples break security, it would be = good to see if they let us violate this security goal.To see why we believe the new parent hash notion (i.e. with resolutions) = achieves this, let us first clarify what technical property about (even = adversarially produced) ratchet trees we want from verifying parent = hashes. It=E2=80=99s not that the adversary can=E2=80=99t produce manipulated trees at all. It=E2=80=99s that the only manipulations she = is able to do will not let her produce a tree that violates the tree = invariant. That=E2=80=99s because (in our model) when the tree invariant = holds then we can show that the epoch is secure.So some manipulations *are* possible. They are just =E2=80=9Cbenign=E2=80=9D= . For example, suppose adversary A controls a leaf L with HPKE key pk in = tree T (i.e. she knows the signing key of L=E2=80=99s owner). Then A can = construct a new tree T=E2=80=99 identical to T but where leaf L has = pk=E2=80=99 !=3D pk (say, by performing a commit at leaf L to obtain T=E2=80=99). = But for this, in T=E2=80=99 leaf L must still be controlled by A. We = call this manipulation =E2=80=9Cbenign=E2=80=9D because even though A = might now break privacy of application messages in epoch with tree T=E2=80= =99 it doesnt violate our security goal because privacy is conditioned on A *not* controlling = any leaves.Conversely, one thing A can *not* do is the following. Say T=E2=80=99 has a node V = with HPKE public key pk0 such that A knows corresponding sk0. Then for = T=E2=80=99 to pass verification, it must contain a leaf controlled by A. = This holds since for T=E2=80=99 to pass verification there must be a leaf in V=E2=80=99s subtree with a signature covering a hash that = includes pk0. I.e. some leaf must =E2=80=9Cattest=E2=80=9D to having = inserted pk0 into T=E2=80=99. (Recall that in our model the adversary = can=E2=80=99t just leak sk0 alone. They must leak a full state of some = client which also means leaking the client's signing key.)So our question to you is the following. (Does this make sense? And) = we=E2=80=99re not exactly sure what you have in mind when you say = =E2=80=9Cswapping out two leaf nodes=E2=80=9D. But we think it would = only be a problem if it is not benign. Is this true for your attack? = Could you expand a bit on what tree manipulation you have in = mind?Best,Jo=C3=ABl & MartaFrom:MLS= <mls-bounces@ietf.org> on behalf of Hubert Chathi = <hubertc@matrix.org>Sent:Thursday, December 31, 2020 = 5:36:53 PMTo:mls@ietf.orgSubject:Re: [MLS] New Parent = HashI think that if you use = "descendants" as you define it, then calculating the parent_hash will = become O(n) all the time, rather than O(log n) in the best-case, since = the size of the descendants list will be O(n). I'm not sure if = there's a good way to maintain O(log n) and still provide the desired = security.

On Thu, 31 Dec 2020, at 06:47, = Raphael Robert wrote:

> Hi Jo=C3=ABl, Daniel and = Marta,

>

> I just = implemented the stronger parent hash scheme you proposed in the

> way it = is described in the spec now.

> The good news is that = it is implementable, the bad news is that is does

> not = solve the problem you initially describe. Specifically, swapping

> out two = leaf nodes still goes undetected.

>

> I think = it is due to a terminology conflict, but it would be great if

> you = could validate that assumption.

> In 7.4 you introduce = the notion of =E2=80=9Coriginal child resolution=E2=80=9D as " the

> array = of HPKEPublicKey values of the nodes in the resolution of S=E2=80=9D. = The

> = term =E2=80=9Cresolution=E2=80=9D is defined in 5.2. It is essentially a = way of dealing

> with blank nodes by substituting them with the first = non-blank

> descendants you find when descending the subtree of a = node.

> Here is my assumption: You probably thought the = resolution would cover

> all = non-blank descendants of a node, not just the first non-blank

> = descendant. In that case we would need another function that does = that

>= (let=E2=80=99s call it =E2=80=9Cdescendants=E2=80=9D).

>

> If we look at the example of 5.2:

>

> =16=16 = - The resolution of node 3 is the list [A, CD, C]

> = - The descendants of node 3 is the list [A, CD, C, D]

>

> =16=16 - The resolution of node 5 is the list [CD]

> - The descendants of node 5 is the list [CD, C, D] = (we just need to

> additionally filter out C, because it is an unmerged = leaf of node 5)

>

> Here is = a code snippet of what the =E2=80=9Cdescendants=E2=80=9D function would = look

>= like (sorry that it=E2=80=99s Rust, not Python):

>

> fn = descendants(x: NodeIndex, size: LeafIndex) -> Vec<NodeIndex> = {

> if level(x) =3D=3D 0 {

> = vec![x]

> } else {

> let = left_child =3D left(x);

> let = right_child =3D right(x, size);

> [

> = ; descendants(left_child, size),

> = ; vec![x],

> = ; descendants(right_child, size),

> ]

> = .concat()

> }

> = }

>

> There=E2=80=99s a more efficient implementation of = course, this is just to illustrate.

>

> In = other words, the new =E2=80=9Coriginal child resolution=E2=80=9D of S = would be the

> subtree under S, but with filtered-out blank nodes and = unmerged leaves

> of S.

>

> If my = assumption is correct, I=E2=80=99ll follow up quickly with a PR = since

> this is currently a blocker for interop based on = draft-11.

>

> = Thanks

>

> = Raphael

>

>

>

> > On = 12. Nov 2020, at 22:14, Joel Alwen <jalwen@wickr.com> wrote:

> >

> > = Hey people,

> >

> > = Just a quick heads up that Daniel, Marta and I put in a PR with a new = version of

> > strong parent hash to prevent the = attacks from the earlier thread. We tried to

> > = make it as minimal as we could to help with deniability while still = preventing

> > the attacks permitted by weak parent = hashes.

> >

> > In = a nutshell, when computing parent_hash at node v with parent p and = sibling w

> > we include

> > - = p's HPKE pubkey

> > - p's parent_hash value and

> > - HPKE pub keys in the resolution of w except for = those belonging to leaves

> > unmerged at p.

> >

> > As a sanity check, notice that as long as p's keys = remain the same one can

> > always recompute the = same parent_hash value at v as was initially computed by

> > the member that set p's keys. (In other words, new = members can verify that the

> > stored parent_hash = values match whats in the tree.) In particular, that's coz as

> > long as p's keys are unchanged so is the resolution = of w. The only exception are

> > leaves being added = as unmerged at w. But those leaves are also added as unmerged

> > at p so they are left out of the hash.

> >

> > As for deniability, at least parent_hash only = binds HPKE keys and nothing else

> > (like, say, = credentials or signatures). Its the best we were able to come up

> > with for now...

> >

> > - = jo=C3=ABl

> >

> > = _______________________________________________

> > = MLS mailing list

> > MLS@ietf.org

> > https://www.ietf.org/mailman/listinfo/mls

>

> _______________________________________________

> MLS mailing list

> MLS@ietf.org

> https://www.ietf.org/mailman/listinfo/mls

>

_______________________________________________

MLS mailing list

MLS@ietf.org

https://www.ietf.org/mailman/listinfo/mls

On 1. Jan 2021, at 23:10, Joel = Alwen <jalwen@wickr.com> wrote:Servus Raphael,Happy New Year

Same to you! Here's hoping the new one will be a tad happier = than the last. ;-)

As for why "resolutions": The impetus to look for a new = approach after our

initial proposal came from a comment in your email on the = Fri, 23 October 2020

18:23 UTC in the thread "Weak vs. Strong Tree = Authentication". There you wrote:Going for fully hashing the tree = nodes as described locks down the whole

tree structure, = making any reshuffling impossible. Maybe there is a more

lightweight = approach that would yield the same result without =E2=80=9Cprovid[ing]

irrefutable = evidence that the signing key's owner was in a group with (at

least some) = of the other members in the group=E2=80=9D, but I can=E2=80=99t think = of

anything right now.

That was an intriguing challenge = and got us thinking. The result is the current

Parent Hash definition using = resolutions. We realized that to buy wiggle room in

not locking down as much of the = tree (especially, not locking down other members

identities into the hash) we = could pay a "price" that actually doesn't harm the

security we've been aiming for. = Namely, we can allow the adversary some more

tree manipulation but still only = benign ones that dont actually allow her to

break the security property = we've been aiming for.

That being said, this looks =
quite elegant!

In retrospect, a second reason to use resolutions is = efficiency (though to fair,

that was really a coincidence rather than an explicit goal of = the redesign).

Initially, we had been thinking of simply having parent hash = include tree hash

as in the Benjamin-Karthik (BK) proposal of days yore. But we = realized that

since then = MLS had introduced Unmerged Leaves which meant that the tree hash = of

some nodes = can change between when they are assigned their HPKE key and = later

when a new = member wants to verify the parent hash of the node. In other = words

the BK = approach was broken by unmerged leaves. So, to make the "include = the

whole subtree = in parent hash" approach work, it seems new members would, for

each node being verified, = traverse the nodes full sub-tree so as to weed out

those nodes that were added = after the to-be-verified node got its HPKE key. (I

believe this is the O(n) time = that Hubert is referring too?) In contrast, the

new version of parent hash = (using resolutions) has the potential to be much

faster; it requires hashing less = data but also visiting less of the tree. E.g.

when verifying near the root the = difference can be as much as visiting order n

nodes vs. visiting 1 node in the = new construction.

The more blanks & unmerged leaves in the tree the more = the new construction

collapses to the old one. But only in the most extreme case = do they actually

reach parity and the new approach is never slower than the = old.

Needless to = say, we didn't implement this stuff though, let alone benchmark = the

2 approaches = so any hard data elucidating what the actual difference is would = be

very welcome. = Still, the new approach can only be as fast or faster (but = never

slower) than = the new one. Nor do I think it's any more complicated to define = or

implement. = And from a security perspective they both let MLS meet our = security

definition. = So regardless of how small the speed up is (and ignoring the = locking

in the tree = discussion above) it's still not clear to me why we'd want to = switch

back.

A meaningful attack that only works on the new but not old = construction would be

one very quick way of changing my mind though! :-) But, I = believe such an attack

would require either new adversarial capabilities and/or = break a new security

property we didn't consider. (For the adversaries & = security we considered,

there is no security loss with the new approach compared to = the old one.)

I =
think we are now fully aligned on the definitions and the performance =
implications and now that we have better documentation we can continue =
with the interim effort.

I=E2=80=99d =
however like to leave it as an open question whether we shouldn=E2=80=99t =
reconsider locking down entire subtrees as a security-in-depth measure =
if we can convince ourselves the performance toll is not too =
high.

For example, I=E2=80=99m wondering if there =
isn=E2=80=99t a clever way to combine the tree hash and the parent hash. =
The tree hash has to be calculated for every epoch change anyway and =
it=E2=80=99s always in O(n).

Raphael

*From:* MLS <mls-bounces@ietf.org <mailto:mls-bounces@ietf.org>> on behalf of

- Jo=C3=ABl

On 01/01/2021 17:30, Raphael Robert wrote:Thanks = for the lengthy explanation!

I=E2=80=99m = glad we are on the same page regarding the definition of the = resolution.

I think I understand where the misunderstanding is coming = from.

Initially you proposed a =E2=80=9Cstron= g hash=E2=80=9D solution on the ML (on Oct 23 2020)

that was the = following:

"strong parent_hash" : Define = parent_hash to include tree_hash (and

tree_hash to *not* = include parent_hash). The tree_hash of a node covers all

values in the subtree rooted at that node (except the = parent_hashes and

signatures). So basically something like = this: - leaf.tree_hash :=3D

H(leaf.data) - node.tree_hash = :=3D H(node.data, node.lchild.tree_hash,

node.rchild.tree_hash) - root.parent_hash :=3D 0 - = node.parent_hash :=3D

H(node.parent.tree_hash, = node.parent.parent_hash).

This definition = did include the whole subtree and therefore I was assuming

that this = would also end up in the spec change. As far as I now understand,

you went for = a more lightweight approach where you say that just using the

resolution = is enough, because pinning all nodes in the subtree is not

necessary = and would only prevent =E2=80=9Cbenign=E2=80=9D attacks. I guess I = missed the part

where you switched from one concept to the other.

What I discovered during implementation is = that when you only use the

resolution = approach and the tree has no blanks, you are not always able to

detect if = the attacker has swapped two leaves. This might indeed just be a

benign = attack in the sense that the tree invariant is not violated.

At this point I don=E2=80=99t have strong = arguments for or against using the

resolution = approach, I would have to think about it some more. Just as a

reminder for = context, the original concept by Benjamin and Karthik was

=E2=80=9Cparen= t hash covers the tree hash=E2=80=9D. That=E2=80=99s pretty much what = you initially

proposed, except that it also covered the KeyPackage at the = leaf level, not

just the HPKE public key.

My = question right now would be: What is the benefit of the lightweight

resolution-based approach? If it=E2=80=99s just efficiency, = we should look at whether

there=E2=80=99s really that much = to gain.

On efficiency: The parent hashes = are verified in two instances: a) when a

new member joins = the group and validates the public tree it received and b)

when a member receives a full Commit from another member.

Re a): The spec now says "Moreover, when = joining a group, new members MUST

authenticate = each non-blank parent node P. A parent node P is authenticated=E2=80=9D.This means the effort is linear to start with. Depending on = how you define

the operation, it is even worse than = linear. Say the tree has n leaves and

therefore 2n-1 = nodes. For every node, we need to iterate over at least a part

of the subtree and collect public keys that we then = concatenate. We end up

only hashing once per node, but = with a variable size payload. I think the

cost of = navigating the subtree and collecting the keys is O(log n) for the

=E2=80=9Cresolution approach and O(n) for the =E2=80=9Cdescenda= nts" approach and the cost of

hashing is linearly = proportional to the size of the payload. If we assume

that = hashing is the most expensive operation, the total cost would be O(n = *

log n) for =E2=80=9Cresolution=E2=80=9D and O(n ^ 2) for = =E2=80=9Cdescendants".

Re b): It is very = similar to a), except that we only have to iterate over

log = n nodes of the direct path, not the whole tree. That means that the

total effort is O(log n * log n) for =E2=80=9Cresolution=E2=80=9D= and O(n * log n) for

=E2=80=9Cdescendants=E2=80=9D.

There=E2=80=99s clearly a difference in terms of efficiency = between the two

approaches. That being said, we should keep in mind that for = small payloads,

hashing is a lot faster than other cryptographic operations = like e.g.

producing and verifying signatures. Since a new joiner needs = to also verify

n signatures when joining a group, it could = well be that the effort of

verifying = the parent hashes is negligible compared to the cost of verifying

all = signatures.

In summary: I still don=E2=80=99t= have strong arguments against the =E2=80=9Cresolution=E2=80=9D

approach, = but I also have a feeling that the efficiency differences are not

dramatic = either.

Happy New Year and thanks!

RaphaelOn 1. Jan 2021, at 15:48, Mularczyk Marta = <marta.mularczyk@inf.ethz.ch

<mailto:marta.mularczyk@inf.ethz.ch>> wrote:

Hi Raphael,

Thanks = for explaining! Using "descendants" is definitely better for

security, = but also, as Hubert mentioned, less efficient.

Also we did understand =E2=80=9Cresolution=E2=80=9D the same = way as you when we proposed

the strong = parent hash (sorry for the confusion).

But = we obviously really want to understand if/why using resolution is

insecure. = Ultimately, the security property we care about is that privacy

and = authenticity of application messages holds. The Parent Hash = mechanism

helps us ensure this holds also for epochs in adversarially = generated

groups, as long as the adversary does not control any leaves = in this epoch

(i.e. does not know the signing key of the leaf). So to = really see if your

examples break security, it would be good to see if they let = us violate

this security goal.

To see why = we believe the new parent hash notion (i.e. with resolutions)

achieves = this, let us first clarify what technical property about (even

adversarially = produced) ratchet trees we want from verifying parent hashes.

It=E2=80=99s not that the adversary can=E2=80=99t produce = manipulated trees at all. It=E2=80=99s

that the only = manipulations she is able to do will not let her produce a

tree that violates the tree invariant. That=E2=80=99s because = (in our model) when

the tree invariant holds then we can show that the epoch is = secure.

So some manipulations *are* = possible. They are just =E2=80=9Cbenign=E2=80=9D. For example,

suppose adversary A controls a leaf L with HPKE key pk in = tree T (i.e. she

knows the signing key of L=E2=80=99s = owner). Then A can construct a new tree T=E2=80=99

identical to = T but where leaf L has pk=E2=80=99 !=3D pk (say, by performing a = commit

at= leaf L to obtain T=E2=80=99). But for this, in T=E2=80=99 leaf L must = still be

controlled by A. We call this manipulation =E2=80=9Cbenign=E2=80= =9D because even though A

might now = break privacy of application messages in epoch with tree T=E2=80=99 = it

doesnt= violate our security goal because privacy is conditioned on A = *not*

controlling any leaves.

Conversely, one thing A can *not* do is the following. Say = T=E2=80=99 has a node V

with HPKE public key pk0 such that = A knows corresponding sk0. Then for T=E2=80=99

to pass = verification, it must contain a leaf controlled by A. This holds

since for = T=E2=80=99 to pass verification there must be a leaf in V=E2=80=99s = subtree with

a signature covering a hash that includes pk0. I.e. some leaf = must

=E2=80=9Cattest=E2=80=9D to having inserted pk0 into = T=E2=80=99. (Recall that in our model the

adversary = can=E2=80=99t just leak sk0 alone. They must leak a full state of = some

client which also means leaking the client's signing = key.)

So our question to you is the = following. (Does this make sense? And) we=E2=80=99re

not exactly = sure what you have in mind when you say =E2=80=9Cswapping out two = leaf

nodes=E2=80=9D. But we think it would only be a problem if it = is not benign. Is

this true for your attack? Could you expand a bit on what = tree

manipulation you have in mind?

Best, Jo=C3=ABl& Marta

---------------------------------------------------------------= -----------------

Hubert Chathi <hubertc@matrix.org <mailto:hubertc@matrix.org>> *Sent:*<= /blockquote>

Thursday, = December 31, 2020 5:36:53 PM *To:* mls@ietf.org

<mailto:mls@ietf.org> = *Subject:* Re: [MLS] New Parent Hash

I = think that if you use "descendants" as you define it, then = calculating

the parent_hash will become O(n) all the time, rather than = O(log n) in the

best-case, since the size of the = descendants list will be O(n). I'm not

sure if = there's a good way to maintain O(log n) and still provide the

desired = security.

On Thu, 31 Dec 2020, at 06:47, = Raphael Robert wrote:Hi Jo=C3=ABl, Daniel and Marta,<https://www.ietf.org/mailman/listinfo/mls>

I = just implemented the stronger parent hash scheme you proposed in = the

way = it is described in the spec now. The good news is that it is

implementable,= the bad news is that is does not solve the problem you

initially = describe. Specifically, swapping out two leaf nodes still goes

undetected.

I think it is due to a terminology conflict, = but it would be great if

you could validate that = assumption. In 7.4 you introduce the notion of

=E2=80=9Corigi= nal child resolution=E2=80=9D as " the array of HPKEPublicKey values = of

the nodes in the resolution of S=E2=80=9D. The term = =E2=80=9Cresolution=E2=80=9D is defined in

5.2. It is = essentially a way of dealing with blank nodes by substituting

them with the first non-blank descendants you find when = descending the

subtree of a node. Here is my assumption: = You probably thought the

resolution would cover all = non-blank descendants of a node, not just the

first = non-blank descendant. In that case we would need another function

that does that (let=E2=80=99s call it =E2=80=9Cdescendants=E2=80= =9D).

If we look at the example of 5.2:

=16=16 - The resolution of node 3 is the list = [A, CD, C] - The descendants

of node 3 is the list [A, CD, = C, D]

=16=16 - The resolution of node 5 is = the list [CD] - The descendants of node

5 is the = list [CD, C, D] (we just need to additionally filter out C,

because it = is an unmerged leaf of node 5)

Here is a = code snippet of what the =E2=80=9Cdescendants=E2=80=9D function would = look

like (sorry that it=E2=80=99s Rust, not Python):

fn descendants(x: NodeIndex, size: LeafIndex) = -> Vec<NodeIndex> { if

level(x) =3D=3D= 0 { vec![x] } else { let left_child =3D left(x); let

right_child = =3D right(x, size); [ descendants(left_child, size), vec![x],

descendants(right_child, size), ] .concat() } }

There=E2=80=99s a more efficient = implementation of course, this is just to

illustrate.

In other words, the new =E2=80=9Coriginal = child resolution=E2=80=9D of S would be the

subtree = under S, but with filtered-out blank nodes and unmerged leaves

of S.

If my assumption is = correct, I=E2=80=99ll follow up quickly with a PR since this

is currently = a blocker for interop based on draft-11.

Thanks

RaphaelOn 12. Nov 2020, at 22:14, Joel Alwen <jalwen@wickr.com

<mailto:jalwen@wickr.com>> wrote:

Hey people,

Just a quick heads = up that Daniel, Marta and I put in a PR with a new

version of = strong parent hash to prevent the attacks from the earlier

thread. We = tried to make it as minimal as we could to help with

deniability = while still preventing the attacks permitted by weak

parent = hashes.

In a nutshell, when computing = parent_hash at node v with parent p and

sibling w we = include - p's HPKE pubkey - p's parent_hash value and -

HPKE pub = keys in the resolution of w except for those belonging to

leaves = unmerged at p.

As a sanity check, notice = that as long as p's keys remain the same one

can always = recompute the same parent_hash value at v as was initially

computed by = the member that set p's keys. (In other words, new members

can verify = that the stored parent_hash values match whats in the

tree.)= In particular, that's coz as long as p's keys are unchanged so

is the resolution of w. The only exception are leaves being = added as

unmerged at w. But those leaves are also added as = unmerged at p so they

are left out of the hash.

As for deniability, at least parent_hash only = binds HPKE keys and

nothing else (like, say, credentials or signatures). Its the = best we

were able to come up with for now...

- jo=C3=ABl

_______________________________________________ MLS mailing = list

MLS@ietf.org <mailto:MLS@ietf.org>

https://www.ietf.org/mailman/listinfo/mls<https://www.ietf.org/mailman/listinfo/mls>

_______________________________________________ MLS mailing = list

MLS@ietf.org <mailto:MLS@ietf.org>

https://www.ietf.org/mailman/listinfo/mls

_______________________________________________ MLS mailing = list

MLS@ietf.org <mailto:MLS@ietf.org>

https://www.ietf.org/mailman/listinfo/mls

<https://www.ietf.org/mailman/listinfo/mls>

= --Apple-Mail=_F6418678-4B60-4861-9A61-36A84FB4689F-- From nobody Sun Jan 3 00:06:05 2021 Return-Path:

1 pull requests submitted:

- #453 Use t= he GroupContext to derive the joiner_secret (by ericcornelissen)

- https://github.= com/mlswg/mls-architecture
- https://github.com/= mlswg/mls-protocol
- https://github.co= m/mlswg/mls-federation