idnits 2.17.1 draft-ietf-mls-protocol-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 4 instances of too long lines in the document, the longest one being 3 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (November 15, 2019) is 1624 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'I-D.ietf-mls-architecture' is mentioned on line 327, but not defined == Missing Reference: 'C' is mentioned on line 679, but not defined == Missing Reference: 'CD' is mentioned on line 679, but not defined == Missing Reference: 'A' is mentioned on line 679, but not defined -- Looks like a reference, but probably isn't: '0' on line 837 -- Looks like a reference, but probably isn't: '1' on line 835 -- Looks like a reference, but probably isn't: '2' on line 833 == Outdated reference: A later version (-12) exists of draft-irtf-cfrg-hpke-00 == Outdated reference: A later version (-42) exists of draft-ietf-trans-rfc6962-bis-33 Summary: 1 error (**), 0 flaws (~~), 7 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group R. Barnes 3 Internet-Draft Cisco 4 Intended status: Informational B. Beurdouche 5 Expires: May 18, 2020 Inria 6 J. Millican 7 Facebook 8 E. Omara 9 Google 10 K. Cohn-Gordon 11 University of Oxford 12 R. Robert 13 Wire 14 November 15, 2019 16 The Messaging Layer Security (MLS) Protocol 17 draft-ietf-mls-protocol-08 19 Abstract 21 Messaging applications are increasingly making use of end-to-end 22 security mechanisms to ensure that messages are only accessible to 23 the communicating endpoints, and not to any servers involved in 24 delivering messages. Establishing keys to provide such protections 25 is challenging for group chat settings, in which more than two 26 clients need to agree on a key but may not be online at the same 27 time. In this document, we specify a key establishment protocol that 28 provides efficient asynchronous group key establishment with forward 29 secrecy and post-compromise security for groups in size ranging from 30 two to thousands. 32 Status of This Memo 34 This Internet-Draft is submitted in full conformance with the 35 provisions of BCP 78 and BCP 79. 37 Internet-Drafts are working documents of the Internet Engineering 38 Task Force (IETF). Note that other groups may also distribute 39 working documents as Internet-Drafts. The list of current Internet- 40 Drafts is at https://datatracker.ietf.org/drafts/current/. 42 Internet-Drafts are draft documents valid for a maximum of six months 43 and may be updated, replaced, or obsoleted by other documents at any 44 time. It is inappropriate to use Internet-Drafts as reference 45 material or to cite them other than as "work in progress." 47 This Internet-Draft will expire on May 18, 2020. 49 Copyright Notice 51 Copyright (c) 2019 IETF Trust and the persons identified as the 52 document authors. All rights reserved. 54 This document is subject to BCP 78 and the IETF Trust's Legal 55 Provisions Relating to IETF Documents 56 (https://trustee.ietf.org/license-info) in effect on the date of 57 publication of this document. Please review these documents 58 carefully, as they describe your rights and restrictions with respect 59 to this document. Code Components extracted from this document must 60 include Simplified BSD License text as described in Section 4.e of 61 the Trust Legal Provisions and are provided without warranty as 62 described in the Simplified BSD License. 64 Table of Contents 66 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 67 1.1. Change Log . . . . . . . . . . . . . . . . . . . . . . . 4 68 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 7 69 3. Basic Assumptions . . . . . . . . . . . . . . . . . . . . . . 7 70 4. Protocol Overview . . . . . . . . . . . . . . . . . . . . . . 8 71 5. Ratchet Trees . . . . . . . . . . . . . . . . . . . . . . . . 12 72 5.1. Tree Computation Terminology . . . . . . . . . . . . . . 12 73 5.2. Ratchet Tree Nodes . . . . . . . . . . . . . . . . . . . 14 74 5.3. Views of a Ratchet Tree . . . . . . . . . . . . . . . . . 15 75 5.4. Ratchet Tree Updates . . . . . . . . . . . . . . . . . . 16 76 5.5. Synchronizing Views of the Tree . . . . . . . . . . . . . 17 77 6. Cryptographic Objects . . . . . . . . . . . . . . . . . . . . 19 78 6.1. Ciphersuites . . . . . . . . . . . . . . . . . . . . . . 19 79 6.1.1. Curve25519, SHA-256, and AES-128-GCM . . . . . . . . 20 80 6.1.2. P-256, SHA-256, and AES-128-GCM . . . . . . . . . . . 20 81 6.2. Credentials . . . . . . . . . . . . . . . . . . . . . . . 21 82 6.3. Tree Hashes . . . . . . . . . . . . . . . . . . . . . . . 23 83 6.4. Group State . . . . . . . . . . . . . . . . . . . . . . . 24 84 6.5. Direct Paths . . . . . . . . . . . . . . . . . . . . . . 25 85 6.6. Key Schedule . . . . . . . . . . . . . . . . . . . . . . 26 86 6.7. Encryption Keys . . . . . . . . . . . . . . . . . . . . . 28 87 7. Initialization Keys . . . . . . . . . . . . . . . . . . . . . 29 88 7.1. Supported Versions and Supported Ciphersuites . . . . . . 30 89 7.2. Expiration . . . . . . . . . . . . . . . . . . . . . . . 31 90 8. Message Framing . . . . . . . . . . . . . . . . . . . . . . . 31 91 8.1. Metadata Encryption . . . . . . . . . . . . . . . . . . . 33 92 8.2. Content Signing and Encryption . . . . . . . . . . . . . 34 93 9. Group Creation . . . . . . . . . . . . . . . . . . . . . . . 35 94 10. Group Evolution . . . . . . . . . . . . . . . . . . . . . . . 37 95 10.1. Proposals . . . . . . . . . . . . . . . . . . . . . . . 37 96 10.1.1. Add . . . . . . . . . . . . . . . . . . . . . . . . 38 97 10.1.2. Update . . . . . . . . . . . . . . . . . . . . . . . 39 98 10.1.3. Remove . . . . . . . . . . . . . . . . . . . . . . . 39 99 10.1.4. External Proposals . . . . . . . . . . . . . . . . . 40 100 10.2. Commit . . . . . . . . . . . . . . . . . . . . . . . . . 40 101 10.2.1. Welcoming New Members . . . . . . . . . . . . . . . 43 102 11. Sequencing of State Changes . . . . . . . . . . . . . . . . . 45 103 11.1. Server-Enforced Ordering . . . . . . . . . . . . . . . . 46 104 11.2. Client-Enforced Ordering . . . . . . . . . . . . . . . . 47 105 12. Application Messages . . . . . . . . . . . . . . . . . . . . 47 106 12.1. Tree of Application Secrets . . . . . . . . . . . . . . 48 107 12.2. Sender Ratchets . . . . . . . . . . . . . . . . . . . . 49 108 12.3. Deletion Schedule . . . . . . . . . . . . . . . . . . . 49 109 12.4. Further Restrictions . . . . . . . . . . . . . . . . . . 51 110 12.5. Message Encryption and Decryption . . . . . . . . . . . 51 111 12.6. Delayed and Reordered Application messages . . . . . . . 52 112 13. Security Considerations . . . . . . . . . . . . . . . . . . . 53 113 13.1. Confidentiality of the Group Secrets . . . . . . . . . . 53 114 13.2. Authentication . . . . . . . . . . . . . . . . . . . . . 53 115 13.3. Forward and post-compromise security . . . . . . . . . . 54 116 13.4. Init Key Reuse . . . . . . . . . . . . . . . . . . . . . 54 117 14. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 54 118 14.1. MLS Ciphersuites . . . . . . . . . . . . . . . . . . . . 54 119 15. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 55 120 16. References . . . . . . . . . . . . . . . . . . . . . . . . . 56 121 16.1. Normative References . . . . . . . . . . . . . . . . . . 56 122 16.2. Informative References . . . . . . . . . . . . . . . . . 57 123 Appendix A. Tree Math . . . . . . . . . . . . . . . . . . . . . 58 124 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 61 126 1. Introduction 128 DISCLAIMER: This is a work-in-progress draft of MLS and has not yet 129 seen significant security analysis. It should not be used as a basis 130 for building production systems. 132 RFC EDITOR: PLEASE REMOVE THE FOLLOWING PARAGRAPH The source for this 133 draft is maintained in GitHub. Suggested changes should be submitted 134 as pull requests at https://github.com/mlswg/mls-protocol. 135 Instructions are on that page as well. Editorial changes can be 136 managed in GitHub, but any substantive change should be discussed on 137 the MLS mailing list. 139 A group of users who want to send each other encrypted messages needs 140 a way to derive shared symmetric encryption keys. For two parties, 141 this problem has been studied thoroughly, with the Double Ratchet 142 emerging as a common solution [doubleratchet] [signal]. Channels 143 implementing the Double Ratchet enjoy fine-grained forward secrecy as 144 well as post-compromise security, but are nonetheless efficient 145 enough for heavy use over low-bandwidth networks. 147 For a group of size greater than two, a common strategy is to 148 unilaterally broadcast symmetric "sender" keys over existing shared 149 symmetric channels, and then for each member to send messages to the 150 group encrypted with their own sender key. Unfortunately, while this 151 improves efficiency over pairwise broadcast of individual messages 152 and provides forward secrecy (with the addition of a hash ratchet), 153 it is difficult to achieve post-compromise security with sender keys. 154 An adversary who learns a sender key can often indefinitely and 155 passively eavesdrop on that member's messages. Generating and 156 distributing a new sender key provides a form of post-compromise 157 security with regard to that sender. However, it requires 158 computation and communications resources that scale linearly with the 159 size of the group. 161 In this document, we describe a protocol based on tree structures 162 that enable asynchronous group keying with forward secrecy and post- 163 compromise security. Based on earlier work on "asynchronous 164 ratcheting trees" [art], the protocol presented here uses an 165 asynchronous key-encapsulation mechanism for tree structures. This 166 mechanism allows the members of the group to derive and update shared 167 keys with costs that scale as the log of the group size. 169 1.1. Change Log 171 RFC EDITOR PLEASE DELETE THIS SECTION. 173 draft-08 175 o Change ClientInitKeys so that they only refer to one ciphersuite 176 (*) 178 o Decompose group operations into Proposals and Commits (*) 180 o Enable Add and Remove proposals from outside the group (*) 182 o Replace Init messages with multi-recipient Welcome message (*) 184 o Add extensions to ClientInitKeys for expiration and downgrade 185 resistance (*) 187 o Allow multiple Proposals and a single Commit in one MLSPlaintext 188 (*) 190 draft-07 191 o Initial version of the Tree based Application Key Schedule (*) 193 o Initial definition of the Init message for group creation (*) 195 o Fix issue with the transcript used for newcomers (*) 197 o Clarifications on message framing and HPKE contexts (*) 199 draft-06 201 o Reorder blanking and update in the Remove operation (*) 203 o Rename the GroupState structure to GroupContext (*) 205 o Rename UserInitKey to ClientInitKey 207 o Resolve the circular dependency that draft-05 introduced in the 208 confirmation MAC calculation (*) 210 o Cover the entire MLSPlaintext in the transcript hash (*) 212 draft-05 214 o Common framing for handshake and application messages (*) 216 o Handshake message encryption (*) 218 o Convert from literal state to a commitment via the "tree hash" (*) 220 o Add credentials to the tree and remove the "roster" concept (*) 222 o Remove the secret field from tree node values 224 draft-04 226 o Updating the language to be similar to the Architecture document 228 o ECIES is now renamed in favor of HPKE (*) 230 o Using a KDF instead of a Hash in TreeKEM (*) 232 draft-03 234 o Added ciphersuites and signature schemes (*) 236 o Re-ordered fields in UserInitKey to make parsing easier (*) 238 o Fixed inconsistencies between Welcome and GroupState (*) 239 o Added encryption of the Welcome message (*) 241 draft-02 243 o Removed ART (*) 245 o Allowed partial trees to avoid double-joins (*) 247 o Added explicit key confirmation (*) 249 draft-01 251 o Initial description of the Message Protection mechanism. (*) 253 o Initial specification proposal for the Application Key Schedule 254 using the per-participant chaining of the Application Secret 255 design. (*) 257 o Initial specification proposal for an encryption mechanism to 258 protect Application Messages using an AEAD scheme. (*) 260 o Initial specification proposal for an authentication mechanism of 261 Application Messages using signatures. (*) 263 o Initial specification proposal for a padding mechanism to 264 improving protection of Application Messages against traffic 265 analysis. (*) 267 o Inversion of the Group Init Add and Application Secret derivations 268 in the Handshake Key Schedule to be ease chaining in case we 269 switch design. (*) 271 o Removal of the UserAdd construct and split of GroupAdd into Add 272 and Welcome messages (*) 274 o Initial proposal for authenticating handshake messages by signing 275 over group state and including group state in the key schedule (*) 277 o Added an appendix with example code for tree math 279 o Changed the ECIES mechanism used by TreeKEM so that it uses nonces 280 generated from the shared secret 282 draft-00 284 o Initial adoption of draft-barnes-mls-protocol-01 as a WG item. 286 2. Terminology 288 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 289 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 290 "OPTIONAL" in this document are to be interpreted as described in BCP 291 14 [RFC2119] [RFC8174] when, and only when, they appear in all 292 capitals, as shown here. 294 Client: An agent that uses this protocol to establish shared 295 cryptographic state with other clients. A client is defined by 296 the cryptographic keys it holds. An application or user may use 297 one client per device (keeping keys local to each device) or sync 298 keys among a user's devices so that each user appears as a single 299 client. In the scenario where multiple devices share the 300 cryptographic material the client is referred to as a "virtual" 301 client. 303 Group: A collection of clients with shared cryptographic state. 305 Member: A client that is included in the shared state of a group, 306 hence has access to the group's secrets. 308 Initialization Key: A short-lived HPKE key pair used to introduce a 309 new client to a group. Initialization keys are published for each 310 client (ClientInitKey). 312 Leaf Key: A secret that represents a member's contribution to the 313 group secret (so called because the members' leaf keys are the 314 leaves in the group's ratchet tree). 316 Identity Key: A long-lived signing key pair used to authenticate the 317 sender of a message. 319 Terminology specific to tree computations is described in Section 5. 321 We use the TLS presentation language [RFC8446] to describe the 322 structure of protocol messages. 324 3. Basic Assumptions 326 This protocol is designed to execute in the context of a Messaging 327 Service (MS) as described in [I-D.ietf-mls-architecture]. In 328 particular, we assume the MS provides the following services: 330 o A long-term identity key provider which allows clients to 331 authenticate protocol messages in a group. These keys MUST be 332 kept for the lifetime of the group as there is no mechanism in the 333 protocol for changing a client's identity key. 335 o A broadcast channel, for each group, which will relay a message to 336 all members of a group. For the most part, we assume that this 337 channel delivers messages in the same order to all participants. 338 (See Section 11 for further considerations.) 340 o A directory to which clients can publish initialization keys and 341 download initialization keys for other participants. 343 4. Protocol Overview 345 The goal of this protocol is to allow a group of clients to exchange 346 confidential and authenticated messages. It does so by deriving a 347 sequence of secrets and keys known only to members. Those should be 348 secret against an active network adversary and should have both 349 forward and post-compromise secrecy with respect to compromise of a 350 participant. 352 We describe the information stored by each client as a _state_, which 353 includes both public and private data. An initial state, including 354 an initial set of clients, is set up by a group creator using the 355 _Init_ algorithm and based on information pre-published by clients. 356 The creator sends the _Init_ message to the clients, who can then set 357 up their own group state and derive the same shared secret. Clients 358 then exchange messages to produce new shared states which are 359 causally linked to their predecessors, forming a logical Directed 360 Acyclic Graph (DAG) of states. Members can send _Update_ messages 361 for post-compromise secrecy and new clients can be added or existing 362 members removed from the group. 364 The protocol algorithms we specify here follow. Each algorithm 365 specifies both (i) how a client performs the operation and (ii) how 366 other clients update their state based on it. 368 There are three major operations in the lifecycle of a group: 370 o Adding a member, initiated by a current member; 372 o Updating the leaf secret of a member; 374 o Removing a member. 376 Each of these operations is "proposed" by sending a message of the 377 corresponding type (Add / Update / Remove). The state of the group 378 is not changed, however, until a Commit message is sent to provide 379 the group with fresh entropy. In this section, we show each proposal 380 being committed immediately, but in more advanced deployment cases, 381 an application might gather several proposals before committing them 382 all at once. 384 Before the initialization of a group, clients publish ClientInitKey 385 objects to a directory provided to the Messaging Service. 387 Group 388 A B C Directory Channel 389 | | | | | 390 | ClientInitKeyA | | | | 391 |------------------------------------------------->| | 392 | | | | | 393 | | ClientInitKeyB | | | 394 | |-------------------------------->| | 395 | | | | | 396 | | | ClientInitKeyC | | 397 | | |--------------->| | 398 | | | | | 400 When a client A wants to establish a group with B and C, it first 401 downloads ClientInitKeys for B and C. It then initializes a group 402 state containing only itself and uses the ClientInitKeys to compute 403 Welcome and Add messages to add B and C, in a sequence chosen by A. 404 The Welcome messages are sent directly to the new members (there is 405 no need to send them to the group). The Add messages are broadcasted 406 to the group, and processed in sequence by B and C. Messages 407 received before a client has joined the group are ignored. Only 408 after A has received its Add messages back from the server does it 409 update its state to reflect their addition. 411 Group 412 A B C Directory Channel 413 | | | | | 414 | ClientInitKeyB, ClientInitKeyC | | 415 |<-------------------------------------------| | 416 |state.init() | | | | 417 | | | | | 418 | | | | Add(A->AB) | 419 | | | | Commit(Add) | 420 |--------------------------------------------------------------->| 421 | | | | | 422 | Welcome(B) | | | | 423 |------------->|state.init() | | | 424 | | | | | 425 | | | | Add(A->AB) | 426 | | | | Commit(Add) | 427 |<---------------------------------------------------------------| 428 |state.add(B) |<------------------------------------------------| 429 | |state.join() | | | 430 | | | | | 431 | | | | Add(AB->ABC) | 432 | | | | Commit(Add) | 433 |--------------------------------------------------------------->| 434 | | | | | 435 | | Welcome(C) | | | 436 |---------------------------->|state.init() | | 437 | | | | | 438 | | | | Add(AB->ABC) | 439 | | | | Commit(Add) | 440 |<---------------------------------------------------------------| 441 |state.add(C) |<------------------------------------------------| 442 | |state.add(C) |<---------------------------------| 443 | | |state.join() | | 445 Subsequent additions of group members proceed in the same way. Any 446 member of the group can download an ClientInitKey for a new client 447 and broadcast an Add message that the current group can use to update 448 their state and the new client can use to initialize its state. 450 To enforce forward secrecy and post-compromise security of messages, 451 each member periodically updates its leaf secret which represents its 452 contribution to the group secret. Any member of the group can send 453 an Update at any time by generating a fresh leaf secret and sending 454 an Update message that describes how to update the group secret with 455 that new information. Once all members have processed this message, 456 the group's secrets will be unknown to an attacker that had 457 compromised the sender's prior leaf secret. 459 It is left to the application to determine the interval of time 460 between Update messages. This policy could require a change for each 461 message, or it could require sending an update every week or more. 463 Group 464 A B ... Z Directory Channel 465 | | | | | 466 | Update(A) | | | | 467 | Commit(Upd) | | | | 468 |---------------------------------------------------------->| 469 | | | | | 470 | | | | Update(A) | 471 | | | | Commit(Upd) | 472 |<----------------------------------------------------------| 473 |state.upd(A) |<-------------------------------------------| 474 | |state.upd(A) |<----------------------------| 475 | | |state.upd(A) | | 476 | | | | | 478 Members are removed from the group in a similar way, as an update is 479 effectively removing the old leaf from the group. Any member of the 480 group can generate a Remove message that adds new entropy to the 481 group state that is known to all members except the removed member. 482 After other participants have processed this message, the group's 483 secrets will be unknown to the removed participant. Note that this 484 does not necessarily imply that any member is actually allowed to 485 evict other members; groups can layer authentication-based access 486 control policies on top of these basic mechanism. 488 Group 489 A B ... Z Directory Channel 490 | | | | | 491 | | | Remove(B) | | 492 | | | Commit(Rem) | | 493 | | |---------------------------->| 494 | | | | | 495 | | | | Remove(B) | 496 | | | | Commit(Rem) | 497 |<----------------------------------------------------------| 498 |state.del(B) | |<----------------------------| 499 | | |state.del(B) | | 500 | | | | | 501 | | | | | 503 5. Ratchet Trees 505 The protocol uses "ratchet trees" for deriving shared secrets among a 506 group of clients. 508 5.1. Tree Computation Terminology 510 Trees consist of _nodes_. A node is a _leaf_ if it has no children, 511 and a _parent_ otherwise; note that all parents in our trees have 512 precisely two children, a _left_ child and a _right_ child. A node 513 is the _root_ of a tree if it has no parents, and _intermediate_ if 514 it has both children and parents. The _descendants_ of a node are 515 that node, its children, and the descendants of its children, and we 516 say a tree _contains_ a node if that node is a descendant of the root 517 of the tree. Nodes are _siblings_ if they share the same parent. 519 A _subtree_ of a tree is the tree given by the descendants of any 520 node, the _head_ of the subtree. The _size_ of a tree or subtree is 521 the number of leaf nodes it contains. For a given parent node, its 522 _left subtree_ is the subtree with its left child as head 523 (respectively _right subtree_). 525 All trees used in this protocol are left-balanced binary trees. A 526 binary tree is _full_ (and _balanced_) if its size is a power of two 527 and for any parent node in the tree, its left and right subtrees have 528 the same size. If a subtree is full and it is not a subset of any 529 other full subtree, then it is _maximal_. 531 A binary tree is _left-balanced_ if for every parent, either the 532 parent is balanced, or the left subtree of that parent is the largest 533 full subtree that could be constructed from the leaves present in the 534 parent's own subtree. Note that given a list of "n" items, there is 535 a unique left-balanced binary tree structure with these elements as 536 leaves. In such a left-balanced tree, the "k-th" leaf node refers to 537 the "k-th" leaf node in the tree when counting from the left, 538 starting from 0. 540 The _direct path_ of a root is the empty list, and of any other node 541 is the concatenation of that node with the direct path of its parent. 542 The _copath_ of a node is the list of siblings of nodes in its direct 543 path. The _frontier_ of a tree is the list of heads of the maximal 544 full subtrees of the tree, ordered from left to right. 546 For example, in the below tree: 548 o The direct path of C is (C, CD, ABCD) 550 o The copath of C is (D, AB, EFG) 551 o The frontier of the tree is (ABCD, EF, G) 553 ABCDEFG 554 / \ 555 / \ 556 / \ 557 ABCD EFG 558 / \ / \ 559 / \ / \ 560 AB CD EF | 561 / \ / \ / \ | 562 A B C D E F G 564 1 1 1 565 0 1 2 3 4 5 6 7 8 9 0 1 2 567 Each node in the tree is assigned an _node index_, starting at zero 568 and running from left to right. A node is a leaf node if and only if 569 it has an even index. The node indices for the nodes in the above 570 tree are as follows: 572 o 0 = A 574 o 1 = AB 576 o 2 = B 578 o 3 = ABCD 580 o 4 = C 582 o 5 = CD 584 o 6 = D 586 o 7 = ABCDEFG 588 o 8 = E 590 o 9 = EF 592 o 10 = F 594 o 11 = EFG 596 o 12 = G 597 (Note that left-balanced binary trees are the same structure that is 598 used for the Merkle trees in the Certificate Transparency protocol 599 [I-D.ietf-trans-rfc6962-bis].) 601 The leaves of the tree are indexed separately, using a _leaf index_, 602 since the protocol messages only need to refer to leaves in the tree. 603 Like nodes, leaves are numbered left to right. Note that given the 604 above numbering, a node is a leaf node if and only if it has an even 605 node index, and a leaf node's leaf index is half its node index. The 606 leaf indices in the above tree are as follows: 608 o 0 = A 610 o 1 = B 612 o 2 = C 614 o 3 = D 616 o 4 = E 618 o 5 = F 620 o 6 = G 622 5.2. Ratchet Tree Nodes 624 A particular instance of a ratchet tree is based on the following 625 cryptographic primitives, defined by the ciphersuite in use: 627 o An HPKE ciphersuite, which specifies a Key Encapsulation Method 628 (KEM), an AEAD encryption scheme, and a hash function 630 o A Derive-Key-Pair function that produces an asymmetric key pair 631 for the specified KEM from a symmetric secret, using the specified 632 hash function. 634 Each node in a ratchet tree contains up to three values: 636 o A private key (only within direct path, see below) 638 o A public key 640 o An ordered list of leaf indices for "unmerged" leaves (see 641 Section 5.3) 643 o A credential (only for leaf nodes) 644 The conditions under which each of these values must or must not be 645 present are laid out in Section 5.3. 647 A node in the tree may also be _blank_, indicating that no value is 648 present at that node. The _resolution_ of a node is an ordered list 649 of non-blank nodes that collectively cover all non-blank descendants 650 of the node. 652 o The resolution of a non-blank node comprises the node itself, 653 followed by its list of unmerged leaves, if any 655 o The resolution of a blank leaf node is the empty list 657 o The resolution of a blank intermediate node is the result of 658 concatenating the resolution of its left child with the resolution 659 of its right child, in that order 661 For example, consider the following tree, where the "_" character 662 represents a blank node: 664 _ 665 / \ 666 / \ 667 _ CD[C] 668 / \ / \ 669 A _ C D 671 0 1 2 3 4 5 6 673 In this tree, we can see all of the above rules in play: 675 o The resolution of node 5 is the list [CD, C] 677 o The resolution of node 2 is the empty list [] 679 o The resolution of node 3 is the list [A, CD, C] 681 Every node, regardless of whether the node is blank or populated, has 682 a corresponding _hash_ that summarizes the contents of the subtree 683 below that node. The rules for computing these hashes are described 684 in Section 6.3. 686 5.3. Views of a Ratchet Tree 688 We generally assume that each participant maintains a complete and 689 up-to-date view of the public state of the group's ratchet tree, 690 including the public keys for all nodes and the credentials 691 associated with the leaf nodes. 693 No participant in an MLS group has full knowledge of the secret state 694 of the tree, i.e., private keys associated to the nodes. Instead, 695 each member is assigned to a leaf of the tree, which determines the 696 set of secret state known to the member. The credential stored at 697 that leaf is one provided by the member. 699 In particular, MLS maintains the members' views of the tree in such a 700 way as to maintain the _tree invariant_: 702 The private key for a node in the tree is known to a member of 703 the group only if that member's leaf is a descendant of 704 the node or equal to it. 706 In other words, if a node is not blank, then it holds a key pair, and 707 the private key of that key pair is known only to members holding 708 leaves below that node. 710 The reverse implication is not true: A member may not know the 711 private keys of all the intermediate nodes they're below. Such a 712 member has an _unmerged_ leaf. Encrypting to an intermediate node 713 requires encrypting to the node's public key, as well as the public 714 keys of all the unmerged leaves below it. A leaf is unmerged when it 715 is first added, because the process of adding the leaf does not give 716 it access to all of the nodes above it in the tree. Leaves are 717 "merged" as they receive the private keys for nodes, as described in 718 Section 5.4. 720 5.4. Ratchet Tree Updates 722 Nodes in a tree are always updated along the direct path from a leaf 723 to the root. The generator of the update chooses a random secret 724 value "path_secret[0]", and generates a sequence of "path secrets", 725 one for each node from the leaf to the root. That is, path_secret[0] 726 is used for the leaf, path_secret[1] for its parent, and so on. At 727 each step, the path secret is used to derive a new secret value for 728 the corresponding node, from which the node's key pair is derived. 730 path_secret[n] = HKDF-Expand-Label(path_secret[n-1], 731 "path", "", Hash.Length) 732 node_secret[n] = HKDF-Expand-Label(path_secret[n], 733 "node", "", Hash.Length) 734 node_priv[n], node_pub[n] = Derive-Key-Pair(node_secret[n]) 736 For example, suppose there is a group with four members: 738 G 739 / \ 740 / \ 741 / \ 742 E _ 743 / \ / \ 744 A B C D 746 If the second participant (B) subsequently generates an update based 747 on a secret X, then the sender would generate the following sequence 748 of path secrets and node secrets: 750 path_secret[2] ---> node_secret[2] 751 ^ 752 | 753 path_secret[1] ---> node_secret[1] 754 ^ 755 | 756 X = path_secret[0] ---> node_secret[0] 758 After the update, the tree will have the following structure, where 759 "ns[i]" represents the node_secret values generated as described 760 above: 762 ns[2] 763 / \ 764 ns[1] _ 765 / \ / \ 766 A ns[0] C D 768 5.5. Synchronizing Views of the Tree 770 The members of the group need to keep their views of the tree in sync 771 and up to date. When a client proposes a change to the tree (e.g., 772 to add or remove a member), it transmits a handshake message 773 containing a set of public values for intermediate nodes in the 774 direct path of a leaf. The other members of the group can use these 775 public values to update their view of the tree, aligning their copy 776 of the tree to the sender's. 778 To perform an update for a leaf, the sender broadcasts to the group 779 the following information for each node in the direct path of the 780 leaf, as well as the root: 782 o The public key for the node 784 o Zero or more encrypted copies of the path secret corresponding to 785 the node 787 The path secret value for a given node is encrypted for the subtree 788 corresponding to the parent's non-updated child, i.e., the child on 789 the copath of the leaf node. There is one encrypted path secret for 790 each public key in the resolution of the non-updated child. In 791 particular, for the leaf node, there are no encrypted secrets, since 792 a leaf node has no children. 794 The recipient of an update processes it with the following steps: 796 1. Compute the updated path secrets. 798 * Identify a node in the direct path for which the local member 799 is in the subtree of the non-updated child. 801 * Identify a node in the resolution of the copath node for which 802 this node has a private key. 804 * Decrypt the path secret for the parent of the copath node 805 using the private key from the resolution node. 807 * Derive path secrets for ancestors of that node using the 808 algorithm described above. 810 * The recipient SHOULD verify that the received public keys 811 agree with the public keys derived from the new node_secret 812 values. 814 2. Merge the updated path secrets into the tree. 816 * Replace the public keys for nodes on the direct path with the 817 received public keys. 819 * For nodes where an updated path secret was computed in step 1, 820 compute the corresponding node secret and node key pair and 821 replace the values stored at the node with the computed 822 values. 824 * For all updated nodes, set the list of unmerged leaves to the 825 empty list. 827 For example, in order to communicate the example update described in 828 the previous section, the sender would transmit the following values: 830 +------------+----------------------------------+ 831 | Public Key | Ciphertext(s) | 832 +------------+----------------------------------+ 833 | pk(ns[2]) | E(pk(C), ps[2]), E(pk(D), ps[2]) | 834 | | | 835 | pk(ns[1]) | E(pk(A), ps[1]) | 836 | | | 837 | pk(ns[0]) | | 838 +------------+----------------------------------+ 840 In this table, the value pk(X) represents the public key derived from 841 the node secret X. The value E(K, S) represents the public-key 842 encryption of the path secret S to the public key K. 844 6. Cryptographic Objects 846 6.1. Ciphersuites 848 Each MLS session uses a single ciphersuite that specifies the 849 following primitives to be used in group key computations: 851 o A hash function 853 o A Diffie-Hellman finite-field group or elliptic curve group 855 o An AEAD encryption algorithm [RFC5116] 857 The ciphersuite's Diffie-Hellman group is used to instantiate an HPKE 858 [I-D.irtf-cfrg-hpke] instance for the purpose of public-key 859 encryption. The ciphersuite must specify an algorithm "Derive-Key- 860 Pair" that maps octet strings with length Hash.length to HPKE key 861 pairs. 863 Ciphersuites are represented with the CipherSuite type. HPKE public 864 keys are opaque values in a format defined by the underlying Diffie- 865 Hellman protocol (see the Ciphersuites section of the HPKE 866 specification for more information): 868 enum { 869 P256_SHA256_AES128GCM(0x0000), 870 X25519_SHA256_AES128GCM(0x0001), 871 (0xFFFF) 872 } CipherSuite; 874 opaque HPKEPublicKey<1..2^16-1>; 876 6.1.1. Curve25519, SHA-256, and AES-128-GCM 878 This ciphersuite uses the following primitives: 880 o Hash function: SHA-256 882 o AEAD: AES-128-GCM 884 When HPKE is used with this ciphersuite, it uses the following 885 algorithms: 887 o KEM: 0x0002 = DHKEM(Curve25519) 889 o KDF: 0x0001 = HKDF-SHA256 891 o AEAD: 0x0001 = AES-GCM-128 893 Given an octet string X, the private key produced by the Derive-Key- 894 Pair operation is SHA-256(X). (Recall that any 32-octet string is a 895 valid Curve25519 private key.) The corresponding public key is 896 X25519(SHA-256(X), 9). 898 Implementations SHOULD use the approach specified in [RFC7748] to 899 calculate the Diffie-Hellman shared secret. Implementations MUST 900 check whether the computed Diffie-Hellman shared secret is the all- 901 zero value and abort if so, as described in Section 6 of [RFC7748]. 902 If implementers use an alternative implementation of these elliptic 903 curves, they SHOULD perform the additional checks specified in 904 Section 7 of [RFC7748] 906 6.1.2. P-256, SHA-256, and AES-128-GCM 908 This ciphersuite uses the following primitives: 910 o Hash function: SHA-256 912 o AEAD: AES-128-GCM 914 When HPKE is used with this ciphersuite, it uses the following 915 algorithms: 917 o KEM: 0x0001 = DHKEM(P-256) 919 o KDF: 0x0001 = HKDF-SHA256 921 o AEAD: 0x0001 = AES-GCM-128 922 Given an octet string X, the private key produced by the Derive-Key- 923 Pair operation is SHA-256(X), interpreted as a big-endian integer. 924 The corresponding public key is the result of multiplying the 925 standard P-256 base point by this integer. 927 P-256 ECDH calculations (including parameter and key generation as 928 well as the shared secret calculation) are performed according to 929 [IEEE1363] using the ECKAS-DH1 scheme with the identity map as key 930 derivation function (KDF), so that the shared secret is the 931 x-coordinate of the ECDH shared secret elliptic curve point 932 represented as an octet string. Note that this octet string (Z in 933 IEEE 1363 terminology) as output by FE2OSP, the Field Element to 934 Octet String Conversion Primitive, has constant length for any given 935 field; leading zeros found in this octet string MUST NOT be 936 truncated. 938 (Note that this use of the identity KDF is a technicality. The 939 complete picture is that ECDH is employed with a non-trivial KDF 940 because MLS does not directly use this secret for anything other than 941 for computing other secrets.) 943 Clients MUST validate remote public values by ensuring that the point 944 is a valid point on the elliptic curve. The appropriate validation 945 procedures are defined in Section 4.3.7 of [X962] and alternatively 946 in Section 5.6.2.3 of [keyagreement]. This process consists of three 947 steps: (1) verify that the value is not the point at infinity (O), 948 (2) verify that for Y = (x, y) both integers are in the correct 949 interval, (3) ensure that (x, y) is a correct solution to the 950 elliptic curve equation. For these curves, implementers do not need 951 to verify membership in the correct subgroup. 953 6.2. Credentials 955 A member of a group authenticates the identities of other 956 participants by means of credentials issued by some authentication 957 system, e.g., a PKI. Each type of credential MUST express the 958 following data: 960 o The public key of a signature key pair 962 o The identity of the holder of the private key 964 o The signature scheme that the holder will use to sign MLS messages 966 Credentials MAY also include information that allows a relying party 967 to verify the identity / signing key binding. 969 enum { 970 basic(0), 971 x509(1), 972 (255) 973 } CredentialType; 975 struct { 976 opaque identity<0..2^16-1>; 977 SignatureScheme algorithm; 978 SignaturePublicKey public_key; 979 } BasicCredential; 981 struct { 982 CredentialType credential_type; 983 select (Credential.credential_type) { 984 case basic: 985 BasicCredential; 987 case x509: 988 opaque cert_data<1..2^24-1>; 989 }; 990 } Credential; 992 The SignatureScheme type represents a signature algorithm. Signature 993 public keys are opaque values in a format defined by the signature 994 scheme. 996 enum { 997 ecdsa_secp256r1_sha256(0x0403), 998 ed25519(0x0807), 999 (0xFFFF) 1000 } SignatureScheme; 1002 opaque SignaturePublicKey<1..2^16-1>; 1004 Note that each new credential that has not already been validated by 1005 the application SHOULD be validated against the Authentication 1006 Service. 1008 [[OPEN ISSUE: 1. SHOULD vs MUST. 2. A client that wants to update 1009 its identity key can perform the operation UNDER THIS CONDITION by 1010 adding a new version of herself using a new credential signed under a 1011 new IdentityKey, then performing a remove of the old leaf. This is 1012 fine as long as the credential binds to the same identity for the 1013 application. If this verfication is not met, there is no 1014 authentication guarantee at the application layer anyway.]] 1016 6.3. Tree Hashes 1018 To allow group members to verify that they agree on the cryptographic 1019 state of the group, this section defines a scheme for generating a 1020 hash value that represents the contents of the group's ratchet tree 1021 and the members' credentials. 1023 The hash of a tree is the hash of its root node, which we define 1024 recursively, starting with the leaves. The hash of a leaf node is 1025 the hash of a "LeafNodeHashInput" object: 1027 struct { 1028 uint8 present; 1029 switch (present) { 1030 case 0: struct{}; 1031 case 1: T value; 1032 } 1033 } optional; 1035 struct { 1036 HPKEPublicKey public_key; 1037 Credential credential; 1038 } LeafNodeInfo; 1040 struct { 1041 uint8 hash_type = 0; 1042 optional info; 1043 } LeafNodeHashInput; 1045 The "public_key" and "credential" fields represent the leaf public 1046 key and the credential for the member holding that leaf, 1047 respectively. The "info" field is equal to the null optional value 1048 when the leaf is blank (i.e., no member occupies that leaf). 1050 Likewise, the hash of a parent node (including the root) is the hash 1051 of a "ParentNodeHashInput" struct: 1053 struct { 1054 HPKEPublicKey public_key; 1055 uint32_t unmerged_leaves<0..2^32-1>; 1056 } ParentNodeInfo; 1058 struct { 1059 uint8 hash_type = 1; 1060 optional info; 1061 opaque left_hash<0..255>; 1062 opaque right_hash<0..255>; 1063 } ParentNodeHashInput; 1064 The "left_hash" and "right_hash" fields hold the hashes of the node's 1065 left and right children, respectively. The "public_key" field holds 1066 the hash of the public key stored at this node, represented as an 1067 "optional" object, which is null if and only if the 1068 node is blank. 1070 6.4. Group State 1072 Each member of the group maintains a GroupContext object that 1073 summarizes the state of the group: 1075 struct { 1076 opaque group_id<0..255>; 1077 uint32 epoch; 1078 opaque tree_hash<0..255>; 1079 opaque confirmed_transcript_hash<0..255>; 1080 } GroupContext; 1082 The fields in this state have the following semantics: 1084 o The "group_id" field is an application-defined identifier for the 1085 group. 1087 o The "epoch" field represents the current version of the group key. 1089 o The "tree_hash" field contains a commitment to the contents of the 1090 group's rachet tree and the credentials for the members of the 1091 group, as described in Section 6.3. 1093 o The "confirmed_transcript_hash" field contains a running hash over 1094 the handshake messages that led to this state. 1096 When a new member is added to the group, an existing member of the 1097 group provides the new member with a Welcome message. The Welcome 1098 message provides the information the new member needs to initialize 1099 its GroupContext. 1101 Different changes to the group will have different effects on the 1102 group state. These effects are described in their respective 1103 subsections of Section 10.1. The following general rules apply: 1105 o The "group_id" field is constant 1107 o The "epoch" field increments by one for each Commit message that 1108 is processed 1110 o The "tree_hash" is updated to represent the current tree and 1111 credentials 1113 o The "confirmed_transcript_hash" is updated with the data for an 1114 MLSPlaintext message encoding a Commit message in two parts: 1116 struct { 1117 opaque group_id<0..255>; 1118 uint32 epoch; 1119 uint32 sender; 1120 ContentType content_type = commit; 1121 Proposal proposals<0..2^32-1>; 1122 Commit commit; 1123 } MLSPlaintextCommitContent; 1125 struct { 1126 opaque confirmation<0..255>; 1127 opaque signature<0..2^16-1>; 1128 } MLSPlaintextCommitAuthData; 1130 confirmed_transcript_hash_[n] = 1131 Hash(interim_transcript_hash_[n-1] || 1132 MLSPlaintextCommitContent_[n]); 1134 interim_transcript_hash_[n] = 1135 Hash(confirmed_transcript_hash_[n] || 1136 MLSPlaintextCommitAuthData_[n]); 1138 Thus the "confirmed_transcript_hash" field in a GroupContext object 1139 represents a transcript over the whole history of MLSPlaintext Commit 1140 messages, up to the confirmation field in the current MLSPlaintext 1141 message. The confirmation and signature fields are then included in 1142 the transcript for the next epoch. The interim transcript hash is 1143 passed to new members in the WelcomeInfo struct, and enables existing 1144 members to incorporate a handshake message into the transcript 1145 without having to store the whole MLSPlaintextCommitAuthData 1146 structure. 1148 When a new group is created, the "interim_transcript_hash" field is 1149 set to the zero-length octet string. 1151 6.5. Direct Paths 1153 As described in Section 5.4, each MLS message needs to transmit node 1154 values along the direct path of a leaf. The path contains a public 1155 key for the leaf node, and a public key and encrypted secret value 1156 for intermediate nodes in the path. In both cases, the path is 1157 ordered from the leaf to the root; each node MUST be the parent of 1158 its predecessor. 1160 struct { 1161 opaque kem_output<0..2^16-1>; 1162 opaque ciphertext<0..2^16-1>; 1163 } HPKECiphertext; 1165 struct { 1166 HPKEPublicKey public_key; 1167 HPKECiphertext encrypted_path_secret<0..2^16-1>; 1168 } DirectPathNode; 1170 struct { 1171 DirectPathNode nodes<0..2^16-1>; 1172 } DirectPath; 1174 The length of the "encrypted_path_secret" vector MUST be zero for the 1175 first node in the path. For the remaining elements in the vector, 1176 the number of ciphertexts in the "encrypted_path_secret" vector MUST 1177 be equal to the length of the resolution of the corresponding copath 1178 node. Each ciphertext in the list is the encryption to the 1179 corresponding node in the resolution. 1181 The HPKECiphertext values are computed as 1183 kem_output, context = SetupBaseI(node_public_key, "") 1184 ciphertext = context.Seal(group_context, path_secret) 1186 where "node_public_key" is the public key of the node that the path 1187 secret is being encrypted for, group_context is the current 1188 GroupContext object for the group, and the functions "SetupBaseI" and 1189 "Seal" are defined according to [I-D.irtf-cfrg-hpke]. 1191 Decryption is performed in the corresponding way, using the private 1192 key of the resolution node and the ephemeral public key transmitted 1193 in the message. 1195 6.6. Key Schedule 1197 Group keys are derived using the HKDF-Extract and HKDF-Expand 1198 functions as defined in [RFC5869], as well as the functions defined 1199 below: 1201 HKDF-Expand-Label(Secret, Label, Context, Length) = 1202 HKDF-Expand(Secret, HkdfLabel, Length) 1204 Where HkdfLabel is specified as: 1206 struct { 1207 opaque group_context<0..255> = Hash(GroupContext_[n]); 1208 uint16 length = Length; 1209 opaque label<7..255> = "mls10 " + Label; 1210 opaque context<0..2^32-1> = Context; 1211 } HkdfLabel; 1213 Derive-Secret(Secret, Label) = 1214 HKDF-Expand-Label(Secret, Label, "", Hash.length) 1216 The Hash function used by HKDF is the ciphersuite hash algorithm. 1217 Hash.length is its output length in bytes. In the below diagram: 1219 o HKDF-Extract takes its salt argument from the top and its IKM 1220 argument from the left 1222 o Derive-Secret takes its Secret argument from the incoming arrow 1224 When processing a handshake message, a client combines the following 1225 information to derive new epoch secrets: 1227 o The init secret from the previous epoch 1229 o The update secret for the current epoch 1231 o The GroupContext object for current epoch 1233 Given these inputs, the derivation of secrets for an epoch proceeds 1234 as shown in the following diagram: 1236 init_secret_[n-1] (or 0) 1237 | 1238 V 1239 update_secret -> HKDF-Extract = epoch_secret 1240 | 1241 +--> Derive-Secret(., "sender data", GroupContext_[n]) 1242 | = sender_data_secret 1243 | 1244 +--> Derive-Secret(., "handshake", GroupContext_[n]) 1245 | = handshake_secret 1246 | 1247 +--> Derive-Secret(., "app", GroupContext_[n]) 1248 | = application_secret 1249 | 1250 +--> Derive-Secret(., "confirm", GroupContext_[n]) 1251 | = confirmation_key 1252 | 1253 V 1254 Derive-Secret(., "init", GroupContext_[n]) 1255 | 1256 V 1257 init_secret_[n] 1259 6.7. Encryption Keys 1261 As described in Section 8, MLS encrypts three different types of 1262 information: 1264 o Metadata (sender information) 1266 o Proposal and Commit messages 1268 o Application messages 1270 The sender information used to look up the key for the content 1271 encryption is encrypted under AEAD using a random nonce and the 1272 sender_data_key which is derived from the sender_data_secret as 1273 follows: 1275 sender_data_key = 1276 HKDF-Expand-Label(sender_data_secret, "sd key", "", key_length) 1278 Each handshake message is encrypted using a key and a nonce derived 1279 from the handshake_secret for a specific sender to prevent two 1280 senders to perform in the following way: 1282 handshake_nonce_[sender] = 1283 HKDF-Expand-Label(handshake_secret, "hs nonce", [sender], nonce_length) 1285 handshake_key_[sender] = 1286 HKDF-Expand-Label(handshake_secret, "hs key", [sender], key_length) 1288 Here the value [sender] represents the index of the member that will 1289 use this key to send, encoded as a uint32. Each sender maintains two 1290 "generation" counters, one for application messages and one for 1291 handshake messages. These counters are incremented by one each time 1292 the sender sends a message. 1294 For application messages, a chain of keys is derived for each sender 1295 in a similar fashion. This allows forward secrecy at the level of 1296 application messages within and out of an epoch. A step in this 1297 chain (the second subscript) is called a "generation". The details 1298 of application key derivation are described in the Section 12.1 1299 section below. 1301 For handshake messages (Proposals and Commits), the same key is used 1302 for all messages, but the nonce is updated according to the 1303 generation of the message: 1305 handshake_nonce_[sender]_[generation] = handshake_nonce_[sender] 1306 XOR encode_big_endian(generation) 1308 where "encode_big_endian()" encodes the generation in a big-endian 1309 integer of the same size as the base handshake nonce. 1311 7. Initialization Keys 1313 In order to facilitate asynchronous addition of clients to a group, 1314 it is possible to pre-publish initialization keys that provide some 1315 public information about a user. ClientInitKey messages provide 1316 information about a client that any existing member can use to add 1317 this client to the group asynchronously. 1319 A ClientInitKey object specifies a ciphersuite that the client 1320 supports, as well as providing a public key that others can use for 1321 key agreement. The client's identity key is intended to be stable 1322 throughout the lifetime of the group; there is no mechanism to change 1323 it. Init keys are intended to be used only once and SHOULD NOT be 1324 reused except in case of last resort. (See Section 13.4). Clients 1325 MAY generate and publish multiple ClientInitKey objects to support 1326 multiple ciphersuites. ClientInitKeys contain an identifier chosen 1327 by the client, which the client MUST ensure uniquely identifies a 1328 given ClientInitKey object among the set of ClientInitKeys created by 1329 this client. 1331 The value for init_key MUST be a public key for the asymmetric 1332 encryption scheme defined by cipher_suite. The whole structure is 1333 signed using the client's identity key. A ClientInitKey object with 1334 an invalid signature field MUST be considered malformed. The input 1335 to the signature computation comprises all of the fields except for 1336 the signature field. 1338 enum { 1339 mls10(0), 1340 (255) 1341 } ProtocolVersion; 1343 enum { 1344 invalid(0), 1345 supported_versions(1), 1346 supported_ciphersuites(2), 1347 expiration(3), 1348 (65535) 1349 } ExtensionType; 1351 struct { 1352 ExtensionType extension_type; 1353 opaque extension_data<0..2^16-1>; 1354 } Extension; 1356 struct { 1357 ProtocolVersion supported_version; 1358 opaque client_init_key_id<0..255>; 1359 CipherSuite cipher_suite; 1360 HPKEPublicKey init_key; 1361 Credential credential; 1362 Extension extensions<0..2^16-1>; 1363 opaque signature<0..2^16-1>; 1364 } ClientInitKey; 1366 ClientInitKey objects MUST contain at least two extensions, one of 1367 type "supported_versions" and one of type "supported_ciphersuites". 1368 These extensions allow MLS session establishment to be safe from 1369 downgrade attacks on these two parameters (as discussed in 1370 Section 9), while still only advertising one version / ciphersuite 1371 per ClientInitKey. 1373 7.1. Supported Versions and Supported Ciphersuites 1375 The "supported_versions" extension contains a list of MLS versions 1376 that are supported by the client. The "supported_ciphersuites" 1377 extension contains a list of MLS ciphersuites that are supported by 1378 the client. 1380 ProtocolVersion supported_versions<0..255>; 1381 CipherSuite supported_ciphersuites<0..255>; 1383 7.2. Expiration 1385 The "expiration" extension represents the time at which clients MUST 1386 consider this ClientInitKey invalid. This time is represented as an 1387 absolute time, measured in seconds since the Unix epoch 1388 (1970-01-01T00:00:00Z). If a client receives a ClientInitKey that 1389 contains an expiration extension at a time after its expiration time, 1390 then it MUST consider the ClientInitKey invalid and not use it for 1391 any further processing. 1393 uint64 expiration; 1395 Note that as an extension, it is not required that any given 1396 ClientInitKey have an expiration time. In particular, applications 1397 that rely on "last resort" ClientInitKeys to ensure continued 1398 reachability may choose to omit the expiration extension from these 1399 keys, or give them much longer lifetimes than other ClientInitKeys. 1401 8. Message Framing 1403 Handshake and application messages use a common framing structure. 1404 This framing provides encryption to ensure confidentiality within the 1405 group, as well as signing to authenticate the sender within the 1406 group. 1408 The two main structures involved are MLSPlaintext and MLSCiphertext. 1409 MLSCiphertext represents a signed and encrypted message, with 1410 protections for both the content of the message and related metadata. 1411 MLSPlaintext represents a message that is only signed, and not 1412 encrypted. Applications SHOULD use MLSCiphertext to encode both 1413 application and handshake messages, but MAY transmit handshake 1414 messages encoded as MLSPlaintext objects in cases where it is 1415 necessary for the delivery service to examine such messages. 1417 enum { 1418 invalid(0), 1419 application(1), 1420 proposal(2), 1421 commit(3), 1422 (255) 1423 } ContentType; 1425 struct { 1426 opaque group_id<0..255>; 1427 uint32 epoch; 1428 uint32 sender; 1429 ContentType content_type; 1430 opaque authenticated_data<0..2^32-1>; 1432 select (MLSPlaintext.content_type) { 1433 case application: 1434 opaque application_data<0..2^32-1>; 1436 case proposal: 1437 Proposal proposals<1..2^32-1>; 1439 case commit: 1440 Proposal proposals<1..2^32-1>; 1441 Commit commit; 1442 opaque confirmation<0..255>; 1443 } 1445 opaque signature<0..2^16-1>; 1446 } MLSPlaintext; 1448 struct { 1449 opaque group_id<0..255>; 1450 uint32 epoch; 1451 ContentType content_type; 1452 opaque authenticated_data<0..2^32-1>; 1453 opaque sender_data_nonce<0..255>; 1454 opaque encrypted_sender_data<0..255>; 1455 opaque ciphertext<0..2^32-1>; 1456 } MLSCiphertext; 1458 The remainder of this section describes how to compute the signature 1459 of an MLSPlaintext object and how to convert it to an MLSCiphertext 1460 object. The overall process is as follows: 1462 o Gather the required metadata: 1464 * Group ID 1465 * Epoch 1467 * Content Type 1469 * Nonce 1471 * Sender index 1473 * Key generation 1475 o Sign the plaintext metadata - the group ID, epoch, sender index, 1476 and content type - as well as the authenticated data and message 1477 content 1479 o Randomly generate sender_data_nonce and encrypt the sender 1480 information using it and the key derived from the 1481 sender_data_secret 1483 o Encrypt the content using a content encryption key identified by 1484 the metadata 1486 The group identifier, epoch, content_type and authenticated data 1487 fields are copied from the MLSPlaintext object directly. The content 1488 encryption process populates the ciphertext field of the 1489 MLSCiphertext object. The metadata encryption step populates the 1490 encrypted_sender_data field. 1492 Decryption follows the same step in reverse: Decrypt the metadata, 1493 then the message and verify the content signature. 1495 8.1. Metadata Encryption 1497 The "sender data" used to look up the key for the content encryption 1498 is encrypted under AEAD using the MLSCiphertext sender_data_nonce and 1499 the sender_data_key from the keyschedule. It is encoded as an object 1500 of the following form: 1502 struct { 1503 uint32 sender; 1504 uint32 generation; 1505 } MLSSenderData; 1507 The Additional Authenticated Data (AAD) for the SenderData ciphertext 1508 computation is its prefix in the MLSCiphertext, namely: 1510 struct { 1511 opaque group_id<0..255>; 1512 uint32 epoch; 1513 ContentType content_type; 1514 opaque authenticated_data<0..2^32-1>; 1515 opaque sender_data_nonce<0..255>; 1516 } MLSCiphertextSenderDataAAD; 1518 When parsing a SenderData struct as part of message decryption, the 1519 recipient MUST verify that the sender field represents an occupied 1520 leaf in the ratchet tree. In particular, the sender index value MUST 1521 be less than the number of leaves in the tree. 1523 8.2. Content Signing and Encryption 1525 The signature field in an MLSPlaintext object is computed using the 1526 signing private key corresponding to the credential at the leaf in 1527 the tree indicated by the sender field. The signature covers the 1528 plaintext metadata and message content, i.e., all fields of 1529 MLSPlaintext except for the "signature" field. The signature also 1530 covers the GroupContext for the current epoch, so that signatures are 1531 specific to a given group and epoch. 1533 struct { 1534 GroupContext context; 1536 opaque group_id<0..255>; 1537 uint32 epoch; 1538 uint32 sender; 1539 ContentType content_type; 1540 opaque authenticated_data<0..2^32-1>; 1542 select (MLSPlaintext.content_type) { 1543 case application: 1544 opaque application_data<0..2^32-1>; 1546 case proposal: 1547 Proposal proposals<1..2^32-1>; 1549 case commit: 1550 Proposal proposals<1..2^32-1>; 1551 Commit commit; 1552 opaque confirmation<0..255>; 1553 } 1554 } MLSPlaintextSignatureInput; 1556 The ciphertext field of the MLSCiphertext object is produced by 1557 supplying the inputs described below to the AEAD function specified 1558 by the ciphersuite in use. The plaintext input contains content and 1559 signature of the MLSPlaintext, plus optional padding. These values 1560 are encoded in the following form: 1562 struct { 1563 select (MLSCiphertext.content_type) { 1564 case handshake: 1565 GroupOperation operation; 1566 opaque confirmation<0..255>; 1568 case application: 1569 opaque application_data<0..2^32-1>; 1570 } 1572 opaque signature<0..2^16-1>; 1573 opaque padding<0..2^16-1>; 1574 } MLSCiphertextContent; 1576 The key and nonce used for the encryption of the message depend on 1577 the content type of the message. The sender chooses the handshake 1578 key for a handshake message or an ununsed generation from its (per- 1579 sender) application key chain for the current epoch, according to the 1580 type of message being encrypted. 1582 The Additional Authenticated Data (AAD) input to the encryption 1583 contains an object of the following form, with the values used to 1584 identify the key and nonce: 1586 struct { 1587 opaque group_id<0..255>; 1588 uint32 epoch; 1589 ContentType content_type; 1590 opaque authenticated_data<0..2^32-1>; 1591 opaque sender_data_nonce<0..255>; 1592 opaque encrypted_sender_data<0..255>; 1593 } MLSCiphertextContentAAD; 1595 The ciphertext field of the MLSCiphertext object is produced by 1596 supplying these inputs to the AEAD function specified by the 1597 ciphersuite in use. 1599 9. Group Creation 1601 A group is always created with a single member, the "creator". The 1602 other members are added when the creator effectively sends itself an 1603 Add proposal and commits it, then sends the corresponding Welcome 1604 message to the new participants. These processes are described in 1605 detail in Section 10.1.1, Section 10.2, and Section 10.2.1. 1607 The creator of a group MUST take the following steps to initialize 1608 the group: 1610 o Fetch ClientInitKeys for the members to be added, and selects a 1611 version and ciphersuite according to the capabilities of the 1612 members. To protect against downgrade attacks, the creator MUST 1613 use the "supported_versions" and "supported_ciphersuites" fields 1614 in these ClientInitKeys to verify that the chosen version and 1615 ciphersuite is the best option supported by all members. 1617 o Initialize a one-member group with the following initial values 1618 (where "0" represents an all-zero vector of size Hash.length): 1620 * Ratchet tree: A tree with a single node, a leaf containing an 1621 HPKE public key and credential for the creator 1623 * Group ID: A value set by the creator 1625 * Epoch: 0x00000000 1627 * Tree hash: The root hash of the above ratchet tree 1629 * Confirmed transcript hash: 0 1631 * Interim transcript hash: 0 1633 * Init secret: 0 1635 o For each member, construct an Add proposal from the ClientInitKey 1636 for that member (see Section 10.1.1) 1638 o Construct a Commit message that commits all of the Add proposals, 1639 in any order chosen by the creator (see Section 10.2) 1641 o Process the Commit message to obtain a new group state (for the 1642 epoch in which the new members are added) and a Welcome message 1644 o Transmit the Welcome message to the other new members 1646 The recipient of a Welcome message processes it as described in 1647 Section 10.2.1. 1649 In principle, the above process could be streamlined by having the 1650 creator directly create a tree and choose a random value for first 1651 epoch's epoch secret. We follow the steps above because it removes 1652 unnecessary choices, by which, for example, bad randomness could be 1653 introduced. The only choices the creator makes here are its own HPKE 1654 key and credential, the leaf secret from which the Commit is built, 1655 and the intermediate key pairs along the direct path to the root. 1657 A new member receiving a Welcome message can recognize group creation 1658 if the number of entries in the "members" array is equal to the 1659 number of leaves in the tree minus one. A client receiving a Welcome 1660 message SHOULD verify whether it is a newly created group, and if so, 1661 SHOULD verify that the above process was followed by reconstructing 1662 the Add and Commit messages and verifying that the resulting 1663 transcript hashes and epoch secret match those found in the Welcome 1664 message. 1666 10. Group Evolution 1668 Over the lifetime of a group, its membership can change, and existing 1669 members might want to change their keys in order to achieve post- 1670 compromise security. In MLS, each such change is accomplished by a 1671 two-step process: 1673 1. A proposal to make the change is broadcast to the group in a 1674 Proposal message 1676 2. A member of the group broadcasts a Commit message that causes one 1677 or more proposed changes to enter into effect 1679 The group thus evolves from one cryptographic state to another each 1680 time a Commit message is sent and processed. These states are 1681 referred to as "epochs" and are uniquely identified among states of 1682 the group by four-octet epoch values. When a new group is 1683 initialized, its initial state epoch 0x00000000. Each time a state 1684 transition occurs, the epoch number is incremented by one. 1686 [[ OPEN ISSUE: It would be better to have non-linear epochs, in order 1687 to tolerate forks in the history. ]] 1689 10.1. Proposals 1691 Proposals are included in an MLSPlaintext by way of a Proposal 1692 structure that indicates their type: 1694 enum { 1695 invalid(0), 1696 add(1), 1697 update(2), 1698 remove(3), 1699 (255) 1700 } ProposalType; 1702 struct { 1703 ProposalType msg_type; 1704 select (Proposal.msg_type) { 1705 case add: Add; 1706 case update: Update; 1707 case remove: Remove; 1708 }; 1709 } Proposal; 1711 On receiving an MLSPlaintext containing a Proposal, a client MUST 1712 verify the signature on the enclosing MLSPlaintext. If the signature 1713 verifies successfully, then the Proposal should be cached in such a 1714 way that it can be retrieved using a ProposalID in a later Commit 1715 message. 1717 10.1.1. Add 1719 An Add proposal requests that a client with a specified ClientInitKey 1720 be added to the group. 1722 struct { 1723 ClientInitKey init_key; 1724 } Add; 1726 The proposer of the Add does not control where in the group's ratchet 1727 tree the new member is added. Instead, the sender of the Commit 1728 message chooses a location for each added member and states it in the 1729 Commit message. 1731 An Add is applied after being included in a Commit message. The 1732 position of the Add in the list of adds determines the leaf index 1733 "index" where the new member will be added. For the first Add in the 1734 Commit, "index" is the leftmost empty leaf in the tree, for the 1735 second Add, the next empty leaf to the right, etc. 1737 o If necessary, extend the tree to the right until it has at least 1738 index + 1 leaves 1740 o For each intermediate node along the path from the leaf at 1741 position "index" to the root, add "index" to the "unmerged_leaves" 1742 list for the node. 1744 o Blank the path from the leaf at position "index" to the root 1746 o Set the leaf node in the tree at position "index" to a new node 1747 containing the public key from the ClientInitKey in the Add, as 1748 well as the credential under which the ClientInitKey was signed 1750 10.1.2. Update 1752 An Update proposal requests that the sender's leaf node in the tree 1753 be updated with a new HPKE public key. 1755 struct { 1756 HPKEPublicKey leaf_key; 1757 } Update; 1759 A member of the group applies an Update message by taking the 1760 following steps: 1762 o Update the sender's leaf node by replacing the HPKE public key 1763 with the public key in the Update proposal 1765 o Blank the intermediate nodes along the path from the sender's leaf 1766 to the root 1768 10.1.3. Remove 1770 A Remove proposal requests that the client at a specified index in 1771 the tree be removed from the group. 1773 struct { 1774 uint32 removed; 1775 } Remove; 1777 A member of the group applies a Remove message by taking the 1778 following steps: 1780 o Replace the leaf node at position "removed" with a blank node 1782 o Blank the intermediate nodes along the path from the removed leaf 1783 to the root 1785 10.1.4. External Proposals 1787 Add and Remove proposals can be constructed and sent to the group by 1788 a party that is outside the group. For example, a Delivery Service 1789 might propose to remove a member of a group has been inactive for a 1790 long time, or propose adding a newly-hired staff member to a group 1791 representing a real-world team. Proposals originating outside the 1792 group are identified by having a "sender" value in the range 1793 0xFFFFFF00 - 0xFFFFFFFF. 1795 The specific value 0xFFFFFFFF is reserved for clients proposing that 1796 they themselves be added. Proposals with types other than Add MUST 1797 NOT be sent with this sender index. In such cases, the MLSPlaintext 1798 MUST be signed with the private key corresponding to the 1799 ClientInitKey in the Add message. Recipients MUST verify that the 1800 MLSPlaintext carrying the Proposal message is validly signed with 1801 this key. 1803 The remaining values 0xFFFFFF00 - 0xFFFFFFFE are reserved for signer 1804 that are pre-provisioned to the clients within a group. If proposals 1805 with these sender IDs are to be accepted within a group, the members 1806 of the group MUST be provisioned by the application with a mapping 1807 between sender indices in this range and authorized signing keys. To 1808 ensure consistent handling of external proposals, the application 1809 MUST ensure that the members of a group have the same mapping and 1810 apply the same policies to external proposals. 1812 An external proposal MUST be sent as an MLSPlaintext object, since 1813 the sender will not have the keys necessary to construct an 1814 MLSCiphertext object. 1816 [[ TODO: Should recognized external signers be added to some object 1817 that the group explicitly agrees on, e.g., as an extension to the 1818 GroupContext? ]] 1820 10.2. Commit 1822 A Commit message initiates a new epoch for the group, based on a 1823 collection of Proposals. It instructs group members to update their 1824 representation of the state of the group by applying the proposals 1825 and advancing the key schedule. 1827 A group member that has observed one or more Proposal messages within 1828 an epoch MUST send a Commit message before sending application data. 1829 This ensures, for example, that any members whose removal was 1830 proposed during the epoch are actually removed before any application 1831 information is transmitted. 1833 The sender of a Commit message MUST include in it all valid Proposals 1834 that the sender has received during the current epoch. Invalid 1835 Proposals include, for example, Proposals with an invalid signature 1836 or Proposals that are semantically inconsistent, such as a Remove 1837 proposal for an unoccupied leaf. The Commit MUST NOT combine 1838 Proposals sent within different epochs. Despite these requirements, 1839 it is still possible for a valid Proposal not to be covered by a 1840 Commit, e.g., because the sender of the Commit did not receive the 1841 Proposal. In such cases, the sender of the proposal can retransmit 1842 the Proposal in the new epoch. 1844 Each proposal covered by the Commit is identified by a ProposalID 1845 structure. The "sender" field in this structure indicates the member 1846 of the group that sent the proposal (according to their index in the 1847 ratchet tree). The "hash" field contains the hash of the 1848 MLSPlaintext in which the Proposal was sent, using the hash function 1849 for the group's ciphersuite. 1851 struct { 1852 uint32 sender; 1853 opaque hash<0..255>; 1854 } ProposalID; 1856 struct { 1857 ProposalID updates<0..2^16-1>; 1858 ProposalID removes<0..2^16-1>; 1859 ProposalID adds<0..2^16-1>; 1860 ProposalID ignored<0..2^16-1>; 1861 DirectPath path; 1862 } Commit; 1864 The sender of a Commit message MUST include in it all proposals that 1865 it has received during the current epoch. Proposals that recipients 1866 should implement are placed in the "updates", "removes", and "adds" 1867 vector, according to their type. Proposals that should not be 1868 implemented are placed in the "ignored" vector. For example, if two 1869 Update proposals are issued for the same leaf, then one of them 1870 (presumably the earlier one) should be ignored and the other 1871 (presumably the later) should be added to the "updates" vector. 1873 [[ OPEN ISSUE: This structure loses the welcome_info_hash, because 1874 new participants are no longer expected to have access to the Commit 1875 message adding them to the group. It might be we need to re- 1876 introduce this assumption, though it seems like the information 1877 confirmed by the welcome_info_hash is confirmed at the next epoch 1878 change anyway. ]] 1879 A member of the group applies a Commit message by taking the 1880 following steps: 1882 1. Verify that the "epoch" field of the enclosing MLSPlaintext 1883 message is equal to the "epoch" field of the current GroupContext 1884 object 1886 2. Verify that the signature on the MLSPlaintext message verifies 1887 using the public key from the credential stored at the leaf in 1888 the tree indicated by the "sender" field. 1890 3. Generate a provisional GroupContext object by applying the 1891 proposals referenced in the commit object in the order provided, 1892 as described in Section 10.1. Add proposals are applied left to 1893 right: Each Add proposal is applied at the leftmost unoccupied 1894 leaf, or appended to the right edge of the tree if all leaves are 1895 occupied. 1897 4. Process the "path" value to update the ratchet tree referenced by 1898 the provisional GroupContext and generate the update secret: 1900 * Update the ratchet tree by replacing nodes in the direct path 1901 of the sender with the corresponding nodes in the path (see 1902 Section 6.5). 1904 * The update secret is the value "path_secret[n+1]" derived from 1905 the "path_secret[n]" value associated to the root node. 1907 5. Use the update secret, the provisional GroupContext, and the init 1908 secret from the previous epoch to compute the epoch secret and 1909 derived secrets for the new epoch. 1911 6. Use the "confirmation_key" for the new epoch to compute the 1912 confirmation MAC for this message, as described below, and verify 1913 that it is the same as the "confirmation" field in the 1914 MLSPlaintext object. 1916 7. If the above checks are successful, consider the updated 1917 GroupContext object as the current state of the group. 1919 The confirmation value confirms that the members of the group have 1920 arrived at the same state of the group: 1922 MLSPlaintext.confirmation = 1923 HMAC(confirmation_key, GroupContext.confirmed_transcript_hash) 1925 HMAC [RFC2104] uses the Hash algorithm for the ciphersuite in use. 1927 [[ OPEN ISSUE: It is not possible for the recipient of a handshake 1928 message to verify that ratchet tree information in the message is 1929 accurate, because each node can only compute the secret and private 1930 key for nodes in its direct path. This creates the possibility that 1931 a malicious participant could cause a denial of service by sending a 1932 handshake message with invalid values for public keys in the ratchet 1933 tree. ]] 1935 10.2.1. Welcoming New Members 1937 The sender of a Commit message is responsible for sending a Welcome 1938 message to any new members added via Add proposals. The Welcome 1939 message provides the new members with the current state of the group, 1940 after the application of the Commit message. The new members will 1941 not be able to decrypt or verify the Commit message, but will have 1942 the secrets they need to participate in the epoch initiated by the 1943 Commit message. 1945 In order to allow the same Welcome message to be sent to all new 1946 members, information describing the group is encrypted with a 1947 symmetric key and nonce randomly chosen by the sender. This key and 1948 nonce are then encrypted to each new member using HPKE. In the same 1949 encrypted package, the committer transmits the path secret for the 1950 lowest node contained in the direct paths of both the committer and 1951 the new member. This allows the new member to compute private keys 1952 for nodes in its direct path that are being reset by the 1953 corresponding Commit. 1955 struct { 1956 HPKEPublicKey public_key; 1957 uint32_t unmerged_leaves<0..2^32-1>; 1958 optional credential; 1959 } RatchetNode; 1961 struct { 1962 // GroupContext inputs 1963 opaque group_id<0..255>; 1964 uint32 epoch; 1965 optional tree<1..2^32-1>; 1966 opaque confirmed_transcript_hash<0..255>; 1968 // Inputs to the next round of the key schedule 1969 opaque interim_transcript_hash<0..255>; 1970 opaque epoch_secret<0..255>; 1972 uint32 signer_index; 1973 opaque signature<0..255>; 1974 } GroupInfo; 1976 struct { 1977 opaque group_info_key<1..255>; 1978 opaque group_info_nonce<1..255>; 1979 opaque path_secret<1..255>; 1980 } KeyPackage; 1982 struct { 1983 opaque client_init_key_hash<1..255>; 1984 HPKECiphertext encrypted_key_package; 1985 } EncryptedKeyPackage; 1987 struct { 1988 ProtocolVersion version = mls10; 1989 CipherSuite cipher_suite; 1990 EncryptedKeyPackage key_packages<1..V>; 1991 opaque encrypted_group_info; 1992 } Welcome; 1994 In the description of the tree as a list of nodes, the "credential" 1995 field for a node MUST be populated if and only if that node is a leaf 1996 in the tree (i.e., a node with an even index). 1998 On receiving a Welcome message, a client processes it using the 1999 following steps: 2001 o Identify an entry in the "key_packages" array where the 2002 "client_init_key_hash" value corresponds to one of this client's 2003 ClientInitKeys, using the hash indicated by the "cipher_suite" 2004 field. If no such field exists, or if the ciphersuite indicated 2005 in the ClientInitKey does not match the one in the Welcome 2006 message, return an error. 2008 o Decrypt the "encrypted_key_package" using HPKE with the algorithms 2009 indicated by the ciphersuite and the HPKE public key in the 2010 ClientInitKey. 2012 o Decrypt the "encrypted_group_info" field using the key and nonce 2013 in the decrypted KeyPackage object. 2015 o Verify the signature on the GroupInfo object. The signature input 2016 comprises all of the fields in the GroupInfo object except the 2017 signature field. The public key and algorithm are taken from the 2018 credential in the leaf node at position "signer_index". If this 2019 verification fails, return an error. 2021 o Identify a leaf in the "tree" array (i.e., an even-numbered node) 2022 whose "public_key" and "credential" fields are identical to the 2023 corresponding fields in the ClientInitKey. If no such field 2024 exists, return an error. Let "index" represent the index of this 2025 node among the leaves in the tree, namely the index of the node in 2026 the "tree" array divided by two. 2028 o Construct a new group state using the information in the GroupInfo 2029 object. The new member's position in the tree is "index", as 2030 defined above. 2032 o Identify the lowest node at which the direct paths from "index" 2033 and "signer_index" overlap. Set private keys for that node and 2034 its parents up to the root of the tree, using the "path_secret" 2035 from the KeyPackage and following the algorithm in Section 5.4 to 2036 move up the tree. 2038 11. Sequencing of State Changes 2040 [[ OPEN ISSUE: This section has an initial set of considerations 2041 regarding sequencing. It would be good to have some more detailed 2042 discussion, and hopefully have a mechanism to deal with this issue. 2043 ]] 2045 Each handshake message is premised on a given starting state, 2046 indicated in its "prior_epoch" field. If the changes implied by a 2047 handshake messages are made starting from a different state, the 2048 results will be incorrect. 2050 This need for sequencing is not a problem as long as each time a 2051 group member sends a handshake message, it is based on the most 2052 current state of the group. In practice, however, there is a risk 2053 that two members will generate handshake messages simultaneously, 2054 based on the same state. 2056 When this happens, there is a need for the members of the group to 2057 deconflict the simultaneous handshake messages. There are two 2058 general approaches: 2060 o Have the delivery service enforce a total order 2062 o Have a signal in the message that clients can use to break ties 2064 As long as handshake messages cannot be merged, there is a risk of 2065 starvation. In a sufficiently busy group, a given member may never 2066 be able to send a handshake message, because he always loses to other 2067 members. The degree to which this is a practical problem will depend 2068 on the dynamics of the application. 2070 It might be possible, because of the non-contributivity of 2071 intermediate nodes, that update messages could be applied one after 2072 the other without the Delivery Service having to reject any handshake 2073 message, which would make MLS more resilient regarding the 2074 concurrency of handshake messages. The Messaging system can decide 2075 to choose the order for applying the state changes. Note that there 2076 are certain cases (if no total ordering is applied by the Delivery 2077 Service) where the ordering is important for security, ie. all 2078 updates must be executed before removes. 2080 Regardless of how messages are kept in sequence, implementations MUST 2081 only update their cryptographic state when valid handshake messages 2082 are received. Generation of handshake messages MUST be stateless, 2083 since the endpoint cannot know at that time whether the change 2084 implied by the handshake message will succeed or not. 2086 11.1. Server-Enforced Ordering 2088 With this approach, the delivery service ensures that incoming 2089 messages are added to an ordered queue and outgoing messages are 2090 dispatched in the same order. The server is trusted to resolve 2091 conflicts during race-conditions (when two members send a message at 2092 the same time), as the server doesn't have any additional knowledge 2093 thanks to the confidentiality of the messages. 2095 Messages should have a counter field sent in clear-text that can be 2096 checked by the server and used for tie-breaking. The counter starts 2097 at 0 and is incremented for every new incoming message. If two group 2098 members send a message with the same counter, the first message to 2099 arrive will be accepted by the server and the second one will be 2100 rejected. The rejected message needs to be sent again with the 2101 correct counter number. 2103 To prevent counter manipulation by the server, the counter's 2104 integrity can be ensured by including the counter in a signed message 2105 envelope. 2107 This applies to all messages, not only state changing messages. 2109 11.2. Client-Enforced Ordering 2111 Order enforcement can be implemented on the client as well, one way 2112 to achieve it is to use a two step update protocol: the first client 2113 sends a proposal to update and the proposal is accepted when it gets 2114 50%+ approval from the rest of the group, then it sends the approved 2115 update. Clients which didn't get their proposal accepted, will wait 2116 for the winner to send their update before retrying new proposals. 2118 While this seems safer as it doesn't rely on the server, it is more 2119 complex and harder to implement. It also could cause starvation for 2120 some clients if they keep failing to get their proposal accepted. 2122 12. Application Messages 2124 The primary purpose of the Handshake protocol is to provide an 2125 authenticated group key exchange to clients. In order to protect 2126 Application messages sent among the members of a group, the 2127 Application secret provided by the Handshake key schedule is used to 2128 derive nonces and encryption keys for the Message Protection Layer 2129 according to the Application Key Schedule. That is, each epoch is 2130 equipped with a fresh Application Key Schedule which consist of a 2131 tree of Application Secrets as well as one symmetric ratchet per 2132 group member. 2134 Each client maintains their own local copy of the Application Key 2135 Schedule for each epoch during which they are a group member. They 2136 derive new keys, nonces and secrets as needed while deleting old ones 2137 as soon as they have been used. 2139 Application messages MUST be protected with the Authenticated- 2140 Encryption with Associated-Data (AEAD) encryption scheme associated 2141 with the MLS ciphersuite using the common framing mechanism. Note 2142 that "Authenticated" in this context does not mean messages are known 2143 to be sent by a specific client but only from a legitimate member of 2144 the group. To authenticate a message from a particular member, 2145 signatures are required. Handshake messages MUST use asymmetric 2146 signatures to strongly authenticate the sender of a message. 2148 12.1. Tree of Application Secrets 2150 The application key schedule begins with the application secrets 2151 which are arranged in an "Application Secret Tree" or AS Tree for 2152 short; a left balanced binary tree with the same set of nodes and 2153 edges as the epoch's ratchet tree. Each leaf in the AS Tree is 2154 associated with the same group member as the corresponding leaf in 2155 the ratchet tree. Nodes are also assigned an index according to 2156 their position in the array representation of the tree (described in 2157 Appendix A). If N is a node index in the AS Tree then left(N) and 2158 right(N) denote the children of N (if they exist). 2160 Each node in the tree is assigned a secret. The root's secret is 2161 simply the application_secret of that epoch. (See Section 6.6 for 2162 the definition of application_secret.) 2164 astree_node_[root]_secret = application_secret 2166 The secret of any other node in the tree is derived from its parent's 2167 secret using a call to Derive-App-Secret. 2169 Derive-App-Secret(Secret, Label, Node, Generation, Length) = 2170 HKDF-Expand-Label(Secret, Label, ApplicationContext, Length) 2172 Where ApplicationContext is specified as: 2174 struct { 2175 uint32 node = Node; 2176 uint32 generation = Generation; 2177 } ApplicationContext; 2179 If N is a node index in the AS Tree then the secrets of the children 2180 of N are defined to be: 2182 astree_node_[N]_secret 2183 | 2184 | 2185 +--> Derive-App-Secret(., "tree", left(N), 0, Hash.length) 2186 | = astree_node_[left(N)]_secret 2187 | 2188 +--> Derive-App-Secret(., "tree", right(N), 0, Hash.length) 2189 = astree_node_[right(N)]_secret 2191 Note that fixing concrete values for GroupContext_[n] and 2192 application_secret completely defines all secrets in the AS Tree. 2194 12.2. Sender Ratchets 2196 The secret of a leaf in the AS Tree is used to initiate a symmetric 2197 hash ratchet which generates a sequence of keys and nonces. The 2198 group member assigned to that leaf uses the j-th key/nonce pair in 2199 the sequence to encrypt (using the AEAD) the j-th message they send 2200 during that epoch. In particular, each key/nonce pair MUST NOT be 2201 used to encrypt more than one message. 2203 More precisely, the initial secret of the ratchet for the group 2204 member assigned to the leaf with node index N is simply the secret of 2205 that leaf. 2207 application_[N]_[0]_secret = astree_node_[N]_secret 2209 Keys, nonces and secrets of ratchets are derived using Derive-App- 2210 Secret. The context in a given call consists of the index of the 2211 sender's leaf in the ratchet tree and the current position in the 2212 ratchet. In particular, the index of the sender's leaf in the 2213 ratchet tree is the same as the index of the leaf in the AS Tree used 2214 to initialize the sender's ratchet. 2216 application_[N]_[j]_secret 2217 | 2218 +--> Derive-App-Secret(., "app-nonce", N, j, AEAD.nonce_length) 2219 | = application_[N]_[j]_nonce 2220 | 2221 +--> Derive-App-Secret(., "app-key", N, j, AEAD.key_length) 2222 | = application_[N]_[j]_key 2223 | 2224 V 2225 Derive-App-Secret(., "app-secret", N, j, Hash.length) 2226 = application_[N]_[j+1]_secret 2228 Here, AEAD.nonce_length and AEAD.key_length denote the lengths in 2229 bytes of the nonce and key for the AEAD scheme defined by the 2230 ciphersuite. 2232 12.3. Deletion Schedule 2234 It is important to delete all security sensitive values as soon as 2235 they are _consumed_. A sensitive value S is said to be _consumed_ if 2237 o S was used to encrypt or (successfully) decrypt a message, or if 2239 o a key, nonce, or secret derived from S has been consumed. (This 2240 goes for values derived via Derive-Secret as well as HKDF-Expand- 2241 Label.) 2243 Here, S may be the "init_secret", "update_secret", "epoch_secret", 2244 "application_secret" as well as any secret in the AS Tree or one of 2245 the ratchets. 2247 As soon as a group member consumes a value they MUST immediately 2248 delete (all representations of) that value. This is crucial to 2249 ensuring Forward Secrecy for past messages. Members MAY keep 2250 unconsumed values around for some reasonable amount of time even if 2251 their generating secret was already consumed (e.g. due to out of 2252 order message delivery). 2254 For example, suppose a group member encrypts or (successfully) 2255 decrypts a message using the j-th key and nonce in the i-th ratchet. 2256 Then, for that member, at least the following values have been 2257 consumed and MUST be deleted: 2259 o the "init_secret", "update_secret", "epoch_secret", 2260 "application_secret" of that epoch, 2262 o all node secrets in the AS Tree on the path from the root to the 2263 leaf with index i, 2265 o the first j secrets in the i-th ratchet and 2267 o "application_[i]_[j]_key" and "application_[i]_[j]_nonce". 2269 Concretely, suppose we have the following AS Tree and ratchet for 2270 participant D: 2272 G 2273 / \ 2274 / \ 2275 E F 2276 / \ / \ 2277 A0 B0 C0 D0 -+- KD0 2278 | | 2279 | +- ND0 2280 | 2281 D1 -+- KD1 2282 | | 2283 | +- ND1 2284 | 2285 D2 2287 Then if a client uses key KD1 and nonce ND1 during epoch n then it 2288 must consume (at least) values G, F, D0, D1, KD1, ND1 as well as the 2289 update_secret and init_secret used to derive G (i.e. the 2290 application_secret). The client MAY retain (i.e., not consume) the 2291 values KD0 and ND0 to allow for out-of-order delivery, and SHOULD 2292 retain D2 to allow for processing future messages. 2294 12.4. Further Restrictions 2296 During each epoch senders MUST NOT encrypt more data than permitted 2297 by the security bounds of the AEAD scheme used. 2299 Note that each change to the Group through a Handshake message will 2300 also set a new application_secret. Hence this change MUST be applied 2301 before encrypting any new Application message. This is required both 2302 to ensure that any users removed from the group can no longer receive 2303 messages and to (potentially) recover confidentiality and 2304 authenticity for future messages despite a past state compromise. 2306 [[ OPEN ISSUE: At the moment there is no contributivity of 2307 Application secrets chained from the initial one to the next 2308 generation of Epoch secret. While this seems safe because 2309 cryptographic operations using the application secrets can't affect 2310 the group init_secret, it remains to be proven correct. ]] 2312 12.5. Message Encryption and Decryption 2314 The group members MUST use the AEAD algorithm associated with the 2315 negotiated MLS ciphersuite to AEAD encrypt and decrypt their 2316 Application messages according to the Message Framing section. 2318 The group identifier and epoch allow a recipient to know which group 2319 secrets should be used and from which Epoch secret to start computing 2320 other secrets and keys. The sender identifier is used to identify 2321 the member's symmetric ratchet from the initial group Application 2322 secret. The application generation field is used to determine how 2323 far into the ratchet to iterate in order to reproduce the required 2324 AEAD keys and nonce for performing decryption. 2326 Application messages SHOULD be padded to provide some resistance 2327 against traffic analysis techniques over encrypted traffic. [CLINIC] 2328 [HCJ16] While MLS might deliver the same payload less frequently 2329 across a lot of ciphertexts than traditional web servers, it might 2330 still provide the attacker enough information to mount an attack. If 2331 Alice asks Bob: "When are we going to the movie ?" the answer 2332 "Wednesday" might be leaked to an adversary by the ciphertext length. 2333 An attacker expecting Alice to answer Bob with a day of the week 2334 might find out the plaintext by correlation between the question and 2335 the length. 2337 Similarly to TLS 1.3, if padding is used, the MLS messages MUST be 2338 padded with zero-valued bytes before AEAD encryption. Upon AEAD 2339 decryption, the length field of the plaintext is used to compute the 2340 number of bytes to be removed from the plaintext to get the correct 2341 data. As the padding mechanism is used to improve protection against 2342 traffic analysis, removal of the padding SHOULD be implemented in a 2343 "constant-time" manner at the MLS layer and above layers to prevent 2344 timing side-channels that would provide attackers with information on 2345 the size of the plaintext. The padding length length_of_padding can 2346 be chosen at the time of the message encryption by the sender. 2347 Recipients can calculate the padding size from knowing the total size 2348 of the ApplicationPlaintext and the length of the content. 2350 [[ TODO: A preliminary formal security analysis has yet to be 2351 performed on this authentication scheme.]] 2353 [[ OPEN ISSUE: Currently, the group identifier, epoch and generation 2354 are contained as meta-data of the Signature. A different solution 2355 could be to include the GroupContext instead, if more information is 2356 required to achieve the security goals regarding cross-group attacks. 2357 ]] 2359 [[ OPEN ISSUE: Should the padding be required for handshake messages 2360 ? Can an adversary get more than the position of a participant in the 2361 tree without padding ? Should the base ciphertext block length be 2362 negotiated or is is reasonable to allow to leak a range for the 2363 length of the plaintext by allowing to send a variable number of 2364 ciphertext blocks ? ]] 2366 12.6. Delayed and Reordered Application messages 2368 Since each Application message contains the group identifier, the 2369 epoch and a message counter, a client can receive messages out of 2370 order. If they are able to retrieve or recompute the correct AEAD 2371 decryption key from currently stored cryptographic material clients 2372 can decrypt these messages. 2374 For usability, MLS clients might be required to keep the AEAD key and 2375 nonce for a certain amount of time to retain the ability to decrypt 2376 delayed or out of order messages, possibly still in transit while a 2377 decryption is being done. 2379 [[TODO: Describe here or in the Architecture spec the details. 2380 Depending on which Secret or key is kept alive, the security 2381 guarantees will vary.]] 2383 13. Security Considerations 2385 The security goals of MLS are described in [I-D.ietf-mls- 2386 architecture]. We describe here how the protocol achieves its goals 2387 at a high level, though a complete security analysis is outside of 2388 the scope of this document. 2390 13.1. Confidentiality of the Group Secrets 2392 Group secrets are derived from (i) previous group secrets, and (ii) 2393 the root key of a ratcheting tree. Only group members know their 2394 leaf private key in the group, therefore, the root key of the group's 2395 ratcheting tree is secret and thus so are all values derived from it. 2397 Initial leaf keys are known only by their owner and the group 2398 creator, because they are derived from an authenticated key exchange 2399 protocol. Subsequent leaf keys are known only by their owner. 2400 [[TODO: or by someone who replaced them.]] 2402 Note that the long-term identity keys used by the protocol MUST be 2403 distributed by an "honest" authentication service for clients to 2404 authenticate their legitimate peers. 2406 13.2. Authentication 2408 There are two forms of authentication we consider. The first form 2409 considers authentication with respect to the group. That is, the 2410 group members can verify that a message originated from one of the 2411 members of the group. This is implicitly guaranteed by the secrecy 2412 of the shared key derived from the ratcheting trees: if all members 2413 of the group are honest, then the shared group key is only known to 2414 the group members. By using AEAD or appropriate MAC with this shared 2415 key, we can guarantee that a member in the group (who knows the 2416 shared secret key) has sent a message. 2418 The second form considers authentication with respect to the sender, 2419 meaning the group members can verify that a message originated from a 2420 particular member of the group. This property is provided by digital 2421 signatures on the messages under identity keys. 2423 [[ OPEN ISSUE: Signatures under the identity keys, while simple, have 2424 the side-effect of preclude deniability. We may wish to allow other 2425 options, such as (ii) a key chained off of the identity key, or (iii) 2426 some other key obtained through a different manner, such as a 2427 pairwise channel that provides deniability for the message 2428 contents.]] 2430 13.3. Forward and post-compromise security 2432 Message encryption keys are derived via a hash ratchet, which 2433 provides a form of forward secrecy: learning a message key does not 2434 reveal previous message or root keys. Post-compromise security is 2435 provided by Update operations, in which a new root key is generated 2436 from the latest ratcheting tree. If the adversary cannot derive the 2437 updated root key after an Update operation, it cannot compute any 2438 derived secrets. 2440 In the case where the client could have been compromised (device 2441 loss...), the client SHOULD signal the delivery service to expire all 2442 the previous ClientInitKeys and publish fresh ones for PCS. 2444 13.4. Init Key Reuse 2446 Initialization keys are intended to be used only once and then 2447 deleted. Reuse of init keys can lead to replay attacks. 2449 14. IANA Considerations 2451 This document requests the creation of the following new IANA 2452 registries: 2454 o MLS Ciphersuites 2456 All of these registries should be under a heading of "Message Layer 2457 Security", and administered under a Specification Required policy 2458 [RFC8126]. 2460 14.1. MLS Ciphersuites 2462 The "MLS Ciphersuites" registry lists identifiers for suites of 2463 cryptographic algorithms defined for use with MLS. These are two- 2464 byte values, so the maximum possible value is 0xFFFF = 65535. Values 2465 in the range 0xF000 - 0xFFFF are reserved for vendor-internal usage. 2467 Template: 2469 o Value: The two-byte identifier for the ciphersuite 2471 o Name: The name of the ciphersuite 2473 o Reference: Where this algorithm is defined 2475 The initial contents for this registry are as follows: 2477 +--------+-------------------------+-----------+ 2478 | Value | Name | Reference | 2479 +--------+-------------------------+-----------+ 2480 | 0x0000 | P256_SHA256_AES128GCM | RFC XXXX | 2481 | | | | 2482 | 0x0001 | X25519_SHA256_AES128GCM | RFC XXXX | 2483 +--------+-------------------------+-----------+ 2485 [[ Note to RFC Editor: Please replace "XXXX" above with the number 2486 assigned to this RFC. ]] 2488 15. Contributors 2490 o Joel Alwen 2491 Wickr 2492 joel.alwen@wickr.com 2494 o Karthikeyan Bhargavan 2495 INRIA 2496 karthikeyan.bhargavan@inria.fr 2498 o Cas Cremers 2499 University of Oxford 2500 cas.cremers@cs.ox.ac.uk 2502 o Alan Duric 2503 Wire 2504 alan@wire.com 2506 o Srinivas Inguva 2507 Twitter 2508 singuva@twitter.com 2510 o Albert Kwon 2511 MIT 2512 kwonal@mit.edu 2514 o Eric Rescorla 2515 Mozilla 2516 ekr@rtfm.com 2518 o Michael Rosenberg 2519 Trail of Bits 2520 michael.rosenberg@trailofbits.com 2522 o Thyla van der Merwe 2523 Royal Holloway, University of London 2524 thyla.van.der@merwe.tech 2526 16. References 2528 16.1. Normative References 2530 [I-D.irtf-cfrg-hpke] 2531 Barnes, R. and K. Bhargavan, "Hybrid Public Key 2532 Encryption", draft-irtf-cfrg-hpke-00 (work in progress), 2533 July 2019. 2535 [IEEE1363] 2536 "IEEE Standard Specifications for Password-Based Public- 2537 Key Cryptographic Techniques", IEEE standard, 2538 DOI 10.1109/ieeestd.2009.4773330, n.d.. 2540 [RFC2104] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed- 2541 Hashing for Message Authentication", RFC 2104, 2542 DOI 10.17487/RFC2104, February 1997, 2543 . 2545 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2546 Requirement Levels", BCP 14, RFC 2119, 2547 DOI 10.17487/RFC2119, March 1997, 2548 . 2550 [RFC5116] McGrew, D., "An Interface and Algorithms for Authenticated 2551 Encryption", RFC 5116, DOI 10.17487/RFC5116, January 2008, 2552 . 2554 [RFC5869] Krawczyk, H. and P. Eronen, "HMAC-based Extract-and-Expand 2555 Key Derivation Function (HKDF)", RFC 5869, 2556 DOI 10.17487/RFC5869, May 2010, 2557 . 2559 [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for 2560 Writing an IANA Considerations Section in RFCs", BCP 26, 2561 RFC 8126, DOI 10.17487/RFC8126, June 2017, 2562 . 2564 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2565 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 2566 May 2017, . 2568 [RFC8446] Rescorla, E., "The Transport Layer Security (TLS) Protocol 2569 Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018, 2570 . 2572 [X962] ANSI, "Public Key Cryptography For The Financial Services 2573 Industry: The Elliptic Curve Digital Signature Algorithm 2574 (ECDSA)", ANSI X9.62, 1998. 2576 16.2. Informative References 2578 [art] Cohn-Gordon, K., Cremers, C., Garratt, L., Millican, J., 2579 and K. Milner, "On Ends-to-Ends Encryption: Asynchronous 2580 Group Messaging with Strong Security Guarantees", January 2581 2018, . 2583 [CLINIC] Miller, B., Huang, L., Joseph, A., and J. Tygar, "I Know 2584 Why You Went to the Clinic: Risks and Realization of HTTPS 2585 Traffic Analysis", Privacy Enhancing Technologies pp. 2586 143-163, DOI 10.1007/978-3-319-08506-7_8, 2014. 2588 [dhreuse] Menezes, A. and B. Ustaoglu, "On reusing ephemeral keys in 2589 Diffie-Hellman key agreement protocols", International 2590 Journal of Applied Cryptography Vol. 2, pp. 154, 2591 DOI 10.1504/ijact.2010.038308, 2010. 2593 [doubleratchet] 2594 Cohn-Gordon, K., Cremers, C., Dowling, B., Garratt, L., 2595 and D. Stebila, "A Formal Security Analysis of the Signal 2596 Messaging Protocol", 2017 IEEE European Symposium on 2597 Security and Privacy (EuroS&P), 2598 DOI 10.1109/eurosp.2017.27, April 2017. 2600 [HCJ16] Husak, M., Čermak, M., Jirsik, T., and P. 2601 Čeleda, "HTTPS traffic analysis and client 2602 identification using passive SSL/TLS fingerprinting", 2603 EURASIP Journal on Information Security Vol. 2016, 2604 DOI 10.1186/s13635-016-0030-7, February 2016. 2606 [I-D.ietf-trans-rfc6962-bis] 2607 Laurie, B., Langley, A., Kasper, E., Messeri, E., and R. 2608 Stradling, "Certificate Transparency Version 2.0", draft- 2609 ietf-trans-rfc6962-bis-33 (work in progress), September 2610 2019. 2612 [keyagreement] 2613 Barker, E., Chen, L., Roginsky, A., and M. Smid, 2614 "Recommendation for Pair-Wise Key Establishment Schemes 2615 Using Discrete Logarithm Cryptography", National Institute 2616 of Standards and Technology report, 2617 DOI 10.6028/nist.sp.800-56ar2, May 2013. 2619 [RFC7748] Langley, A., Hamburg, M., and S. Turner, "Elliptic Curves 2620 for Security", RFC 7748, DOI 10.17487/RFC7748, January 2621 2016, . 2623 [signal] Perrin(ed), T. and M. Marlinspike, "The Double Ratchet 2624 Algorithm", n.d., 2625 . 2628 Appendix A. Tree Math 2630 One benefit of using left-balanced trees is that they admit a simple 2631 flat array representation. In this representation, leaf nodes are 2632 even-numbered nodes, with the n-th leaf at 2*n. Intermediate nodes 2633 are held in odd-numbered nodes. For example, a 11-element tree has 2634 the following structure: 2636 X 2637 X 2638 X X X 2639 X X X X X 2640 X X X X X X X X X X X 2641 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 2643 This allows us to compute relationships between tree nodes simply by 2644 manipulating indices, rather than having to maintain complicated 2645 structures in memory, even for partial trees. The basic rule is that 2646 the high-order bits of parent and child nodes have the following 2647 relation (where "x" is an arbitrary bit string): 2649 parent=01x => left=00x, right=10x 2651 The following python code demonstrates the tree computations 2652 necessary for MLS. Test vectors can be derived from the diagram 2653 above. 2655 # The largest power of 2 less than n. Equivalent to: 2656 # int(math.floor(math.log(x, 2))) 2657 def log2(x): 2658 if x == 0: 2659 return 0 2661 k = 0 2662 while (x >> k) > 0: 2663 k += 1 2664 return k-1 2666 # The level of a node in the tree. Leaves are level 0, their 2667 # parents are level 1, etc. If a node's children are at different 2668 # level, then its level is the max level of its children plus one. 2669 def level(x): 2670 if x & 0x01 == 0: 2671 return 0 2673 k = 0 2674 while ((x >> k) & 0x01) == 1: 2675 k += 1 2676 return k 2678 # The number of nodes needed to represent a tree with n leaves 2679 def node_width(n): 2680 return 2*(n - 1) + 1 2682 # The index of the root node of a tree with n leaves 2683 def root(n): 2684 w = node_width(n) 2685 return (1 << log2(w)) - 1 2687 # The left child of an intermediate node. Note that because the 2688 # tree is left-balanced, there is no dependency on the size of the 2689 # tree. The child of a leaf node is itself. 2690 def left(x): 2691 k = level(x) 2692 if k == 0: 2693 return x 2695 return x ^ (0x01 << (k - 1)) 2697 # The right child of an intermediate node. Depends on the size of 2698 # the tree because the straightforward calculation can take you 2699 # beyond the edge of the tree. The child of a leaf node is itself. 2700 def right(x, n): 2701 k = level(x) 2702 if k == 0: 2703 return x 2705 r = x ^ (0x03 << (k - 1)) 2706 while r >= node_width(n): 2707 r = left(r) 2708 return r 2710 # The immediate parent of a node. May be beyond the right edge of 2711 # the tree. 2712 def parent_step(x): 2713 k = level(x) 2714 b = (x >> (k + 1)) & 0x01 2715 return (x | (1 << k)) ^ (b << (k + 1)) 2717 # The parent of a node. As with the right child calculation, have 2718 # to walk back until the parent is within the range of the tree. 2719 def parent(x, n): 2720 if x == root(n): 2721 return x 2723 p = parent_step(x) 2724 while p >= node_width(n): 2725 p = parent_step(p) 2726 return p 2728 # The other child of the node's parent. Root's sibling is itself. 2729 def sibling(x, n): 2730 p = parent(x, n) 2731 if x < p: 2732 return right(p, n) 2733 elif x > p: 2734 return left(p) 2736 return p 2738 # The direct path of a node, ordered from the root 2739 # down, not including the root or the terminal node 2740 def direct_path(x, n): 2741 d = [] 2742 p = parent(x, n) 2743 r = root(n) 2744 while p != r: 2745 d.append(p) 2746 p = parent(p, n) 2747 return d 2749 # The copath of the node is the siblings of the nodes on its direct 2750 # path (including the node itself) 2751 def copath(x, n): 2752 d = dirpath(x, n) 2753 if x != sibling(x, n): 2754 d.append(x) 2756 return [sibling(y, n) for y in d] 2758 # Frontier is the list of full subtrees, from left to right. A 2759 # balanced binary tree with n leaves has a full subtree for every 2760 # power of two where n has a bit set, with the largest subtrees 2761 # furthest to the left. For example, a tree with 11 leaves has full 2762 # subtrees of size 8, 2, and 1. 2764 def frontier(n): 2765 st = [1 << k for k in range(log2(n) + 1) if n & (1 << k) != 0] 2766 st = reversed(st) 2768 base = 0 2769 f = [] 2770 for size in st: 2771 f.append(root(size) + base) 2772 base += 2*size 2773 return f 2775 # Leaves are in even-numbered nodes 2776 def leaves(n): 2777 return [2*i for i in range(n)] 2779 # The resolution of a node is the collection of non-blank 2780 # descendants of this node. Here the tree is represented by a list 2781 # of nodes, where blank nodes are represented by None 2782 def resolve(tree, x, n): 2783 if tree[x] != None: 2784 return [x] 2786 if level(x) == 0: 2787 return [] 2789 L = resolve(tree, left(x), n) 2790 R = resolve(tree, right(x, n), n) 2791 return L + R 2793 Authors' Addresses 2795 Richard Barnes 2796 Cisco 2798 Email: rlb@ipv.sx 2800 Benjamin Beurdouche 2801 Inria 2803 Email: benjamin.beurdouche@inria.fr 2805 Jon Millican 2806 Facebook 2808 Email: jmillican@fb.com 2809 Emad Omara 2810 Google 2812 Email: emadomara@google.com 2814 Katriel Cohn-Gordon 2815 University of Oxford 2817 Email: me@katriel.co.uk 2819 Raphael Robert 2820 Wire 2822 Email: raphael@wire.com