idnits 2.17.1 draft-ietf-mls-protocol-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 22, 2018) is 2012 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'I-D.ietf-mls-architecture' is mentioned on line 243, but not defined == Missing Reference: 'CD' is mentioned on line 586, but not defined == Missing Reference: 'A' is mentioned on line 586, but not defined == Missing Reference: 'N' is mentioned on line 1489, but not defined -- Looks like a reference, but probably isn't: '32' on line 1550 == Outdated reference: A later version (-42) exists of draft-ietf-trans-rfc6962-bis-29 Summary: 0 errors (**), 0 flaws (~~), 6 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group R. Barnes 3 Internet-Draft Cisco 4 Intended status: Informational J. Millican 5 Expires: April 25, 2019 Facebook 6 E. Omara 7 Google 8 K. Cohn-Gordon 9 University of Oxford 10 R. Robert 11 Wire 12 October 22, 2018 14 The Messaging Layer Security (MLS) Protocol 15 draft-ietf-mls-protocol-02 17 Abstract 19 Messaging applications are increasingly making use of end-to-end 20 security mechanisms to ensure that messages are only accessible to 21 the communicating endpoints, and not to any servers involved in 22 delivering messages. Establishing keys to provide such protections 23 is challenging for group chat settings, in which more than two 24 participants need to agree on a key but may not be online at the same 25 time. In this document, we specify a key establishment protocol that 26 provides efficient asynchronous group key establishment with forward 27 secrecy and post-compromise security for groups in size ranging from 28 two to thousands. 30 Status of This Memo 32 This Internet-Draft is submitted in full conformance with the 33 provisions of BCP 78 and BCP 79. 35 Internet-Drafts are working documents of the Internet Engineering 36 Task Force (IETF). Note that other groups may also distribute 37 working documents as Internet-Drafts. The list of current Internet- 38 Drafts is at https://datatracker.ietf.org/drafts/current/. 40 Internet-Drafts are draft documents valid for a maximum of six months 41 and may be updated, replaced, or obsoleted by other documents at any 42 time. It is inappropriate to use Internet-Drafts as reference 43 material or to cite them other than as "work in progress." 45 This Internet-Draft will expire on April 25, 2019. 47 Copyright Notice 49 Copyright (c) 2018 IETF Trust and the persons identified as the 50 document authors. All rights reserved. 52 This document is subject to BCP 78 and the IETF Trust's Legal 53 Provisions Relating to IETF Documents 54 (https://trustee.ietf.org/license-info) in effect on the date of 55 publication of this document. Please review these documents 56 carefully, as they describe your rights and restrictions with respect 57 to this document. Code Components extracted from this document must 58 include Simplified BSD License text as described in Section 4.e of 59 the Trust Legal Provisions and are provided without warranty as 60 described in the Simplified BSD License. 62 Table of Contents 64 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 65 1.1. Change Log . . . . . . . . . . . . . . . . . . . . . . . 4 66 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 67 3. Basic Assumptions . . . . . . . . . . . . . . . . . . . . . . 6 68 4. Protocol Overview . . . . . . . . . . . . . . . . . . . . . . 6 69 5. Ratchet Trees . . . . . . . . . . . . . . . . . . . . . . . . 9 70 5.1. Tree Computation Terminology . . . . . . . . . . . . . . 9 71 5.2. Ratchet Tree Nodes . . . . . . . . . . . . . . . . . . . 12 72 5.3. Blank Nodes and Resolution . . . . . . . . . . . . . . . 13 73 5.4. Ratchet Tree Updates . . . . . . . . . . . . . . . . . . 13 74 5.5. Cryptographic Objects . . . . . . . . . . . . . . . . . . 15 75 5.5.1. Curve25519, SHA-256, and AES-128-GCM . . . . . . . . 15 76 5.5.2. P-256, SHA-256, and AES-128-GCM . . . . . . . . . . . 16 77 5.6. Credentials . . . . . . . . . . . . . . . . . . . . . . . 17 78 5.7. Group State . . . . . . . . . . . . . . . . . . . . . . . 17 79 5.8. Direct Paths . . . . . . . . . . . . . . . . . . . . . . 19 80 5.9. Key Schedule . . . . . . . . . . . . . . . . . . . . . . 20 81 6. Initialization Keys . . . . . . . . . . . . . . . . . . . . . 22 82 7. Handshake Messages . . . . . . . . . . . . . . . . . . . . . 22 83 7.1. Init . . . . . . . . . . . . . . . . . . . . . . . . . . 24 84 7.2. Add . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 85 7.3. Update . . . . . . . . . . . . . . . . . . . . . . . . . 27 86 7.4. Remove . . . . . . . . . . . . . . . . . . . . . . . . . 27 87 8. Sequencing of State Changes . . . . . . . . . . . . . . . . . 28 88 8.1. Server-Enforced Ordering . . . . . . . . . . . . . . . . 29 89 8.2. Client-Enforced Ordering . . . . . . . . . . . . . . . . 29 90 8.3. Merging Updates . . . . . . . . . . . . . . . . . . . . . 30 91 9. Message Protection . . . . . . . . . . . . . . . . . . . . . 31 92 9.1. Application Key Schedule . . . . . . . . . . . . . . . . 31 93 9.1.1. Updating the Application Secret . . . . . . . . . . . 32 94 9.1.2. Application AEAD Key Calculation . . . . . . . . . . 33 96 9.2. Message Encryption and Decryption . . . . . . . . . . . . 33 97 9.2.1. Delayed and Reordered Application messages . . . . . 35 98 10. Security Considerations . . . . . . . . . . . . . . . . . . . 36 99 10.1. Confidentiality of the Group Secrets . . . . . . . . . . 36 100 10.2. Authentication . . . . . . . . . . . . . . . . . . . . . 36 101 10.3. Forward and post-compromise security . . . . . . . . . . 37 102 10.4. Init Key Reuse . . . . . . . . . . . . . . . . . . . . . 37 103 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 37 104 12. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 37 105 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 38 106 13.1. Normative References . . . . . . . . . . . . . . . . . . 38 107 13.2. Informative References . . . . . . . . . . . . . . . . . 39 108 Appendix A. Tree Math . . . . . . . . . . . . . . . . . . . . . 40 109 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 43 111 1. Introduction 113 DISCLAIMER: This is a work-in-progress draft of MLS and has not yet 114 seen significant security analysis. It should not be used as a basis 115 for building production systems. 117 RFC EDITOR: PLEASE REMOVE THE FOLLOWING PARAGRAPH The source for this 118 draft is maintained in GitHub. Suggested changes should be submitted 119 as pull requests at https://github.com/mlswg/mls-protocol. 120 Instructions are on that page as well. Editorial changes can be 121 managed in GitHub, but any substantive change should be discussed on 122 the MLS mailing list. 124 A group of agents who want to send each other encrypted messages 125 needs a way to derive shared symmetric encryption keys. For two 126 parties, this problem has been studied thoroughly, with the Double 127 Ratchet emerging as a common solution [doubleratchet] [signal]. 128 Channels implementing the Double Ratchet enjoy fine-grained forward 129 secrecy as well as post-compromise security, but are nonetheless 130 efficient enough for heavy use over low-bandwidth networks. 132 For a group of size greater than two, a common strategy is to 133 unilaterally broadcast symmetric "sender" keys over existing shared 134 symmetric channels, and then for each agent to send messages to the 135 group encrypted with their own sender key. Unfortunately, while this 136 improves efficiency over pairwise broadcast of individual messages 137 and (with the addition of a hash ratchet) provides forward secrecy, 138 it is difficult to achieve post-compromise security with sender keys. 139 An adversary who learns a sender key can often indefinitely and 140 passively eavesdrop on that sender's messages. Generating and 141 distributing a new sender key provides a form of post-compromise 142 security with regard to that sender. However, it requires 143 computation and communications resources that scale linearly as the 144 size of the group. 146 In this document, we describe a protocol based on tree structures 147 that enable asynchronous group keying with forward secrecy and post- 148 compromise security. Based on earlier work on "asynchronous 149 ratcheting trees" [art], the mechanism presented here use a 150 asynchronous key-encapsulation mechanism for tree structures. This 151 mechanism allows the members of the group to derive and update shared 152 keys with costs that scale as the log of the group size. 154 1.1. Change Log 156 RFC EDITOR PLEASE DELETE THIS SECTION. 158 draft-02 160 o Removed ART (*) 162 o Allowed partial trees to avoid double-joins (*) 164 o Added explicit key confirmation (*) 166 draft-01 168 o Initial description of the Message Protection mechanism. (*) 170 o Initial specification proposal for the Application Key Schedule 171 using the per-participant chaining of the Application Secret 172 design. (*) 174 o Initial specification proposal for an encryption mechanism to 175 protect Application Messages using an AEAD scheme. (*) 177 o Initial specification proposal for an authentication mechanism of 178 Application Messages using signatures. (*) 180 o Initial specification proposal for a padding mechanism to 181 improving protection of Application Messages against traffic 182 analysis. (*) 184 o Inversion of the Group Init Add and Application Secret derivations 185 in the Handshake Key Schedule to be ease chaining in case we 186 switch design. (*) 188 o Removal of the UserAdd construct and split of GroupAdd into Add 189 and Welcome messages (*) 191 o Initial proposal for authenticating Handshake messages by signing 192 over group state and including group state in the key schedule (*) 194 o Added an appendix with example code for tree math 196 o Changed the ECIES mechanism used by TreeKEM so that it uses nonces 197 generated from the shared secret 199 draft-00 201 o Initial adoption of draft-barnes-mls-protocol-01 as a WG item. 203 2. Terminology 205 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 206 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 207 "OPTIONAL" in this document are to be interpreted as described in BCP 208 14 [RFC2119] [RFC8174] when, and only when, they appear in all 209 capitals, as shown here. 211 Participant: An agent that uses this protocol to establish shared 212 cryptographic state with other participants. A participant is 213 defined by the cryptographic keys it holds. An application may 214 use one participant per device (keeping keys local to each device) 215 or sync keys among a user's devices so that each user appears as a 216 single participant. 218 Group: A collection of participants with shared cryptographic state. 220 Member: A participant that is included in the shared state of a 221 group, and has access to the group's secrets. 223 Initialization Key: A short-lived Diffie-Hellman key pair used to 224 introduce a new member to a group. Initialization keys are 225 published for individual participants (UserInitKey). 227 Leaf Key: A short-lived Diffie-Hellman key pair that represents a 228 group member's contribution to the group secret, so called because 229 the participants leaf keys are the leaves in the group's ratchet 230 tree. 232 Identity Key: A long-lived signing key pair used to authenticate the 233 sender of a message. 235 Terminology specific to tree computations is described in Section 5. 237 We use the TLS presentation language [RFC8446] to describe the 238 structure of protocol messages. 240 3. Basic Assumptions 242 This protocol is designed to execute in the context of a Messaging 243 Service (MS) as described in [I-D.ietf-mls-architecture]. In 244 particular, we assume the MS provides the following services: 246 o A long-term identity key provider which allows participants to 247 authenticate protocol messages in a group. These keys MUST be 248 kept for the lifetime of the group as there is no mechanism in the 249 protocol for changing a participant's identity key. 251 o A broadcast channel, for each group, which will relay a message to 252 all members of a group. For the most part, we assume that this 253 channel delivers messages in the same order to all participants. 254 (See Section 8 for further considerations.) 256 o A directory to which participants can publish initialization keys, 257 and from which participant can download initialization keys for 258 other participants. 260 4. Protocol Overview 262 The goal of this protocol is to allow a group of participants to 263 exchange confidential and authenticated messages. It does so by 264 deriving a sequence of secrets and keys known only to group members. 265 Those should be secret against an active network adversary and should 266 have both forward and post-compromise secrecy with respect to 267 compromise of a participant. 269 We describe the information stored by each participant as a _state_, 270 which includes both public and private data. An initial state, 271 including an initial set of participants, is set up by a group 272 creator using the _Init_ algorithm and based on information pre- 273 published by the initial members. The creator sends the _GroupInit_ 274 message to the participants, who can then set up their own group 275 state and derive the same shared secret. Participants then exchange 276 messages to produce new shared states which are causally linked to 277 their predecessors, forming a logical Directed Acyclic Graph (DAG) of 278 states. Participants can send _Update_ messages for post-compromise 279 secrecy and new participants can be added or existing participants 280 removed from the group. 282 The protocol algorithms we specify here follow. Each algorithm 283 specifies both (i) how a participant performs the operation and (ii) 284 how other participants update their state based on it. 286 There are four major operations in the lifecycle of a group: 288 o Adding a member, initiated by a current member; 290 o Adding a member, initiated by the new member; 292 o Updating the leaf secret of a member; 294 o Removing a member. 296 Before the initialization of a group, participants publish 297 UserInitKey objects to a directory provided to the Messaging Service. 299 Group 300 A B C Directory Channel 301 | | | | | 302 | UserInitKeyA | | | | 303 |------------------------------------------->| | 304 | | | | | 305 | | UserInitKeyB | | | 306 | |---------------------------->| | 307 | | | | | 308 | | | UserInitKeyC | | 309 | | |------------->| | 310 | | | | | 312 When a participant A wants to establish a group with B and C, it 313 first downloads InitKeys for B and C. It then initializes a group 314 state containing only itself and uses the InitKeys to compute Add 315 messages to add B and C, in a sequence chosen by A. These messages 316 are broadcasted to the Group, and processed in sequence by B and C. 317 Messages received before a participant has joined the group are 318 ignored. Only after A has received its Add messages back from the 319 server does it update its state to reflect their addition. 321 Group 322 A B C Directory Channel 323 | | | | | 324 | UserInitKeyB, UserInitKeyC | | 325 |<-------------------------------------------| | 326 | | | | | 327 | | | | Add(A->AB) | 328 |--------------------------------------------------------------->| 329 | | | | | 330 | | | | Add(AB->ABC) | 331 |--------------------------------------------------------------->| 332 | | | | | 333 | | | | Add(A->AB) | 334 |<---------------------------------------------------------------| 335 |state.add(B) |<------------------------------------------------| 336 | |state.init() |x---------------------------------| 337 | | | | | 338 | | | | Add(AB->ABC) | 339 |<---------------------------------------------------------------| 340 |state.add(C) |<------------------------------------------------| 341 | |state.add(C) |<---------------------------------| 342 | | |state.init() | | 343 | | | | | 345 Subsequent additions of group members proceed in the same way. Any 346 member of the group can download an InitKey for a new participant and 347 broadcast an Add message that the current group can use to update 348 their state and the new participant can use to initialize its state. 350 To enforce forward secrecy and post-compromise security of messages, 351 each participant periodically updates its leaf secret which 352 represents its contribution to the group secret. Any member of the 353 group can send an Update at any time by generating a fresh leaf 354 secret and sending an Update message that describes how to update the 355 group secret with that new information. Once all participants have 356 processed this message, the group's secrets will be unknown to an 357 attacker that had compromised the sender's prior leaf secret. 359 It is left to the application to determine the interval of time 360 between Update messages. This policy could require a change for each 361 message, or it could require sending an update every week or more. 363 Group 364 A B ... Z Directory Channel 365 | | | | | 366 | Update(A) | | | | 367 |---------------------------------------------------------->| 368 | | | | | 369 | | | | Update(A) | 370 |<----------------------------------------------------------| 371 |state.upd(A) |<-------------------------------------------| 372 | |state.upd(A) |<----------------------------| 373 | | |state.upd(A) | | 374 | | | | | 376 Users are removed from the group in a similar way, as an update is 377 effectively removing the old leaf from the group. Any member of the 378 group can generate a Remove message that adds new entropy to the 379 group state that is known to all members except the removed member. 380 After other participants have processed this message, the group's 381 secrets will be unknown to the removed participant. Note that this 382 does not necessarily imply that any member is actually allowed to 383 evict other members; groups can layer authentication-based access 384 control policies on top of these basic mechanism. 386 Group 387 A B ... Z Directory Channel 388 | | | | | 389 | | | Remove(B) | | 390 | | |---------------------------->| 391 | | | | | 392 | | | | Remove(B) | 393 |<----------------------------------------------------------| 394 |state.del(B) | |<----------------------------| 395 | | |state.del(B) | | 396 | | | | | 397 | | | | | 399 5. Ratchet Trees 401 The protocol uses "ratchet trees" for deriving shared secrets among a 402 group of participants. 404 5.1. Tree Computation Terminology 406 Trees consist of _nodes_. A node is a _leaf_ if it has no children, 407 and a _parent_ otherwise; note that all parents in our ratchet trees 408 have precisely two children, a _left_ child and a _right_ child. A 409 node is the _root_ of a tree if it has no parents, and _intermediate_ 410 if it has both children and parents. The _descendants_ of a node are 411 that node, its children, and the descendants of its children, and we 412 say a tree _contains_ a node if that node is a descendant of the root 413 of the tree. Nodes are _siblings_ if they share the same parent. 415 A _subtree_ of a tree is the tree given by the descendants of any 416 node, the _head_ of the subtree. The _size_ of a tree or subtree is 417 the number of leaf nodes it contains. For a given parent node, its 418 _left subtree_ is the subtree with its left child as head 419 (respectively _right subtree_). 421 All trees used in this protocol are left-balanced binary trees. A 422 binary tree is _full_ (and _balanced_) if it its size is a power of 423 two and for any parent node in the tree, its left and right subtrees 424 have the same size. If a subtree is full and it is not a subset of 425 any other full subtree, then it is _maximal_. 427 A binary tree is _left-balanced_ if for every parent, either the 428 parent is balanced, or the left subtree of that parent is the largest 429 full subtree that could be constructed from the leaves present in the 430 parent's own subtree. Note that given a list of "n" items, there is 431 a unique left-balanced binary tree structure with these elements as 432 leaves. In such a left-balanced tree, the "k-th" leaf node refers to 433 the "k-th" leaf node in the tree when counting from the left, 434 starting from 0. 436 The _direct path_ of a root is the empty list, and of any other node 437 is the concatenation of that node with the direct path of its parent. 438 The _copath_ of a node is the list of siblings of nodes in its direct 439 path, excluding the root. The _frontier_ of a tree is the list of 440 heads of the maximal full subtrees of the tree, ordered from left to 441 right. 443 For example, in the below tree: 445 o The direct path of C is (C, CD, ABCD) 447 o The copath of C is (D, AB, EFG) 449 o The frontier of the tree is (ABCD, EF, G) 450 ABCDEFG 451 / \ 452 / \ 453 / \ 454 ABCD EFG 455 / \ / \ 456 / \ / \ 457 AB CD EF | 458 / \ / \ / \ | 459 A B C D E F G 461 1 1 1 462 0 1 2 3 4 5 6 7 8 9 0 1 2 464 Each node in the tree is assigned an _index_, starting at zero and 465 running from left to right. A node is a leaf node if and only if it 466 has an even index. The indices for the nodes in the above tree are 467 as follows: 469 o 0 = A 471 o 1 = AB 473 o 2 = B 475 o 3 = ABCD 477 o 4 = C 479 o 5 = CD 481 o 6 = D 483 o 7 = ABCDEFG 485 o 8 = E 487 o 9 = EF 489 o 10 = F 491 o 11 = EFG 493 o 12 = G 495 (Note that left-balanced binary trees are the same structure that is 496 used for the Merkle trees in the Certificate Transparency protocol 497 [I-D.ietf-trans-rfc6962-bis].) 499 5.2. Ratchet Tree Nodes 501 Ratchet trees are used for generating shared group secrets. In this 502 section, we describe the structure of a ratchet tree. A particular 503 instance of a ratchet tree is based on the following cryptographic 504 primitives, defined by the ciphersuite in use: 506 o A Diffie-Hellman finite-field group or elliptic curve 508 o A Derive-Key-Pair function that produces a key pair from an octet 509 string 511 o A hash function 513 A ratchet tree is a left-balanced binary tree, in which each node 514 contains up to three values: 516 o A secret octet string (optional) 518 o An asymmetric private key (optional) 520 o An asymmetric public key 522 The private key and public key for a node are derived from its secret 523 value using the Derive-Key-Pair operation. 525 The contents of a parent node are computed from one of its children 526 as follows: 528 parent_secret = Hash(child_secret) 529 parent_private, parent_public = Derive-Key-Pair(parent_secret) 531 The contents of the parent are based on the latest-updated child. 532 For example, if participants with leaf secrets A, B, C, and D join a 533 group in that order, then the resulting tree will have the following 534 structure: 536 H(H(D)) 537 / \ 538 H(B) H(D) 539 / \ / \ 540 A B C D 542 If the first participant subsequently changes its leaf secret to be 543 X, then the tree will have the following structure. 545 H(H(X)) 546 / \ 547 H(X) H(D) 548 / \ / \ 549 X B C D 551 5.3. Blank Nodes and Resolution 553 A node in the tree may be _blank_, indicating that no value is 554 present at that node. The _resolution_ of a node is an ordered list 555 of non-blank nodes that collectively cover all non-blank descendants 556 of the node. The nodes in a resolution are ordered according to 557 their indices. 559 o The resolution of a non-blank node is a one element list 560 containing the node itself 562 o The resolution of a blank leaf node is the empty list 564 o The resolution of a blank intermediate node is the result of 565 concatinating the resolution of its left child with the resolution 566 of its right child, in that order 568 For example, consider the following tree, where the "_" character 569 represents a blank node: 571 _ 572 / \ 573 / \ 574 _ CD 575 / \ / \ 576 A _ C D 578 0 1 2 3 4 5 6 580 In this tree, we can see all three of the above rules in play: 582 o The resolution of node 5 is the list [CD] 584 o The resolution of node 2 is the empty list [] 586 o The resolution of node 3 is the list [A, CD] 588 5.4. Ratchet Tree Updates 590 In order to update the state of the group such as adding and removing 591 participants, MLS messages are used to make changes to the group's 592 ratchet tree. The participant proposing an update to the tree 593 transmits a representation of a set of tree nodes along the direct 594 path from a leaf to the root. Other participants in the group can 595 use these nodes to update their view of the tree, aligning their copy 596 of the tree to the sender's. 598 To perform an update for a leaf, the sender transmits the following 599 information for each node in the direct path from leaf leaf to the 600 root: 602 o The public key for the node 604 o Zero or more encrypted copies of the node's secret value 606 The secret value is encrypted for the subtree corresponding to the 607 node's non-updated child, i.e., the child not on the direct path. 608 There is one encrypted secret for each public key in the resolution 609 of the non-updated child. In particular, for the leaf node, there 610 are no encrypted secrets, since a leaf node has no children. 612 The recipient of an update processes it with the following steps: 614 1. Compute the updated secret values * Identify a node in the direct 615 path for which the local participant is in the subtree of the 616 non-updated child * Identify a node in the resolution of the non- 617 updated child for which this node has a private key * Decrypt the 618 secret value for the direct path node using the private key from 619 the resolution node * Compute secret values for ancestors of that 620 node by hashing the decrypted secret 622 2. Merge the updated secrets into the tree * Replace the public keys 623 for nodes on the direct path with the received public keys * For 624 nodes where an updated secret was computed in step 1, replace the 625 secret value for the node with the updated value 627 For example, suppose we had the following tree: 629 G 630 / \ 631 / \ 632 E _ 633 / \ / \ 634 A B C D 636 If an update is made along the direct path B-E-G, then the following 637 values will be transmitted (using pk(X) to represent the public key 638 corresponding to the secret value X and E(K, S) to represent public- 639 key encryption to the public key K of the secret value S): 641 +------------+--------------------------+ 642 | Public Key | Ciphertext(s) | 643 +------------+--------------------------+ 644 | pk(G) | E(pk(C), G), E(pk(D), G) | 645 | | | 646 | pk(E) | E(pk(A), E) | 647 | | | 648 | pk(B) | | 649 +------------+--------------------------+ 651 5.5. Cryptographic Objects 653 Each MLS session uses a single ciphersuite that specifies the 654 following primitives to be used in group key computations: 656 o A hash function 658 o A Diffie-Hellman finite-field group or elliptic curve 660 o An AEAD encryption algorithm [RFC5116] 662 The ciphersuite must also specify an algorithm "Derive-Key-Pair" that 663 maps octet strings with the same length as the output of the hash 664 function to key pairs for the asymmetric encryption scheme. 666 Public keys used in the protocol are opaque values in a format 667 defined by the ciphersuite, using the following types: 669 uint16 CipherSuite; 670 opaque DHPublicKey<1..2^16-1>; 671 opaque SignaturePublicKey<1..2^16-1>; 673 5.5.1. Curve25519, SHA-256, and AES-128-GCM 675 This ciphersuite uses the following primitives: 677 o Hash function: SHA-256 679 o Diffie-Hellman group: Curve25519 [RFC7748] 681 o AEAD: AES-128-GCM 683 Given an octet string X, the private key produced by the Derive-Key- 684 Pair operation is SHA-256(X). (Recall that any 32-octet string is a 685 valid Curve25519 private key.) The corresponding public key is 686 X25519(SHA-256(X), 9). 688 Implementations SHOULD use the approach specified in [RFC7748] to 689 calculate the Diffie-Hellman shared secret. Implementations MUST 690 check whether the computed Diffie-Hellman shared secret is the all- 691 zero value and abort if so, as described in Section 6 of [RFC7748]. 692 If implementers use an alternative implementation of these elliptic 693 curves, they SHOULD perform the additional checks specified in 694 Section 7 of [RFC7748] 696 Encryption keys are derived from shared secrets by taking the first 697 16 bytes of H(Z), where Z is the shared secret and H is SHA-256. 699 5.5.2. P-256, SHA-256, and AES-128-GCM 701 This ciphersuite uses the following primitives: 703 o Hash function: P-256 705 o Diffie-Hellman group: secp256r1 (NIST P-256) 707 o AEAD: AES-128-GCM 709 Given an octet string X, the private key produced by the Derive-Key- 710 Pair operation is SHA-256(X), interpreted as a big-endian integer. 711 The corresponding public key is the result of multiplying the 712 standard P-256 base point by this integer. 714 P-256 ECDH calculations (including parameter and key generation as 715 well as the shared secret calculation) are performed according to 716 [IEEE1363] using the ECKAS-DH1 scheme with the identity map as key 717 derivation function (KDF), so that the shared secret is the 718 x-coordinate of the ECDH shared secret elliptic curve point 719 represented as an octet string. Note that this octet string (Z in 720 IEEE 1363 terminology) as output by FE2OSP, the Field Element to 721 Octet String Conversion Primitive, has constant length for any given 722 field; leading zeros found in this octet string MUST NOT be 723 truncated. 725 (Note that this use of the identity KDF is a technicality. The 726 complete picture is that ECDH is employed with a non-trivial KDF 727 because MLS does not directly use this secret for anything other than 728 for computing other secrets.) 730 Clients MUST validate remote public values by ensuring that the point 731 is a valid point on the elliptic curve. The appropriate validation 732 procedures are defined in Section 4.3.7 of [X962] and alternatively 733 in Section 5.6.2.3 of [keyagreement]. This process consists of three 734 steps: (1) verify that the value is not the point at infinity (O), 735 (2) verify that for Y = (x, y) both integers are in the correct 736 interval, (3) ensure that (x, y) is a correct solution to the 737 elliptic curve equation. For these curves, implementers do not need 738 to verify membership in the correct subgroup. 740 Encryption keys are derived from shared secrets by taking the first 741 16 bytes of H(Z), where Z is the shared secret and H is SHA-256. 743 5.6. Credentials 745 A member of a group authenticates the identities of other 746 participants by means of credentials issued by some authentication 747 system, e.g., a PKI. Each type of credential MUST express the 748 holder's identity as well as the public key of a signature key pair 749 that the holder of the credential will use to sign MLS messages. 750 Credentials MAY also include information that allows a relying party 751 to verify the identity / signing key binding. 753 enum { 754 basic(0), 755 x509(1), 756 (255) 757 } CredentialType; 759 struct { 760 opaque identity<0..2^16-1>; 761 SignaturePublicKey public_key; 762 } BasicCredential; 764 struct { 765 CredentialType credential_type; 766 select (credential_type) { 767 case basic: 768 BasicCredential; 770 case x509: 771 opaque cert_data<1..2^24-1>; 772 }; 773 } Credential; 775 5.7. Group State 777 Each participant in the group maintains a representation of the state 778 of the group: 780 struct { 781 uint8 present; 782 switch (present) { 783 case 0: struct{}; 784 case 1: T value; 785 } 786 } optional; 788 struct { 789 opaque group_id<0..255>; 790 uint32 epoch; 791 optional roster<1..2^32-1>; 792 optional tree<1..2^32-1>; 793 opaque transcript_hash<0..255>; 794 } GroupState; 796 The fields in this state have the following semantics: 798 o The "group_id" field is an application-defined identifier for the 799 group. 801 o The "epoch" field represents the current version of the group key. 803 o The "roster" field contains credentials for the occupied slots in 804 the tree, including the identity and signature public key for the 805 holder of the slot. 807 o The "tree" field contains the public keys corresponding to the 808 nodes of the ratchet tree for this group. The length of this 809 vector MUST be "2*size + 1", where "size" is the length of the 810 roster, since this is the number of nodes in a tree with "size" 811 leaves, according to the structure described in Section 5. 813 o The "transcript" field contains the list of "GroupOperation" 814 messages that led to this state. 816 When a new member is added to the group, an existing member of the 817 group provides the new member with a Welcome message. The Welcome 818 message provides the information the new member needs to initialize 819 its GroupState. 821 Different group operations will have different effects on the group 822 state. These effects are described in their respective subsections 823 of Section 7. The following rules apply to all operations: 825 o The "group_id" field is constant 826 o The "epoch" field increments by one for each GroupOperation that 827 is processed 829 o The "transcript_hash" is updated by a GroupOperation message 830 "operation" in the following way: 832 transcript\_hash\_[n] = Hash(transcript\_hash\_[n-1] || operation) 834 When a new one-member group is created (which requires no 835 GroupOperation), the "transcript_hash" field is set to an all-zero 836 vector of length Hash.length. 838 5.8. Direct Paths 840 As described in Section 5.4, each MLS message needs to transmit node 841 values along the direct path from a leaf to the root. The path 842 contains a public key for the leaf node, and a public key and 843 encrypted secret value for intermediate nodes in the path. In both 844 cases, the path is ordered from the leaf to the root; each node MUST 845 be the parent of its predecessor. 847 struct { 848 DHPublicKey ephemeral_key; 849 opaque ciphertext<0..255>; 850 } ECIESCiphertext; 852 struct { 853 DHPublicKey public_key; 854 ECIESCiphertext node_secrets<0..2^16-1>; 855 } RatchetNode 857 struct { 858 RatchetNode nodes<0..2^16-1>; 859 } DirectPath; 861 The length of the "node\_secrets" vector MUST be zero for the first 862 node in the path. For the remaining elements in the vector, the 863 number of ciphertexts in the "node\_secrets" vector MUST be equal to 864 the length of the resolution of the corresponding copath node. Each 865 ciphertext in the list is the encryption to the corresponding node in 866 the resolution. 868 The ECIESCiphertext values encoding the encrypted secret values are 869 computed as follows: 871 o Generate an ephemeral DH key pair (x, x*G) in the DH group 872 specified by the ciphersuite in use 874 o Compute the shared secret Z with the node's other child 876 o Derive a key and nonce as described below 878 o Encrypt the node's secret value using the AEAD algorithm specified 879 by the ciphersuite in use, with the following inputs: 881 * Key: The key derived from Z 883 * Nonce: The nonce derived from Z 885 * Additional Authenticated Data: The empty octet string 887 * Plaintext: The secret value, without any further formatting 889 o Encode the ECIESCiphertext with the following values: 891 * ephemeral_key: The ephemeral public key x*G 893 * ciphertext: The AEAD output 895 key = HKDF-Expand(Secret, ECIESLabel("key"), Length) 896 nonce = HKDF-Expand(Secret, ECIESLabel("nonce"), Length) 898 Where ECIESLabel is specified as: 900 struct { 901 uint16 length = Length; 902 opaque label<12..255> = "mls10 ecies " + Label; 903 } ECIESLabel; 905 Decryption is performed in the corresponding way, using the private 906 key of the resolution node and the ephemeral public key transmitted 907 in the message. 909 5.9. Key Schedule 911 Group keys are derived using the HKDF-Extract and HKDF-Expand 912 functions as defined in [RFC5869], as well as the functions defined 913 below: 915 Derive-Secret(Secret, Label, State) = 916 HKDF-Expand(Secret, HkdfLabel, Hash.length) 918 Where HkdfLabel is specified as: 920 struct { 921 uint16 length = Length; 922 opaque label<6..255> = "mls10 " + Label; 923 GroupState state = State; 924 } HkdfLabel; 926 The Hash function used by HKDF is the ciphersuite hash algorithm. 927 Hash.length is its output length in bytes. In the below diagram: 929 o HKDF-Extract takes its Salt argument from the top and its IKM 930 argument from the left 932 o Derive-Secret takes its Secret argument from the incoming arrow 934 When processing a handshake message, a participant combines the 935 following information to derive new epoch secrets: 937 o The init secret from the previous epoch 939 o The update secret for the current epoch 941 o The GroupState object for current epoch 943 Given these inputs, the derivation of secrets for an epoch proceeds 944 as shown in the following diagram: 946 init_secret_[n-1] (or 0) 947 | 948 V 949 update_secret -> HKDF-Extract = epoch_secret 950 | 951 +--> Derive-Secret(., "app", GroupState_[n]) 952 | = application_secret 953 | 954 +--> Derive-Secret(., "confirm", GroupState_[n]) 955 | = confirmation_key 956 | 957 V 958 Derive-Secret(., "init", GroupState_[n]) 959 | 960 V 961 init_secret_[n] 963 6. Initialization Keys 965 In order to facilitate asynchronous addition of participants to a 966 group, it is possible to pre-publish initialization keys that provide 967 some public information about a user. UserInitKey messages provide 968 information about a potential group member, that a group member can 969 use to add this user to a group asynchronously. 971 A UserInitKey object specifies what ciphersuites a client supports, 972 as well as providing public keys that the client can use for key 973 derivation and signing. The client's identity key is intended to be 974 stable throughout the lifetime of the group; there is no mechanism to 975 change it. Init keys are intended to be used a very limited number 976 of times, potentially once. (see Section 10.4). 978 The init_keys array MUST have the same length as the cipher_suites 979 array, and each entry in the init_keys array MUST be a public key for 980 the DH group defined by the corresponding entry in the cipher_suites 981 array. 983 The whole structure is signed using the client's identity key. A 984 UserInitKey object with an invalid signature field MUST be considered 985 malformed. The input to the signature computation comprises all of 986 the fields except for the signature field. 988 struct { 989 CipherSuite cipher_suites<0..255>; 990 DHPublicKey init_keys<1..2^16-1>; 991 SignaturePublicKey identity_key; 992 SignatureScheme algorithm; 993 opaque signature<0..2^16-1>; 994 } UserInitKey; 996 7. Handshake Messages 998 Over the lifetime of a group, its state will change for: 1000 o Group initialization 1002 o A current member adding a new participant 1004 o A current participant updating its leaf key 1006 o A current member deleting another current member 1008 In MLS, these changes are accomplished by broadcasting "handshake" 1009 messages to the group. Note that unlike TLS and DTLS, there is not a 1010 consolidated handshake phase to the protocol. Rather, handshake 1011 messages are exchanged throughout the lifetime of a group, whenever a 1012 change is made to the group state. This means an unbounded number of 1013 interleaved application and handshake messages. 1015 An MLS handshake message encapsulates a specific "key exchange" 1016 message that accomplishes a change to the group state. It also 1017 includes a signature by the sender of the message over the GroupState 1018 object representing the state of the group after the change has been 1019 made. 1021 enum { 1022 init(0), 1023 add(1), 1024 update(2), 1025 remove(3), 1026 (255) 1027 } GroupOperationType; 1029 struct { 1030 GroupOperationType msg_type; 1031 select (GroupOperation.msg_type) { 1032 case init: Init; 1033 case add: Add; 1034 case update: Update; 1035 case remove: Remove; 1036 }; 1037 } GroupOperation; 1039 struct { 1040 uint32 prior_epoch; 1041 GroupOperation operation; 1043 uint32 signer_index; 1044 SignatureScheme algorithm; 1045 opaque signature<1..2^16-1>; 1046 opaque confirmation[Hash.length]; 1047 } Handshake; 1049 The high-level flow for processing a Handshake message is as follows: 1051 1. Verify that the "prior_epoch" field of the Handshake message is 1052 equal the "epoch" field of the current GroupState object. 1054 2. Use the "operation" message to produce an updated, provisional 1055 GroupState object incorporating the proposed changes. 1057 3. Look up the public key for slot index "signer_index" from the 1058 roster in the current GroupState object (before the update). 1060 4. Use that public key to verify the "signature" field in the 1061 Handshake message, with the updated GroupState object as input. 1063 5. If the signature fails to verify, discard the updated GroupState 1064 object and consider the Handshake message invalid. 1066 6. Use the "confirmation_key" for the new group state to compute the 1067 finished MAC for this message, as described below, and verify 1068 that it is the same as the "finished_mac" field. 1070 7. If the the above checks are successful, consider the updated 1071 GroupState object as the current state of the group. 1073 The "finished_mac" value is computed over the provisional group state 1074 and the current handshake message (with the confirmation value set to 1075 zero): 1077 struct { 1078 GroupState state; // Provisional group state 1079 Handshake handshake; // Handshake message, confirmation = 0 1080 } ConfirmationData; 1082 confirmation = HMAC(confirmation_key, ConfirmationData) 1084 HMAC [RFC2104] uses the Hash algorithm for the ciphersuite in use. 1086 [[ OPEN ISSUE: The Add and Remove operations create a "double-join" 1087 situation, where a participants leaf key is also known to another 1088 participant. When a participant A is double-joined to another B, 1089 deleting A will not remove them from the conversation, since they 1090 will still hold the leaf key for B. These situations are resolved by 1091 updates, but since operations are asynchronous and participants may 1092 be offline for a long time, the group will need to be able to 1093 maintain security in the presence of double-joins. ]] 1095 [[ OPEN ISSUE: It is not possible for the recipient of a handshake 1096 message to verify that ratchet tree information in the message is 1097 accurate, because each node can only compute the secret and private 1098 key for nodes in its direct path. This creates the possibility that 1099 a malicious participant could cause a denial of service by sending a 1100 handshake message with invalid values for public keys in the ratchet 1101 tree. ]] 1103 7.1. Init 1105 [[ OPEN ISSUE: Direct initialization is currently undefined. A 1106 participant can create a group by initializing its own state to 1107 reflect a group including only itself, then adding the initial 1108 participants. This has computation and communication complexity O(N 1109 log N) instead of the O(N) complexity of direct initialization. ]] 1111 7.2. Add 1113 In order to add a new member to the group, an existing member of the 1114 group must take two actions: 1116 1. Send a Welcome message to the new member 1118 2. Send an Add message to the group (including the new member) 1120 The Welcome message contains the information that the new member 1121 needs to initialize a GroupState object that can be updated to the 1122 current state using the Add message: 1124 struct { 1125 opaque group_id<0..255>; 1126 uint32 epoch; 1127 Credential roster<1..2^32-1>; 1128 PublicKey tree<1..2^32-1>; 1129 GroupOperation transcript<0..2^32-1>; 1130 opaque init_secret<0..255>; 1131 } Welcome; 1133 Note that the "init_secret" in the Welcome message is the 1134 "init_secret" at the output of the key schedule diagram in 1135 Section 5.9. That is, if the "epoch" value in the Welcome message is 1136 "n", then the "init_secret" value is "init_secret_[n]". The new 1137 member can combine this init secret with the update secret 1138 transmitted in the corresponding Add message to get the epoch secret 1139 for the epoch in which it is added. No secrets from prior epochs are 1140 revealed to the new member. 1142 Since the new member is expected to process the Add message for 1143 itself, the Welcome message should reflect the state of the group 1144 before the new user is added. The sender of the Welcome message can 1145 simply copy all fields except the "leaf_secret" from its GroupState 1146 object. 1148 [[ OPEN ISSUE: The Welcome message needs to be sent encrypted for the 1149 new member. This should be done using the public key in the 1150 UserInitKey, either with ECIES or X3DH. ]] 1152 [[ OPEN ISSUE: The Welcome message needs to be synchronized in the 1153 same way as the Add. That is, the Welcome should be sent only if the 1154 Add succeeds, and is not in conflict with another, simultaneous Add. 1155 ]] 1156 An Add message provides existing group members with the information 1157 they need to update their GroupState with information about the new 1158 member: 1160 struct { 1161 UserInitKey init_key; 1162 } Add; 1164 A group member generates this message by requesting a UserInitKey 1165 from the directory for the user to be added, and encoding it into an 1166 Add message. 1168 The new participant processes Welcome and Add messages together as 1169 follows: 1171 o Prepare a new GroupState object based on the Welcome message 1173 o Process the Add message as an existing participant would 1175 An existing participant receiving a Add message first verifies the 1176 signature on the message, then updates its state as follows: 1178 o Increment the size of the group 1180 o Verify the signature on the included UserInitKey; if the signature 1181 verification fails, abort 1183 o Append an entry to the roster containing the credential in the 1184 included UserInitKey 1186 o Update the ratchet tree by adding a new leaf node for the new 1187 member, containing the public key from the UserInitKey in the Add 1188 corresponding to the ciphersuite in use 1190 o Update the ratchet tree by setting to blank all nodes in the 1191 direct path of the new node, except for the leaf (which remains 1192 set to the new member's public key) 1194 The update secret resulting from this change is an all-zero octet 1195 string of length Hash.length. 1197 On receipt of an Add message, new participants SHOULD send an update 1198 immediately to their key. This will help to limit the tree structure 1199 degrading into subtrees, and thus maintain the protocol's efficiency. 1201 7.3. Update 1203 An Update message is sent by a group participant to update its leaf 1204 key pair. This operation provides post-compromise security with 1205 regard to the participant's prior leaf private key. 1207 struct { 1208 DirectPath path; 1209 } Update; 1211 The sender of an Update message creates it in the following way: 1213 o Generate a fresh leaf key pair 1215 o Compute its direct path in the current ratchet tree 1217 An existing participant receiving a Update message first verifies the 1218 signature on the message, then updates its state as follows: 1220 o Update the cached ratchet tree by replacing nodes in the direct 1221 path from the updated leaf using the information contained in the 1222 Update message 1224 The update secret resulting from this change is the secret for the 1225 root node of the ratchet tree. 1227 7.4. Remove 1229 A Remove message is sent by a group member to remove one or more 1230 participants from the group. 1232 struct { 1233 uint32 removed; 1234 DirectPath path; 1235 } Remove; 1237 The sender of a Remove message generates it as as follows: 1239 o Generate a fresh leaf key pair 1241 o Compute its direct path in the current ratchet tree, starting from 1242 the removed leaf 1244 An existing participant receiving a Remove message first verifies the 1245 signature on the message, then verifies its identity proof against 1246 the identity tree held by the participant. The participant then 1247 updates its state as follows: 1249 o Update the roster by setting the credential in the removed slot to 1250 the null optional value 1252 o Update the ratchet tree by replacing nodes in the direct path from 1253 the removed leaf using the information in the Remove message 1255 o Update the ratchet tree by setting to blank all nodes in the 1256 direct path from the removed leaf to the root 1258 The update secret resulting from this change is the secret for the 1259 root node of the ratchet tree after the second step (after the third 1260 step, the root is blank). 1262 8. Sequencing of State Changes 1264 [[ OPEN ISSUE: This section has an initial set of considerations 1265 regarding sequencing. It would be good to have some more detailed 1266 discussion, and hopefully have a mechanism to deal with this issue. 1267 ]] 1269 Each handshake message is premised on a given starting state, 1270 indicated in its "prior_epoch" field. If the changes implied by a 1271 handshake messages are made starting from a different state, the 1272 results will be incorrect. 1274 This need for sequencing is not a problem as long as each time a 1275 group member sends a handshake message, it is based on the most 1276 current state of the group. In practice, however, there is a risk 1277 that two members will generate handshake messages simultaneously, 1278 based on the same state. 1280 When this happens, there is a need for the members of the group to 1281 deconflict the simultaneous handshake messages. There are two 1282 general approaches: 1284 o Have the delivery service enforce a total order 1286 o Have a signal in the message that clients can use to break ties 1288 As long as handshake messages cannot be merged, there is a risk of 1289 starvation. In a sufficiently busy group, a given member may never 1290 be able to send a handshake message, because he always loses to other 1291 members. The degree to which this is a practical problem will depend 1292 on the dynamics of the application. 1294 It might be possible, because of the non-contributivity of 1295 intermediate nodes, that update messages could be applied one after 1296 the other without the Delivery Service having to reject any handshake 1297 message, which would make MLS more resilient regarding the 1298 concurrency of handshake messages. The Messaging system can decide 1299 to choose the order for applying the state changes. Note that there 1300 are certain cases (if no total ordering is applied by the Delivery 1301 Service) where the ordering is important for security, ie. all 1302 updates must be executed before removes. 1304 Regardless of how messages are kept in sequence, implementations MUST 1305 only update their cryptographic state when valid handshake messages 1306 are received. Generation of handshake messages MUST be stateless, 1307 since the endpoint cannot know at that time whether the change 1308 implied by the handshake message will succeed or not. 1310 8.1. Server-Enforced Ordering 1312 With this approach, the delivery service ensures that incoming 1313 messages are added to an ordered queue and outgoing messages are 1314 dispatched in the same order. The server is trusted to resolve 1315 conflicts during race-conditions (when two members send a message at 1316 the same time), as the server doesn't have any additional knowledge 1317 thanks to the confidentiality of the messages. 1319 Messages should have a counter field sent in clear-text that can be 1320 checked by the server and used for tie-breaking. The counter starts 1321 at 0 and is incremented for every new incoming message. If two group 1322 members send a message with the same counter, the first message to 1323 arrive will be accepted by the server and the second one will be 1324 rejected. The rejected message needs to be sent again with the 1325 correct counter number. 1327 To prevent counter manipulation by the server, the counter's 1328 integrity can be ensured by including the counter in a signed message 1329 envelope. 1331 This applies to all messages, not only state changing messages. 1333 8.2. Client-Enforced Ordering 1335 Order enforcement can be implemented on the client as well, one way 1336 to achieve it is to use a two step update protocol: the first client 1337 sends a proposal to update and the proposal is accepted when it gets 1338 50%+ approval from the rest of the group, then it sends the approved 1339 update. Clients which didn't get their proposal accepted, will wait 1340 for the winner to send their update before retrying new proposals. 1342 While this seems safer as it doesn't rely on the server, it is more 1343 complex and harder to implement. It also could cause starvation for 1344 some clients if they keep failing to get their proposal accepted. 1346 8.3. Merging Updates 1348 It is possible in principle to partly address the problem of 1349 concurrent changes by having the recipients of the changes merge 1350 them, rather than having the senders retry. Because the value of 1351 intermediate node is determined by its last updated child, updates 1352 can be merged by recipients as long as the recipients agree on an 1353 order - the only question is which node was last updated. 1355 Recall that the processing of an update proceeds in two steps: 1357 1. Compute updated secret values by hashing up the tree 1359 2. Update the tree with the new secret and public values 1361 To merge an ordered list of updates, a recipient simply performs 1362 these updates in the specified order. 1364 For example, suppose we have a tree in the following configuration: 1366 H(H(D)) 1367 / \ 1368 H(B) H(D) 1369 / \ / \ 1370 A B C D 1372 Now suppose B and C simultaneously decide to update to X and Y, 1373 respectively. They will send out updates of the following form: 1375 Update from B Update from C 1376 ============= ============= 1377 H(H(X)) H(H(Y)) 1378 / \ 1379 H(X) H(Y) 1380 \ / 1381 X Y 1383 Assuming that the ordering agreed by the group says that B's update 1384 should be processed before C's, the other participants in the group 1385 will overwrite the root value for B with the root value from C, and 1386 all arrive at the following state: 1388 H(H(Y)) 1389 / \ 1390 H(X) H(Y) 1391 / \ / \ 1392 A X Y D 1394 9. Message Protection 1396 The primary purpose of the handshake protocol is to provide an 1397 authenticated group key exchange to participants. In order to 1398 protect Application messages sent among those participants, the 1399 Application secret provided by the Handshake key schedule is used to 1400 derive encryption keys for the Message Protection Layer. 1402 Application messages MUST be protected with the Authenticated- 1403 Encryption with Associated-Data (AEAD) encryption scheme associated 1404 with the MLS ciphersuite. Note that "Authenticated" in this context 1405 does not mean messages are known to be sent by a specific participant 1406 but only from a legitimate member of the group. To authenticate a 1407 message from a particular member, signatures are required. Handshake 1408 messages MUST use asymmetric signatures to strongly authenticate the 1409 sender of a message. 1411 Each participant maintains their own chain of Application secrets, 1412 where the first one is derived based on a secret chained from the 1413 Epoch secret. As shown in Section 5.9, the initial Application 1414 secret is bound to the identity of each participant to avoid 1415 collisions and allow support for decryption of reordered messages. 1417 Subsequent Application secrets MUST be rotated for each message sent 1418 in order to provide stronger cryptographic security guarantees. The 1419 Application Key Schedule use this rotation to generate fresh AEAD 1420 encryption keys and nonces used to encrypt and decrypt future 1421 Application messages. In all cases, a participant MUST NOT encrypt 1422 more than expected by the security bounds of the AEAD scheme used. 1424 Note that each change to the Group through a Handshake message will 1425 cause a change of the Group Secret. Hence this change MUST be 1426 applied before encrypting any new Application message. This is 1427 required for confidentiality reasons in order for Members to avoid 1428 receiving messages from the group after leaving, being added to, or 1429 excluded from the Group. 1431 9.1. Application Key Schedule 1433 After computing the initial Application Secret shared by the group, 1434 each Participant creates an initial Participant Application Secret to 1435 be used for its own sending chain: 1437 application_secret 1438 | 1439 V 1440 Derive-Secret(., "app sender", [sender]) 1441 | 1442 V 1443 application_secret_[sender]_[0] 1445 Note that [sender] represent the uint32 value encoding the index of 1446 the participant in the ratchet tree. 1448 Updating the Application secret and deriving the associated AEAD key 1449 and nonce can be summarized as the following Application key schedule 1450 where each participant's Application secret chain looks as follows 1451 after the initial derivation: 1453 application_secret_[sender]_[N-1] 1454 | 1455 +--> HKDF-Expand-Label(.,"nonce", "", nonce_length) 1456 | = write_nonce_[sender]_[N-1] 1457 | 1458 +--> HKDF-Expand-Label(.,"key", "", key_length) 1459 | = write_key_[sender]_[N-1] 1460 V 1461 Derive-Secret(., "app upd","") 1462 | 1463 V 1464 application_secret_[sender]_[N] 1466 The Application context provided together with the previous 1467 Application secret is used to bind the Application messages with the 1468 next key and add some freshness. 1470 [[OPEN ISSUE: The HKDF context field is left empty for now. A proper 1471 security study is needed to make sure that we do not need more 1472 information in the context to achieve the security goals.]] 1474 [[ OPEN ISSUE: At the moment there is no contributivity of 1475 Application secrets chained from the initial one to the next 1476 generation of Epoch secret. While this seems safe because 1477 cryptographic operations using the application secrets can't affect 1478 the group init_secret, it remains to be proven correct. ]] 1480 9.1.1. Updating the Application Secret 1482 The following rules apply to an Application Secret: 1484 o Senders MUST only use the Application Secret once and 1485 monotonically increment the generation of their secret. This is 1486 important to provide Forward Secrecy at the level of Application 1487 messages. An attacker getting hold of a Participant's Application 1488 Secret at generation [N+1] will not be able to derive the 1489 Participant's Application Secret [N] nor the associated AEAD key 1490 and nonce. 1492 o Receivers MUST delete an Application Secret once it has been used 1493 to derive the corresponding AEAD key and nonce as well as the next 1494 Application Secret. Receivers MAY keep the AEAD key and nonce 1495 around for some reasonable period. 1497 o Receivers MUST delete AEAD keys and nonces once they have been 1498 used to successfully decrypt a message. 1500 9.1.2. Application AEAD Key Calculation 1502 The Application AEAD keying material is generated from the following 1503 input values: 1505 o The Application Secret value; 1507 o A purpose value indicating the specific value being generated; 1509 o The length of the key being generated. 1511 Note, that because the identity of the participant using the keys to 1512 send data is included in the initial Application Secret, all 1513 successive updates to the Application secret will implicitly inherit 1514 this ownership. 1516 All the traffic keying material is recomputed whenever the underlying 1517 Application Secret changes. 1519 9.2. Message Encryption and Decryption 1521 The Group participants MUST use the AEAD algorithm associated with 1522 the negotiated MLS ciphersuite to AEAD encrypt and decrypt their 1523 Application messages and sign them as follows: 1525 struct { 1526 opaque content<0..2^32-1>; 1527 opaque signature<0..2^16-1>; 1528 uint8 zeros[length_of_padding]; 1529 } ApplicationPlaintext; 1531 struct { 1532 uint8 group[32]; 1533 uint32 epoch; 1534 uint32 generation; 1535 uint32 sender; 1536 opaque encrypted_content<0..2^32-1>; 1537 } Application; 1539 The Group identifier and epoch allow a device to know which Group 1540 secrets should be used and from which Epoch secret to start computing 1541 other secrets and keys. The participant identifier is used to derive 1542 the participant Application secret chain from the initial shared 1543 Application secret. The application generation field is used to 1544 determine which Application secret should be used from the chain to 1545 compute the correct AEAD keys before performing decryption. 1547 The signature field allows strong authentication of messages: 1549 struct { 1550 uint8 group[32]; 1551 uint32 epoch; 1552 uint32 generation; 1553 uint32 sender; 1554 opaque content<0..2^32-1>; 1555 } MLSSignatureContent; 1557 The signature used in the MLSPlaintext is computed over the 1558 MLSSignatureContent which covers the metadata information about the 1559 current state of the group (group identifier, epoch, generation and 1560 sender's Leaf index) to prevent Group participants from impersonating 1561 other participants. It is also necessary in order to prevent cross- 1562 group attacks. 1564 [[ TODO: A preliminary formal security analysis has yet to be 1565 performed on this authentication scheme.]] 1567 [[ OPEN ISSUE: Currently, the group identifier, epoch and generation 1568 are contained as meta-data of the Signature. A different solution 1569 could be to include the GroupState instead, if more information is 1570 required to achieve the security goals regarding cross-group attacks. 1571 ]] 1573 [[ OPEN ISSUE: Should the padding be required for Handshake messages 1574 ? Can an adversary get more than the position of a participant in the 1575 tree without padding ? Should the base ciphertext block length be 1576 negotiated or is is reasonable to allow to leak a range for the 1577 length of the plaintext by allowing to send a variable number of 1578 ciphertext blocks ? ]] 1580 Application messages SHOULD be padded to provide some resistance 1581 against traffic analysis techniques over encrypted traffic. [CLINIC] 1582 [HCJ16] While MLS might deliver the same payload less frequently 1583 across a lot of ciphertexts than traditional web servers, it might 1584 still provide the attacker enough information to mount an attack. If 1585 Alice asks Bob: "When are we going to the movie ?" the answer 1586 "Wednesday" might be leaked to an adversary by the ciphertext length. 1587 An attacker expecting Alice to answer Bob with a day of the week 1588 might find out the plaintext by correlation between the question and 1589 the length. 1591 Similarly to TLS 1.3, if padding is used, the MLS messages MUST be 1592 padded with zero-valued bytes before AEAD encryption. Upon AEAD 1593 decryption, the length field of the plaintext is used to compute the 1594 number of bytes to be removed from the plaintext to get the correct 1595 data. As the padding mechanism is used to improve protection against 1596 traffic analysis, removal of the padding SHOULD be implemented in a 1597 "constant-time" manner at the MLS layer and above layers to prevent 1598 timing side-channels that would provide attackers with information on 1599 the size of the plaintext. 1601 9.2.1. Delayed and Reordered Application messages 1603 Since each Application message contains the Group identifier, the 1604 epoch and a message counter, a participant can receive messages out 1605 of order. If they are able to retrieve or recompute the correct AEAD 1606 decryption key from currently stored cryptographic material 1607 participants can decrypt these messages. 1609 For usability, MLS Participants might be required to keep the AEAD 1610 key and nonce for a certain amount of time to retain the ability to 1611 decrypt delayed or out of order messages, possibly still in transit 1612 while a decryption is being done. 1614 [[TODO: Describe here or in the Architecture spec the details. 1615 Depending on which Secret or key is kept alive, the security 1616 guarantees will vary.]] 1618 10. Security Considerations 1620 The security goals of MLS are described in [I-D.ietf-mls- 1621 architecture]. We describe here how the protocol achieves its goals 1622 at a high level, though a complete security analysis is outside of 1623 the scope of this document. 1625 10.1. Confidentiality of the Group Secrets 1627 Group secrets are derived from (i) previous group secrets, and (ii) 1628 the root key of a ratcheting tree. Only group members know their 1629 leaf private key in the group, therefore, the root key of the group's 1630 ratcheting tree is secret and thus so are all values derived from it. 1632 Initial leaf keys are known only by their owner and the group 1633 creator, because they are derived from an authenticated key exchange 1634 protocol. Subsequent leaf keys are known only by their owner. 1635 [[TODO: or by someone who replaced them.]] 1637 Note that the long-term identity keys used by the protocol MUST be 1638 distributed by an "honest" authentication service for parties to 1639 authenticate their legitimate peers. 1641 10.2. Authentication 1643 There are two forms of authentication we consider. The first form 1644 considers authentication with respect to the group. That is, the 1645 group members can verify that a message originated from one of the 1646 members of the group. This is implicitly guaranteed by the secrecy 1647 of the shared key derived from the ratcheting trees: if all members 1648 of the group are honest, then the shared group key is only known to 1649 the group members. By using AEAD or appropriate MAC with this shared 1650 key, we can guarantee that a participant in the group (who knows the 1651 shared secret key) has sent a message. 1653 The second form considers authentication with respect to the sender, 1654 meaning the group members can verify that a message originated from a 1655 particular member of the group. This property is provided by digital 1656 signatures on the messages under identity keys. 1658 [[ OPEN ISSUE: Signatures under the identity keys, while simple, have 1659 the side-effect of preclude deniability. We may wish to allow other 1660 options, such as (ii) a key chained off of the identity key, or (iii) 1661 some other key obtained through a different manner, such as a 1662 pairwise channel that provides deniability for the message 1663 contents.]] 1665 10.3. Forward and post-compromise security 1667 Message encryption keys are derived via a hash ratchet, which 1668 provides a form of forward secrecy: learning a message key does not 1669 reveal previous message or root keys. Post-compromise security is 1670 provided by Update operations, in which a new root key is generated 1671 from the latest ratcheting tree. If the adversary cannot derive the 1672 updated root key after an Update operation, it cannot compute any 1673 derived secrets. 1675 10.4. Init Key Reuse 1677 Initialization keys are intended to be used only once and then 1678 deleted. Reuse of init keys is not believed to be inherently 1679 insecure [dhreuse], although it can complicate protocol analyses. 1681 11. IANA Considerations 1683 TODO: Registries for protocol parameters, e.g., ciphersuites 1685 12. Contributors 1687 o Benjamin Beurdouche 1688 INRIA 1689 benjamin.beurdouche@ens.fr 1691 o Karthikeyan Bhargavan 1692 INRIA 1693 karthikeyan.bhargavan@inria.fr 1695 o Cas Cremers 1696 University of Oxford 1697 cas.cremers@cs.ox.ac.uk 1699 o Alan Duric 1700 Wire 1701 alan@wire.com 1703 o Srinivas Inguva 1704 Twitter 1705 singuva@twitter.com 1707 o Albert Kwon 1708 MIT 1709 kwonal@mit.edu 1711 o Eric Rescorla 1712 Mozilla 1713 ekr@rtfm.com 1715 o Thyla van der Merwe 1716 Royal Holloway, University of London 1717 thyla.van.der@merwe.tech 1719 13. References 1721 13.1. Normative References 1723 [IEEE1363] 1724 "IEEE Standard Specifications for Password-Based Public- 1725 Key Cryptographic Techniques", IEEE standard, 1726 DOI 10.1109/ieeestd.2009.4773330, n.d.. 1728 [RFC2104] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed- 1729 Hashing for Message Authentication", RFC 2104, 1730 DOI 10.17487/RFC2104, February 1997, 1731 . 1733 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1734 Requirement Levels", BCP 14, RFC 2119, 1735 DOI 10.17487/RFC2119, March 1997, 1736 . 1738 [RFC5116] McGrew, D., "An Interface and Algorithms for Authenticated 1739 Encryption", RFC 5116, DOI 10.17487/RFC5116, January 2008, 1740 . 1742 [RFC5869] Krawczyk, H. and P. Eronen, "HMAC-based Extract-and-Expand 1743 Key Derivation Function (HKDF)", RFC 5869, 1744 DOI 10.17487/RFC5869, May 2010, 1745 . 1747 [RFC7748] Langley, A., Hamburg, M., and S. Turner, "Elliptic Curves 1748 for Security", RFC 7748, DOI 10.17487/RFC7748, January 1749 2016, . 1751 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1752 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1753 May 2017, . 1755 [RFC8446] Rescorla, E., "The Transport Layer Security (TLS) Protocol 1756 Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018, 1757 . 1759 [X962] ANSI, "Public Key Cryptography For The Financial Services 1760 Industry: The Elliptic Curve Digital Signature Algorithm 1761 (ECDSA)", ANSI X9.62, 1998. 1763 13.2. Informative References 1765 [art] Cohn-Gordon, K., Cremers, C., Garratt, L., Millican, J., 1766 and K. Milner, "On Ends-to-Ends Encryption: Asynchronous 1767 Group Messaging with Strong Security Guarantees", January 1768 2018, . 1770 [CLINIC] Miller, B., Huang, L., Joseph, A., and J. Tygar, "I Know 1771 Why You Went to the Clinic: Risks and Realization of HTTPS 1772 Traffic Analysis", Privacy Enhancing Technologies pp. 1773 143-163, DOI 10.1007/978-3-319-08506-7_8, 2014. 1775 [dhreuse] Menezes, A. and B. Ustaoglu, "On reusing ephemeral keys in 1776 Diffie-Hellman key agreement protocols", International 1777 Journal of Applied Cryptography Vol. 2, pp. 154, 1778 DOI 10.1504/ijact.2010.038308, 2010. 1780 [doubleratchet] 1781 Cohn-Gordon, K., Cremers, C., Dowling, B., Garratt, L., 1782 and D. Stebila, "A Formal Security Analysis of the Signal 1783 Messaging Protocol", 2017 IEEE European Symposium on 1784 Security and Privacy (EuroS&P), 1785 DOI 10.1109/eurosp.2017.27, April 2017. 1787 [HCJ16] Husak, M., Čermak, M., Jirsik, T., and P. 1788 Čeleda, "HTTPS traffic analysis and client 1789 identification using passive SSL/TLS fingerprinting", 1790 EURASIP Journal on Information Security Vol. 2016, 1791 DOI 10.1186/s13635-016-0030-7, February 2016. 1793 [I-D.ietf-trans-rfc6962-bis] 1794 Laurie, B., Langley, A., Kasper, E., Messeri, E., and R. 1795 Stradling, "Certificate Transparency Version 2.0", draft- 1796 ietf-trans-rfc6962-bis-29 (work in progress), October 1797 2018. 1799 [keyagreement] 1800 Barker, E., Chen, L., Roginsky, A., and M. Smid, 1801 "Recommendation for Pair-Wise Key Establishment Schemes 1802 Using Discrete Logarithm Cryptography", National Institute 1803 of Standards and Technology report, 1804 DOI 10.6028/nist.sp.800-56ar2, May 2013. 1806 [signal] Perrin(ed), T. and M. Marlinspike, "The Double Ratchet 1807 Algorithm", n.d., 1808 . 1811 Appendix A. Tree Math 1813 One benefit of using left-balanced trees is that they admit a simple 1814 flat array representation. In this representation, leaf nodes are 1815 even-numbered nodes, with the n-th leaf at 2*n. Intermediate nodes 1816 are held in odd-numbered nodes. For example, a 11-element tree has 1817 the following structure: 1819 X 1820 X 1821 X X X 1822 X X X X X 1823 X X X X X X X X X X X 1824 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1826 This allows us to compute relationships between tree nodes simply by 1827 manipulating indices, rather than having to maintain complicated 1828 structures in memory, even for partial trees. The basic rule is that 1829 the high-order bits of parent and child nodes have the following 1830 relation (where "x" is an arbitrary bit string): 1832 parent=01x => left=00x, right=10x 1834 The following python code demonstrates the tree computations 1835 necessary for MLS. Test vectors can be derived from the diagram 1836 above. 1838 # The largest power of 2 less than n. Equivalent to: 1839 # int(math.floor(math.log(x, 2))) 1840 def log2(x): 1841 if x == 0: 1842 return 0 1844 k = 0 1845 while (x >> k) > 0: 1846 k += 1 1847 return k-1 1849 # The level of a node in the tree. Leaves are level 0, their 1850 # parents are level 1, etc. If a node's children are at different 1851 # level, then its level is the max level of its children plus one. 1852 def level(x): 1853 if x & 0x01 == 0: 1855 return 0 1857 k = 0 1858 while ((x >> k) & 0x01) == 1: 1859 k += 1 1860 return k 1862 # The number of nodes needed to represent a tree with n leaves 1863 def node_width(n): 1864 return 2*(n - 1) + 1 1866 # The index of the root node of a tree with n leaves 1867 def root(n): 1868 w = node_width(n) 1869 return (1 << log2(w)) - 1 1871 # The left child of an intermediate node. Note that because the 1872 # tree is left-balanced, there is no dependency on the size of the 1873 # tree. The child of a leaf node is itself. 1874 def left(x): 1875 k = level(x) 1876 if k == 0: 1877 return x 1879 return x ^ (0x01 << (k - 1)) 1881 # The right child of an intermediate node. Depends on the size of 1882 # the tree because the straightforward calculation can take you 1883 # beyond the edge of the tree. The child of a leaf node is itself. 1884 def right(x, n): 1885 k = level(x) 1886 if k == 0: 1887 return x 1889 r = x ^ (0x03 << (k - 1)) 1890 while r >= node_width(n): 1891 r = left(r) 1892 return r 1894 # The immediate parent of a node. May be beyond the right edge of 1895 # the tree. 1896 def parent_step(x): 1897 k = level(x) 1898 b = (x >> (k + 1)) & 0x01 1899 return (x | (1 << k)) ^ (b << (k + 1)) 1901 # The parent of a node. As with the right child calculation, have 1902 # to walk back until the parent is within the range of the tree. 1904 def parent(x, n): 1905 if x == root(n): 1906 return x 1908 p = parent_step(x) 1909 while p >= node_width(n): 1910 p = parent_step(p) 1911 return p 1913 # The other child of the node's parent. Root's sibling is itself. 1914 def sibling(x, n): 1915 p = parent(x, n) 1916 if x < p: 1917 return right(p, n) 1918 elif x > p: 1919 return left(p) 1921 return p 1923 # The direct path from a node to the root, ordered from the root 1924 # down, not including the root or the terminal node 1925 def direct_path(x, n): 1926 d = [] 1927 p = parent(x, n) 1928 r = root(n) 1929 while p != r: 1930 d.append(p) 1931 p = parent(p, n) 1932 return d 1934 # The copath of the node is the siblings of the nodes on its direct 1935 # path (including the node itself) 1936 def copath(x, n): 1937 d = dirpath(x, n) 1938 if x != sibling(x, n): 1939 d.append(x) 1941 return [sibling(y, n) for y in d] 1943 # Frontier is is the list of full subtrees, from left to right. A 1944 # balance binary tree with n leaves has a full subtree for every 1945 # power of two where n has a bit set, with the largest subtrees 1946 # furthest to the left. For example, a tree with 11 leaves has full 1947 # subtrees of size 8, 2, and 1. 1948 def frontier(n): 1949 st = [1 << k for k in range(log2(n) + 1) if n & (1 << k) != 0] 1950 st = reversed(st) 1951 base = 0 1952 f = [] 1953 for size in st: 1954 f.append(root(size) + base) 1955 base += 2*size 1956 return f 1958 # Leaves are in even-numbered nodes 1959 def leaves(n): 1960 return [2*i for i in range(n)] 1962 # The resolution of a node is the collection of non-blank 1963 # descendants of this node. Here the tree is represented by a list 1964 # of nodes, where blank nodes are represented by None 1965 def resolve(tree, x, n): 1966 if tree[x] != None: 1967 return [x] 1969 if level(x) == 0: 1970 return [] 1972 L = resolve(tree, left(x), n) 1973 R = resolve(tree, right(x, n), n) 1974 return L + R 1976 Authors' Addresses 1978 Richard Barnes 1979 Cisco 1981 Email: rlb@ipv.sx 1983 Jon Millican 1984 Facebook 1986 Email: jmillican@fb.com 1988 Emad Omara 1989 Google 1991 Email: emadomara@google.com 1992 Katriel Cohn-Gordon 1993 University of Oxford 1995 Email: me@katriel.co.uk 1997 Raphael Robert 1998 Wire 2000 Email: raphael@wire.com