idnits 2.17.1 draft-zimmermann-avt-zrtp-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 16. -- Found old boilerplate from RFC 3978, Section 5.5 on line 1561. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1538. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1545. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1551. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 455: '... secret value, svi, SHOULD be twice as...' RFC 2119 keyword, line 457: '... secret value SHOULD be 256 bits lon...' RFC 2119 keyword, line 458: '... value SHOULD be 512 bits long....' RFC 2119 keyword, line 494: '... SHOULD be twice as long as the AES ...' RFC 2119 keyword, line 495: '... DH secret value SHOULD be 256 bits lo...' (19 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 26, 2006) is 6627 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: 'SHA-256' on line 834 == Unused Reference: '3' is defined on line 1466, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. '4' -- Possible downref: Non-RFC (?) normative reference: ref. '5' -- Possible downref: Non-RFC (?) normative reference: ref. '6' -- Obsolete informational reference (is this intentional?): RFC 2327 (ref. '16') (Obsoleted by RFC 4566) Summary: 4 errors (**), 0 flaws (~~), 3 warnings (==), 12 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 AVT WG P. Zimmermann 3 Internet-Draft Phil Zimmermann and Associates LLC 4 Expires: August 30, 2006 A. Johnston, Ed. 5 SIPStation 6 February 26, 2006 8 ZRTP: Extensions to RTP for Diffie-Hellman Key Agreement for SRTP 9 draft-zimmermann-avt-zrtp-00 11 Status of this Memo 13 By submitting this Internet-Draft, each author represents that any 14 applicable patent or other IPR claims of which he or she is aware 15 have been or will be disclosed, and any of which he or she becomes 16 aware will be disclosed, in accordance with Section 6 of BCP 79. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that 20 other groups may also distribute working documents as Internet- 21 Drafts. 23 Internet-Drafts are draft documents valid for a maximum of six months 24 and may be updated, replaced, or obsoleted by other documents at any 25 time. It is inappropriate to use Internet-Drafts as reference 26 material or to cite them other than as "work in progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/1id-abstracts.txt. 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html. 34 This Internet-Draft will expire on August 30, 2006. 36 Copyright Notice 38 Copyright (C) The Internet Society (2006). 40 Abstract 42 This document defines ZRTP, RTP (Real-time Transport Protocol) header 43 extensions for a Diffie-Hellman exchange to agree on a session key 44 and parameters for establishing Secure RTP (SRTP) sessions. The ZRTP 45 protocol is completely self-contained in RTP and does not require 46 support in the signaling protocol or assume a Public Key 47 Infrastructure (PKI) infrastructure. For the media session, ZRTP 48 provides confidentiality, protection against Man in the Middle (MitM) 49 attacks, and, in cases where a secret is available from the signaling 50 protocol, authentication. 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 55 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 7 56 3. Protocol Description . . . . . . . . . . . . . . . . . . . . . 7 57 3.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . 7 58 3.2. Key Agreement Algorithm . . . . . . . . . . . . . . . . . 9 59 3.2.1. Discovery . . . . . . . . . . . . . . . . . . . . . . 9 60 3.2.2. Hash Commitment . . . . . . . . . . . . . . . . . . . 10 61 3.2.3. Diffie-Hellman Exchange . . . . . . . . . . . . . . . 11 62 3.2.4. Confirmation and Switch to SRTP . . . . . . . . . . . 15 63 3.3. Random Number Generation . . . . . . . . . . . . . . . . . 16 64 4. RTP Header Extensions . . . . . . . . . . . . . . . . . . . . 17 65 4.1. ZRTP Message Formats . . . . . . . . . . . . . . . . . . . 17 66 4.1.1. Message Type Block . . . . . . . . . . . . . . . . . . 17 67 4.1.2. Message Type Block . . . . . . . . . . . . . . . . . . 18 68 4.1.3. Cipher Type Block . . . . . . . . . . . . . . . . . . 19 69 4.1.4. Public Key Type Block . . . . . . . . . . . . . . . . 19 70 4.1.5. SAS Type Block . . . . . . . . . . . . . . . . . . . . 19 71 4.2. Hello message . . . . . . . . . . . . . . . . . . . . . . 20 72 4.3. HelloACK message . . . . . . . . . . . . . . . . . . . . . 21 73 4.4. Commit message . . . . . . . . . . . . . . . . . . . . . . 22 74 4.5. DHPart1 message . . . . . . . . . . . . . . . . . . . . . 23 75 4.6. DHPart2 message . . . . . . . . . . . . . . . . . . . . . 24 76 4.7. Confirm1 message . . . . . . . . . . . . . . . . . . . . . 25 77 4.8. Confirm2 message . . . . . . . . . . . . . . . . . . . . . 26 78 4.9. Conf2ACK message . . . . . . . . . . . . . . . . . . . . . 27 79 4.10. Error message . . . . . . . . . . . . . . . . . . . . . . 27 80 4.11. GoClear message . . . . . . . . . . . . . . . . . . . . . 28 81 4.12. ClearACK message . . . . . . . . . . . . . . . . . . . . . 29 82 5. Retransmissions . . . . . . . . . . . . . . . . . . . . . . . 29 83 6. Short Authentication String . . . . . . . . . . . . . . . . . 30 84 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 32 85 8. Security Considerations . . . . . . . . . . . . . . . . . . . 32 86 9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 32 87 10. Appendix - ZRTP, SIP, and SDP . . . . . . . . . . . . . . . . 33 88 11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 33 89 11.1. Normative References . . . . . . . . . . . . . . . . . . . 33 90 11.2. Informative References . . . . . . . . . . . . . . . . . . 34 91 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 35 92 Intellectual Property and Copyright Statements . . . . . . . . . . 36 94 1. Introduction 96 ZRTP is key agreement protocol which performs Diffie-Hellman key 97 exchange during call setup in-band in the Real-time Transport 98 Protocol (RTP) [1] media stream which has been established using some 99 other signaling protocol such as Session Initiation Protocol (SIP) 100 [11]. This generates a shared secret which is then used to generate 101 keys and salt for a Secure RTP (SRTP) [2] session. ZRTP borrows 102 ideas from PGPfone [7]. A reference implementation of ZRTP is 103 available as Zfone [8]. 105 The ZRTP protocol has some nice cryptographic features lacking in 106 many other approaches to media session encryption. Although it uses 107 a public key algorithm, it does not rely on a public key 108 infrastructure (PKI). In fact, it does not use persistent public 109 keys at all. It uses ephemeral Diffie-Hellman (DH) with hash 110 commitment, and allows the detection of Man in the Middle (MitM) 111 attacks by displaying a short authentication string for the users to 112 read and compare over the phone. It has perfect forward secrecy, 113 meaning the keys are destroyed at the end of the call, which 114 precludes retroactively compromising the call by future disclosures 115 of key material. But even if the users are too lazy to bother with 116 short authentication strings, we still get fairly decent 117 authentication against a MitM attack, based on a form of key 118 continuity. It does this by caching some key material to use in the 119 next call, to be mixed in with the next call's DH shared secret, 120 giving it key continuity properties analogous to SSH. All this is 121 done without reliance on a PKI, key certification, trust models, 122 certificate authorities, or key management complexity that bedevils 123 the email encryption world. It also does not rely on SIP signaling 124 for the key management, and in fact does not rely on any servers at 125 all. It performs its key agreements and key management in a purely 126 peer-to-peer manner over the RTP packet stream. 128 Most secure phones rely on a Diffie-Hellman exchange to agree on a 129 common session key. But since DH is susceptible to a man-in-the- 130 middle (MitM) attack, it is common practice to provide a way to 131 authenticate the DH exchange. In some military systems, this is done 132 by depending on digital signatures backed by a centrally-managed PKI. 133 A decade of industry experience has shown that deploying centrally 134 managed PKIs can be a painful and often futile experience. PKIs are 135 just too messy, and require too much activation energy to get them 136 started. Setting up a PKI requires somebody to run it, which is not 137 practical for an equipment provider. A service provider like a 138 carrier might venture down this path, but even then you have to deal 139 with cross-carrier authentication, certificate revocation lists, and 140 other complexities. It is much simpler to avoid PKIs altogether, 141 especially when developing secure commercial products. It is 142 therefore more common for commercial secure phones to augment the DH 143 exchange with a Short Authentication String (SAS) combined with a 144 hash commitment at the start of the key exchange, to shorten the 145 length of SAS material that must be read aloud. No PKI is required 146 for this approach to authenticating the DH exchange. The AT&T 3600, 147 Eric Blossom's COMSEC secure phones [9], PGPfone [7], and CryptoPhone 148 [10] are all examples of products that took this simpler lightweight 149 approach. 151 The main problem with this approach is inattentive users who may not 152 execute the voice authentication procedure, or unattended secure 153 phone calls to answering machines that cannot execute it. 154 Additionally, some people worry about voice spoofing (the "Rich 155 Little" attack), and some worry about trying to use it between people 156 who don't know each other's voices. This is not as much of a problem 157 as it seems, because it isn't necessary that they recognize each 158 other by their voice, it's only necessary that they detect that the 159 voice used for the SAS procedure matches the voice in the rest of the 160 phone call. These concerns are not enough reason to embrace PKIs as 161 an alternative, in my opinion. 163 A popular and field-proven approach is used by SSH (Secure Shell) 164 [12], which Peter Gutmann likes to call the "baby duck" security 165 model. SSH establishes a relationship by exchanging public keys in 166 the initial session, when we assume no attacker is present, and this 167 makes it possible to authenticate all subsequent sessions. A 168 successful MitM attacker has to have been present in all sessions all 169 the way back to the first one, which is assumed to be difficult for 170 the attacker. All this is accomplished without resorting to a 171 centrally-managed PKI. 173 We use an analogous baby duck security model to authenticate the DH 174 exchange in ZRTP. We don't need to exchange persistent public keys, 175 we can simply cache a shared secret and re-use it to authenticate a 176 long series of DH exchanges for secure phone calls over a long period 177 of time. If we read aloud just one SAS, and then cache a shared 178 secret for later calls to use for authentication, no new voice 179 authentication rituals need to be executed. We just have to remember 180 we did one already. 182 If we ever lose this cached shared secret, it is no longer available 183 for authentication of DH exchanges, so we would have to do a new SAS 184 procedure and start over with a new cached shared secret. Then we 185 could go back to omitting the voice authentication on later calls. 187 A particularly compelling reason why this approach is attractive is 188 that SAS is easiest to implement when a GUI or some sort of display 189 is available, which raises the question of what to do when no display 190 is available. We envision some products that implement secure VoIP 191 via a local network proxy, which lacks a display in many cases. If 192 we take an approach that greatly reduces the need for a SAS in each 193 and every call, we can operate in GUI-less products with greater 194 ease. 196 It's a good idea to force your opponent to have to solve multiple 197 problems in order to mount a successful attack. Some examples of 198 widely differing problems we might like to present him with are: 199 Stealing a shared secret from one of the parties, being present on 200 the very first session and every subsequent session to carry out an 201 active MitM attack, and solving the discrete log problem. We want to 202 force the opponent to solve more than one of these problems to 203 succeed. 205 The protocol can make use different kinds of shared secrets. Each 206 type of shared secret is determined by a different method. All of 207 the shared secrets are hashed together to form a session key to 208 encrypt the call. An attacker must defeat all of the methods in 209 order to determine the session key. 211 First, there is the shared secret determined entirely by a Diffie- 212 Hellman key agreement. It changes with every call, based on random 213 numbers. An attacker may attempt a classic DH MitM attack on this 214 secret, but we can protect against this by displaying and reading 215 aloud a SAS, combined with adding a hash commitment at the beginning 216 of the DH exchange. 218 Second, there is an evolving shared secret, or ongoing shared secret 219 that is automatically changed and refreshed and cached with every new 220 session. We will call this the cached shared secret, or sometimes 221 the retained shared secret. Each new image of this ongoing secret is 222 a non-invertable function of its previous value and the new secret 223 derived by the new DH agreement. It's possible that no cached shared 224 secret is available, because there were no previous sessions to 225 inherit this value from, or because one side loses its cache. 227 There are other approaches for key agreement for SRTP that compute a 228 shared secret using information in the signaling. For example, [14] 229 describes how to carry a MIKEY (Multimedia Internet KEYing) [15] 230 payload in SDP [16]. Or [13] describes directly carrying SRTP keying 231 and configuration information in SDP. ZRTP does not rely on the 232 signaling to compute a shared secret, but If a client does produce a 233 shared secret via the signaling, and makes it available to the ZRTP 234 protocol, ZRTP can make use of this shared secret to augment the list 235 of shared secrets that will be hashed together to form a session key. 236 This way, any security weaknesses that might compromise the shared 237 secret contributed by the signaling will not harm the final resulting 238 session key. 240 There may also be a static shared secret that the two parties agree 241 on out-of-band in advance. A hashed passphrase would suffice. 243 The shared secret provided by the signaling (if available), the 244 shared secret computed by DH, and the cached shared secret are all 245 hashed together to compute the session key for a call. If the cached 246 shared secret is not available, it is omitted from the hash 247 computation. If the signaling provides no shared secret, it is also 248 omitted from the hash computation. 250 No DH MitM attack can succeed if the ongoing shared secret is 251 available to the two parties, but not to the attacker. This is 252 because the attacker cannot compute a common session key with either 253 party without knowing the cached secret component, even if he 254 correctly executes a classic DH MitM attack. Mixing in the cached 255 shared secret for the session key calculation allows it to act as an 256 implicit authenticator to protect the DH exchange, without requiring 257 additional explicit HMACs to be computed on the DH parameters. If 258 the cached shared secret is available, a MitM attack would be 259 instantly detected by the failure to achieve a shared session key, 260 resulting in undecryptable packets. The protocol can easily detect 261 this. It would be more accurate to say that the MitM attack is not 262 merely detected, but thwarted. 264 When adding the complexity of additional shared secrets beyond the 265 familiar DH key agreement, we must make sure the lack of availability 266 of the cached shared secret cannot prevent a call from going through, 267 and we must also prevent false alarms that claim an attack was 268 detected. 270 An added benefit of using these cached shared secrets to mix in with 271 the session keys is that it augments the entropy of the session key. 272 Even if limits on the size of the DH exchange produces a session key 273 with less than 256 bits of real work factor, the added entropy from 274 the cached shared secret can bring up all the subsequent session keys 275 to the full 256-bit AES key strength, assuming no attacker was 276 present in the first call. 278 We could have authenticated the DH exchange the same way SSH does it, 279 with digital signatures, caching public keys instead of shared 280 secrets. But this approach with caching shared secrets seemed a bit 281 simpler, and has the added benefit of adding more entropy to the 282 session keys. 284 The following sections provide an overview of the ZRTP protocol, 285 describe the key agreement algorithm and RTP header extensions. 287 2. Terminology 289 In this document, the key words "MUST", "MUST NOT", "REQUIRED", 290 "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", 291 and "OPTIONAL" are to be interpreted as described in RFC 2119 and 292 indicate requirement levels for compliant implementations. 294 3. Protocol Description 296 3.1. Overview 298 This section provides a description of how ZRTP works. This 299 description is non-normative in nature but is included to build 300 understanding of the protocol. 302 ZRTP is negotiated the same way a conventional RTP session is 303 negotiated. Using SIP, the AVP/RTP profile is used in SDP. The ZRTP 304 protocol begins after two endpoints have utilized a signaling 305 protocol such as SIP and are ready to send or have already begun 306 sending RTP packets. This specification defines new RTP extension 307 header which is used to carry the ZRTP messages between the 308 endpoints. Since RTP endpoints ignore unknown extension headers, the 309 protocol is fully backwards compatible - a ZRTP endpoint attempting 310 to perform key agreement with a non-ZRTP endpoint will simply receive 311 normal RTP responses and can then inform the user that a secure 312 session is not possible and either continue with the insecure session 313 or terminate the session depending on the user's security policy. 315 The ZRTP exchange begins at the same time that the first RTP packets 316 are exchanged between the endpoints. A ZRTP message can be embedded 317 in RTP messages containing actual media samples, or they may be sent 318 in separate RTP messages. For example, if the RTP payload or codec 319 supports silence or no-op messages, then these can be used for RTP 320 transport. If none of these are supported, an RTP packet containing 321 comfort noise can be generated to carry a ZRTP message. 323 A ZRTP endpoint initiates the exchange by sending a ZRTP Hello 324 message to the other endpoint. The purpose of the Hello message is 325 to discover if the other endpoint supports the protocol and to see 326 what algorithms the two ZRTP endpoints have in common. 328 The Hello message contains the SRTP configuration options, and the 329 ZID. Each instance of ZRTP has a unique 96-bit random ZRTP ID or ZID 330 that is generated once at installation time. It is used to look up 331 retained shared secrets in a local cache. A single global ZID for a 332 single installation is the simplest way to implement ZIDs, and may be 333 required in applications where the encryption is being done by a 334 "bump in the cord" proxy that does not know who is being called. 335 However, it is specifically not precluded for an implementation to 336 use multiple ZIDs, up to the limit of a separate one per callee. 337 This then turns it into a long-lived "association ID" that does not 338 apply to any other associations between a different pair of parties. 339 It is a goal of this protocol to permit both options to interoperate 340 freely. 342 A response to a ZRTP Hello message is a ZRTP HelloACK message. The 343 HelloACK message simply acknowledges receipt of the Hello message and 344 indicates support for the ZRTP protocol. Since RTP uses best effort 345 UDP transport, ZRTP has retransmission timers in case of lost 346 datagrams. There are two timers, both with exponential backoff 347 mechanisms. One timer is used for retransmissions of Hello messages 348 and the other is used for retransmissions of all other messages after 349 receipt of a HelloACK which indicates support of ZRTP by the other 350 endpoint. 352 After both endpoints exchange Hello and HelloACK messages, the key 353 agreement exchange can begin with the ZRTP Commit message. An 354 example call flow is shown in Figure 1 below. Note that the order of 355 the Hello/HelloACK exchanges in F1/F2 and F3/F4 may be reversed. 356 Also, an endpoint that receives a Hello message and wishes to 357 immediately begin the ZRTP key agreement can omit the HelloACK and 358 send the Commit instead. In Figure 1, this would result in messages 359 F2, F3, and F4 being omitted. Note that the endpoint which sends the 360 Commit message is considered the initiator of the ZRTP session and 361 drives the key agreement exchange. 363 Alice Bob 364 | | 365 | Alice and Bob establish a media session.| 366 | | 367 | RTP | 368 |<=======================================>| 369 | | 370 | Hello (ver,cid,hash,cipher,pkt,sas,Alice's ZID) F1 371 |---------------------------------------->| 372 | HelloACK F2 | 373 |<----------------------------------------| 374 | Hello (ver,cid,hash,cipher,pkt,sas,Bob's ZID) F3 375 |<----------------------------------------| 376 | HelloACK F4 | 377 |---------------------------------------->| 378 | | 379 | Bob acts as the initiator | 380 | | 381 | Commit (Bob's ZID,hash,cipher,pkt,hvi) F5 382 |<----------------------------------------| 383 | DHPart1 (pvr,rs1IDr,rs2IDr,sigsIDr,srtpsIDr,other_secretIDr) F6 384 |---------------------------------------->| 385 | DHPart2 (pvi,rs1IDi,rs2IDi,sigsIDi,ssrtpIDi,other_secretIDi) F7 386 |<----------------------------------------| 387 | | 388 | Alice and Bob generate SRTP session key.| 389 | | 390 | SRTP begins | 391 |<=======================================>| 392 | | 393 | Confirm1 (plaintext,sasflag,hmac) F8 | 394 |---------------------------------------->| 395 | Confirm2 (plaintext,sasflag,hmac) F9 | 396 |<----------------------------------------| 397 | Confirm2AK F10 | 398 |---------------------------------------->| 399 Figure 1. Establishment of a SRTP session using ZRTP 401 3.2. Key Agreement Algorithm 403 The key agreement algorithm has four phases that are described 404 normatively in the following sections. 406 3.2.1. Discovery 408 During the discovery phase, a ZRTP endpoint discovers if the other 409 endpoint supports ZRTP and which ZRTP version, hash, cipher, public 410 key type, and sas algorithms are supported. In addition, each 411 endpoint sends and discovers ZIDs. The received ZID is used to 412 retrieve previous retained shared secrets, rs1 and rs2. If the 413 endpoint has other secrets, then they are also collected. The 414 signaling secret (sigs), is passed from the signaling protocol used 415 to establish the RTP session. For SIP, it is the dialog identifier 416 of a Secure SIP (SIPS) session: a string composed of Call-ID, to tag, 417 and from tag. From the definitions in RFC 3261 [11]: 419 sigs = hash(call-id | to-tag | from-tag) 421 Note: the dialog identifier of a non-secure SIP session should not be 422 considered a signaling secret as it has no confidentiality 423 protection. For the SRTP secret (srtps), it is the SRTP master key 424 and salt. This information may have been passed in the signaling 425 using MIKEY or SDP Security Descriptions, for example: 427 srtps = hash(SRTP master key | SRTP master salt) 429 Additional shared secrets can be defined and used as other_secret. 430 If no secret of a given type is available, a random value is 431 generated and used for that secret to ensure a mismatch in the hash 432 comparisons in the DHPart1 and DHPart2 messages. This prevents an 433 eavesdropper from knowing how many shared secrets are available 434 between the endpoints. 436 A Hello message can be sent at any time, but is usually sent at the 437 start of an RTP session to determine if the other endpoint supports 438 ZRTP, and also if the SRTP implementations are compatible. A Hello 439 message is retransmitted using timer T1 and an exponential backoff 440 mechanism detailed in Section 5 until the receipt of a HelloACK 441 message or a Commit message. 443 3.2.2. Hash Commitment 445 The hash commitment is performed by the initiator of the ZRTP 446 exchange. From the intersection of the algorithms in the sent and 447 received Hello messages, the initiator chooses a hash, cipher, public 448 key type, and sas algorithm to be used. 450 The key agreement begins with the initiator choosing a fresh random 451 Diffie-Hellman (DH) secret value (svi) based on the chosen public key 452 type value, and computing the public value. (Note that to speed up 453 processing, this computation can be done in advance.) For guidance 454 on generating random numbers, see the section on Random Number 455 Generation. The Diffie-Hellman secret value, svi, SHOULD be twice as 456 long as the AES key length. This means, if AES 128 is used, the DH 457 secret value SHOULD be 256 bits long. If AES 256 is used, the secret 458 value SHOULD be 512 bits long. 460 pvi = g^svi mod p 462 where g and p are determined by the public key type value, and a 463 hash, hvi, of the public value using the chosen hash algorithm. The 464 hvi includes the set of hash, cipher, pkt, and sas types from the 465 responder's Hello message in the following order: 467 hvi=hash(pvi | hashr1-5 | cipherr1-5 | pktr1-5 | sasr1-5) 469 The information from the responder's Hello message is included in the 470 hash calculation to prevent a bid-down attack by modification of the 471 responder's Hello message. 473 Note: If both sides send Commit messages initiating a secure session 474 at the same time, the Commit message with the lowest hvi value is 475 discarded and the other side is the initiator. This breaks the tie, 476 allowing the protocol to proceed from this point with a clear 477 definition of who is the initiator and who is the responder. 479 3.2.3. Diffie-Hellman Exchange 481 The purpose of the Diffie-Hellman exchange is for the two ZRTP 482 endpoints to generate a new shared secret, s0. In addition, the 483 endpoints discover if they have any shared secrets in common. If 484 they do, this exchange allows them to discover how many and agree on 485 an ordering for them: s1, s2, etc. 487 3.2.3.1. Responder Behavior 489 Upon receipt of the Commit message, the responder generates its own 490 fresh random DH secret value, svr, and computes the public value. 491 (Note that to speed up processing, this computation can be done in 492 advance.) For guidance on random number generation, see the section 493 on Random Number Generation. The Diffie-Hellman secret value, svr, 494 SHOULD be twice as long as the AES key length. This means, if AES 495 128 is used, the DH secret value SHOULD be 256 bits long. If AES 256 496 is used, the secret value SHOULD be 512 bits long. 498 pvr = g^svr mod p 500 The final shared secret, s0, is calculated by hashing the 501 concatenation of the Diffie-Hellman shared secret (DHSS) followed by 502 the (possibly empty) set of shared secrets that are actually shared 503 between the initiator and responder. For computing the hash, the 504 shared secrets are sorted by ascending order of the initiator's 505 corresponding shared secret IDs. The remainder of this section 506 describes an algorithm to accomplish this. 508 First, an HMAC keyed hash is calculated using the first retained 509 shared secret, rs1, as the key on the string "Responder" which 510 generates a retained secret ID, rs1IDr, which is truncated to 64 511 bits. HMACs are calculated in a similar way for additonal shared 512 secrets: 514 rs1IDr = HMAC(rs1, "Responder") 516 rs2IDr = HMAC(rs2, "Responder") 518 sigsIDr = HMAC(sigs, "Responder") 520 srtpsIDr = HMAC(srtps, "Responder") 522 other_secretIDr = HMAC(other_secret, "Responder") 524 A ZRTP DHPart1 message is generated containing pvr and the set of 525 keyed hashes (HMACs) derived from the possibly shared secrets. 527 Upon receipt of the DHPart2 message, the responder checks that the 528 initiator's public DH value is not equal to 1 or p-1. An attacker 529 might inject a false DHPart2 packet with a value of 1 or p-1 for 530 g^svi mod p, which would cause a disastrously weak final DH result to 531 be computed. If pvi is 1 or p-1, the user should be alerted of the 532 attack and the protocol must be aborted. Otherwise, the responder 533 then computes the hash of the public DH value in the DHPart2 with the 534 hash from the Commit. If they are different (hash(pvi)!= hvi), a 535 MitM attack is taking place and the user is alerted. 537 The responder then calculates the Diffie-Hellman result: 539 DHResult = pvi^svr mod p 541 The responder then calculates the Diffie-Hellman shared secret: 543 DHSS = hash(DHResult) 545 The set of five shared secret IDs received from the DHPart2 message 546 are stored as set A. 548 The responder then calculates the set of secret IDs that are expected 549 to be received from the initiator in the DHPart2 message: 551 rs1IDi = HMAC(rs1, "Initiator") 553 rs2IDi = HMAC(rs2, "Initiator") 554 sigsIDi = HMAC(sigs, "Initiator") 556 srtpsIDi = HMAC(srtps, "Initiator") 558 other_secretIDi = HMAC(other_secret, "Initiator") 560 The set (rs1IDi, rs2IDi, sigsIDi, srtpsIDi, other_secretIDi) is set 561 B. Set C is the intersection of set A and set B. Set C is then sorted 562 in ascending numerical order. Set C will contain between zero and 563 five secret IDs. Set D is then created as the actual secrets 564 corresponding to the secret IDs in set C in the same order. The set 565 D is expanded to 5 values by adding in null secrets: s1, s2, s3, s4, 566 and s5. The final shared secret, s0, is calculated by hashing the 567 concatenation of the DHSS and the set of non-null shared secrets. As 568 a result, the null secrets have no effect on the concatenation 569 operation: 571 s0 = hash(DHSS | s1 | s2 | s3 | s4 | s5) 573 3.2.3.2. Initiator Behavior 575 Upon receipt of the DHPart1 message, the initiator checks that the 576 responder's public DH value is not equal to 1 or p-1. An attacker 577 might inject a false DHPart1 packet with a value of 1 or p-1 for 578 g^svr mod p, which would cause a disastrously weak final DH result to 579 be computed. If pvr is 1 or p-1, the user should be alerted of the 580 attack and the protocol must be aborted. 582 If pvr is not 1 or p-1, the initiator looks up any retained shared 583 secrets associated with the responder's ZID. The final shared 584 secret, s0, is calculated by hashing the concatenation of the DHSS 585 followed by the (possibly empty) set of shared secrets that are 586 actually shared between the initiator and responder. For computing 587 the hash, the shared secrets are sorted by ascending order of the 588 initiator's corresponding shared secret IDs. The remainder of this 589 section describes an algorithm to accomplish this. 591 First, an HMAC keyed hash is calculated using the first retained 592 shared secret, rs1, as the key on the string "Initiator" which 593 generates a retained secret ID, rs1IDi, which is truncated to 64 594 bits. HMACs are calculated in a similar way for additional shared 595 secrets: 597 rs1IDi = HMAC(rs1, "Initiator") 599 rs2IDi = HMAC(rs2, "Initiator") 601 sigsIDi = HMAC(sigs, "Initiator") 602 srtpsIDi = HMAC(srtps, "Initiator") 604 other_secretIDi = HMAC(other_secret, "Initiator") 606 The initiator then sends a DHPart2 message containing the initiator's 607 public DH value and the set of calculated retained secret IDs. 609 The initiator calculates the same Diffie-Hellman result using: 611 DHResult = pvr^svi mod p 613 The initiator then calculates the DH shared secret using: 615 DHSS = hash(DHResult) 617 The set of five shared secret IDs received in the DHPart1 message are 618 stored as set A. 620 The initiator then calculates the set of secret IDs that are expected 621 to be received from the responder in the DHPart1 message: 623 rs1IDr = HMAC(rs1, "Responder") 625 rs2IDr = HMAC(rs2, "Responder") 627 sigsIDr = HMAC(sigs, "Responder") 629 srtpsIDr = HMAC(srtps, "Responder") 631 other_secretIDr = HMAC(other_secret, "Responder") 633 The set (rs1IDr, rs2IDr, sigsIDr, srtpsIDr, other_secretIDr) is B. 634 Set C is the intersection of set A and set B. Set C will contain 635 between zero and five secret IDs. Set D is then created as the 636 actual secrets corresponding to the secret IDs in set C. Set E is the 637 set of secret IDs that corresponds to the secrets in set D sent in 638 the DHPart2 message. Set E is then sorted in ascending numerical 639 order. Set D is then sorted to the same order as the corresponding 640 secrets in set E. 642 The set D is expanded to 5 values by adding in null secrets: s1, s2, 643 s3, s4, and s5. The final shared secret, s0, is calculated by 644 hashing the concatenation of the DHSS and the set of non-null shared 645 secrets. As a result, the null secrets have no effect on the 646 concatenation operation: 648 s0 = hash(DHSS | s1 | s2 | s3 | s4 | s5) 650 3.2.4. Confirmation and Switch to SRTP 652 The SRTP master key and master salt are then generated using the 653 shared secret. Separate SRTP keys and salts are used in each 654 direction for each media stream. Unless otherwise specified, ZRTP 655 uses SRTP with no MKI, 32 bit authentication using HMAC-SHA1, AES-CM 656 128 or 256 bit key length, 112 bit session salt key length, 2^48 key 657 derivation rate, and SRTP prefix length 0. 659 The ZRTP initiator encrypts and the ZRTP responder decrypts packets 660 by using srtpkeyi and srtpsalti, which are generated by: 662 srtpkeyi = HMAC(s0,"Initiator SRTP master key") 664 srtpsalti = HMAC(s0,"Initiator SRTP master salt") 666 The ZRTP responder encrypts and the ZRTP initiator decrypts packets 667 by using srtpkeyr and srtpsaltr, which are generated by: 669 srtpkeyr = HMAC(s0,"Responder SRTP master key") 671 srtpsaltr = HMAC(s0,"Responder SRTP master salt") 673 The HMAC key is generated by: 675 hmackey = HMAC(s0,"HMAC key") 677 Both sides now discard the rs2 value and store rs1 as rs2. A new rs1 678 is calculated from s0: 680 rs1 = HMAC (s0, "retained secret") 682 The endpoints can now switch to SRTP and begin packet encryption. 683 The ZRTP Initiator and Responder use their own keying material for 684 the SRTP session. No MKI is used and a 32 bit authentication tag is 685 used. 687 The ZRTP Confirm1 and Confirm2 messages are sent for two reasons. 688 First, they confirm that all the key agreement calculations were 689 successful and the encryption is working, and they enable us to 690 automatically detect a DH MitM attack from a reckless attacker who 691 does not know the retained shared secret. Second, they enable us to 692 transmit the SASflag under cover of SRTP encryption, shielding it 693 from a passive observer who would like to know if the human users are 694 in the habit of diligently verifying the SAS. 696 In the Confirm1 and Confirm2 messages, the sasflag Boolean is 697 converted to an octet called sasflagoctet (resulting in either 0x00 698 or 0x01). Confirm1 and Confirm2 messages contain an HMAC of some 699 known plaintext and the sasflagoctet. The HMAC is explicitly 700 included in the payload because we may not always be able to rely on 701 the built-in authentication tag in SRTP, which might be configured to 702 different sizes, including none. 704 hmac = HMAC(hmackey, "known plaintext" | sasflagoctet ) 706 This information is not carried in the extension header but inserted 707 at the start of the SRTP payload. 709 The Comfirm2ACK message completes the exchange. 711 The optional GoClear message is used to switch from SRTP back to RTP. 712 To avoid relying on the optional SRTP authentication tag, the GoClear 713 contains an HMAC of the string "GoClear" computed with the hmackey 714 derived from the shared secret: 716 clear_hmac = HMAC(hmackey, "GoClear") 718 A GoClear message receives either a ClearACK message or an Error 719 message, which indicates that the ZRTP endpoint does not support the 720 GoClear mechanism or that the GoClear has failed authentication (the 721 clear_hmac does not validate). 723 3.3. Random Number Generation 725 The ZRTP protocol uses random numbers for cryptographic key material, 726 notably for the DH secret exponents, which must be freshly generated 727 with each session. Whenever a random number is needed, all of the 728 following criteria must be satisfied: 730 It MUST be derived from a physical entropy source, such as RF noise, 731 acoustic noise, thermal noise, high resolution timings of 732 environmental events, or other unpredictable physical sources of 733 entropy. Chapter 10 of [4] gives a detailed explanation of 734 cryptographic grade random numbers and provides guidance for 735 collecting suitable entropy. The raw entropy must be distilled and 736 processed through a deterministic random bit generator (DRBG). 737 Examples of DRBGs may be found in NIST SP 800-90 [5], and in [4]. 739 It MUST be freshly generated, meaning that it must not have been used 740 in a previous calculation. 742 It MUST be greater than or equal to two, and less than or equal to 743 2^L - 1, where L is the number of random bits required. 745 It MUST be chosen with equal probability from the entire available 746 number space, e.g., [2, 2^L - 1]. 748 4. RTP Header Extensions 750 This specification defines a new RTP header extension used for all 751 ZRTP messages. When used, the X bit is set in the RTP header to 752 indicate the presence of the RTP header extension. 754 Section 5.3.1 in RFC 3550 defines the format of an RTP Header 755 extension. The Header extension is appended to the RTP header. The 756 first 16 bits are an identifier for the header extension, and the 757 following 16 bits are length of the extension header in 32 bit words. 758 All word lengths referenced in this specification follow RFC 3550 and 759 are 32 bits or 4 octets. All integer fields are carried in network 760 byte order, that is, most significant byte (octet) first, commonly 761 known as big-endian. Each ZRTP message is carried in a single RTP 762 header extension which is the value of 0x505A. 764 4.1. ZRTP Message Formats 766 ZRTP messages are designed to simplify endpoint parsing requirements 767 and to reduce the opportunities for buffer overflow attacks (a good 768 goal of any security extension should be to not introduce new attack 769 vectors...) 771 ZRTP uses 8 octet blocks (2 words) to encode many ZRTP parameters. 772 These fixed-length blocks are used for Message Type, Hash Type, 773 Cipher Type, and Public Key Type. The values in the blocks are ASCII 774 strings which are extended with spaces (0x20) to make them 8 775 characters long. Currently defined block values are listed in Tables 776 1-4 below. Additional block values may be defined and used. 778 ZRTP uses this ASCII encoding to simplify debugging and make it 779 "ethereal friendly". 781 4.1.1. Message Type Block 783 Currently eleven Message Type Blocks are defined - they represent the 784 set of ZRTP message primitives. ZRTP endpoints MUST support the 785 Hello, HelloACK, Commit, DHPart1, DHPart2, Confirm1, Confirm2, 786 Conf2ACK, and Error block types. They MAY support GoClear and 787 ClearACK. 789 Message Type Block | Meaning 790 --------------------------------------------------- 791 Hello | Hello Message 792 | defined in Section 4.2 793 --------------------------------------------------- 794 HelloACK | HelloACK Message 795 | defined in Section 4.3 796 --------------------------------------------------- 797 Commit | Commit Message 798 | defined in Section 4.4 799 --------------------------------------------------- 800 DHPart1 | DHPart1 Message 801 | defined in Section 4.4 802 --------------------------------------------------- 803 DHPart2 | DHPart2 Message 804 | defined in Section 4.5 805 --------------------------------------------------- 806 Confirm1 | Confirm1 Message 807 | defined in Section 4.6 808 --------------------------------------------------- 809 Confirm2 | Confirm2 Message 810 | defined in Section 4.7 811 --------------------------------------------------- 812 Conf2ACK | Conf2ACK Message 813 | defined in Section 4.8 814 --------------------------------------------------- 815 Error | Error Message 816 | defined in Section 4.9 817 --------------------------------------------------- 818 GoClear | GoClear Message 819 | defined in Section 4.10 820 --------------------------------------------------- 821 ClearACK | ClearACK Message 822 | defined in Section 4.11 823 --------------------------------------------------- 824 Table 1. Message Block Type Values 826 4.1.2. Message Type Block 828 Only one Hash Type is currently defined, SHA256, and all ZRTP 829 endpoints MUST support this hash. Additional Hash Types can be 830 registered and used. 832 Hash Type Block | Meaning 833 --------------------------------------------------- 834 SHA256 | SHA-256 Hash defined in [SHA-256] 835 --------------------------------------------------- 836 Table 2. Hash Block Type Values 838 4.1.3. Cipher Type Block 840 All ZRTP endpoints MUST support AES128 and MAY support AES256 or 841 other Cipher Types. Also, if AES 128 is used, DH3k should be used. 842 If AES 256 is used, DH4k should be used. 844 Cipher Type Block | Meaning 845 --------------------------------------------------- 846 AES128 | AES-CM with 128 bit keys 847 | as defined in RFC 3711 848 --------------------------------------------------- 849 AES256 | AES-CM with 256 bit keys 850 | as defined in RFC 3711 851 --------------------------------------------------- 852 Table 3. Cipher Block Type Values 854 4.1.4. Public Key Type Block 856 All ZRTP endpoints MUST support DH3072 and MAY support DH4096. ZRTP 857 endpoints MUST use the DH generator function g=2. The choice of AES 858 key length is coupled to the choice of public key type. If AES 128 859 is chosen, DH3072 SHOULD be used. If AES 256 is chosen, DH4096 860 SHOULD be used. 862 Public Key Type Block| Meaning 863 --------------------------------------------------- 864 DH3072 | DH with p=3072 bit prime 865 | as defined in RFC 3526 866 --------------------------------------------------- 867 DH4096 | DH with p=4096 bit prime 868 | as defined in RFC 3526 869 --------------------------------------------------- 870 Table 4. Public Key Block Type Values 872 4.1.5. SAS Type Block 874 All ZRTP endpoints MAY support the libase32 Short Authentication 875 String scheme or other SAS schemes. The optional ZRTP SAS is 876 described in Section 6. 878 SAS Type Block | Meaning 879 --------------------------------------------------- 880 libase32 | Short Authentication String using 881 | libbase32 encoding defined in Section 6. 882 --------------------------------------------------- 883 Table 5. SAS Block Type Values 885 4.2. Hello message 887 The Hello message has the format shown in Figure 2 below. The header 888 extension payload contains the ZRTP version number and the list of 889 algorithms supported by SRTP. The extension header field format is 890 shown in Figure 2. 892 The Hello ZRTP message begins with the ZRTP header extension field 893 followed by the 32 bit word count of the header field. Next is a 894 word containing the version (ver) of ZRTP. For this specification, 895 the version is the string "0.01". Next is the Client Identifier 896 string (cid) which is 15 octets long and identifies the vendor and 897 release of the ZRTP software. The Passive bit (P) is a Boolean 898 normally set to False. A ZRTP endpoint which is configured to never 899 initiate secure sessions is regarded as passive, and would set the P 900 bit to True. Next is a list of supported Hash Types, Cipher Types, 901 public key types, and SAS Type. Five possible algorithms are listed 902 for each using the Blocks defined in Tables 2, 3, 4, and 5. If fewer 903 than five algorithms are supported, spaces (0x20) are used to pad out 904 the 10 words for each type. The last parameter is the ZID, the 96 905 bit long unique identifier for the ZRTP endpoint. 907 0 1 2 3 908 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 909 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 910 |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0| length=50 words | 911 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 912 | Message Type Block=Hello (2 words) | 913 | | 914 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 915 | version (1 word) | 916 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 917 | | 918 | Client Identifier (15 octets) | 919 | +-+-+-+-+-+-+-+-+ 920 | |0 0 0 0 0 0 0|P| 921 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 922 | | 923 | Hash Type Blocks 1-5 (10 words) | 924 | . . . | 925 | | 926 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 927 | | 928 | Cipher Type Blocks 1-5 (10 words) | 929 | . . . | 930 | | 931 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 932 | | 933 | Public Key Type Blocks 1-5 (10 words) | 934 | . . . | 935 | | 936 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 937 | | 938 | SAS Type Blocks 1-5 (10 words) | 939 | . . . | 940 | | 941 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 942 | | 943 | ZID (3 words) | 944 | | 945 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 946 Figure 2. Extension header format for Hello message 948 4.3. HelloACK message 950 The HelloACK message is used to stop retransmissions of a Hello 951 message. A HelloACK is sent regardless if the version number in the 952 Hello is supported or the algorithm list supported. The receipt of a 953 HelloACK stops retransmission of the Hello message. The format is 954 shown in Figure 3 below. Note that a Commit message can be sent in 955 place of a HelloACK by an initiator. 957 0 1 2 3 958 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 959 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 960 |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0| length=2 words | 961 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 962 | Message Type Block=HelloACK (2 words) | 963 | | 964 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 965 Figure 3. Extension header format for HelloACK message 967 4.4. Commit message 969 The Commit message is sent to initiate the key agreement process 970 after receiving a Hello message. The Commit message contains the 971 initiator's ZID and a list of selected algorithms (hash, cipher, pkt, 972 sas) and hvi, a hash of the public DH value of the initiator and the 973 algorithm list from the responder's Hello message. A Commit cannot 974 be sent until a Hello message has been received. 976 0 1 2 3 977 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 978 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 979 |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0| length=16 words | 980 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 981 | Message Type Block=Commit (2 words) | 982 | | 983 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 984 | | 985 | ZID (3 words) | 986 | | 987 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 988 | Hash Type Blocks (2 words) | 989 | | 990 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 991 | Cipher Type Block (2 words) | 992 | | 993 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 994 | Public Key Type Block (2 words) | 995 | | 996 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 997 | SAS Type Block (2 words) | 998 | | 999 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1000 | | 1001 | hvi (8 words) | 1002 | . . . | 1003 | | 1004 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1005 Figure 4. Extension header format for Commit message 1007 4.5. DHPart1 message 1009 The DHPart1 message contain begins the DH exchange. The format is 1010 shown in Figure 5 below. The DHPart1 message is sent if a valid 1011 Commit message is received. The length of the pvr value depends on 1012 the Public Key Type chosen. If DH4096 is used, the pvr will be 128 1013 words (512 octets). If DH3072 is used, it is 96 words (384 octets). 1015 The next five parameters are HMACs of potential shared secrets used 1016 in generating the ZRTP secret. The first two, rs1IDr and rs2IDr, are 1017 the HMACs of the responder's two retained shared secrets, truncated 1018 to 64 bits. Next is sigsIDr, the HMAC of the responder's signaling 1019 secret, truncated to 64 bits. Next is srtpsIDr, the HMAC of the 1020 responder's SRTP secret, truncated to 64 bits. The last parameter is 1021 the HMAC of an additional shared secret. For example, if multiple 1022 SRTP secrets are available or some other secret is used, it can used 1023 as the other_secret. 1025 0 1 2 3 1026 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1027 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1028 |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0| length=depends on PK Type | 1029 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1030 | Message Type Block=DHPart1 (2 words) | 1031 | | 1032 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1033 | | 1034 | pvr (length depends on PK Type) | 1035 | . . . | 1036 | | 1037 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1038 | rs1IDr (2 words) | 1039 | | 1040 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1041 | rs2IDr (2 words) | 1042 | | 1043 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1044 | sigsIDr (2 words) | 1045 | | 1046 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1047 | srtpsIDr (2 words) | 1048 | | 1049 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1050 | other_secretIDr (2 words) | 1051 | | 1052 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1053 Figure 5. Extension header format for DHPart1 message 1055 4.6. DHPart2 message 1057 The DHPart2 message completes the DH exchange. A DHPart2 message is 1058 sent if a valid DHPart1 message is received. The length of the pvi 1059 value depends on the Public Key Type chosen. If DH4096 is used, the 1060 pvr will be 128 words (512 octets). If DH3072 is used, it is 96 1061 words (384 octets). 1063 The next five parameters are HMACs of potential shared secrets used 1064 in generating the ZRTP secret. The first two, rs1IDi and rs2IDi, are 1065 the HMACs of the initiator's two retained shared secrets, truncated 1066 to 64 bits. Next is sigsIDi, the HMAC of the initiator's signaling 1067 secret, truncated to 64 bits. Next is srtpsIDi, the HMAC of the 1068 initiator's SRTP secret, truncated to 64 bits. The last parameter is 1069 the HMAC of an additional shared secret. For example, if multiple 1070 SRTP secrets are available or some other secret is used, it can be 1071 included. 1073 0 1 2 3 1074 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1075 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1076 |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0| length=depends on PK Type | 1077 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1078 | Message Type Block=DHPart2 (2 words) | 1079 | | 1080 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1081 | | 1082 | pvi (length depends on PK Type) | 1083 | . . . | 1084 | | 1085 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1086 | rs1IDi (2 words) | 1087 | | 1088 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1089 | rs2IDi (2 words) | 1090 | | 1091 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1092 | sigsIDi (2 words) | 1093 | | 1094 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1095 | srtpsIDi (2 words) | 1096 | | 1097 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1098 | other_secretIDi (2 words) | 1099 | | 1100 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1101 Figure 6. Extension header format for DHPart2 message 1103 4.7. Confirm1 message 1105 The Confirm1 message is sent in response to a valid DHPart2 message 1106 after the SRTP session key and parameters have been negotiated. As a 1107 result, it is always sent in an SRTP packet. The format is shown in 1108 Figure 7 below. The header extension itself has no parameters 1109 besides the Message Type Block. However, three parameters are 1110 carried in the SRTP payload. The plaintext parameter contains the 1111 known plaintext "known plaintext". The sasflag (S) is a Boolean bit. 1112 The hmac is a hash over the known plaintext "known plaintext" and the 1113 SASflag Boolean converted to the octet 0x00 or 0x01. 1115 The parameters included in the SRTP payload MUST NOT be allowed to 1116 pass to the RTP stack or errors may occur with the media stream. 1118 0 1 2 3 1119 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1120 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1121 |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0| length=2 words | 1122 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1123 | Message Type Block=Confirm1 (2 words) | 1124 | | 1125 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1127 At the start of the SRTP payload: 1129 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1130 | | 1131 | | 1132 | plaintext (31 octets) | 1133 | +-+-+-+-+-+-+-+-+ 1134 | |0 0 0 0 0 0 0|S| 1135 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1136 | | 1137 | hmac (8 words) | 1138 | . . . | 1139 | | 1140 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1141 Figure 7. Extension header format for Confirm1 message 1143 4.8. Confirm2 message 1145 The Confirm2 message is sent in response to a Confirm1 message after 1146 the SRTP session key and parameters have been negotiated. As a 1147 result, it is always sent in an SRTP packet. The format is shown in 1148 Figure 8 below. The header extension itself has no parameters 1149 besides the Message Type Block. However, three parameters are 1150 carried in the SRTP payload. The plaintext parameter contains the 1151 known plaintext "known plaintext". The sasflag (S) is a Boolean bit. 1152 The hmac is a hash over the known plaintext "known plaintext" and the 1153 SASflag Boolean converted to the octet 0x00 or 0x01. 1155 The parameters included in the SRTP payload MUST NOT be allowed to 1156 pass to the RTP stack or errors may occur with the media stream. 1158 0 1 2 3 1159 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1160 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1161 |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0| length=2 words | 1162 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1163 | Message Type Block=Confirm2 (2 words) | 1164 | | 1165 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1167 At the start of the SRTP payload: 1169 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1170 | | 1171 | plaintext (31 octets) | 1172 | +-+-+-+-+-+-+-+-+ 1173 | |0 0 0 0 0 0 0|S| 1174 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1175 | | 1176 | hmac (8 words) | 1177 | . . . | 1178 | | 1179 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1180 Figure 8. Extension header format for Confirm1 message 1182 4.9. Conf2ACK message 1184 The Conf2ACK message is sent in response to a valid Confirm2 message. 1185 The format is shown in Figure 9 below. The receipt of a Conf2ACK 1186 stops retransmission of the Confirm2 message. 1188 0 1 2 3 1189 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1190 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1191 |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0| length=2 words | 1192 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1193 | Message Type Block=Conf2ACK (2 words) | 1194 | | 1195 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1196 Figure 9. Extension header format for Conf2ACK message 1198 4.10. Error message 1200 An Error message is sent in response to another ZRTP message which is 1201 not valid or not supported. The format is shown in Figure 10 below. 1202 Reasons could be: missing block or parameter, chosen parameter not in 1203 offered list, checksum failure, message type block not understood 1204 etc. The ZRTP message type that generated the error is included in 1205 the Message Type Block. This message can be sent in response to any 1206 ZRTP message except Hello and HelloACK and is never acknowledged or 1207 retransmitted. 1209 0 1 2 3 1210 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1211 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1212 |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0| length=4 words | 1213 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1214 | Message Type Block=Error (2 words) | 1215 | | 1216 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1217 | Message Type Block (2 words) | 1218 | | 1219 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1220 Figure 10. Extension header format for Error message 1222 4.11. GoClear message 1224 The optional GoClear message is sent to switch from SRTP back to RTP. 1225 The format is shown in Figure 11 below. The clear_hmac is used to 1226 authenticate the GoClear message so that bogus GoClear messages 1227 introduced by an attacker can be detected and discarded. This 1228 message is retransmitted at 500ms intervals until the receipt of a 1229 ClearACK message or an Error message. 1231 After sending a GoClear message, the ZRTP endpoint stops sending SRTP 1232 packets. When a ClearACK is received, the ZRTP endpoint deletes the 1233 crypto context for the SRTP session and may then resume sending RTP 1234 packets. However, if instead an Error message is received, the SRTP 1235 session resumes as if the GoClear had never been sent. 1237 0 1 2 3 1238 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1239 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1240 |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0| length=10 words | 1241 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1242 | Message Type Block=GoClear (2 words) | 1243 | | 1244 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1245 | | 1246 | clear_hmac (8 words) | 1247 | . . . | 1248 | | 1249 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1250 Figure 11. Extension header format for GoClear message 1252 4.12. ClearACK message 1254 The optional ClearACK message is sent to acknowledge receipt of a 1255 GoClear. A ClearACK is only sent if the clear_hmac from the GoClear 1256 message is authenticated. Otherwise, an Error message is returned. 1257 The format is shown in Figure 12 below. A ZRTP endpoint that 1258 receives a GoClear message stops sending SRTP packets, generates a 1259 ClearACK in response, and deletes the crypto context for the SRTP 1260 session. Until confirmation from the user is received (e.g. clicking 1261 a button, pressing a DTMF key, etc.), the ZRTP endpoint MUST NOT 1262 resume sending RTP packets. The endpoint then renders the 1263 information that the media session has switched to clear mode to the 1264 user and waits for confirmation from the user. To prevent pinholes 1265 from closing or NAT bindings from expiring, the ClearACK message 1266 should be resent every 5 seconds while waiting for confirmation from 1267 the user. After confirmation of the notification is received from 1268 the user, the sending of RTP packets may begin. 1270 Note that if the GoClear/ClearACK mechanism is not supported by a 1271 ZRTP endpoint, an Error message MUST be sent in response to a GoClear 1272 message. 1274 0 1 2 3 1275 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1276 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1277 |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0| length=2 words | 1278 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1279 | Message Type Block=ClearACK (2 words) | 1280 | | 1281 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1282 Figure 12. Extension header format for ClearACK message 1284 5. Retransmissions 1286 ZRTP uses two retransmission timers T1 and T2. T1 is used for 1287 retransmission of Hello messages, when the support of ZRTP by the 1288 other endpoint may not be known. T2 is used in retransmissions of 1289 all the other ZRTP messages with the exception of GoClear. The 1290 retransmission of GoClear messages is discussed in the section on 1291 GoClear. 1293 Practical experience has shown that RTP packet loss at the start of 1294 an RTP session can be extremely high. Since the entire ZRTP message 1295 exchange occurs during this period, the defined retransmission scheme 1296 is defined to be aggressive. Since ZRTP packets with the exception 1297 of the DHPart1 and DHPart2 messages are small, this should have 1298 minimal effect on overall bandwidth utilization of the media session. 1300 Hello ZRTP requests are retransmitted at an interval that starts at 1301 T1 seconds and doubles after every retransmission, capping at 200ms. 1302 A Hello message is retransmitted 20 times before giving up. T1 has a 1303 recommended value of 50 ms. Retransmission of a Hello ends upon 1304 receipt of a HelloACK or Commit message. 1306 Non-Hello ZRTP requests are retransmitted only by the initiator - 1307 that is, only Commit, DHPart2, and Confirm2 are retransmitted if the 1308 corresponding message from the responder, DHPart1, Confirm1, and 1309 Conf2ACK, are not received. Non-Hello ZRTP messages are 1310 retransmitted at an interval that starts at T2 seconds and doubles 1311 after every retransmission, capping at 600ms. Only the ZRTP 1312 initiator performs retransmissions. Each message is retransmitted 10 1313 times before giving up and resuming a normal RTP session. T2 has a 1314 default value of 150ms. Each message has a response message that 1315 stops retransmissions, as shown in Table 6. The high value of T2 1316 means that retransmissions will likely only occur with packet loss. 1317 The receipt of an Error message ends retransmission of the message 1318 identified in the Error message. 1320 Message Acknowledgement Message 1321 ------- ----------------------- 1322 Hello HelloACK or Commit 1323 Commit DHPart1 1324 DHPart2 Confirm1 1325 Confirm2 Conf2ACK 1326 GoClear ClearACK 1327 Table 6. Retransmitted ZRTP Messages and Responses 1329 6. Short Authentication String 1331 This section will discuss the implementation of the optional Short 1332 Authentication String, or SAS in ZRTP. 1334 The Short Authentication String (SAS) value is calculated as the hash 1335 of both DH public values and the string "Short Authentication 1336 String". 1338 sasvalue = hash(pvi | pvr | "Short Authentication String") 1340 The rendering of the SAS value depends on the SAS Type agreed upon in 1341 the Commit message. For the SAS Type of libase32, the last 20 bits 1342 of the sasvalue are rendered as a form of base32 encoding known as 1343 libbase32 [6]. The purpose of libbase32 is to represent arbitrary 1344 sequences of octets in a form that is as convenient as possible for 1345 human users to manipulate. As a result, the choice of characters is 1346 slightly different from base32 as defined in RFC 3548. The last 20 1347 bits of the sasvalue results in four libbase32 characters which are 1348 rendered to both ZRTP endpoints. Other SAS Types may be defined to 1349 render the SAS value in other ways. 1351 The sasflag is set based on the user indicating that SAS has been 1352 successfully performed. The sasflag is exchanged securely in the 1353 Confirm1 and Confirm2 messages of the next session. In other words, 1354 each party sends the sasflag from the previous session in the Confirm 1355 message of the current session. It is perfectly reasonable to have a 1356 ZRTP endpoint that never sets the sasflag, because it would require 1357 adding complexity to the user interface to allow the user to set it. 1358 The sasflag is not required to be set, but if it is available to the 1359 client software, it allows for the possibility that the client 1360 software could render to the user that the SAS verify procedure was 1361 carried out in a previous session. 1363 Regardless of whether there is a user interface element to allow the 1364 user to set the sasflag, it is worth caching a shared secret, because 1365 doing so reduces opportunities for an attacker in the next call. 1367 If at any time the users carry out the SAS procedure, and it actually 1368 fails to match, then this means there is a very resourceful man in 1369 the middle. If this is the first call, the MitM was there on the 1370 first call, which is impressive enough. If it happens in a later 1371 call, it also means the MitM must also know your cached shared 1372 secret, because you could not have carried out any voice traffic at 1373 all unless the session key was correctly computed and is also known 1374 to the attacker. This implies the MitM must have been present in all 1375 the previous sessions, since the initial establishment of the first 1376 shared secret. This is indeed a resourceful attacker. It also means 1377 that if at any time he ceases his participation as a MitM on one of 1378 your calls, the protocol will detect that the cached shared secret is 1379 no longer valid-- because it was really two different shared secrets 1380 all along, one of them between Alice and the attacker, and the other 1381 between the attacker and Bob. The continuity of the cached shared 1382 secrets make it possible for us to detect the MitM when he inserts 1383 himself into the ongoing relationship, as well as when he leaves. 1384 Also, if the attacker tries to stay with a long lineage of calls, but 1385 fails to execute a DH MitM attack for even one missed call, he is 1386 permanently excluded. He can no longer resynchronize with the chain 1387 of cached shared secrets. 1389 Some sort of user interface element (maybe a checkbox) is needed to 1390 allow the user to tell the software the SAS verify was successful, 1391 causing the software to set the "SAS verified" flag, which (together 1392 with our cached shared secret) obviates the need to perform the SAS 1393 procedure in the next call. An additional user interface element can 1394 be provided to let the user tell the software he detected an actual 1395 SAS mismatch, which indicates a MitM attack. The software can then 1396 take appropriate action, clearing the "SAS verified" flags, and erase 1397 the cached shared secret from this session. It is up to the 1398 implementer to decide if this added user interface complexity is 1399 warranted. 1401 If the SAS matches, it means there is no MitM, which also implies it 1402 is now safe to trust a cached shared secret for later calls. If 1403 inattentive users don't bother to check the SAS, it means we don't 1404 know whether there is or is not a MitM, so even if we do establish a 1405 new cached shared secret, there is a risk that our potential attacker 1406 may have a subsequent opportunity to continue inserting himself in 1407 the call, until we finally get around to checking the SAS. If the 1408 SAS matches, it means no attacker was present for any previous 1409 session since we started propagating cached shared secrets, because 1410 this session and all the previous sessions were also authenticated 1411 with a continuous lineage of shared secrets. 1413 7. IANA Considerations 1415 If an IANA registry for RTP extension headers were defined, then the 1416 value 0x505A would be reserved for ZRTP. 1418 8. Security Considerations 1420 This document is all about securely keying SRTP sessions. As such, 1421 security is discussed in every section. The next version of this 1422 draft will have a summary of those security properties discussed 1423 throughout the document. 1425 9. Acknowledgments 1427 The authors would like to thank Bryce Wilcox for his contributions to 1428 the design of this protocol, and to thank Jon Callas, Jon Peterson, 1429 Colin Plumb, and Hal Finney for their helpful comments and 1430 suggestions. 1432 10. Appendix - ZRTP, SIP, and SDP 1434 This section discusses how ZRTP, SIP, and SDP work together. 1436 SIP UAs which support this specification would include the to-be- 1437 defined SDP attribute a=zrtp in their SDP offers and answers. The 1438 presence of this attribute is a hint to another UA that ZRTP is 1439 supported. If a UA supports both ZRTP and another approach to 1440 negotiate an SRTP secret such as [14] or [13] , then the presence of 1441 the a=zrtp attribute is critical. If both UAs support ZRTP, they 1442 will first try ZRTP before attempting SRTP. If only one endpoint 1443 supports ZRTP but both support SRTP, then the other method will be 1444 used instead. 1446 Note that ZRTP may be implemented without coupling with the SIP 1447 signaling. For example, ZRTP can be implemented as a "bump in the 1448 wire" or as a "bump in the stack" in which RTP sent by the SIP UA is 1449 converted to ZRTP. In these cases, the SIP UA will have no knowledge 1450 of ZRTP and will not include the a=zrtp attribute. As a result, even 1451 if the other UA does not indicate support for ZRTP, a ZRTP endpoint 1452 SHOULD still send Hello messages. 1454 11. References 1456 11.1. Normative References 1458 [1] Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson, 1459 "RTP: A Transport Protocol for Real-Time Applications", STD 64, 1460 RFC 3550, July 2003. 1462 [2] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 1463 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 1464 RFC 3711, March 2004. 1466 [3] Kivinen, T. and M. Kojo, "More Modular Exponential (MODP) 1467 Diffie-Hellman groups for Internet Key Exchange (IKE)", 1468 RFC 3526, May 2003. 1470 [4] Ferguson, N. and B. Schneier, "Practical Cryptography", Wiley 1471 Publishing 2003. 1473 [5] Barker, E. and J. Kelsey, "Recommendation for Random Number 1474 Generation Using Deterministic Random Bit Generators", NIST 1475 Special Publication 800-90 DRAFT (December 2005). 1477 [6] O'Whielacronx, Z., "human-oriented base-32 encoding", http:// 1478 cvs.sourceforge.net/viewcvs.py/libbase32/libbase32/ 1479 DESIGN?rev=HEAD . 1481 11.2. Informative References 1483 [7] Zimmermann, P., "PGPfone", 1484 http://www.pgpi.org/products/pgpfone/ . 1486 [8] Zimmermann, P., "Zfone", http://www.philzimmermann.com/zfone . 1488 [9] Blossom, E., "The VP1 Protocol for Voice Privacy Devices 1489 Version 1.2", http://www.comsec.com/vp1-protocol.pdf . 1491 [10] "CryptoPhone", http://www.cryptophone.de/ . 1493 [11] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., 1494 Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP: 1495 Session Initiation Protocol", RFC 3261, June 2002. 1497 [12] Ylonen, T. and C. Lonvick, "The Secure Shell (SSH) Protocol 1498 Architecture", RFC 4251, January 2006. 1500 [13] Andreasen, F., "Session Description Protocol Security 1501 Descriptions for Media Streams", 1502 draft-ietf-mmusic-sdescriptions-12 (work in progress), 1503 September 2005. 1505 [14] Arkko, J., "Key Management Extensions for Session Description 1506 Protocol (SDP) and Real Time Streaming Protocol (RTSP)", 1507 draft-ietf-mmusic-kmgmt-ext-15 (work in progress), June 2005. 1509 [15] Arkko, J., Carrara, E., Lindholm, F., Naslund, M., and K. 1510 Norrman, "MIKEY: Multimedia Internet KEYing", RFC 3830, 1511 August 2004. 1513 [16] Handley, M. and V. Jacobson, "SDP: Session Description 1514 Protocol", RFC 2327, April 1998. 1516 Authors' Addresses 1518 Philip Zimmermann 1519 Phil Zimmermann and Associates LLC 1521 Email: prz@mit.edu 1523 Alan Johnston (editor) 1524 SIPStation 1525 St. Louis, MO 63124 1527 Email: alan@sipstation.com 1529 Intellectual Property Statement 1531 The IETF takes no position regarding the validity or scope of any 1532 Intellectual Property Rights or other rights that might be claimed to 1533 pertain to the implementation or use of the technology described in 1534 this document or the extent to which any license under such rights 1535 might or might not be available; nor does it represent that it has 1536 made any independent effort to identify any such rights. Information 1537 on the procedures with respect to rights in RFC documents can be 1538 found in BCP 78 and BCP 79. 1540 Copies of IPR disclosures made to the IETF Secretariat and any 1541 assurances of licenses to be made available, or the result of an 1542 attempt made to obtain a general license or permission for the use of 1543 such proprietary rights by implementers or users of this 1544 specification can be obtained from the IETF on-line IPR repository at 1545 http://www.ietf.org/ipr. 1547 The IETF invites any interested party to bring to its attention any 1548 copyrights, patents or patent applications, or other proprietary 1549 rights that may cover technology that may be required to implement 1550 this standard. Please address the information to the IETF at 1551 ietf-ipr@ietf.org. 1553 Disclaimer of Validity 1555 This document and the information contained herein are provided on an 1556 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1557 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET 1558 ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, 1559 INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE 1560 INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1561 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1563 Copyright Statement 1565 Copyright (C) The Internet Society (2006). This document is subject 1566 to the rights, licenses and restrictions contained in BCP 78, and 1567 except as set forth therein, the authors retain all their rights. 1569 Acknowledgment 1571 Funding for the RFC Editor function is currently provided by the 1572 Internet Society.