idnits 2.17.1 

draft-zimmermann-avt-zrtp-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 16.

  -- Found old boilerplate from RFC 3978, Section 5.5 on line 1561.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1538.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1545.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1551.

  ** This document has an original RFC 3978 Section 5.4 Copyright Line,
     instead of the newer IETF Trust Copyright according to RFC 4748.

  ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead
     of the newer disclaimer which includes the IETF Trust according to RFC
     4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack a both a reference to RFC 2119 and the
     recommended RFC 2119 boilerplate, even if it appears to use RFC 2119
     keywords. 

     RFC 2119 keyword, line 455: '... secret value, svi, SHOULD be twice as...'
     RFC 2119 keyword, line 457: '...   secret value SHOULD be 256 bits lon...'
     RFC 2119 keyword, line 458: '...   value SHOULD be 512 bits long....'
     RFC 2119 keyword, line 494: '...   SHOULD be twice as long as the AES ...'
     RFC 2119 keyword, line 495: '... DH secret value SHOULD be 256 bits lo...'
     (19 more instances...)


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not
     match the current year

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (February 26, 2006) is 6627 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Looks like a reference, but probably isn't: 'SHA-256' on line 834

  == Unused Reference: '3' is defined on line 1466, but no explicit reference
     was found in the text

  -- Possible downref: Non-RFC (?) normative reference: ref. '4'

  -- Possible downref: Non-RFC (?) normative reference: ref. '5'

  -- Possible downref: Non-RFC (?) normative reference: ref. '6'

  -- Obsolete informational reference (is this intentional?): RFC 2327 (ref.
     '16') (Obsoleted by RFC 4566)


     Summary: 4 errors (**), 0 flaws (~~), 3 warnings (==), 12 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	AVT WG                                                     P. Zimmermann
3	Internet-Draft                        Phil Zimmermann and Associates LLC
4	Expires: August 30, 2006                                A. Johnston, Ed.
5	                                                              SIPStation
6	                                                       February 26, 2006

8	   ZRTP: Extensions to RTP for Diffie-Hellman Key Agreement for SRTP
9	                      draft-zimmermann-avt-zrtp-00

11	Status of this Memo

13	   By submitting this Internet-Draft, each author represents that any
14	   applicable patent or other IPR claims of which he or she is aware
15	   have been or will be disclosed, and any of which he or she becomes
16	   aware will be disclosed, in accordance with Section 6 of BCP 79.

18	   Internet-Drafts are working documents of the Internet Engineering
19	   Task Force (IETF), its areas, and its working groups.  Note that
20	   other groups may also distribute working documents as Internet-
21	   Drafts.

23	   Internet-Drafts are draft documents valid for a maximum of six months
24	   and may be updated, replaced, or obsoleted by other documents at any
25	   time.  It is inappropriate to use Internet-Drafts as reference
26	   material or to cite them other than as "work in progress."

28	   The list of current Internet-Drafts can be accessed at
29	   http://www.ietf.org/ietf/1id-abstracts.txt.

31	   The list of Internet-Draft Shadow Directories can be accessed at
32	   http://www.ietf.org/shadow.html.

34	   This Internet-Draft will expire on August 30, 2006.

36	Copyright Notice

38	   Copyright (C) The Internet Society (2006).

40	Abstract

42	   This document defines ZRTP, RTP (Real-time Transport Protocol) header
43	   extensions for a Diffie-Hellman exchange to agree on a session key
44	   and parameters for establishing Secure RTP (SRTP) sessions.  The ZRTP
45	   protocol is completely self-contained in RTP and does not require
46	   support in the signaling protocol or assume a Public Key
47	   Infrastructure (PKI) infrastructure.  For the media session, ZRTP
48	   provides confidentiality, protection against Man in the Middle (MitM)
49	   attacks, and, in cases where a secret is available from the signaling
50	   protocol, authentication.

52	Table of Contents

54	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
55	   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  7
56	   3.  Protocol Description . . . . . . . . . . . . . . . . . . . . .  7
57	     3.1.  Overview . . . . . . . . . . . . . . . . . . . . . . . . .  7
58	     3.2.  Key Agreement Algorithm  . . . . . . . . . . . . . . . . .  9
59	       3.2.1.  Discovery  . . . . . . . . . . . . . . . . . . . . . .  9
60	       3.2.2.  Hash Commitment  . . . . . . . . . . . . . . . . . . . 10
61	       3.2.3.  Diffie-Hellman Exchange  . . . . . . . . . . . . . . . 11
62	       3.2.4.  Confirmation and Switch to SRTP  . . . . . . . . . . . 15
63	     3.3.  Random Number Generation . . . . . . . . . . . . . . . . . 16
64	   4.  RTP Header Extensions  . . . . . . . . . . . . . . . . . . . . 17
65	     4.1.  ZRTP Message Formats . . . . . . . . . . . . . . . . . . . 17
66	       4.1.1.  Message Type Block . . . . . . . . . . . . . . . . . . 17
67	       4.1.2.  Message Type Block . . . . . . . . . . . . . . . . . . 18
68	       4.1.3.  Cipher Type Block  . . . . . . . . . . . . . . . . . . 19
69	       4.1.4.  Public Key Type Block  . . . . . . . . . . . . . . . . 19
70	       4.1.5.  SAS Type Block . . . . . . . . . . . . . . . . . . . . 19
71	     4.2.  Hello message  . . . . . . . . . . . . . . . . . . . . . . 20
72	     4.3.  HelloACK message . . . . . . . . . . . . . . . . . . . . . 21
73	     4.4.  Commit message . . . . . . . . . . . . . . . . . . . . . . 22
74	     4.5.  DHPart1 message  . . . . . . . . . . . . . . . . . . . . . 23
75	     4.6.  DHPart2 message  . . . . . . . . . . . . . . . . . . . . . 24
76	     4.7.  Confirm1 message . . . . . . . . . . . . . . . . . . . . . 25
77	     4.8.  Confirm2 message . . . . . . . . . . . . . . . . . . . . . 26
78	     4.9.  Conf2ACK message . . . . . . . . . . . . . . . . . . . . . 27
79	     4.10. Error message  . . . . . . . . . . . . . . . . . . . . . . 27
80	     4.11. GoClear message  . . . . . . . . . . . . . . . . . . . . . 28
81	     4.12. ClearACK message . . . . . . . . . . . . . . . . . . . . . 29
82	   5.  Retransmissions  . . . . . . . . . . . . . . . . . . . . . . . 29
83	   6.  Short Authentication String  . . . . . . . . . . . . . . . . . 30
84	   7.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 32
85	   8.  Security Considerations  . . . . . . . . . . . . . . . . . . . 32
86	   9.  Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 32
87	   10. Appendix - ZRTP, SIP, and SDP  . . . . . . . . . . . . . . . . 33
88	   11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 33
89	     11.1. Normative References . . . . . . . . . . . . . . . . . . . 33
90	     11.2. Informative References . . . . . . . . . . . . . . . . . . 34
91	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 35
92	   Intellectual Property and Copyright Statements . . . . . . . . . . 36

94	1.  Introduction

96	   ZRTP is key agreement protocol which performs Diffie-Hellman key
97	   exchange during call setup in-band in the Real-time Transport
98	   Protocol (RTP) [1] media stream which has been established using some
99	   other signaling protocol such as Session Initiation Protocol (SIP)
100	   [11].  This generates a shared secret which is then used to generate
101	   keys and salt for a Secure RTP (SRTP) [2] session.  ZRTP borrows
102	   ideas from PGPfone [7].  A reference implementation of ZRTP is
103	   available as Zfone [8].

105	   The ZRTP protocol has some nice cryptographic features lacking in
106	   many other approaches to media session encryption.  Although it uses
107	   a public key algorithm, it does not rely on a public key
108	   infrastructure (PKI).  In fact, it does not use persistent public
109	   keys at all.  It uses ephemeral Diffie-Hellman (DH) with hash
110	   commitment, and allows the detection of Man in the Middle (MitM)
111	   attacks by displaying a short authentication string for the users to
112	   read and compare over the phone.  It has perfect forward secrecy,
113	   meaning the keys are destroyed at the end of the call, which
114	   precludes retroactively compromising the call by future disclosures
115	   of key material.  But even if the users are too lazy to bother with
116	   short authentication strings, we still get fairly decent
117	   authentication against a MitM attack, based on a form of key
118	   continuity.  It does this by caching some key material to use in the
119	   next call, to be mixed in with the next call's DH shared secret,
120	   giving it key continuity properties analogous to SSH.  All this is
121	   done without reliance on a PKI, key certification, trust models,
122	   certificate authorities, or key management complexity that bedevils
123	   the email encryption world.  It also does not rely on SIP signaling
124	   for the key management, and in fact does not rely on any servers at
125	   all.  It performs its key agreements and key management in a purely
126	   peer-to-peer manner over the RTP packet stream.

128	   Most secure phones rely on a Diffie-Hellman exchange to agree on a
129	   common session key.  But since DH is susceptible to a man-in-the-
130	   middle (MitM) attack, it is common practice to provide a way to
131	   authenticate the DH exchange.  In some military systems, this is done
132	   by depending on digital signatures backed by a centrally-managed PKI.
133	   A decade of industry experience has shown that deploying centrally
134	   managed PKIs can be a painful and often futile experience.  PKIs are
135	   just too messy, and require too much activation energy to get them
136	   started.  Setting up a PKI requires somebody to run it, which is not
137	   practical for an equipment provider.  A service provider like a
138	   carrier might venture down this path, but even then you have to deal
139	   with cross-carrier authentication, certificate revocation lists, and
140	   other complexities.  It is much simpler to avoid PKIs altogether,
141	   especially when developing secure commercial products.  It is
142	   therefore more common for commercial secure phones to augment the DH
143	   exchange with a Short Authentication String (SAS) combined with a
144	   hash commitment at the start of the key exchange, to shorten the
145	   length of SAS material that must be read aloud.  No PKI is required
146	   for this approach to authenticating the DH exchange.  The AT&T 3600,
147	   Eric Blossom's COMSEC secure phones [9], PGPfone [7], and CryptoPhone
148	   [10] are all examples of products that took this simpler lightweight
149	   approach.

151	   The main problem with this approach is inattentive users who may not
152	   execute the voice authentication procedure, or unattended secure
153	   phone calls to answering machines that cannot execute it.
154	   Additionally, some people worry about voice spoofing (the "Rich
155	   Little" attack), and some worry about trying to use it between people
156	   who don't know each other's voices.  This is not as much of a problem
157	   as it seems, because it isn't necessary that they recognize each
158	   other by their voice, it's only necessary that they detect that the
159	   voice used for the SAS procedure matches the voice in the rest of the
160	   phone call.  These concerns are not enough reason to embrace PKIs as
161	   an alternative, in my opinion.

163	   A popular and field-proven approach is used by SSH (Secure Shell)
164	   [12], which Peter Gutmann likes to call the "baby duck" security
165	   model.  SSH establishes a relationship by exchanging public keys in
166	   the initial session, when we assume no attacker is present, and this
167	   makes it possible to authenticate all subsequent sessions.  A
168	   successful MitM attacker has to have been present in all sessions all
169	   the way back to the first one, which is assumed to be difficult for
170	   the attacker.  All this is accomplished without resorting to a
171	   centrally-managed PKI.

173	   We use an analogous baby duck security model to authenticate the DH
174	   exchange in ZRTP.  We don't need to exchange persistent public keys,
175	   we can simply cache a shared secret and re-use it to authenticate a
176	   long series of DH exchanges for secure phone calls over a long period
177	   of time.  If we read aloud just one SAS, and then cache a shared
178	   secret for later calls to use for authentication, no new voice
179	   authentication rituals need to be executed.  We just have to remember
180	   we did one already.

182	   If we ever lose this cached shared secret, it is no longer available
183	   for authentication of DH exchanges, so we would have to do a new SAS
184	   procedure and start over with a new cached shared secret.  Then we
185	   could go back to omitting the voice authentication on later calls.

187	   A particularly compelling reason why this approach is attractive is
188	   that SAS is easiest to implement when a GUI or some sort of display
189	   is available, which raises the question of what to do when no display
190	   is available.  We envision some products that implement secure VoIP
191	   via a local network proxy, which lacks a display in many cases.  If
192	   we take an approach that greatly reduces the need for a SAS in each
193	   and every call, we can operate in GUI-less products with greater
194	   ease.

196	   It's a good idea to force your opponent to have to solve multiple
197	   problems in order to mount a successful attack.  Some examples of
198	   widely differing problems we might like to present him with are:
199	   Stealing a shared secret from one of the parties, being present on
200	   the very first session and every subsequent session to carry out an
201	   active MitM attack, and solving the discrete log problem.  We want to
202	   force the opponent to solve more than one of these problems to
203	   succeed.

205	   The protocol can make use different kinds of shared secrets.  Each
206	   type of shared secret is determined by a different method.  All of
207	   the shared secrets are hashed together to form a session key to
208	   encrypt the call.  An attacker must defeat all of the methods in
209	   order to determine the session key.

211	   First, there is the shared secret determined entirely by a Diffie-
212	   Hellman key agreement.  It changes with every call, based on random
213	   numbers.  An attacker may attempt a classic DH MitM attack on this
214	   secret, but we can protect against this by displaying and reading
215	   aloud a SAS, combined with adding a hash commitment at the beginning
216	   of the DH exchange.

218	   Second, there is an evolving shared secret, or ongoing shared secret
219	   that is automatically changed and refreshed and cached with every new
220	   session.  We will call this the cached shared secret, or sometimes
221	   the retained shared secret.  Each new image of this ongoing secret is
222	   a non-invertable function of its previous value and the new secret
223	   derived by the new DH agreement.  It's possible that no cached shared
224	   secret is available, because there were no previous sessions to
225	   inherit this value from, or because one side loses its cache.

227	   There are other approaches for key agreement for SRTP that compute a
228	   shared secret using information in the signaling.  For example, [14]
229	   describes how to carry a MIKEY (Multimedia Internet KEYing) [15]
230	   payload in SDP [16].  Or [13] describes directly carrying SRTP keying
231	   and configuration information in SDP.  ZRTP does not rely on the
232	   signaling to compute a shared secret, but If a client does produce a
233	   shared secret via the signaling, and makes it available to the ZRTP
234	   protocol, ZRTP can make use of this shared secret to augment the list
235	   of shared secrets that will be hashed together to form a session key.
236	   This way, any security weaknesses that might compromise the shared
237	   secret contributed by the signaling will not harm the final resulting
238	   session key.

240	   There may also be a static shared secret that the two parties agree
241	   on out-of-band in advance.  A hashed passphrase would suffice.

243	   The shared secret provided by the signaling (if available), the
244	   shared secret computed by DH, and the cached shared secret are all
245	   hashed together to compute the session key for a call.  If the cached
246	   shared secret is not available, it is omitted from the hash
247	   computation.  If the signaling provides no shared secret, it is also
248	   omitted from the hash computation.

250	   No DH MitM attack can succeed if the ongoing shared secret is
251	   available to the two parties, but not to the attacker.  This is
252	   because the attacker cannot compute a common session key with either
253	   party without knowing the cached secret component, even if he
254	   correctly executes a classic DH MitM attack.  Mixing in the cached
255	   shared secret for the session key calculation allows it to act as an
256	   implicit authenticator to protect the DH exchange, without requiring
257	   additional explicit HMACs to be computed on the DH parameters.  If
258	   the cached shared secret is available, a MitM attack would be
259	   instantly detected by the failure to achieve a shared session key,
260	   resulting in undecryptable packets.  The protocol can easily detect
261	   this.  It would be more accurate to say that the MitM attack is not
262	   merely detected, but thwarted.

264	   When adding the complexity of additional shared secrets beyond the
265	   familiar DH key agreement, we must make sure the lack of availability
266	   of the cached shared secret cannot prevent a call from going through,
267	   and we must also prevent false alarms that claim an attack was
268	   detected.

270	   An added benefit of using these cached shared secrets to mix in with
271	   the session keys is that it augments the entropy of the session key.
272	   Even if limits on the size of the DH exchange produces a session key
273	   with less than 256 bits of real work factor, the added entropy from
274	   the cached shared secret can bring up all the subsequent session keys
275	   to the full 256-bit AES key strength, assuming no attacker was
276	   present in the first call.

278	   We could have authenticated the DH exchange the same way SSH does it,
279	   with digital signatures, caching public keys instead of shared
280	   secrets.  But this approach with caching shared secrets seemed a bit
281	   simpler, and has the added benefit of adding more entropy to the
282	   session keys.

284	   The following sections provide an overview of the ZRTP protocol,
285	   describe the key agreement algorithm and RTP header extensions.

287	2.  Terminology

289	   In this document, the key words "MUST", "MUST NOT", "REQUIRED",
290	   "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY",
291	   and "OPTIONAL" are to be interpreted as described in RFC 2119 and
292	   indicate requirement levels for compliant implementations.

294	3.  Protocol Description

296	3.1.  Overview

298	   This section provides a description of how ZRTP works.  This
299	   description is non-normative in nature but is included to build
300	   understanding of the protocol.

302	   ZRTP is negotiated the same way a conventional RTP session is
303	   negotiated.  Using SIP, the AVP/RTP profile is used in SDP.  The ZRTP
304	   protocol begins after two endpoints have utilized a signaling
305	   protocol such as SIP and are ready to send or have already begun
306	   sending RTP packets.  This specification defines new RTP extension
307	   header which is used to carry the ZRTP messages between the
308	   endpoints.  Since RTP endpoints ignore unknown extension headers, the
309	   protocol is fully backwards compatible - a ZRTP endpoint attempting
310	   to perform key agreement with a non-ZRTP endpoint will simply receive
311	   normal RTP responses and can then inform the user that a secure
312	   session is not possible and either continue with the insecure session
313	   or terminate the session depending on the user's security policy.

315	   The ZRTP exchange begins at the same time that the first RTP packets
316	   are exchanged between the endpoints.  A ZRTP message can be embedded
317	   in RTP messages containing actual media samples, or they may be sent
318	   in separate RTP messages.  For example, if the RTP payload or codec
319	   supports silence or no-op messages, then these can be used for RTP
320	   transport.  If none of these are supported, an RTP packet containing
321	   comfort noise can be generated to carry a ZRTP message.

323	   A ZRTP endpoint initiates the exchange by sending a ZRTP Hello
324	   message to the other endpoint.  The purpose of the Hello message is
325	   to discover if the other endpoint supports the protocol and to see
326	   what algorithms the two ZRTP endpoints have in common.

328	   The Hello message contains the SRTP configuration options, and the
329	   ZID.  Each instance of ZRTP has a unique 96-bit random ZRTP ID or ZID
330	   that is generated once at installation time.  It is used to look up
331	   retained shared secrets in a local cache.  A single global ZID for a
332	   single installation is the simplest way to implement ZIDs, and may be
333	   required in applications where the encryption is being done by a
334	   "bump in the cord" proxy that does not know who is being called.
335	   However, it is specifically not precluded for an implementation to
336	   use multiple ZIDs, up to the limit of a separate one per callee.
337	   This then turns it into a long-lived "association ID" that does not
338	   apply to any other associations between a different pair of parties.
339	   It is a goal of this protocol to permit both options to interoperate
340	   freely.

342	   A response to a ZRTP Hello message is a ZRTP HelloACK message.  The
343	   HelloACK message simply acknowledges receipt of the Hello message and
344	   indicates support for the ZRTP protocol.  Since RTP uses best effort
345	   UDP transport, ZRTP has retransmission timers in case of lost
346	   datagrams.  There are two timers, both with exponential backoff
347	   mechanisms.  One timer is used for retransmissions of Hello messages
348	   and the other is used for retransmissions of all other messages after
349	   receipt of a HelloACK which indicates support of ZRTP by the other
350	   endpoint.

352	   After both endpoints exchange Hello and HelloACK messages, the key
353	   agreement exchange can begin with the ZRTP Commit message.  An
354	   example call flow is shown in Figure 1 below.  Note that the order of
355	   the Hello/HelloACK exchanges in F1/F2 and F3/F4 may be reversed.
356	   Also, an endpoint that receives a Hello message and wishes to
357	   immediately begin the ZRTP key agreement can omit the HelloACK and
358	   send the Commit instead.  In Figure 1, this would result in messages
359	   F2, F3, and F4 being omitted.  Note that the endpoint which sends the
360	   Commit message is considered the initiator of the ZRTP session and
361	   drives the key agreement exchange.

363	   Alice                                      Bob
364	     |                                         |
365	     | Alice and Bob establish a media session.|
366	     |                                         |
367	     |                   RTP                   |
368	     |<=======================================>|
369	     |                                         |
370	     | Hello (ver,cid,hash,cipher,pkt,sas,Alice's ZID) F1
371	     |---------------------------------------->|
372	     |                             HelloACK F2 |
373	     |<----------------------------------------|
374	     | Hello (ver,cid,hash,cipher,pkt,sas,Bob's ZID) F3
375	     |<----------------------------------------|
376	     | HelloACK F4                             |
377	     |---------------------------------------->|
378	     |                                         |
379	     |        Bob acts as the initiator        |
380	     |                                         |
381	     |   Commit (Bob's ZID,hash,cipher,pkt,hvi) F5
382	     |<----------------------------------------|
383	     | DHPart1 (pvr,rs1IDr,rs2IDr,sigsIDr,srtpsIDr,other_secretIDr) F6
384	     |---------------------------------------->|
385	     | DHPart2 (pvi,rs1IDi,rs2IDi,sigsIDi,ssrtpIDi,other_secretIDi) F7
386	     |<----------------------------------------|
387	     |                                         |
388	     | Alice and Bob generate SRTP session key.|
389	     |                                         |
390	     |               SRTP begins               |
391	     |<=======================================>|
392	     |                                         |
393	     | Confirm1 (plaintext,sasflag,hmac) F8    |
394	     |---------------------------------------->|
395	     |    Confirm2 (plaintext,sasflag,hmac) F9 |
396	     |<----------------------------------------|
397	     | Confirm2AK F10                          |
398	     |---------------------------------------->|
399	   Figure 1. Establishment of a SRTP session using ZRTP

401	3.2.  Key Agreement Algorithm

403	   The key agreement algorithm has four phases that are described
404	   normatively in the following sections.

406	3.2.1.  Discovery

408	   During the discovery phase, a ZRTP endpoint discovers if the other
409	   endpoint supports ZRTP and which ZRTP version, hash, cipher, public
410	   key type, and sas algorithms are supported.  In addition, each
411	   endpoint sends and discovers ZIDs.  The received ZID is used to
412	   retrieve previous retained shared secrets, rs1 and rs2.  If the
413	   endpoint has other secrets, then they are also collected.  The
414	   signaling secret (sigs), is passed from the signaling protocol used
415	   to establish the RTP session.  For SIP, it is the dialog identifier
416	   of a Secure SIP (SIPS) session: a string composed of Call-ID, to tag,
417	   and from tag.  From the definitions in RFC 3261 [11]:

419	   sigs = hash(call-id | to-tag | from-tag)

421	   Note: the dialog identifier of a non-secure SIP session should not be
422	   considered a signaling secret as it has no confidentiality
423	   protection.  For the SRTP secret (srtps), it is the SRTP master key
424	   and salt.  This information may have been passed in the signaling
425	   using MIKEY or SDP Security Descriptions, for example:

427	   srtps = hash(SRTP master key | SRTP master salt)

429	   Additional shared secrets can be defined and used as other_secret.
430	   If no secret of a given type is available, a random value is
431	   generated and used for that secret to ensure a mismatch in the hash
432	   comparisons in the DHPart1 and DHPart2 messages.  This prevents an
433	   eavesdropper from knowing how many shared secrets are available
434	   between the endpoints.

436	   A Hello message can be sent at any time, but is usually sent at the
437	   start of an RTP session to determine if the other endpoint supports
438	   ZRTP, and also if the SRTP implementations are compatible.  A Hello
439	   message is retransmitted using timer T1 and an exponential backoff
440	   mechanism detailed in Section 5 until the receipt of a HelloACK
441	   message or a Commit message.

443	3.2.2.  Hash Commitment

445	   The hash commitment is performed by the initiator of the ZRTP
446	   exchange.  From the intersection of the algorithms in the sent and
447	   received Hello messages, the initiator chooses a hash, cipher, public
448	   key type, and sas algorithm to be used.

450	   The key agreement begins with the initiator choosing a fresh random
451	   Diffie-Hellman (DH) secret value (svi) based on the chosen public key
452	   type value, and computing the public value.  (Note that to speed up
453	   processing, this computation can be done in advance.)  For guidance
454	   on generating random numbers, see the section on Random Number
455	   Generation.  The Diffie-Hellman secret value, svi, SHOULD be twice as
456	   long as the AES key length.  This means, if AES 128 is used, the DH
457	   secret value SHOULD be 256 bits long.  If AES 256 is used, the secret
458	   value SHOULD be 512 bits long.

460	   pvi = g^svi mod p

462	   where g and p are determined by the public key type value, and a
463	   hash, hvi, of the public value using the chosen hash algorithm.  The
464	   hvi includes the set of hash, cipher, pkt, and sas types from the
465	   responder's Hello message in the following order:

467	   hvi=hash(pvi | hashr1-5 | cipherr1-5 | pktr1-5 | sasr1-5)

469	   The information from the responder's Hello message is included in the
470	   hash calculation to prevent a bid-down attack by modification of the
471	   responder's Hello message.

473	   Note: If both sides send Commit messages initiating a secure session
474	   at the same time, the Commit message with the lowest hvi value is
475	   discarded and the other side is the initiator.  This breaks the tie,
476	   allowing the protocol to proceed from this point with a clear
477	   definition of who is the initiator and who is the responder.

479	3.2.3.  Diffie-Hellman Exchange

481	   The purpose of the Diffie-Hellman exchange is for the two ZRTP
482	   endpoints to generate a new shared secret, s0.  In addition, the
483	   endpoints discover if they have any shared secrets in common.  If
484	   they do, this exchange allows them to discover how many and agree on
485	   an ordering for them: s1, s2, etc.

487	3.2.3.1.  Responder Behavior

489	   Upon receipt of the Commit message, the responder generates its own
490	   fresh random DH secret value, svr, and computes the public value.
491	   (Note that to speed up processing, this computation can be done in
492	   advance.)  For guidance on random number generation, see the section
493	   on Random Number Generation.  The Diffie-Hellman secret value, svr,
494	   SHOULD be twice as long as the AES key length.  This means, if AES
495	   128 is used, the DH secret value SHOULD be 256 bits long.  If AES 256
496	   is used, the secret value SHOULD be 512 bits long.

498	   pvr = g^svr mod p

500	   The final shared secret, s0, is calculated by hashing the
501	   concatenation of the Diffie-Hellman shared secret (DHSS) followed by
502	   the (possibly empty) set of shared secrets that are actually shared
503	   between the initiator and responder.  For computing the hash, the
504	   shared secrets are sorted by ascending order of the initiator's
505	   corresponding shared secret IDs.  The remainder of this section
506	   describes an algorithm to accomplish this.

508	   First, an HMAC keyed hash is calculated using the first retained
509	   shared secret, rs1, as the key on the string "Responder" which
510	   generates a retained secret ID, rs1IDr, which is truncated to 64
511	   bits.  HMACs are calculated in a similar way for additonal shared
512	   secrets:

514	   rs1IDr = HMAC(rs1, "Responder")

516	   rs2IDr = HMAC(rs2, "Responder")

518	   sigsIDr = HMAC(sigs, "Responder")

520	   srtpsIDr = HMAC(srtps, "Responder")

522	   other_secretIDr = HMAC(other_secret, "Responder")

524	   A ZRTP DHPart1 message is generated containing pvr and the set of
525	   keyed hashes (HMACs) derived from the possibly shared secrets.

527	   Upon receipt of the DHPart2 message, the responder checks that the
528	   initiator's public DH value is not equal to 1 or p-1.  An attacker
529	   might inject a false DHPart2 packet with a value of 1 or p-1 for
530	   g^svi mod p, which would cause a disastrously weak final DH result to
531	   be computed.  If pvi is 1 or p-1, the user should be alerted of the
532	   attack and the protocol must be aborted.  Otherwise, the responder
533	   then computes the hash of the public DH value in the DHPart2 with the
534	   hash from the Commit.  If they are different (hash(pvi)!= hvi), a
535	   MitM attack is taking place and the user is alerted.

537	   The responder then calculates the Diffie-Hellman result:

539	   DHResult = pvi^svr mod p

541	   The responder then calculates the Diffie-Hellman shared secret:

543	   DHSS = hash(DHResult)

545	   The set of five shared secret IDs received from the DHPart2 message
546	   are stored as set A.

548	   The responder then calculates the set of secret IDs that are expected
549	   to be received from the initiator in the DHPart2 message:

551	   rs1IDi = HMAC(rs1, "Initiator")

553	   rs2IDi = HMAC(rs2, "Initiator")
554	   sigsIDi = HMAC(sigs, "Initiator")

556	   srtpsIDi = HMAC(srtps, "Initiator")

558	   other_secretIDi = HMAC(other_secret, "Initiator")

560	   The set (rs1IDi, rs2IDi, sigsIDi, srtpsIDi, other_secretIDi) is set
561	   B. Set C is the intersection of set A and set B. Set C is then sorted
562	   in ascending numerical order.  Set C will contain between zero and
563	   five secret IDs.  Set D is then created as the actual secrets
564	   corresponding to the secret IDs in set C in the same order.  The set
565	   D is expanded to 5 values by adding in null secrets: s1, s2, s3, s4,
566	   and s5.  The final shared secret, s0, is calculated by hashing the
567	   concatenation of the DHSS and the set of non-null shared secrets.  As
568	   a result, the null secrets have no effect on the concatenation
569	   operation:

571	   s0 = hash(DHSS | s1 | s2 | s3 | s4 | s5)

573	3.2.3.2.  Initiator Behavior

575	   Upon receipt of the DHPart1 message, the initiator checks that the
576	   responder's public DH value is not equal to 1 or p-1.  An attacker
577	   might inject a false DHPart1 packet with a value of 1 or p-1 for
578	   g^svr mod p, which would cause a disastrously weak final DH result to
579	   be computed.  If pvr is 1 or p-1, the user should be alerted of the
580	   attack and the protocol must be aborted.

582	   If pvr is not 1 or p-1, the initiator looks up any retained shared
583	   secrets associated with the responder's ZID.  The final shared
584	   secret, s0, is calculated by hashing the concatenation of the DHSS
585	   followed by the (possibly empty) set of shared secrets that are
586	   actually shared between the initiator and responder.  For computing
587	   the hash, the shared secrets are sorted by ascending order of the
588	   initiator's corresponding shared secret IDs.  The remainder of this
589	   section describes an algorithm to accomplish this.

591	   First, an HMAC keyed hash is calculated using the first retained
592	   shared secret, rs1, as the key on the string "Initiator" which
593	   generates a retained secret ID, rs1IDi, which is truncated to 64
594	   bits.  HMACs are calculated in a similar way for additional shared
595	   secrets:

597	   rs1IDi = HMAC(rs1, "Initiator")

599	   rs2IDi = HMAC(rs2, "Initiator")

601	   sigsIDi = HMAC(sigs, "Initiator")
602	   srtpsIDi = HMAC(srtps, "Initiator")

604	   other_secretIDi = HMAC(other_secret, "Initiator")

606	   The initiator then sends a DHPart2 message containing the initiator's
607	   public DH value and the set of calculated retained secret IDs.

609	   The initiator calculates the same Diffie-Hellman result using:

611	   DHResult = pvr^svi mod p

613	   The initiator then calculates the DH shared secret using:

615	   DHSS = hash(DHResult)

617	   The set of five shared secret IDs received in the DHPart1 message are
618	   stored as set A.

620	   The initiator then calculates the set of secret IDs that are expected
621	   to be received from the responder in the DHPart1 message:

623	   rs1IDr = HMAC(rs1, "Responder")

625	   rs2IDr = HMAC(rs2, "Responder")

627	   sigsIDr = HMAC(sigs, "Responder")

629	   srtpsIDr = HMAC(srtps, "Responder")

631	   other_secretIDr = HMAC(other_secret, "Responder")

633	   The set (rs1IDr, rs2IDr, sigsIDr, srtpsIDr, other_secretIDr) is B.
634	   Set C is the intersection of set A and set B. Set C will contain
635	   between zero and five secret IDs.  Set D is then created as the
636	   actual secrets corresponding to the secret IDs in set C. Set E is the
637	   set of secret IDs that corresponds to the secrets in set D sent in
638	   the DHPart2 message.  Set E is then sorted in ascending numerical
639	   order.  Set D is then sorted to the same order as the corresponding
640	   secrets in set E.

642	   The set D is expanded to 5 values by adding in null secrets: s1, s2,
643	   s3, s4, and s5.  The final shared secret, s0, is calculated by
644	   hashing the concatenation of the DHSS and the set of non-null shared
645	   secrets.  As a result, the null secrets have no effect on the
646	   concatenation operation:

648	   s0 = hash(DHSS | s1 | s2 | s3 | s4 | s5)

650	3.2.4.  Confirmation and Switch to SRTP

652	   The SRTP master key and master salt are then generated using the
653	   shared secret.  Separate SRTP keys and salts are used in each
654	   direction for each media stream.  Unless otherwise specified, ZRTP
655	   uses SRTP with no MKI, 32 bit authentication using HMAC-SHA1, AES-CM
656	   128 or 256 bit key length, 112 bit session salt key length, 2^48 key
657	   derivation rate, and SRTP prefix length 0.

659	   The ZRTP initiator encrypts and the ZRTP responder decrypts packets
660	   by using srtpkeyi and srtpsalti, which are generated by:

662	   srtpkeyi = HMAC(s0,"Initiator SRTP master key")

664	   srtpsalti = HMAC(s0,"Initiator SRTP master salt")

666	   The ZRTP responder encrypts and the ZRTP initiator decrypts packets
667	   by using srtpkeyr and srtpsaltr, which are generated by:

669	   srtpkeyr = HMAC(s0,"Responder SRTP master key")

671	   srtpsaltr = HMAC(s0,"Responder SRTP master salt")

673	   The HMAC key is generated by:

675	   hmackey = HMAC(s0,"HMAC key")

677	   Both sides now discard the rs2 value and store rs1 as rs2.  A new rs1
678	   is calculated from s0:

680	   rs1 = HMAC (s0, "retained secret")

682	   The endpoints can now switch to SRTP and begin packet encryption.
683	   The ZRTP Initiator and Responder use their own keying material for
684	   the SRTP session.  No MKI is used and a 32 bit authentication tag is
685	   used.

687	   The ZRTP Confirm1 and Confirm2 messages are sent for two reasons.
688	   First, they confirm that all the key agreement calculations were
689	   successful and the encryption is working, and they enable us to
690	   automatically detect a DH MitM attack from a reckless attacker who
691	   does not know the retained shared secret.  Second, they enable us to
692	   transmit the SASflag under cover of SRTP encryption, shielding it
693	   from a passive observer who would like to know if the human users are
694	   in the habit of diligently verifying the SAS.

696	   In the Confirm1 and Confirm2 messages, the sasflag Boolean is
697	   converted to an octet called sasflagoctet (resulting in either 0x00
698	   or 0x01).  Confirm1 and Confirm2 messages contain an HMAC of some
699	   known plaintext and the sasflagoctet.  The HMAC is explicitly
700	   included in the payload because we may not always be able to rely on
701	   the built-in authentication tag in SRTP, which might be configured to
702	   different sizes, including none.

704	   hmac = HMAC(hmackey, "known plaintext" | sasflagoctet )

706	   This information is not carried in the extension header but inserted
707	   at the start of the SRTP payload.

709	   The Comfirm2ACK message completes the exchange.

711	   The optional GoClear message is used to switch from SRTP back to RTP.
712	   To avoid relying on the optional SRTP authentication tag, the GoClear
713	   contains an HMAC of the string "GoClear" computed with the hmackey
714	   derived from the shared secret:

716	   clear_hmac = HMAC(hmackey, "GoClear")

718	   A GoClear message receives either a ClearACK message or an Error
719	   message, which indicates that the ZRTP endpoint does not support the
720	   GoClear mechanism or that the GoClear has failed authentication (the
721	   clear_hmac does not validate).

723	3.3.  Random Number Generation

725	   The ZRTP protocol uses random numbers for cryptographic key material,
726	   notably for the DH secret exponents, which must be freshly generated
727	   with each session.  Whenever a random number is needed, all of the
728	   following criteria must be satisfied:

730	   It MUST be derived from a physical entropy source, such as RF noise,
731	   acoustic noise, thermal noise, high resolution timings of
732	   environmental events, or other unpredictable physical sources of
733	   entropy.  Chapter 10 of [4] gives a detailed explanation of
734	   cryptographic grade random numbers and provides guidance for
735	   collecting suitable entropy.  The raw entropy must be distilled and
736	   processed through a deterministic random bit generator (DRBG).
737	   Examples of DRBGs may be found in NIST SP 800-90 [5], and in [4].

739	   It MUST be freshly generated, meaning that it must not have been used
740	   in a previous calculation.

742	   It MUST be greater than or equal to two, and less than or equal to
743	   2^L - 1, where L is the number of random bits required.

745	   It MUST be chosen with equal probability from the entire available
746	   number space, e.g., [2, 2^L - 1].

748	4.  RTP Header Extensions

750	   This specification defines a new RTP header extension used for all
751	   ZRTP messages.  When used, the X bit is set in the RTP header to
752	   indicate the presence of the RTP header extension.

754	   Section 5.3.1 in RFC 3550 defines the format of an RTP Header
755	   extension.  The Header extension is appended to the RTP header.  The
756	   first 16 bits are an identifier for the header extension, and the
757	   following 16 bits are length of the extension header in 32 bit words.
758	   All word lengths referenced in this specification follow RFC 3550 and
759	   are 32 bits or 4 octets.  All integer fields are carried in network
760	   byte order, that is, most significant byte (octet) first, commonly
761	   known as big-endian.  Each ZRTP message is carried in a single RTP
762	   header extension which is the value of 0x505A.

764	4.1.  ZRTP Message Formats

766	   ZRTP messages are designed to simplify endpoint parsing requirements
767	   and to reduce the opportunities for buffer overflow attacks (a good
768	   goal of any security extension should be to not introduce new attack
769	   vectors...)

771	   ZRTP uses 8 octet blocks (2 words) to encode many ZRTP parameters.
772	   These fixed-length blocks are used for Message Type, Hash Type,
773	   Cipher Type, and Public Key Type.  The values in the blocks are ASCII
774	   strings which are extended with spaces (0x20) to make them 8
775	   characters long.  Currently defined block values are listed in Tables
776	   1-4 below.  Additional block values may be defined and used.

778	   ZRTP uses this ASCII encoding to simplify debugging and make it
779	   "ethereal friendly".

781	4.1.1.  Message Type Block

783	   Currently eleven Message Type Blocks are defined - they represent the
784	   set of ZRTP message primitives.  ZRTP endpoints MUST support the
785	   Hello, HelloACK, Commit, DHPart1, DHPart2, Confirm1, Confirm2,
786	   Conf2ACK, and Error block types.  They MAY support GoClear and
787	   ClearACK.

789	    Message Type Block   |  Meaning
790	    ---------------------------------------------------
791	    Hello                |  Hello Message
792	                         |  defined in Section 4.2
793	    ---------------------------------------------------
794	    HelloACK             |  HelloACK Message
795	                         |  defined in Section 4.3
796	    ---------------------------------------------------
797	    Commit               |  Commit Message
798	                         |  defined in Section 4.4
799	    ---------------------------------------------------
800	    DHPart1              |  DHPart1 Message
801	                         |  defined in Section 4.4
802	    ---------------------------------------------------
803	    DHPart2              |  DHPart2 Message
804	                         |  defined in Section 4.5
805	    ---------------------------------------------------
806	    Confirm1             |  Confirm1 Message
807	                         |  defined in Section 4.6
808	    ---------------------------------------------------
809	    Confirm2             |  Confirm2 Message
810	                         |  defined in Section 4.7
811	    ---------------------------------------------------
812	    Conf2ACK             |  Conf2ACK Message
813	                         |  defined in Section 4.8
814	    ---------------------------------------------------
815	    Error                |  Error Message
816	                         |  defined in Section 4.9
817	    ---------------------------------------------------
818	    GoClear              |  GoClear Message
819	                         |  defined in Section 4.10
820	    ---------------------------------------------------
821	    ClearACK             |  ClearACK Message
822	                         |  defined in Section 4.11
823	    ---------------------------------------------------
824	    Table 1. Message Block Type Values

826	4.1.2.  Message Type Block

828	   Only one Hash Type is currently defined, SHA256, and all ZRTP
829	   endpoints MUST support this hash.  Additional Hash Types can be
830	   registered and used.

832	    Hash Type Block      |  Meaning
833	    ---------------------------------------------------
834	    SHA256               |  SHA-256 Hash defined in [SHA-256]
835	    ---------------------------------------------------
836	    Table 2. Hash Block Type Values

838	4.1.3.  Cipher Type Block

840	   All ZRTP endpoints MUST support AES128 and MAY support AES256 or
841	   other Cipher Types.  Also, if AES 128 is used, DH3k should be used.
842	   If AES 256 is used, DH4k should be used.

844	     Cipher Type Block    |  Meaning
845	    ---------------------------------------------------
846	    AES128                |  AES-CM with 128 bit keys
847	                          |  as defined in RFC 3711
848	    ---------------------------------------------------
849	    AES256                |  AES-CM with 256 bit keys
850	                          |  as defined in RFC 3711
851	    ---------------------------------------------------
852	    Table 3. Cipher Block Type Values

854	4.1.4.  Public Key Type Block

856	   All ZRTP endpoints MUST support DH3072 and MAY support DH4096.  ZRTP
857	   endpoints MUST use the DH generator function g=2.  The choice of AES
858	   key length is coupled to the choice of public key type.  If AES 128
859	   is chosen, DH3072 SHOULD be used.  If AES 256 is chosen, DH4096
860	   SHOULD be used.

862	     Public Key Type Block|  Meaning
863	    ---------------------------------------------------
864	    DH3072                |  DH with p=3072 bit prime
865	                          |  as defined in RFC 3526
866	    ---------------------------------------------------
867	    DH4096                |  DH with p=4096 bit prime
868	                          |  as defined in RFC 3526
869	    ---------------------------------------------------
870	    Table 4. Public Key Block Type Values

872	4.1.5.  SAS Type Block

874	   All ZRTP endpoints MAY support the libase32 Short Authentication
875	   String scheme or other SAS schemes.  The optional ZRTP SAS is
876	   described in Section 6.

878	     SAS Type Block       |  Meaning
879	    ---------------------------------------------------
880	    libase32              |  Short Authentication String using
881	                          |  libbase32 encoding defined in Section 6.
882	    ---------------------------------------------------
883	    Table 5. SAS Block Type Values

885	4.2.  Hello message

887	   The Hello message has the format shown in Figure 2 below.  The header
888	   extension payload contains the ZRTP version number and the list of
889	   algorithms supported by SRTP.  The extension header field format is
890	   shown in Figure 2.

892	   The Hello ZRTP message begins with the ZRTP header extension field
893	   followed by the 32 bit word count of the header field.  Next is a
894	   word containing the version (ver) of ZRTP.  For this specification,
895	   the version is the string "0.01".  Next is the Client Identifier
896	   string (cid) which is 15 octets long and identifies the vendor and
897	   release of the ZRTP software.  The Passive bit (P) is a Boolean
898	   normally set to False.  A ZRTP endpoint which is configured to never
899	   initiate secure sessions is regarded as passive, and would set the P
900	   bit to True.  Next is a list of supported Hash Types, Cipher Types,
901	   public key types, and SAS Type.  Five possible algorithms are listed
902	   for each using the Blocks defined in Tables 2, 3, 4, and 5.  If fewer
903	   than five algorithms are supported, spaces (0x20) are used to pad out
904	   the 10 words for each type.  The last parameter is the ZID, the 96
905	   bit long unique identifier for the ZRTP endpoint.

907	        0                   1                   2                   3
908	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
909	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
910	       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|        length=50 words        |
911	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
912	       |              Message Type Block=Hello (2 words)               |
913	       |                                                               |
914	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
915	       |                        version (1 word)                       |
916	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
917	       |                                                               |
918	       |                 Client Identifier (15 octets)                 |
919	       |                                               +-+-+-+-+-+-+-+-+
920	       |                                               |0 0 0 0 0 0 0|P|
921	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
922	       |                                                               |
923	       |                 Hash Type Blocks 1-5 (10 words)               |
924	       |                              . . .                            |
925	       |                                                               |
926	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
927	       |                                                               |
928	       |                Cipher Type Blocks 1-5 (10 words)              |
929	       |                              . . .                            |
930	       |                                                               |
931	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
932	       |                                                               |
933	       |             Public Key Type Blocks 1-5 (10 words)             |
934	       |                              . . .                            |
935	       |                                                               |
936	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
937	       |                                                               |
938	       |                  SAS Type Blocks 1-5 (10 words)               |
939	       |                              . . .                            |
940	       |                                                               |
941	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
942	       |                                                               |
943	       |                         ZID  (3 words)                        |
944	       |                                                               |
945	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
946	   Figure 2. Extension header format for Hello message

948	4.3.  HelloACK message

950	   The HelloACK message is used to stop retransmissions of a Hello
951	   message.  A HelloACK is sent regardless if the version number in the
952	   Hello is supported or the algorithm list supported.  The receipt of a
953	   HelloACK stops retransmission of the Hello message.  The format is
954	   shown in Figure 3 below.  Note that a Commit message can be sent in
955	   place of a HelloACK by an initiator.

957	        0                   1                   2                   3
958	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
959	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
960	       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|         length=2 words        |
961	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
962	       |              Message Type Block=HelloACK (2 words)            |
963	       |                                                               |
964	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
965	     Figure 3. Extension header format for HelloACK message

967	4.4.  Commit message

969	   The Commit message is sent to initiate the key agreement process
970	   after receiving a Hello message.  The Commit message contains the
971	   initiator's ZID and a list of selected algorithms (hash, cipher, pkt,
972	   sas) and hvi, a hash of the public DH value of the initiator and the
973	   algorithm list from the responder's Hello message.  A Commit cannot
974	   be sent until a Hello message has been received.

976	        0                   1                   2                   3
977	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
978	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
979	       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|        length=16 words        |
980	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
981	       |              Message Type Block=Commit (2 words)              |
982	       |                                                               |
983	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
984	       |                                                               |
985	       |                         ZID  (3 words)                        |
986	       |                                                               |
987	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
988	       |                    Hash Type Blocks (2 words)                 |
989	       |                                                               |
990	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
991	       |                   Cipher Type Block (2 words)                 |
992	       |                                                               |
993	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
994	       |                Public Key Type Block (2 words)                |
995	       |                                                               |
996	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
997	       |                    SAS Type Block (2 words)                   |
998	       |                                                               |
999	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1000	       |                                                               |
1001	       |                             hvi (8 words)                     |
1002	       |                                . . .                          |
1003	       |                                                               |
1004	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1005	     Figure 4. Extension header format for Commit message

1007	4.5.  DHPart1 message

1009	   The DHPart1 message contain begins the DH exchange.  The format is
1010	   shown in Figure 5 below.  The DHPart1 message is sent if a valid
1011	   Commit message is received.  The length of the pvr value depends on
1012	   the Public Key Type chosen.  If DH4096 is used, the pvr will be 128
1013	   words (512 octets).  If DH3072 is used, it is 96 words (384 octets).

1015	   The next five parameters are HMACs of potential shared secrets used
1016	   in generating the ZRTP secret.  The first two, rs1IDr and rs2IDr, are
1017	   the HMACs of the responder's two retained shared secrets, truncated
1018	   to 64 bits.  Next is sigsIDr, the HMAC of the responder's signaling
1019	   secret, truncated to 64 bits.  Next is srtpsIDr, the HMAC of the
1020	   responder's SRTP secret, truncated to 64 bits.  The last parameter is
1021	   the HMAC of an additional shared secret.  For example, if multiple
1022	   SRTP secrets are available or some other secret is used, it can used
1023	   as the other_secret.

1025	        0                   1                   2                   3
1026	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1027	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1028	       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|   length=depends on PK Type   |
1029	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1030	       |              Message Type Block=DHPart1 (2 words)             |
1031	       |                                                               |
1032	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1033	       |                                                               |
1034	       |                 pvr (length depends on PK Type)               |
1035	       |                               . . .                           |
1036	       |                                                               |
1037	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1038	       |                        rs1IDr (2 words)                       |
1039	       |                                                               |
1040	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1041	       |                        rs2IDr (2 words)                       |
1042	       |                                                               |
1043	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1044	       |                        sigsIDr (2 words)                      |
1045	       |                                                               |
1046	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1047	       |                       srtpsIDr (2 words)                      |
1048	       |                                                               |
1049	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1050	       |                    other_secretIDr (2 words)                  |
1051	       |                                                               |
1052	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1053	      Figure 5. Extension header format for DHPart1 message

1055	4.6.  DHPart2 message

1057	   The DHPart2 message completes the DH exchange.  A DHPart2 message is
1058	   sent if a valid DHPart1 message is received.  The length of the pvi
1059	   value depends on the Public Key Type chosen.  If DH4096 is used, the
1060	   pvr will be 128 words (512 octets).  If DH3072 is used, it is 96
1061	   words (384 octets).

1063	   The next five parameters are HMACs of potential shared secrets used
1064	   in generating the ZRTP secret.  The first two, rs1IDi and rs2IDi, are
1065	   the HMACs of the initiator's two retained shared secrets, truncated
1066	   to 64 bits.  Next is sigsIDi, the HMAC of the initiator's signaling
1067	   secret, truncated to 64 bits.  Next is srtpsIDi, the HMAC of the
1068	   initiator's SRTP secret, truncated to 64 bits.  The last parameter is
1069	   the HMAC of an additional shared secret.  For example, if multiple
1070	   SRTP secrets are available or some other secret is used, it can be
1071	   included.

1073	        0                   1                   2                   3
1074	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1075	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1076	       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|   length=depends on PK Type   |
1077	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1078	       |              Message Type Block=DHPart2 (2 words)             |
1079	       |                                                               |
1080	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1081	       |                                                               |
1082	       |                   pvi (length depends on PK Type)             |
1083	       |                               . . .                           |
1084	       |                                                               |
1085	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1086	       |                        rs1IDi (2 words)                       |
1087	       |                                                               |
1088	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1089	       |                        rs2IDi (2 words)                       |
1090	       |                                                               |
1091	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1092	       |                        sigsIDi (2 words)                      |
1093	       |                                                               |
1094	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1095	       |                       srtpsIDi (2 words)                      |
1096	       |                                                               |
1097	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1098	       |                    other_secretIDi (2 words)                  |
1099	       |                                                               |
1100	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1101	     Figure 6. Extension header format for DHPart2 message

1103	4.7.  Confirm1 message

1105	   The Confirm1 message is sent in response to a valid DHPart2 message
1106	   after the SRTP session key and parameters have been negotiated.  As a
1107	   result, it is always sent in an SRTP packet.  The format is shown in
1108	   Figure 7 below.  The header extension itself has no parameters
1109	   besides the Message Type Block.  However, three parameters are
1110	   carried in the SRTP payload.  The plaintext parameter contains the
1111	   known plaintext "known plaintext".  The sasflag (S) is a Boolean bit.
1112	   The hmac is a hash over the known plaintext "known plaintext" and the
1113	   SASflag Boolean converted to the octet 0x00 or 0x01.

1115	   The parameters included in the SRTP payload MUST NOT be allowed to
1116	   pass to the RTP stack or errors may occur with the media stream.

1118	        0                   1                   2                   3
1119	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1120	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1121	       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|         length=2 words        |
1122	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1123	       |              Message Type Block=Confirm1 (2 words)            |
1124	       |                                                               |
1125	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1127	         At the start of the SRTP payload:

1129	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1130	       |                                                               |
1131	       |                                                               |
1132	       |                     plaintext (31 octets)                     |
1133	       |                                               +-+-+-+-+-+-+-+-+
1134	       |                                               |0 0 0 0 0 0 0|S|
1135	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1136	       |                                                               |
1137	       |                         hmac (8 words)                        |
1138	       |                             . . .                             |
1139	       |                                                               |
1140	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1141	     Figure 7. Extension header format for Confirm1 message

1143	4.8.  Confirm2 message

1145	   The Confirm2 message is sent in response to a Confirm1 message after
1146	   the SRTP session key and parameters have been negotiated.  As a
1147	   result, it is always sent in an SRTP packet.  The format is shown in
1148	   Figure 8 below.  The header extension itself has no parameters
1149	   besides the Message Type Block.  However, three parameters are
1150	   carried in the SRTP payload.  The plaintext parameter contains the
1151	   known plaintext "known plaintext".  The sasflag (S) is a Boolean bit.
1152	   The hmac is a hash over the known plaintext "known plaintext" and the
1153	   SASflag Boolean converted to the octet 0x00 or 0x01.

1155	   The parameters included in the SRTP payload MUST NOT be allowed to
1156	   pass to the RTP stack or errors may occur with the media stream.

1158	        0                   1                   2                   3
1159	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1160	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1161	       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|         length=2 words        |
1162	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1163	       |              Message Type Block=Confirm2 (2 words)            |
1164	       |                                                               |
1165	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1167	        At the start of the SRTP payload:

1169	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1170	       |                                                               |
1171	       |                     plaintext (31 octets)                     |
1172	       |                                               +-+-+-+-+-+-+-+-+
1173	       |                                               |0 0 0 0 0 0 0|S|
1174	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1175	       |                                                               |
1176	       |                         hmac (8 words)                        |
1177	       |                             . . .                             |
1178	       |                                                               |
1179	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1180	      Figure 8. Extension header format for Confirm1 message

1182	4.9.  Conf2ACK message

1184	   The Conf2ACK message is sent in response to a valid Confirm2 message.
1185	   The format is shown in Figure 9 below.  The receipt of a Conf2ACK
1186	   stops retransmission of the Confirm2 message.

1188	        0                   1                   2                   3
1189	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1190	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1191	       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|         length=2 words        |
1192	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1193	       |              Message Type Block=Conf2ACK (2 words)            |
1194	       |                                                               |
1195	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1196	     Figure 9. Extension header format for Conf2ACK message

1198	4.10.  Error message

1200	   An Error message is sent in response to another ZRTP message which is
1201	   not valid or not supported.  The format is shown in Figure 10 below.
1202	   Reasons could be: missing block or parameter, chosen parameter not in
1203	   offered list, checksum failure, message type block not understood
1204	   etc.  The ZRTP message type that generated the error is included in
1205	   the Message Type Block.  This message can be sent in response to any
1206	   ZRTP message except Hello and HelloACK and is never acknowledged or
1207	   retransmitted.

1209	        0                   1                   2                   3
1210	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1211	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1212	       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|        length=4 words         |
1213	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1214	       |              Message Type Block=Error (2 words)               |
1215	       |                                                               |
1216	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1217	       |                  Message Type Block  (2 words)                |
1218	       |                                                               |
1219	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1220	     Figure 10. Extension header format for Error message

1222	4.11.  GoClear message

1224	   The optional GoClear message is sent to switch from SRTP back to RTP.
1225	   The format is shown in Figure 11 below.  The clear_hmac is used to
1226	   authenticate the GoClear message so that bogus GoClear messages
1227	   introduced by an attacker can be detected and discarded.  This
1228	   message is retransmitted at 500ms intervals until the receipt of a
1229	   ClearACK message or an Error message.

1231	   After sending a GoClear message, the ZRTP endpoint stops sending SRTP
1232	   packets.  When a ClearACK is received, the ZRTP endpoint deletes the
1233	   crypto context for the SRTP session and may then resume sending RTP
1234	   packets.  However, if instead an Error message is received, the SRTP
1235	   session resumes as if the GoClear had never been sent.

1237	        0                   1                   2                   3
1238	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1239	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1240	       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|        length=10 words        |
1241	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1242	       |              Message Type Block=GoClear (2 words)             |
1243	       |                                                               |
1244	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1245	       |                                                               |
1246	       |                       clear_hmac (8 words)                    |
1247	       |                             . . .                             |
1248	       |                                                               |
1249	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1250	     Figure 11. Extension header format for GoClear message

1252	4.12.  ClearACK message

1254	   The optional ClearACK message is sent to acknowledge receipt of a
1255	   GoClear.  A ClearACK is only sent if the clear_hmac from the GoClear
1256	   message is authenticated.  Otherwise, an Error message is returned.
1257	   The format is shown in Figure 12 below.  A ZRTP endpoint that
1258	   receives a GoClear message stops sending SRTP packets, generates a
1259	   ClearACK in response, and deletes the crypto context for the SRTP
1260	   session.  Until confirmation from the user is received (e.g. clicking
1261	   a button, pressing a DTMF key, etc.), the ZRTP endpoint MUST NOT
1262	   resume sending RTP packets.  The endpoint then renders the
1263	   information that the media session has switched to clear mode to the
1264	   user and waits for confirmation from the user.  To prevent pinholes
1265	   from closing or NAT bindings from expiring, the ClearACK message
1266	   should be resent every 5 seconds while waiting for confirmation from
1267	   the user.  After confirmation of the notification is received from
1268	   the user, the sending of RTP packets may begin.

1270	   Note that if the GoClear/ClearACK mechanism is not supported by a
1271	   ZRTP endpoint, an Error message MUST be sent in response to a GoClear
1272	   message.

1274	        0                   1                   2                   3
1275	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1276	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1277	       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|        length=2 words         |
1278	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1279	       |              Message Type Block=ClearACK (2 words)            |
1280	       |                                                               |
1281	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1282	     Figure 12. Extension header format for ClearACK message

1284	5.  Retransmissions

1286	   ZRTP uses two retransmission timers T1 and T2.  T1 is used for
1287	   retransmission of Hello messages, when the support of ZRTP by the
1288	   other endpoint may not be known.  T2 is used in retransmissions of
1289	   all the other ZRTP messages with the exception of GoClear.  The
1290	   retransmission of GoClear messages is discussed in the section on
1291	   GoClear.

1293	   Practical experience has shown that RTP packet loss at the start of
1294	   an RTP session can be extremely high.  Since the entire ZRTP message
1295	   exchange occurs during this period, the defined retransmission scheme
1296	   is defined to be aggressive.  Since ZRTP packets with the exception
1297	   of the DHPart1 and DHPart2 messages are small, this should have
1298	   minimal effect on overall bandwidth utilization of the media session.

1300	   Hello ZRTP requests are retransmitted at an interval that starts at
1301	   T1 seconds and doubles after every retransmission, capping at 200ms.
1302	   A Hello message is retransmitted 20 times before giving up.  T1 has a
1303	   recommended value of 50 ms.  Retransmission of a Hello ends upon
1304	   receipt of a HelloACK or Commit message.

1306	   Non-Hello ZRTP requests are retransmitted only by the initiator -
1307	   that is, only Commit, DHPart2, and Confirm2 are retransmitted if the
1308	   corresponding message from the responder, DHPart1, Confirm1, and
1309	   Conf2ACK, are not received.  Non-Hello ZRTP messages are
1310	   retransmitted at an interval that starts at T2 seconds and doubles
1311	   after every retransmission, capping at 600ms.  Only the ZRTP
1312	   initiator performs retransmissions.  Each message is retransmitted 10
1313	   times before giving up and resuming a normal RTP session.  T2 has a
1314	   default value of 150ms.  Each message has a response message that
1315	   stops retransmissions, as shown in Table 6.  The high value of T2
1316	   means that retransmissions will likely only occur with packet loss.
1317	   The receipt of an Error message ends retransmission of the message
1318	   identified in the Error message.

1320	       Message      Acknowledgement Message
1321	       -------      -----------------------
1322	       Hello        HelloACK or Commit
1323	       Commit       DHPart1
1324	       DHPart2      Confirm1
1325	       Confirm2     Conf2ACK
1326	       GoClear      ClearACK
1327	      Table 6. Retransmitted ZRTP Messages and Responses

1329	6.  Short Authentication String

1331	   This section will discuss the implementation of the optional Short
1332	   Authentication String, or SAS in ZRTP.

1334	   The Short Authentication String (SAS) value is calculated as the hash
1335	   of both DH public values and the string "Short Authentication
1336	   String".

1338	   sasvalue = hash(pvi | pvr | "Short Authentication String")

1340	   The rendering of the SAS value depends on the SAS Type agreed upon in
1341	   the Commit message.  For the SAS Type of libase32, the last 20 bits
1342	   of the sasvalue are rendered as a form of base32 encoding known as
1343	   libbase32 [6].  The purpose of libbase32 is to represent arbitrary
1344	   sequences of octets in a form that is as convenient as possible for
1345	   human users to manipulate.  As a result, the choice of characters is
1346	   slightly different from base32 as defined in RFC 3548.  The last 20
1347	   bits of the sasvalue results in four libbase32 characters which are
1348	   rendered to both ZRTP endpoints.  Other SAS Types may be defined to
1349	   render the SAS value in other ways.

1351	   The sasflag is set based on the user indicating that SAS has been
1352	   successfully performed.  The sasflag is exchanged securely in the
1353	   Confirm1 and Confirm2 messages of the next session.  In other words,
1354	   each party sends the sasflag from the previous session in the Confirm
1355	   message of the current session.  It is perfectly reasonable to have a
1356	   ZRTP endpoint that never sets the sasflag, because it would require
1357	   adding complexity to the user interface to allow the user to set it.
1358	   The sasflag is not required to be set, but if it is available to the
1359	   client software, it allows for the possibility that the client
1360	   software could render to the user that the SAS verify procedure was
1361	   carried out in a previous session.

1363	   Regardless of whether there is a user interface element to allow the
1364	   user to set the sasflag, it is worth caching a shared secret, because
1365	   doing so reduces opportunities for an attacker in the next call.

1367	   If at any time the users carry out the SAS procedure, and it actually
1368	   fails to match, then this means there is a very resourceful man in
1369	   the middle.  If this is the first call, the MitM was there on the
1370	   first call, which is impressive enough.  If it happens in a later
1371	   call, it also means the MitM must also know your cached shared
1372	   secret, because you could not have carried out any voice traffic at
1373	   all unless the session key was correctly computed and is also known
1374	   to the attacker.  This implies the MitM must have been present in all
1375	   the previous sessions, since the initial establishment of the first
1376	   shared secret.  This is indeed a resourceful attacker.  It also means
1377	   that if at any time he ceases his participation as a MitM on one of
1378	   your calls, the protocol will detect that the cached shared secret is
1379	   no longer valid-- because it was really two different shared secrets
1380	   all along, one of them between Alice and the attacker, and the other
1381	   between the attacker and Bob. The continuity of the cached shared
1382	   secrets make it possible for us to detect the MitM when he inserts
1383	   himself into the ongoing relationship, as well as when he leaves.
1384	   Also, if the attacker tries to stay with a long lineage of calls, but
1385	   fails to execute a DH MitM attack for even one missed call, he is
1386	   permanently excluded.  He can no longer resynchronize with the chain
1387	   of cached shared secrets.

1389	   Some sort of user interface element (maybe a checkbox) is needed to
1390	   allow the user to tell the software the SAS verify was successful,
1391	   causing the software to set the "SAS verified" flag, which (together
1392	   with our cached shared secret) obviates the need to perform the SAS
1393	   procedure in the next call.  An additional user interface element can
1394	   be provided to let the user tell the software he detected an actual
1395	   SAS mismatch, which indicates a MitM attack.  The software can then
1396	   take appropriate action, clearing the "SAS verified" flags, and erase
1397	   the cached shared secret from this session.  It is up to the
1398	   implementer to decide if this added user interface complexity is
1399	   warranted.

1401	   If the SAS matches, it means there is no MitM, which also implies it
1402	   is now safe to trust a cached shared secret for later calls.  If
1403	   inattentive users don't bother to check the SAS, it means we don't
1404	   know whether there is or is not a MitM, so even if we do establish a
1405	   new cached shared secret, there is a risk that our potential attacker
1406	   may have a subsequent opportunity to continue inserting himself in
1407	   the call, until we finally get around to checking the SAS.  If the
1408	   SAS matches, it means no attacker was present for any previous
1409	   session since we started propagating cached shared secrets, because
1410	   this session and all the previous sessions were also authenticated
1411	   with a continuous lineage of shared secrets.

1413	7.  IANA Considerations

1415	   If an IANA registry for RTP extension headers were defined, then the
1416	   value 0x505A would be reserved for ZRTP.

1418	8.  Security Considerations

1420	   This document is all about securely keying SRTP sessions.  As such,
1421	   security is discussed in every section.  The next version of this
1422	   draft will have a summary of those security properties discussed
1423	   throughout the document.

1425	9.  Acknowledgments

1427	   The authors would like to thank Bryce Wilcox for his contributions to
1428	   the design of this protocol, and to thank Jon Callas, Jon Peterson,
1429	   Colin Plumb, and Hal Finney for their helpful comments and
1430	   suggestions.

1432	10.  Appendix - ZRTP, SIP, and SDP

1434	   This section discusses how ZRTP, SIP, and SDP work together.

1436	   SIP UAs which support this specification would include the to-be-
1437	   defined SDP attribute a=zrtp in their SDP offers and answers.  The
1438	   presence of this attribute is a hint to another UA that ZRTP is
1439	   supported.  If a UA supports both ZRTP and another approach to
1440	   negotiate an SRTP secret such as [14] or [13] , then the presence of
1441	   the a=zrtp attribute is critical.  If both UAs support ZRTP, they
1442	   will first try ZRTP before attempting SRTP.  If only one endpoint
1443	   supports ZRTP but both support SRTP, then the other method will be
1444	   used instead.

1446	   Note that ZRTP may be implemented without coupling with the SIP
1447	   signaling.  For example, ZRTP can be implemented as a "bump in the
1448	   wire" or as a "bump in the stack" in which RTP sent by the SIP UA is
1449	   converted to ZRTP.  In these cases, the SIP UA will have no knowledge
1450	   of ZRTP and will not include the a=zrtp attribute.  As a result, even
1451	   if the other UA does not indicate support for ZRTP, a ZRTP endpoint
1452	   SHOULD still send Hello messages.

1454	11.  References

1456	11.1.  Normative References

1458	   [1]  Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson,
1459	        "RTP: A Transport Protocol for Real-Time Applications", STD 64,
1460	        RFC 3550, July 2003.

1462	   [2]  Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K.
1463	        Norrman, "The Secure Real-time Transport Protocol (SRTP)",
1464	        RFC 3711, March 2004.

1466	   [3]  Kivinen, T. and M. Kojo, "More Modular Exponential (MODP)
1467	        Diffie-Hellman groups for Internet Key Exchange (IKE)",
1468	        RFC 3526, May 2003.

1470	   [4]  Ferguson, N. and B. Schneier, "Practical Cryptography", Wiley
1471	        Publishing 2003.

1473	   [5]  Barker, E. and J. Kelsey, "Recommendation for Random Number
1474	        Generation Using Deterministic Random Bit Generators", NIST
1475	        Special Publication 800-90 DRAFT (December 2005).

1477	   [6]  O'Whielacronx, Z., "human-oriented base-32 encoding", http://
1478	        cvs.sourceforge.net/viewcvs.py/libbase32/libbase32/
1479	        DESIGN?rev=HEAD .

1481	11.2.  Informative References

1483	   [7]   Zimmermann, P., "PGPfone",
1484	         http://www.pgpi.org/products/pgpfone/ .

1486	   [8]   Zimmermann, P., "Zfone", http://www.philzimmermann.com/zfone .

1488	   [9]   Blossom, E., "The VP1 Protocol for Voice Privacy Devices
1489	         Version 1.2", http://www.comsec.com/vp1-protocol.pdf .

1491	   [10]  "CryptoPhone", http://www.cryptophone.de/ .

1493	   [11]  Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A.,
1494	         Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP:
1495	         Session Initiation Protocol", RFC 3261, June 2002.

1497	   [12]  Ylonen, T. and C. Lonvick, "The Secure Shell (SSH) Protocol
1498	         Architecture", RFC 4251, January 2006.

1500	   [13]  Andreasen, F., "Session Description Protocol Security
1501	         Descriptions for Media Streams",
1502	         draft-ietf-mmusic-sdescriptions-12 (work in progress),
1503	         September 2005.

1505	   [14]  Arkko, J., "Key Management Extensions for Session Description
1506	         Protocol (SDP) and Real  Time Streaming Protocol (RTSP)",
1507	         draft-ietf-mmusic-kmgmt-ext-15 (work in progress), June 2005.

1509	   [15]  Arkko, J., Carrara, E., Lindholm, F., Naslund, M., and K.
1510	         Norrman, "MIKEY: Multimedia Internet KEYing", RFC 3830,
1511	         August 2004.

1513	   [16]  Handley, M. and V. Jacobson, "SDP: Session Description
1514	         Protocol", RFC 2327, April 1998.

1516	Authors' Addresses

1518	   Philip Zimmermann
1519	   Phil Zimmermann and Associates LLC

1521	   Email: prz@mit.edu

1523	   Alan Johnston (editor)
1524	   SIPStation
1525	   St. Louis, MO  63124

1527	   Email: alan@sipstation.com

1529	Intellectual Property Statement

1531	   The IETF takes no position regarding the validity or scope of any
1532	   Intellectual Property Rights or other rights that might be claimed to
1533	   pertain to the implementation or use of the technology described in
1534	   this document or the extent to which any license under such rights
1535	   might or might not be available; nor does it represent that it has
1536	   made any independent effort to identify any such rights.  Information
1537	   on the procedures with respect to rights in RFC documents can be
1538	   found in BCP 78 and BCP 79.

1540	   Copies of IPR disclosures made to the IETF Secretariat and any
1541	   assurances of licenses to be made available, or the result of an
1542	   attempt made to obtain a general license or permission for the use of
1543	   such proprietary rights by implementers or users of this
1544	   specification can be obtained from the IETF on-line IPR repository at
1545	   http://www.ietf.org/ipr.

1547	   The IETF invites any interested party to bring to its attention any
1548	   copyrights, patents or patent applications, or other proprietary
1549	   rights that may cover technology that may be required to implement
1550	   this standard.  Please address the information to the IETF at
1551	   ietf-ipr@ietf.org.

1553	Disclaimer of Validity

1555	   This document and the information contained herein are provided on an
1556	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
1557	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
1558	   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
1559	   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
1560	   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
1561	   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

1563	Copyright Statement

1565	   Copyright (C) The Internet Society (2006).  This document is subject
1566	   to the rights, licenses and restrictions contained in BCP 78, and
1567	   except as set forth therein, the authors retain all their rights.

1569	Acknowledgment

1571	   Funding for the RFC Editor function is currently provided by the
1572	   Internet Society.