idnits 2.17.1 

draft-zimmermann-avt-zrtp-02.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 18.

  -- Found old boilerplate from RFC 3978, Section 5.5 on line 2228.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 2239.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 2246.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 2252.

  ** This document has an original RFC 3978 Section 5.4 Copyright Line,
     instead of the newer IETF Trust Copyright according to RFC 4748.

  ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead
     of the newer disclaimer which includes the IETF Trust according to RFC
     4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** There are 4 instances of too long lines in the document, the longest one
     being 5 characters in excess of 72.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not
     match the current year

  == The document seems to lack the recommended RFC 2119 boilerplate, even if
     it appears to use RFC 2119 keywords. 

     (The document does seem to have the reference to RFC 2119 which the
     ID-Checklist requires).
  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (October 22, 2006) is 6389 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  -- Looks like a reference, but probably isn't: 'SHA-256' on line 1137

  == Unused Reference: '5' is defined on line 2130, but no explicit reference
     was found in the text

  == Outdated reference: A later version (-01) exists of
     draft-mcgrew-srtp-big-aes-00

  ** Obsolete normative reference: RFC 3309 (ref. '6') (Obsoleted by RFC 4960)

  ** Obsolete normative reference: RFC 4566 (ref. '11') (Obsoleted by RFC
     8866)

  == Outdated reference: A later version (-02) exists of
     draft-wing-rtpsec-keying-eval-01

  -- Obsolete informational reference (is this intentional?): RFC 4474 (ref.
     '23') (Obsoleted by RFC 8224)


     Summary: 6 errors (**), 0 flaws (~~), 5 warnings (==), 9 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	AVT WG                                                     P. Zimmermann
3	Internet-Draft                                             Zfone Project
4	Intended status: Informational                          A. Johnston, Ed.
5	Expires: April 25, 2007                                            Avaya
6	                                                               J. Callas
7	                                                         PGP Corporation
8	                                                        October 22, 2006

10	   ZRTP: Extensions to RTP for Diffie-Hellman Key Agreement for SRTP
11	                      draft-zimmermann-avt-zrtp-02

13	Status of this Memo

15	   By submitting this Internet-Draft, each author represents that any
16	   applicable patent or other IPR claims of which he or she is aware
17	   have been or will be disclosed, and any of which he or she becomes
18	   aware will be disclosed, in accordance with Section 6 of BCP 79.

20	   Internet-Drafts are working documents of the Internet Engineering
21	   Task Force (IETF), its areas, and its working groups.  Note that
22	   other groups may also distribute working documents as Internet-
23	   Drafts.

25	   Internet-Drafts are draft documents valid for a maximum of six months
26	   and may be updated, replaced, or obsoleted by other documents at any
27	   time.  It is inappropriate to use Internet-Drafts as reference
28	   material or to cite them other than as "work in progress."

30	   The list of current Internet-Drafts can be accessed at
31	   http://www.ietf.org/ietf/1id-abstracts.txt.

33	   The list of Internet-Draft Shadow Directories can be accessed at
34	   http://www.ietf.org/shadow.html.

36	   This Internet-Draft will expire on April 25, 2007.

38	Copyright Notice

40	   Copyright (C) The Internet Society (2006).

42	Abstract

44	   This document defines ZRTP, RTP (Real-time Transport Protocol) header
45	   extensions for a Diffie-Hellman exchange to agree on a session key
46	   and parameters for establishing Secure RTP (SRTP) sessions.  The ZRTP
47	   protocol is completely self-contained in RTP and does not require
48	   support in the signaling protocol or assume a Public Key
49	   Infrastructure (PKI) infrastructure.  For the media session, ZRTP
50	   provides confidentiality, protection against Man in the Middle (MitM)
51	   attacks, and, in cases where a secret is available from the signaling
52	   protocol, authentication.  ZRTP can utilize three Session Description
53	   Protocol (SDP) attributes to provide discovery and authentication
54	   through the signaling channel.

56	Table of Contents

58	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
59	   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  8
60	   3.  ZRTP and RTP Keying Requirements . . . . . . . . . . . . . . .  8
61	   4.  Overview . . . . . . . . . . . . . . . . . . . . . . . . . . .  8
62	     4.1.  Key Agreement Modes  . . . . . . . . . . . . . . . . . . .  9
63	       4.1.1.  Diffie-Hellman Mode  . . . . . . . . . . . . . . . . . 10
64	       4.1.2.  Multistream Mode . . . . . . . . . . . . . . . . . . . 11
65	   5.  Protocol Description . . . . . . . . . . . . . . . . . . . . . 12
66	     5.1.  Key Agreement and Derivation Algorithm . . . . . . . . . . 12
67	       5.1.1.  Discovery  . . . . . . . . . . . . . . . . . . . . . . 12
68	       5.1.2.  Hash Commitment  . . . . . . . . . . . . . . . . . . . 13
69	       5.1.3.  Diffie-Hellman Exchange  . . . . . . . . . . . . . . . 14
70	       5.1.4.  Confirmation and Switch to SRTP  . . . . . . . . . . . 18
71	     5.2.  Multistream Mode . . . . . . . . . . . . . . . . . . . . . 20
72	     5.3.  Random Number Generation . . . . . . . . . . . . . . . . . 21
73	     5.4.  CRC Protection of Messages . . . . . . . . . . . . . . . . 22
74	     5.5.  ZID and Cache Operation  . . . . . . . . . . . . . . . . . 22
75	     5.6.  Terminating an SRTP Session or ZRTP Exchange . . . . . . . 23
76	   6.  RTP Header Extension . . . . . . . . . . . . . . . . . . . . . 24
77	     6.1.  ZRTP Message Formats . . . . . . . . . . . . . . . . . . . 24
78	       6.1.1.  Message Type Block . . . . . . . . . . . . . . . . . . 25
79	       6.1.2.  Hash Type Block  . . . . . . . . . . . . . . . . . . . 26
80	       6.1.3.  Cipher Type Block  . . . . . . . . . . . . . . . . . . 26
81	       6.1.4.  Auth Tag Length Block  . . . . . . . . . . . . . . . . 26
82	       6.1.5.  Key Agreement Type Block . . . . . . . . . . . . . . . 27
83	       6.1.6.  SAS Type Block . . . . . . . . . . . . . . . . . . . . 27
84	     6.2.  Hello message  . . . . . . . . . . . . . . . . . . . . . . 28
85	     6.3.  HelloACK message . . . . . . . . . . . . . . . . . . . . . 29
86	     6.4.  Commit message . . . . . . . . . . . . . . . . . . . . . . 30
87	     6.5.  DHPart1 message  . . . . . . . . . . . . . . . . . . . . . 31
88	     6.6.  DHPart2 message  . . . . . . . . . . . . . . . . . . . . . 32
89	     6.7.  Confirm1 message . . . . . . . . . . . . . . . . . . . . . 33
90	     6.8.  Confirm2 message . . . . . . . . . . . . . . . . . . . . . 35
91	     6.9.  Conf2ACK message . . . . . . . . . . . . . . . . . . . . . 36
92	     6.10. GoClear message  . . . . . . . . . . . . . . . . . . . . . 37
93	     6.11. ClearACK message . . . . . . . . . . . . . . . . . . . . . 37
94	   7.  Retransmissions  . . . . . . . . . . . . . . . . . . . . . . . 38
95	   8.  Short Authentication String  . . . . . . . . . . . . . . . . . 39
96	   9.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 41
97	   10. Security Considerations  . . . . . . . . . . . . . . . . . . . 42
98	   11. Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 43
99	   12. Appendix A - ZRTP, SIP, and SDP  . . . . . . . . . . . . . . . 43
100	   13. Appendix B - The ZRTP Disclosure flag  . . . . . . . . . . . . 46
101	   14. Appendix C - Intermediary ZRTP Devices . . . . . . . . . . . . 48
102	   15. References . . . . . . . . . . . . . . . . . . . . . . . . . . 48
103	     15.1. Normative References . . . . . . . . . . . . . . . . . . . 48
104	     15.2. Informative References . . . . . . . . . . . . . . . . . . 49
105	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 50
106	   Intellectual Property and Copyright Statements . . . . . . . . . . 52

108	1.  Introduction

110	   ZRTP is key agreement protocol which performs Diffie-Hellman key
111	   exchange during call setup in-band in the Real-time Transport
112	   Protocol (RTP) [2] media stream which has been established using a
113	   signaling protocol such as Session Initiation Protocol (SIP) [17].
114	   This generates a shared secret which is then used to generate keys
115	   and salt for a Secure RTP (SRTP) [3] session.  ZRTP borrows ideas
116	   from PGPfone [13].  A reference implementation of ZRTP is available
117	   as Zfone [14].

119	   The ZRTP protocol has some nice cryptographic features lacking in
120	   many other approaches to media session encryption.  Although it uses
121	   a public key algorithm, it does not rely on a public key
122	   infrastructure (PKI).  In fact, it does not use persistent public
123	   keys at all.  It uses ephemeral Diffie-Hellman (DH) with hash
124	   commitment, and allows the detection of Man in the Middle (MitM)
125	   attacks by displaying a short authentication string for the users to
126	   read and compare over the phone.  It has perfect forward secrecy,
127	   meaning the keys are destroyed at the end of the call, which
128	   precludes retroactively compromising the call by future disclosures
129	   of key material.  But even if the users are too lazy to bother with
130	   short authentication strings, we still get fairly decent
131	   authentication against a MitM attack, based on a form of key
132	   continuity.  It does this by caching some key material to use in the
133	   next call, to be mixed in with the next call's DH shared secret,
134	   giving it key continuity properties analogous to SSH.  All this is
135	   done without reliance on a PKI, key certification, trust models,
136	   certificate authorities, or key management complexity that bedevils
137	   the email encryption world.  It also does not rely on SIP signaling
138	   for the key management, and in fact does not rely on any servers at
139	   all.  It performs its key agreements and key management in a purely
140	   peer-to-peer manner over the RTP packet stream.

142	   Most secure phones rely on a Diffie-Hellman exchange to agree on a
143	   common session key.  But since DH is susceptible to a man-in-the-
144	   middle (MitM) attack, it is common practice to provide a way to
145	   authenticate the DH exchange.  In some military systems, this is done
146	   by depending on digital signatures backed by a centrally-managed PKI.
147	   A decade of industry experience has shown that deploying centrally
148	   managed PKIs can be a painful and often futile experience.  PKIs are
149	   just too messy, and require too much activation energy to get them
150	   started.  Setting up a PKI requires somebody to run it, which is not
151	   practical for an equipment provider.  A service provider like a
152	   carrier might venture down this path, but even then you have to deal
153	   with cross-carrier authentication, certificate revocation lists, and
154	   other complexities.  It is much simpler to avoid PKIs altogether,
155	   especially when developing secure commercial products.  It is
156	   therefore more common for commercial secure phones in the PSTN world
157	   to augment the DH exchange with a Short Authentication String (SAS)
158	   combined with a hash commitment at the start of the key exchange, to
159	   shorten the length of SAS material that must be read aloud.  No PKI
160	   is required for this approach to authenticating the DH exchange.  The
161	   AT&T 3600, Eric Blossom's COMSEC secure phones [15], PGPfone [13],
162	   and CryptoPhone [16] are all examples of products that took this
163	   simpler lightweight approach.

165	   The main problem with this approach is inattentive users who may not
166	   execute the voice authentication procedure, or unattended secure
167	   phone calls to answering machines that cannot execute it.
168	   Additionally, some people worry about voice spoofing (the "Rich
169	   Little" attack), and some worry about trying to use it between people
170	   who don't know each other's voices.  This is not as much of a problem
171	   as it seems, because it isn't necessary that they recognize each
172	   other by their voice, it's only necessary that they detect that the
173	   voice used for the SAS procedure matches the voice in the rest of the
174	   phone call.  These concerns are not enough reason to embrace PKIs as
175	   an alternative, in my opinion.

177	   A popular and field-proven approach is used by SSH (Secure Shell)
178	   [18], which Peter Gutmann likes to call the "baby duck" security
179	   model.  SSH establishes a relationship by exchanging public keys in
180	   the initial session, when we assume no attacker is present, and this
181	   makes it possible to authenticate all subsequent sessions.  A
182	   successful MitM attacker has to have been present in all sessions all
183	   the way back to the first one, which is assumed to be difficult for
184	   the attacker.  All this is accomplished without resorting to a
185	   centrally-managed PKI.

187	   We use an analogous baby duck security model to authenticate the DH
188	   exchange in ZRTP.  We don't need to exchange persistent public keys,
189	   we can simply cache a shared secret and re-use it to authenticate a
190	   long series of DH exchanges for secure phone calls over a long period
191	   of time.  If we read aloud just one SAS, and then cache a shared
192	   secret for later calls to use for authentication, no new voice
193	   authentication rituals need to be executed.  We just have to remember
194	   we did one already.

196	   If we ever lose this cached shared secret, it is no longer available
197	   for authentication of DH exchanges, so we would have to do a new SAS
198	   procedure and start over with a new cached shared secret.  Then we
199	   could go back to omitting the voice authentication on later calls.

201	   A particularly compelling reason why this approach is attractive is
202	   that SAS is easiest to implement when a GUI or some sort of display
203	   is available, which raises the question of what to do when no display
204	   is available.  We envision some products that implement secure VoIP
205	   via a local network proxy, which lacks a display in many cases.  If
206	   we take an approach that greatly reduces the need for a SAS in each
207	   and every call, we can operate in GUI-less products with greater
208	   ease.

210	   It's a good idea to force your opponent to have to solve multiple
211	   problems in order to mount a successful attack.  Some examples of
212	   widely differing problems we might like to present him with are:
213	   Stealing a shared secret from one of the parties, being present on
214	   the very first session and every subsequent session to carry out an
215	   active MitM attack, and solving the discrete log problem.  We want to
216	   force the opponent to solve more than one of these problems to
217	   succeed.

219	   The protocol can make use different kinds of shared secrets.  Each
220	   type of shared secret is determined by a different method.  All of
221	   the shared secrets are hashed together to form a session key to
222	   encrypt the call.  An attacker must defeat all of the methods in
223	   order to determine the session key.

225	   First, there is the shared secret determined entirely by a Diffie-
226	   Hellman key agreement.  It changes with every call, based on random
227	   numbers.  An attacker may attempt a classic DH MitM attack on this
228	   secret, but we can protect against this by displaying and reading
229	   aloud a SAS, combined with adding a hash commitment at the beginning
230	   of the DH exchange.

232	   Second, there is an evolving shared secret, or ongoing shared secret
233	   that is automatically changed and refreshed and cached with every new
234	   session.  We will call this the cached shared secret, or sometimes
235	   the retained shared secret.  Each new image of this ongoing secret is
236	   a non-invertable function of its previous value and the new secret
237	   derived by the new DH agreement.  It's possible that no cached shared
238	   secret is available, because there were no previous sessions to
239	   inherit this value from, or because one side loses its cache.

241	   There are other approaches for key agreement for SRTP that compute a
242	   shared secret using information in the signaling.  For example, [20]
243	   describes how to carry a MIKEY (Multimedia Internet KEYing) [21]
244	   payload in SDP [11].  Or [19] describes directly carrying SRTP keying
245	   and configuration information in SDP.  ZRTP does not rely on the
246	   signaling to compute a shared secret, but If a client does produce a
247	   shared secret via the signaling, and makes it available to the ZRTP
248	   protocol, ZRTP can make use of this shared secret to augment the list
249	   of shared secrets that will be hashed together to form a session key.
250	   This way, any security weaknesses that might compromise the shared
251	   secret contributed by the signaling will not harm the final resulting
252	   session key.

254	   There may also be a static shared secret that the two parties agree
255	   on out-of-band in advance.  A hashed passphrase would suffice.

257	   The shared secret provided by the signaling (if available), the
258	   shared secret computed by DH, and the cached shared secret are all
259	   hashed together to compute the session key for a call.  If the cached
260	   shared secret is not available, it is omitted from the hash
261	   computation.  If the signaling provides no shared secret, it is also
262	   omitted from the hash computation.

264	   No DH MitM attack can succeed if the ongoing shared secret is
265	   available to the two parties, but not to the attacker.  This is
266	   because the attacker cannot compute a common session key with either
267	   party without knowing the cached secret component, even if he
268	   correctly executes a classic DH MitM attack.  Mixing in the cached
269	   shared secret for the session key calculation allows it to act as an
270	   implicit authenticator to protect the DH exchange, without requiring
271	   additional explicit HMACs to be computed on the DH parameters.  If
272	   the cached shared secret is available, a MitM attack would be
273	   instantly detected by the failure to achieve a shared session key,
274	   resulting in undecryptable packets.  The protocol can easily detect
275	   this.  It would be more accurate to say that the MitM attack is not
276	   merely detected, but thwarted.

278	   When adding the complexity of additional shared secrets beyond the
279	   familiar DH key agreement, we must make sure the lack of availability
280	   of the cached shared secret cannot prevent a call from going through,
281	   and we must also prevent false alarms that claim an attack was
282	   detected.

284	   An added benefit of using these cached shared secrets to mix in with
285	   the session keys is that it augments the entropy of the session key.
286	   Even if limits on the size of the DH exchange produces a session key
287	   with less than 256 bits of real work factor, the added entropy from
288	   the cached shared secret can bring up all the subsequent session keys
289	   to the full 256-bit AES key strength, assuming no attacker was
290	   present in the first call.

292	   We could have authenticated the DH exchange the same way SSH does it,
293	   with digital signatures, caching public keys instead of shared
294	   secrets.  But this approach with caching shared secrets seemed a bit
295	   simpler, and has the added benefit of adding more entropy to the
296	   session keys.

298	   The following sections provide an overview of the ZRTP protocol,
299	   describe the key agreement algorithm and RTP header extensions.

301	2.  Terminology

303	   In this document, the key words "MUST", "MUST NOT", "REQUIRED",
304	   "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY",
305	   and "OPTIONAL" are to be interpreted as described in RFC 2119 and
306	   indicate requirement levels for compliant implementations [1].

308	3.  ZRTP and RTP Keying Requirements

310	   This section discuses how ZRTP meets the RTP keying requirements
311	   discussed in [12].  The section numbers referenced are those in this
312	   document.

314	   Due to the in-band key management approach, ZRTP meets the following
315	   requirements: 4.1 Secure Retargeting and Secure Forking and 4.2
316	   Clipping Media Before SDP Answer.

318	   Due to the built-in in-band discovery mechanisms, ZRTP meets the 5.3
319	   Best Effort Encryption requirement.

321	   The use of Diffie-Hellman ensures that ZRTP meets the 5.2 Perfect
322	   Forward Secrecy requirement.

324	   Since the supported SRTP algorithms are not exchanged in the
325	   signaling but in the media path, there is no computational penalty in
326	   allowing additional supported algorithms as described in 5.4
327	   Upgrading Algorithms.

329	   ZRTP does not require the SSRC or ROC be signaled per requirement 4.4
330	   SSRC and ROC

332	   ZRTP does not currently use certificates for authentication so it
333	   does not meet requirement 5.1 Public Key Infrastructure.  However,
334	   ZRTP could be extended to utilize a certificate to perform a digital
335	   signature over the Diffie-Hellman values exchanged.

337	   ZRTP does not support 4.3 Centralized Keying due to its point-to-
338	   point design.

340	4.  Overview

342	   This section provides a description of how ZRTP works.  This
343	   description is non-normative in nature but is included to build
344	   understanding of the protocol.

346	   ZRTP is negotiated the same way a conventional RTP session is
347	   negotiated in an offer/answer exchange using the AVP/RTP profile.
348	   The ZRTP protocol begins after two endpoints have utilized a
349	   signaling protocol such as SIP and are ready to send or have already
350	   begun sending RTP packets.  This specification defines a new RTP
351	   extension header which is used to carry the ZRTP messages between the
352	   endpoints.  Since RTP endpoints ignore unknown extension headers, the
353	   protocol is fully backwards compatible - a ZRTP endpoint attempting
354	   to perform key agreement with a non-ZRTP endpoint will simply receive
355	   normal RTP responses and can then inform the user that a secure
356	   session is not possible and either continue with the insecure session
357	   or terminate the session depending on the user's security policy.

359	   The ZRTP exchange begins at the same time that the first RTP packets
360	   are exchanged between the endpoints.  A ZRTP message is transported
361	   in an RTP no-op packet.

363	   A ZRTP endpoint initiates the exchange by sending a ZRTP Hello
364	   message to the other endpoint.  The purpose of the Hello message is
365	   to discover if the other endpoint supports the protocol and to see
366	   what algorithms the two ZRTP endpoints have in common.  This
367	   discovery can also be achieved if a=zrtp attribute is present in an
368	   SDP offer or answer, as described in Appendix A.

370	   The Hello message contains the SRTP configuration options, and the
371	   ZID.  Each instance of ZRTP has a unique 96-bit random ZRTP ID or ZID
372	   that is generated once at installation time.  ZIDs are discovered
373	   during the Hello message exchange.  The received ZID is used to look
374	   up retained shared secrets in a local cache and are used by ZRTP to
375	   manage lookup cached or retained shared secrets from previous ZRTP
376	   sessions with the endpoint.

378	   A response to a ZRTP Hello message is a ZRTP HelloACK message.  The
379	   HelloACK message simply acknowledges receipt of the Hello message and
380	   indicates support for the ZRTP protocol.  Since RTP uses best effort
381	   UDP transport, ZRTP has retransmission timers in case of lost
382	   datagrams.  There are two timers, both with exponential backoff
383	   mechanisms.  One timer is used for retransmissions of Hello messages
384	   and the other is used for retransmissions of all other messages after
385	   receipt of a HelloACK which indicates support of ZRTP by the other
386	   endpoint.

388	4.1.  Key Agreement Modes

390	   After both endpoints exchange Hello and HelloACK messages, the key
391	   agreement exchange can begin with the ZRTP Commit message.  ZRTP
392	   supports a number of key agreement modes including both Diffie-
393	   Hellman and non-Diffie-Hellman as described in the following
394	   sections.

396	4.1.1.  Diffie-Hellman Mode

398	   An example ZRTP call flow is shown in Figure 1 below.  Note that the
399	   order of the Hello/HelloACK exchanges in F1/F2 and F3/F4 may be
400	   reversed.  That is, either Alice or Bob might send the first Hello
401	   message.  Also, an endpoint that receives a Hello message and wishes
402	   to immediately begin the ZRTP key agreement can omit the HelloACK and
403	   send the Commit instead.  In Figure 1, this would result in messages
404	   F2, F3, and F4 being omitted.  Note that the endpoint which sends the
405	   Commit message is considered the initiator of the ZRTP session and
406	   drives the key agreement exchange.  The Diffie-Hellman public values
407	   are exchanged in the DHPart1 and DHPart2 messages.  SRTP keys and
408	   salts are then calculated along with a ZRTP Session key.

410	   Alice                                      Bob
411	     |                                         |
412	     | Alice and Bob establish a media session.|
413	     |                                         |
414	     |                   RTP                   |
415	     |<=======================================>|
416	     |                                         |
417	     | Hello (version, options, Alice's ZID) F1|
418	     |---------------------------------------->|
419	     |                             HelloACK F2 |
420	     |<----------------------------------------|
421	     | Hello (version, options, Bob's ZID) F3  |
422	     |<----------------------------------------|
423	     | HelloACK F4                             |
424	     |---------------------------------------->|
425	     |                                         |
426	     |        Bob acts as the initiator        |
427	     |                                         |
428	     | Commit (Bob's ZID, options, hvi or nonce) F5
429	     |<----------------------------------------|
430	     | DHPart1 (pvr, shared secret hashes) F6  |
431	     |---------------------------------------->|
432	     | DHPart2 (pvi, shared secret hashes) F7  |
433	     |<----------------------------------------|
434	     |                                         |
435	     | Alice and Bob generate SRTP session key.|
436	     |                                         |
437	     |               SRTP begins               |
438	     |<=======================================>|
439	     |                                         |
440	     | Confirm1 (plaintext, D,S,V flags, hmac) F8
441	     |---------------------------------------->|
442	     | Confirm2 (plaintext, D,S,V flags, hmac) F9
443	     |<----------------------------------------|
444	     | Confirm2AK F10                          |
445	     |---------------------------------------->|

447	   Figure 1. Establishment of an SRTP session using ZRTP

449	4.1.2.  Multistream Mode

451	   Multistream mode is an alternative key agreement method when two
452	   endpoints have an establish SRTP media stream between them and hence
453	   an active ZRTP Session key.  ZRTP can derive multiple SRTP keys from
454	   a single DH exchange.  For example, an established secure voice call
455	   that adds a video stream could use Multistream mode to quickly
456	   initiate the video stream without a second DH exchange.

458	   When Multistream mode is indicated in the Commit message, a call flow
459	   similar to Figure 1 is used, but no DH calculation is performed by
460	   either endpoint and the DHPart1 and DHPart2 messages are omitted.  In
461	   this mode, multiple non-DH ZRTP exchanges can be performed in
462	   parallel between two endpoints.

464	   Alternatively, each stream can be handled independently using the
465	   call flow of Figure 1, resulting in a DH exchange per media stream.
466	   To keep the integrity of the retained shared secrets, only a single
467	   DH exchange can be processed at a time between two endpoints.

469	5.  Protocol Description

471	   ZRTP uses RTP [2] to transport discovery and key agreement messages.
472	   The messages are carried as RTP header extensions as defined in
473	   Section 6.  It is RECOMMENDED to use the no-op RTP/AVP payload type
474	   [7].  No-op packets are ideal for ZRTP transport as it is permissible
475	   to send no-op packets even for media streams marked 'recvonly' or
476	   'inactive'.  Also, no-op packets can be used with any media type.  An
477	   endpoint MAY use a different SSRC for ZRTP messages than for RTP
478	   media.

480	   Note: the use of separate SSRC numbers and hence separate sequence
481	   number space allows for very loose coupling between the ZRTP
482	   application and the RTP media application.

484	   To support best effort encryption [12], ZRTP uses normal RTP/AVP
485	   profile (AVP) media lines in the initial offer/answer exchange.  The
486	   ZRTP SDP attribute flag a=zrtp defined in Appendix A SHOULD be used
487	   in all offers and answers to indicate support for the ZRTP protocol.
488	   In subsequent offer/answer exchanges after a successful ZRTP exchange
489	   has resulted in an SRTP session, the Secure RTP/AVP (SAVP) profile
490	   MAY be used.

492	5.1.  Key Agreement and Derivation Algorithm

494	   The key agreement algorithm has four phases that are described
495	   normatively in the following sections.

497	5.1.1.  Discovery

499	   During the discovery phase, a ZRTP endpoint discovers if the other
500	   endpoint supports ZRTP and which ZRTP version, hash, cipher, auth tag
501	   length, key agreement type, and SAS algorithms are supported.  In
502	   addition, each endpoint sends and discovers ZIDs.  The received ZID
503	   is used to retrieve previous retained shared secrets, rs1 and rs2.
504	   If the endpoint has other secrets, then they are also collected.  The
505	   signaling secret (sigs), is passed from the signaling protocol used
506	   to establish the RTP session.  For SIP, it is the dialog identifier
507	   of a Secure SIP (sips) session: a string composed of Call-ID, to tag,
508	   and from tag.  From the definitions in RFC 3261 [17]:

510	   sigs = hash(call-id | tag1 | tag2)

512	   Note: the dialog identifier of a non-secure SIP session should not be
513	   considered a signaling secret as it has no confidentiality
514	   protection.

516	   For the SRTP secret (srtps), it is the SRTP master key and salt.
517	   This information may have been passed in the signaling using MIKEY or
518	   SDP Security Descriptions, for example:

520	   srtps = hash(SRTP master key | SRTP master salt)

522	   Additional shared secrets can be defined and used as other_secret.
523	   If no secret of a given type is available, a random value is
524	   generated and used for that secret to ensure a mismatch in the hash
525	   comparisons in the DHPart1 and DHPart2 messages.  This prevents an
526	   eavesdropper from knowing how many shared secrets are available
527	   between the endpoints.

529	   A Hello message can be sent at any time, but is usually sent at the
530	   start of an RTP session to determine if the other endpoint supports
531	   ZRTP, and also if the SRTP implementations are compatible.  A Hello
532	   message is retransmitted using timer T1 and an exponential backoff
533	   mechanism detailed in Section 7 until the receipt of a HelloACK
534	   message or a Commit message.

536	5.1.2.  Hash Commitment

538	   The hash commitment is performed by the initiator of the ZRTP
539	   exchange.  From the intersection of the algorithms in the sent and
540	   received Hello messages, the initiator chooses a hash, cipher, auth
541	   tag length, key agreement type, and sas algorithm to be used.

543	   A Diffie-Hellman mode is selected by setting the Key Agreement Type
544	   to DH4096 or DH3072 in the Commit.  In this mode, the key agreement
545	   begins with the initiator choosing a fresh random Diffie-Hellman (DH)
546	   secret value (svi) based on the chosen key agreement type value, and
547	   computing the public value.  (Note that to speed up processing, this
548	   computation can be done in advance.)  For guidance on generating
549	   random numbers, see the section on Random Number Generation.  The
550	   Diffie-Hellman secret value, svi, SHOULD be twice as long as the AES
551	   key length.  This means, if AES 128 is used, the DH secret value
552	   SHOULD be 256 bits long.  If AES 256 is used, the secret value SHOULD
553	   be 512 bits long.

555	   pvi = g^svi mod p

557	   where g and p are determined by the key agreement type value, and a
558	   hash, hvi, of the public value using the chosen hash algorithm.  The
559	   hvi includes the set of hash, cipher, atl, pkt, and sas types from
560	   the responder's Hello in the following order:

562	   hvi=hash(pvi | hashr1-5 | cipherr1-5 | atl1-5 | pktr1-5 | sasr1-5)

564	   The information from the responder's Hello message is included in the
565	   hash calculation to prevent a bid-down attack by modification of the
566	   responder's Hello message.

568	   Note: If both sides send Commit messages initiating a secure session
569	   at the same time, the Commit message with the lowest hvi value is
570	   discarded and the other side is the initiator.  This breaks the tie,
571	   allowing the protocol to proceed from this point with a clear
572	   definition of who is the initiator and who is the responder.

574	   Because the DH exchange affects the state of the retained shared
575	   secret cache, only one in-process ZRTP DH exchange may occur at a
576	   time between two ZRTP endpoints.  Otherwise, race conditions and
577	   cache integrity problems will result.  When multiple media streams
578	   are established in parallel between the same pair of ZRTP endpoints
579	   (determined by the ZIDs in the Hello Messages), only one can be
580	   processed.  Once that exchange completes with Confirm2 and Conf2ACK
581	   messages, another ZRTP DH exchange can begin.  In the event that
582	   Commit messages are sent by both ZRTP endpoints at the same time, but
583	   are received in different media streams, the same resolution rules
584	   apply - the Commit message with the lowest hvi value is discarded and
585	   the other side is the initiator.  The media stream in which the
586	   Commit was sent will proceed through the ZRTP exchange while the
587	   media stream with the discarded Commit must wait for the completion
588	   of the other ZRTP exchange.

590	   Note: This paragraph does not apply when Multistream mode key
591	   agreement is used since the cached shared secrets are not affected.

593	5.1.3.  Diffie-Hellman Exchange

595	   The purpose of the Diffie-Hellman exchange is for the two ZRTP
596	   endpoints to generate a new shared secret, s0.  In addition, the
597	   endpoints discover if they have any shared secrets in common.  If
598	   they do, this exchange allows them to discover how many and agree on
599	   an ordering for them: s1, s2, etc.

601	5.1.3.1.  Responder Behavior

603	   Upon receipt of the Commit message, the responder generates its own
604	   fresh random DH secret value, svr, and computes the public value.
605	   (Note that to speed up processing, this computation can be done in
606	   advance.)  For guidance on random number generation, see the section
607	   on Random Number Generation.  The Diffie-Hellman secret value, svr,
608	   SHOULD be twice as long as the AES key length.  This means, if AES
609	   128 is used, the DH secret value SHOULD be 256 bits long.  If AES 256
610	   is used, the secret value SHOULD be 512 bits long.

612	   pvr = g^svr mod p

614	   The final shared secret, s0, is calculated by hashing the
615	   concatenation of the Diffie-Hellman shared secret (DHSS) followed by
616	   the (possibly empty) set of shared secrets that are actually shared
617	   between the initiator and responder.  For computing the hash, the
618	   shared secrets are sorted by the order of the initiator's
619	   corresponding shared secret IDs.  The remainder of this section
620	   describes an algorithm to accomplish this.

622	   First, an HMAC keyed hash is calculated using the first retained
623	   shared secret, rs1, as the key on the string "Responder" which
624	   generates a retained secret ID, rs1IDr, which is truncated to 64
625	   bits.  HMACs are calculated in a similar way for additonal shared
626	   secrets:

628	   rs1IDr = HMAC(rs1, "Responder")

630	   rs2IDr = HMAC(rs2, "Responder")

632	   sigsIDr = HMAC(sigs, "Responder")

634	   srtpsIDr = HMAC(srtps, "Responder")

636	   other_secretIDr = HMAC(other_secret, "Responder")

638	   A ZRTP DHPart1 message is generated containing pvr and the set of
639	   keyed hashes (HMACs) derived from the possibly shared secrets.

641	   Upon receipt of the DHPart2 message, the responder checks that the
642	   initiator's public DH value is not equal to 1 or p-1.  An attacker
643	   might inject a false DHPart2 packet with a value of 1 or p-1 for
644	   g^svi mod p, which would cause a disastrously weak final DH result to
645	   be computed.  If pvi is 1 or p-1, the user should be alerted of the
646	   attack and the protocol exchange must be terminated.  Otherwise, the
647	   responder then computes the hash of the public DH value in the
648	   DHPart2 with the hash from the Commit.  If they are different, a MitM
649	   attack is taking place and the user is alerted and the protocol
650	   exchange terminated.

652	   The responder then calculates the Diffie-Hellman result:

654	   DHResult = pvi^svr mod p

656	   The responder then calculates the Diffie-Hellman shared secret:

658	   DHSS = hash(DHResult)

660	   The hmacs of the possible shared secrets received are compared
661	   against the hmacs of the local set of possible shared secrets.

663	   Note: When comparing the signaling secret sigs derived from SIP, both
664	   orderings of to-tag followed by from-tag, and from-tag followed by
665	   to-tag must be tried.

667	   The expected hmac values of the shared secrets are calculated (using
668	   the string "Initiator" instead of "Responder") and compared to the
669	   hmacs received in the DHPart2 message.  The secrets corresponding to
670	   matching hmacs are kept while the secrets corresponding to the non-
671	   matching ones are replaced with a null.  The set of up to five actual
672	   shared secrets are then s1, s2, s3, s4, and s5 - the order is that
673	   chosen by the initiator.  The final shared secret, s0, is calculated
674	   by hashing the concatenation of the DHSS and the set of non-null
675	   shared secrets.  As a result, the null secrets have no effect on the
676	   concatenation operation:

678	   s0 = hash(DHSS | s1 | s2 | s3 | s4 | s5)

680	   For example, consider two ZRTP endpoints who share secrets rs1, rs2,
681	   and a hash of a secret passphrase other_secret.  During the
682	   comparison, rs1ID, rs2ID, and other_secretID will match but sigsID
683	   and srtpsID will not.  As a result, s1 = rs1, s2 = rs2, s5 =
684	   other_secret, while s3 and s4 will be nulls. s0 for this exchange
685	   will be calculated as the hash of the concatenation of DHSS, rs1,
686	   rs2, and other_secret.

688	5.1.3.2.  Initiator Behavior

690	   Upon receipt of the DHPart1 message, the initiator checks that the
691	   responder's public DH value is not equal to 1 or p-1.  An attacker
692	   might inject a false DHPart1 packet with a value of 1 or p-1 for
693	   g^svr mod p, which would cause a disastrously weak final DH result to
694	   be computed.  If pvr is 1 or p-1, the user should be alerted of the
695	   attack and the protocol exchange must be terminated.

697	   If pvr is not 1 or p-1, the initiator looks up any retained shared
698	   secrets associated with the responder's ZID.  The final shared
699	   secret, s0, is calculated by hashing the concatenation of the DHSS
700	   followed by the (possibly empty) set of shared secrets that are
701	   actually shared between the initiator and responder.  For computing
702	   the hash, the shared secrets are sorted by the order of the
703	   initiator's corresponding shared secret IDs.  The remainder of this
704	   section describes an algorithm to accomplish this.

706	   First, an HMAC keyed hash is calculated using the first retained
707	   shared secret, rs1, as the key on the string "Initiator" which
708	   generates a retained secret ID, rs1IDi, which is truncated to 64
709	   bits.  HMACs are calculated in a similar way for additional shared
710	   secrets:

712	   rs1IDi = HMAC(rs1, "Initiator")

714	   rs2IDi = HMAC(rs2, "Initiator")

716	   sigsIDi = HMAC(sigs, "Initiator")

718	   srtpsIDi = HMAC(srtps, "Initiator")

720	   other_secretIDi = HMAC(other_secret, "Initiator")

722	   The initiator then sends a DHPart2 message containing the initiator's
723	   public DH value and the set of calculated retained secret IDs.

725	   The initiator calculates the same Diffie-Hellman result using:

727	   DHResult = pvr^svi mod p

729	   The initiator then calculates the DH shared secret using:

731	   DHSS = hash(DHResult)

733	   The initiator then calculates the set of secret IDs that are expected
734	   to be received from the responder in the DHPart1 message:

736	   rs1IDr = HMAC(rs1, "Responder")

738	   rs2IDr = HMAC(rs2, "Responder")

740	   sigsIDr = HMAC(sigs, "Responder")

742	   srtpsIDr = HMAC(srtps, "Responder")

744	   other_secretIDr = HMAC(other_secret, "Responder")
745	   The hmacs of the possible shared secrets received are compared
746	   against the hmacs of the local set of possible shared secrets.

748	   Note: When comparing the signaling secret sigs derived from SIP, both
749	   orderings of to-tag followed by from-tag, and from-tag followed by
750	   to-tag must be tried.

752	   The expected hmac values of the shared secrets are calculated (using
753	   the string "Responder" instead of "Initiator") and compared to the
754	   hmacs received in the DHPart1 message.  The secrets corresponding to
755	   matching hmacs are kept while the secrets corresponding to the non-
756	   matching ones are replaced with a null.  The set of up to five actual
757	   shared secrets are then s1, s2, s3, s4, and s5 - the order is that
758	   chosen by the initiator.  The final shared secret, s0, is calculated
759	   by hashing the concatenation of the DHSS and the set of non-null
760	   shared secrets.  As a result, the null secrets have no effect on the
761	   concatenation operation:

763	   s0 = hash(DHSS | s1 | s2 | s3 | s4 | s5)

765	5.1.4.  Confirmation and Switch to SRTP

767	   The SRTP master key and master salt are then generated using the
768	   shared secret.  Separate SRTP keys and salts are used in each
769	   direction for each media stream.  Unless otherwise specified, ZRTP
770	   uses SRTP with no MKI, 32 bit authentication using HMAC-SHA1, AES-CM
771	   128 or 256 bit key length, 112 bit session salt key length, 2^48 key
772	   derivation rate, and SRTP prefix length 0.

774	   The ZRTP initiator encrypts and the ZRTP responder decrypts packets
775	   by using srtpkeyi and srtpsalti, which are generated by:

777	   srtpkeyi = HMAC(s0,"Initiator SRTP master key")

779	   srtpsalti = HMAC(s0,"Initiator SRTP master salt")

781	   The key and salt values are truncated to the length determined by the
782	   chosen SRTP algorithm.  The ZRTP responder encrypts and the ZRTP
783	   initiator decrypts packets by using srtpkeyr and srtpsaltr, which are
784	   generated by:

786	   srtpkeyr = HMAC(s0,"Responder SRTP master key")

788	   srtpsaltr = HMAC(s0,"Responder SRTP master salt")

790	   A ZRTP Session Key is generated which then allows the ZRTP
791	   Multistream mode to be used to generate SRTP key and salt pairs for
792	   additional concurrent media streams between this pair of ZRTP
793	   endpoints.  If a ZRTP Session Key has already been generated between
794	   this pair of endpoints, no new ZRTP Session Key is calculated.

796	   ZRTPsess = HMAC(s0,"ZRTP Session Key")

798	   The ZRTPsess key is kept for the duration of the call signaling
799	   session between the two ZRTP endpoints.  That is, if there are two
800	   separate calls between the endpoints (in SIP terms, separate SIP
801	   dialogs), then a ZRTP Session Key MUST NOT be used across the two
802	   call signaling sessions.  At the end of the call signaling session,
803	   ZRTPSess is destroyed.

805	   The HMAC keys are generated by:

807	   hmackeyi = HMAC(s0,"Initiator HMAC key")

809	   hmackeyr = HMAC(s0,"Responder HMAC key")

811	   Note that these HMAC keys are used only by ZRTP and not by SRTP.  A
812	   new rs1 is calculated from s0:

814	   rs1 = HMAC (s0, "retained secret")

816	   The endpoints can now switch to SRTP and begin packet encryption.
817	   The ZRTP Initiator and Responder use their own keying material for
818	   the SRTP session.  No MKI is used and a 32 bit authentication tag is
819	   used.

821	   The ZRTP Confirm1 and Confirm2 messages are sent for two reasons.
822	   First, they confirm that all the key agreement calculations were
823	   successful and the encryption is working, and they enable automatic
824	   detection of a DH MitM attack from a reckless attacker who does not
825	   know the retained shared secret.  Second, they enable us to transmit
826	   the SAS Verified flag (V) under cover of SRTP encryption, shielding
827	   it from a passive observer who would like to know if the human users
828	   are in the habit of diligently verifying the SAS.

830	   The Confirm1 and Confirm2 messages contain the cache expiration
831	   interval for the newly generated retained shared secret.  Based on
832	   this, both sides now discard the rs2 value and store rs1 as rs2.  The
833	   Confirm1 and Confirm2 messages also contain an HMAC of some known
834	   plaintext and the flagoctet.  The flagoctet is an 8 bit unsigned
835	   integer made up of the Disclosure flag (D), Stay secure flag (S), SAS
836	   Verified flag (V):

838	   flagoctet = D * 2^2 + S * 2^1 + V * 2^0

840	   The HMAC is explicitly included in the payload because we may not
841	   always be able to rely on the built-in authentication tag in SRTP,
842	   which might be configured to different sizes, including none.

844	   hmac = HMAC(hmackey, "known plaintext" | flagoctet )

846	   This information is not carried in the extension header but inserted
847	   at the start of the SRTP payload.

849	   The Conf2ACK message completes the exchange.

851	5.2.  Multistream Mode

853	   The Multistream key agreement mode can be used to generate SRTP keys
854	   and salts for additional media streams established between a pair of
855	   endpoints.  Multistream mode cannot be used unless there is an active
856	   SRTP session established between the endpoints which means a ZRTP
857	   Session key is active.  This ZRTP Session key can be used to generate
858	   keys and salts without performing another DH calculation.  In this
859	   mode, the retained shared secret cache is not used or updated.  As a
860	   result, multiple ZRTP Multistream mode exchanges can be processed in
861	   parallel between two endpoints.

863	   This mode is selected by setting the Key Agreement Type to "Multistr"
864	   in the Commit message.  The Cipher Type and Auth Tag Length in
865	   Multistream mode MUST be the same as the values in the initial DH
866	   Mode Commit and MUST be ignored if different, making bid down
867	   impossible.  The SAS Type is ignored as there is no SAS
868	   authentication in this mode.  In in place of hvi in the Commit, a
869	   random number, nonce, 32 octets long is chosen.  Its value MUST be
870	   unique for all nonce values chosen for active ZRTP sessions between a
871	   pair of endpoints.  If a Commit is received with a reused nonce
872	   value, the ZRTP exchange MUST be immediately terminated.

874	   Note: Since the nonce is used to calculate different SRTP key and
875	   salt pairs for each media stream, a duplication will result in the
876	   same key and salt being generated for the two media streams.

878	   If a Commit is received selecting Multistream mode, but the responder
879	   does not have a ZRTP Session Key available, the exchange MUST be
880	   terminated.

882	   In Multistream mode, both the DHPart1 and DHPart2 messages are not
883	   sent.  After the Commit, SRTP begins and the responder sends the
884	   Confirm1 message.  The SRTP key and salt for the initiator and
885	   responder are calculated using the ZRTP Session Key and the nonce
886	   from the Commit message.  For the nth media stream:

888	   s0n= HMAC(ZRTPSess, nonce)
889	   The ZRTP initiator encrypts and the ZRTP responder decrypts packets
890	   for this nth session by using srtpkeyin and srtpsaltin, which are
891	   generated by:

893	   srtpkeyin = HMAC(s0n,"Initiator SRTP master key")

895	   srtpsaltin = HMAC(s0n,"Initiator SRTP master salt")

897	   The key and salt values are truncated to the length determined by the
898	   chosen SRTP algorithm.  The ZRTP responder encrypts and the ZRTP
899	   initiator decrypts packets for this nth stream by using srtpkeyrn and
900	   srtpsaltrn, which are generated by:

902	   srtpkeyrn = HMAC(s0n,"Responder SRTP master key")

904	   srtpsaltrn = HMAC(s0n,"Responder SRTP master salt")

906	   The HMAC keys are generated by:

908	   hmackeyin = HMAC(s0n,"Initiator HMAC key")

910	   hmackeyrn = HMAC(s0n,"Responder HMAC key")

912	5.3.  Random Number Generation

914	   The ZRTP protocol uses random numbers for cryptographic key material,
915	   notably for the DH secret exponents and nonces, which must be freshly
916	   generated with each session.  Whenever a random number is needed, all
917	   of the following criteria must be satisfied:

919	   It MUST be derived from a physical entropy source, such as RF noise,
920	   acoustic noise, thermal noise, high resolution timings of
921	   environmental events, or other unpredictable physical sources of
922	   entropy.  Chapter 10 of [8] gives a detailed explanation of
923	   cryptographic grade random numbers and provides guidance for
924	   collecting suitable entropy.  The raw entropy must be distilled and
925	   processed through a deterministic random bit generator (DRBG).
926	   Examples of DRBGs may be found in NIST SP 800-90 [9], and in [8].

928	   It MUST be freshly generated, meaning that it must not have been used
929	   in a previous calculation.

931	   It MUST be greater than or equal to two, and less than or equal to
932	   2^L - 1, where L is the number of random bits required.

934	   It MUST be chosen with equal probability from the entire available
935	   number space, e.g., [2, 2^L - 1].

937	5.4.  CRC Protection of Messages

939	   The ZRTP protocol uses a 32 bit CRC checksum in each ZRTP message as
940	   defined in RFC 3309 [6] to detect transmission errors.  ZRTP packets
941	   are carried by UDP, which carries its own built-in 16-bit checksum
942	   for integrity, but ZRTP does not rely on it.  This is because of the
943	   effect of an undetected transmission error in a ZRTP message.  For
944	   example, an undetected error in the DH exchange could appear to be an
945	   active man-in-the-middle attack.  The psychological effects of a
946	   false announcement of this by ZTRP clients can not be overstated.
947	   The probability of such a false alarm hinges on a mere 16-bit
948	   checksum that usually protects UDP packets, so more error detection
949	   is needed.  For these reasons, this belt-and-suspenders approach is
950	   used to minimize the chance of a transmission error affecting the
951	   ZRTP key agreement.

953	   The CRC is calculated across the ZRTP message only, including the RTP
954	   Header extension (0x505A) and length field, followed by the ZRTP
955	   message itself, but not including the CRC field.  The CRC does not
956	   include the normal RTP header (V, P, X, CC, M, PT, sequence number,
957	   timestamp, SSRC, CCRC) or payload.  In the Confirm1 and Confirm2
958	   messages, the CRC does not include the fields transported in the
959	   payload (plaintext, flags, hmac).  If a ZRTP message fails the CRC
960	   check, it is silently discarded.

962	5.5.  ZID and Cache Operation

964	   Each instance of ZRTP has a unique 96-bit random ZRTP ID or ZID that
965	   is generated once at installation time.  It is used to look up
966	   retained shared secrets in a local cache.  A single global ZID for a
967	   single installation is the simplest way to implement ZIDs.  However,
968	   it is specifically not precluded for an implementation to use
969	   multiple ZIDs, up to the limit of a separate one per callee.  This
970	   then turns it into a long-lived "association ID" that does not apply
971	   to any other associations between a different pair of parties.  It is
972	   a goal of this protocol to permit both options to interoperate
973	   freely.

975	   Each time a new s0 is calculated, a new retained shared secret rs1 is
976	   generated and stored in the cache, indexed by the ZID of the other
977	   endpoint.  The previous retained shared secret is then renamed rs2
978	   and also stored in the cache.  For the new retained shared secret,
979	   each endpoint chooses a cache expiration value which is an unsigned
980	   32 bit integer of the number of seconds that this secret should be
981	   retained in the cache.  The time interval is relative to when the
982	   Confirm1 message is sent or received.

984	   Note: The storage of two retained shared secrets ensures that even
985	   when a Commit is sent close to the expiration time of a retained
986	   shared secret, there is a high probability of the endpoints having at
987	   least one retained shared secret.  The exception to this is if both
988	   retained shared secrets have identical or near identical expiration
989	   times.

991	   The cache intervals are exchanged in the Confirm1 and Confirm2
992	   messages.  The actual cache interval used by both endpoints is the
993	   minimum of the values from the Confirm1 and Confirm2 messages.  A
994	   value of 0 seconds means the secret should not be cached and the
995	   current values of rs1 and rs2 MUST be maintained.  A value of
996	   0xFFFFFFFF means the secret should be cached indefinitely and is the
997	   recommended value.  If the ZRTP exchange results in no new shared
998	   secret generation (i.e.  Multistream Mode), the field in the Confirm1
999	   and Confirm2 is set to 0xFFFFFFFF and ignored.

1001	   Retained shared secrets expiration times are checked at the time of
1002	   their inclusion in a DHPart1 or DHPart2 message.  Expired values are
1003	   not included and dropped from the cache.

1005	5.6.  Terminating an SRTP Session or ZRTP Exchange

1007	   The GoClear message is used to switch from SRTP to RTP or to
1008	   terminate an in-progress ZRTP exchange.  The GoClear message contains
1009	   a reason string for human purposes and a clear_hmac field.

1011	   When used to switch from SRTP to RTP, ZRTP avoids relying on the
1012	   optional SRTP authentication tag by using an HMAC of the string
1013	   "GoClear" computed with the hmackey derived from the shared secret:

1015	   clear_hmac = HMAC(hmackey, "GoClear")

1017	   A GoClear message which does not receive a ClearACK response
1018	   indicates that the GoClear has failed authentication (the clear_hmac
1019	   does not validate) and that the session must stay in secure mode.

1021	   When terminating an in-progress ZRTP exchange, no secret hmackey is
1022	   available, so the clear_hmac field is set to all zeros and ignored.
1023	   The reason string SHOULD indicate the reason for the failure (e.g.
1024	   "No Session Key", "Nonce Reuse", "Invalid DH Value").  The
1025	   termination of a ZRTP key agreement exchange results in no updates to
1026	   the cached shared secrets and deletion of all crypto context.

1028	   A ZRTP endpoint that receives a GoClear authenticates the message by
1029	   checking the clear_hmac.  If the message authenticates, the endpoint
1030	   stops sending SRTP packets, generates a ClearACK in response, and
1031	   deletes the crypto context for the SRTP session.  Until confirmation
1032	   from the user is received (e.g. clicking a button, pressing a DTMF
1033	   key, etc.), the ZRTP endpoint MUST NOT resume sending RTP packets.
1034	   The endpoint then renders the reason string and an indication that
1035	   the media session has switched to clear mode to the user and waits
1036	   for confirmation from the user.  To prevent pinholes from closing or
1037	   NAT bindings from expiring, the ClearACK message MAY be resent at
1038	   regular intervals (e.g. every 5 seconds) while waiting for
1039	   confirmation from the user.  After confirmation of the notification
1040	   is received from the user, the sending of RTP packets may begin.

1042	   After sending a GoClear message, the ZRTP endpoint stops sending SRTP
1043	   packets.  When a ClearACK is received, the ZRTP endpoint deletes the
1044	   crypto context for the SRTP session and may then resume sending RTP
1045	   packets.  However, the ZRTP Session key is not deleted unless the
1046	   signaling session is terminated as well.

1048	   A ZRTP endpoint MAY choose not to accept GoClear messages after the
1049	   session has switched to SRTP.  This is indicated in the Confirm1 or
1050	   Confirm2 messages by setting the Stay secure flag (S).

1052	6.  RTP Header Extension

1054	   This specification defines a new RTP header extension used for all
1055	   ZRTP messages.  When used, the X bit is set in the RTP header to
1056	   indicate the presence of the RTP header extension.

1058	   Section 5.3.1 in RFC 3550 defines the format of an RTP Header
1059	   extension.  The Header extension is appended to the RTP header.  The
1060	   first 16 bits are an identifier for the header extension, and the
1061	   following 16 bits are length of the extension header in 32 bit words.
1062	   All word lengths referenced in this specification follow RFC 3550 and
1063	   are 32 bits or 4 octets.  All integer fields are carried in network
1064	   byte order, that is, most significant byte (octet) first, commonly
1065	   known as big-endian.  Each ZRTP message is carried in a single RTP
1066	   header extension which has the value of 0x505A.

1068	6.1.  ZRTP Message Formats

1070	   ZRTP messages are designed to simplify endpoint parsing requirements
1071	   and to reduce the opportunities for buffer overflow attacks (a good
1072	   goal of any security extension should be to not introduce new attack
1073	   vectors...)

1075	   ZRTP uses 8 octets (2 words) to encode many ZRTP parameters.  These
1076	   fixed-length blocks are used for Message Type, Hash Type, Cipher
1077	   Type, and Key Agreement Type.  For the Authentication Tag Length, 4
1078	   octets are used.  The values in the blocks are ASCII strings which
1079	   are extended with spaces (0x20) to make them 8 characters long.

1081	   Currently defined block values are listed in Tables 1-6 below.
1082	   Additional block values may be defined and used.

1084	   ZRTP uses this ASCII encoding to simplify debugging and make it
1085	   "ethereal friendly".

1087	6.1.1.  Message Type Block

1089	   Currently ten Message Type Blocks are defined - they represent the
1090	   set of ZRTP message primitives.  ZRTP endpoints MUST support the
1091	   Hello, HelloACK, Commit, DHPart1, DHPart2, Confirm1, Confirm2,
1092	   Conf2ACK, GoClear and ClearACK block types.

1094	    Message Type Block   |  Meaning
1095	    ---------------------------------------------------
1096	    "Hello   "           |  Hello Message
1097	                         |  defined in Section 6.2
1098	    ---------------------------------------------------
1099	    "HelloACK"           |  HelloACK Message
1100	                         |  defined in Section 6.3
1101	    ---------------------------------------------------
1102	    "Commit  "           |  Commit Message
1103	                         |  defined in Section 6.4
1104	    ---------------------------------------------------
1105	    "DHPart1 "           |  DHPart1 Message
1106	                         |  defined in Section 6.5
1107	    ---------------------------------------------------
1108	    "DHPart2 "           |  DHPart2 Message
1109	                         |  defined in Section 6.6
1110	    ---------------------------------------------------
1111	    "Confirm1"           |  Confirm1 Message
1112	                         |  defined in Section 6.7
1113	    ---------------------------------------------------
1114	    "Confirm2"           |  Confirm2 Message
1115	                         |  defined in Section 6.8
1116	    ---------------------------------------------------
1117	    "Conf2ACK"           |  Conf2ACK Message
1118	                         |  defined in Section 6.9
1119	    ---------------------------------------------------
1120	    "GoClear "           |  GoClear Message
1121	                         |  defined in Section 6.10
1122	    ---------------------------------------------------
1123	    "ClearACK"           |  ClearACK Message
1124	                         |  defined in Section 6.11
1125	    ---------------------------------------------------

1127	    Table 1. Message Block Type Values

1129	6.1.2.  Hash Type Block

1131	   Only one Hash Type is currently defined, SHA256, and all ZRTP
1132	   endpoints MUST support this hash.  Additional Hash Types can be
1133	   registered and used.

1135	    Hash Type Block      |  Meaning
1136	    ---------------------------------------------------
1137	    "SHA256  "           |  SHA-256 Hash defined in [SHA-256]
1138	    ---------------------------------------------------

1140	    Table 2. Hash Block Type Values

1142	6.1.3.  Cipher Type Block

1144	   All ZRTP endpoints MUST support AES128 and MAY support AES256 [4]. or
1145	   other Cipher Types.  Also, if AES 128 is used, DH3k should be used.
1146	   If AES 256 is used, DH4k should be used.

1148	     Cipher Type Block    |  Meaning
1149	    ---------------------------------------------------
1150	    "AES128  "            |  AES-CM with 128 bit keys
1151	                          |  as defined in RFC 3711
1152	    ---------------------------------------------------
1153	    "AES256  "            |  AES-CM with 256 bit keys
1154	                          |  as defined in RFC 3711
1155	    ---------------------------------------------------

1157	    Table 3. Cipher Block Type Values

1159	6.1.4.  Auth Tag Length Block

1161	   The Auth Tag Length Block is 4 octets (1 word) long.  All ZRTP
1162	   endpoints MUST support 32 bit and 80 bit authentication tags as
1163	   defined in RFC 3711.

1165	    Auth Tag Length Block |  Meaning
1166	    ---------------------------------------------------
1167	    "32  "                |  32 bit authentication tag
1168	                          |  as defined in RFC 3711
1169	    ---------------------------------------------------
1170	    "80  "                |  80 bit authentication tag
1171	                          |  as defined in RFC 3711
1172	    ---------------------------------------------------
1173	    Table 4. Auth Tag Length Values

1175	6.1.5.  Key Agreement Type Block

1177	   All ZRTP endpoints MUST support DH3072 and MAY support DH4096.  ZRTP
1178	   endpoints MUST use the DH generator function g=2.  The choice of AES
1179	   key length is coupled to the choice of key agreement type.  If AES
1180	   128 is chosen, DH3072 SHOULD be used.  If AES 256 is chosen, DH4096
1181	   SHOULD be used.  ZRTP also defines a non-DH mode, Multistream, which
1182	   MUST be supported.  In Multistream mode, the SRTP key is derived from
1183	   a ZRTP Session key and a nonce.

1185	     Key Agreement Type Block | Meaning
1186	    ---------------------------------------------------
1187	    "DH3072  "                |  DH mode with p=3072 bit prime
1188	                              |  as defined in RFC 3526
1189	    ---------------------------------------------------
1190	    "DH4096  "                |  DH mode with p=4096 bit prime
1191	                              |  as defined in RFC 3526
1192	    ---------------------------------------------------
1193	    "Multistr"                |  Multistream Non-DH mode
1194	                              |  uses ZRTP Session key
1195	    ---------------------------------------------------

1197	    Table 5. Key Agreement Block Type Values

1199	6.1.6.  SAS Type Block

1201	   All ZRTP endpoints SHOULD support the base32 and base256 Short
1202	   Authentication String scheme or other SAS schemes.  The optional ZRTP
1203	   SAS is described in Section 7.

1205	     SAS Type Block       |  Meaning
1206	    ---------------------------------------------------
1207	    "base32  "            |  Short Authentication String using
1208	                          |  base32 encoding defined in Section 8.
1209	    ---------------------------------------------------
1210	    "base256 "            |  Short Authentication String using
1211	                          |  base 256 encoding defined in Section 8.
1212	    ---------------------------------------------------

1214	    Table 6. SAS Block Type Values

1216	6.2.  Hello message

1218	   The Hello message has the format shown in Figure 2 below.  The header
1219	   extension payload contains the ZRTP version number and the list of
1220	   algorithms supported by SRTP.  The extension header field format is
1221	   shown in Figure 2.

1223	   The Hello ZRTP message begins with the ZRTP header extension field
1224	   followed by the 32 bit word count of the header field.  Next is a
1225	   word containing the version (ver) of ZRTP.  For this specification,
1226	   the version is the string "0.03".  Next is the Client Identifier
1227	   string (cid) which is 31 octets long and identifies the vendor and
1228	   release of the ZRTP software.  The Passive bit (P) is a Boolean
1229	   normally set to False.  A ZRTP endpoint which is configured to never
1230	   initiate secure sessions is regarded as passive, and would set the P
1231	   bit to True.  Next is a list of supported Hash Types, Cipher Types,
1232	   Auth Tag length, Key Agreement Types, and SAS Type.  Five possible
1233	   algorithms are listed for each using the Blocks defined in Tables 2,
1234	   3, 4, 5, and 6.  If fewer than five algorithms are supported, spaces
1235	   (0x20) are used to pad out the 10 words for each type.  The last
1236	   parameter is the ZID, the 96 bit long unique identifier for the ZRTP
1237	   endpoint.

1239	        0                   1                   2                   3
1240	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1241	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1242	       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|        length=60 words        |
1243	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1244	       |            Message Type Block="Hello   " (2 words)            |
1245	       |                                                               |
1246	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1247	       |                        version (1 word)                       |
1248	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1249	       |                                                               |
1250	       |                 Client Identifier (31 octets)                 |
1251	       |                              . . .                            |
1252	       |                                               +-+-+-+-+-+-+-+-+
1253	       |                                               |0 0 0 0 0 0 0|P|
1254	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1255	       |                                                               |
1256	       |                 Hash Type Blocks 1-5 (10 words)               |
1257	       |                              . . .                            |
1258	       |                                                               |
1259	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1260	       |                                                               |
1261	       |                Cipher Type Blocks 1-5 (10 words)              |
1262	       |                              . . .                            |
1263	       |                                                               |
1264	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1265	       |                                                               |
1266	       |             Auth Tag Length Blocks 1-5 (5 words)              |
1267	       |                              . . .                            |
1268	       |                                                               |
1269	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1270	       |                                                               |
1271	       |             Key Agreement Type Blocks 1-5 (10 words)          |
1272	       |                              . . .                            |
1273	       |                                                               |
1274	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1275	       |                                                               |
1276	       |                  SAS Type Blocks 1-5 (10 words)               |
1277	       |                              . . .                            |
1278	       |                                                               |
1279	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1280	       |                                                               |
1281	       |                         ZID  (3 words)                        |
1282	       |                                                               |
1283	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1284	       |                          CRC (1 word)                         |
1285	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1287	   Figure 2. Extension header format for Hello message

1289	6.3.  HelloACK message

1291	   The HelloACK message is used to stop retransmissions of a Hello
1292	   message.  A HelloACK is sent regardless if the version number in the
1293	   Hello is supported or the algorithm list supported.  The receipt of a
1294	   HelloACK stops retransmission of the Hello message.  The format is
1295	   shown in Figure 3 below.  Note that a Commit message can be sent in
1296	   place of a HelloACK by an initiator.

1298	        0                   1                   2                   3
1299	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1300	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1301	       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|         length=3 words        |
1302	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1303	       |              Message Type Block="HelloACK" (2 words)          |
1304	       |                                                               |
1305	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1306	       |                          CRC (1 word)                         |
1307	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1309	     Figure 3. Extension header format for HelloACK message

1311	6.4.  Commit message

1313	   The Commit message is sent to initiate the key agreement process
1314	   after receiving a Hello message.  The Commit message contains the
1315	   initiator's ZID and a list of selected algorithms (hash, cipher, atl,
1316	   pkt, sas), the ZRTP mode, and hvi, a hash of the public DH value of
1317	   the initiator and the algorithm list from the responder's Hello
1318	   message.  If a non-DH mode is used, hvi is replaced by a random
1319	   number, nonce.

1321	        0                   1                   2                   3
1322	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1323	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1324	       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|        length=23 words        |
1325	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1326	       |              Message Type Block="Commit  " (2 words)          |
1327	       |                                                               |
1328	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1329	       |                                                               |
1330	       |                         ZID  (3 words)                        |
1331	       |                                                               |
1332	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1333	       |                    Hash Type Blocks (2 words)                 |
1334	       |                                                               |
1335	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1336	       |                   Cipher Type Block (2 words)                 |
1337	       |                                                               |
1338	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1339	       |                 Auth Tag Length Block (1 word)                |
1340	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1341	       |                Key Agreement Type Block (2 words)             |
1342	       |                                                               |
1343	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1344	       |                    SAS Type Block (2 words)                   |
1345	       |                                                               |
1346	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1347	       |                                                               |
1348	       |                        hvi or nonce (8 words)                 |
1349	       |                               . . .                           |
1350	       |                                                               |
1351	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1352	       |                          CRC (1 word)                         |
1353	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1355	    Figure 4. Extension header format for Commit message

1357	6.5.  DHPart1 message

1359	   The DHPart1 message begins the DH exchange.  The format is shown in
1360	   Figure 5 below.  The DHPart1 message is sent if a valid Commit
1361	   message is received.  The length of the pvr value depends on the Key
1362	   Agreement Type chosen.  If DH4096 is used, the pvr will be 128 words
1363	   (512 octets).  If DH3072 is used, it is 96 words (384 octets).

1365	   The next five parameters are HMACs of potential shared secrets used
1366	   in generating the ZRTP secret.  The first two, rs1IDr and rs2IDr, are
1367	   the HMACs of the responder's two retained shared secrets, truncated
1368	   to 64 bits.  Next is sigsIDr, the HMAC of the responder's signaling
1369	   secret, truncated to 64 bits.  Next is srtpsIDr, the HMAC of the
1370	   responder's SRTP secret, truncated to 64 bits.  The last parameter is
1371	   the HMAC of an additional shared secret.  For example, if multiple
1372	   SRTP secrets are available or some other secret is used, it can be
1373	   used as the other_secret.

1375	        0                   1                   2                   3
1376	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1377	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1378	       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|   length=depends on KA Type   |
1379	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1380	       |              Message Type Block="DHPart1 " (2 words)          |
1381	       |                                                               |
1382	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1383	       |                                                               |
1384	       |                 pvr (length depends on KA Type)               |
1385	       |                               . . .                           |
1386	       |                                                               |
1387	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1388	       |                        rs1IDr (2 words)                       |
1389	       |                                                               |
1390	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1391	       |                        rs2IDr (2 words)                       |
1392	       |                                                               |
1393	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1394	       |                        sigsIDr (2 words)                      |
1395	       |                                                               |
1396	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1397	       |                       srtpsIDr (2 words)                      |
1398	       |                                                               |
1399	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1400	       |                    other_secretIDr (2 words)                  |
1401	       |                                                               |
1402	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1403	       |                          CRC (1 word)                         |
1404	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1406	     Figure 5. Extension header format for DHPart1 message

1408	6.6.  DHPart2 message

1410	   The DHPart2 message completes the DH exchange.  A DHPart2 message is
1411	   sent if a valid DHPart1 message is received.  The length of the pvi
1412	   value depends on the Key Agreement Type chosen.  If DH4096 is used,
1413	   the pvr will be 128 words (512 octets).  If DH3072 is used, it is 96
1414	   words (384 octets).

1416	   The next five parameters are HMACs of potential shared secrets used
1417	   in generating the ZRTP secret.  The first two, rs1IDi and rs2IDi, are
1418	   the HMACs of the initiator's two retained shared secrets, truncated
1419	   to 64 bits.  Next is sigsIDi, the HMAC of the initiator's signaling
1420	   secret, truncated to 64 bits.  Next is srtpsIDi, the HMAC of the
1421	   initiator's SRTP secret, truncated to 64 bits.  The last parameter is
1422	   the HMAC of an additional shared secret.  For example, if multiple
1423	   SRTP secrets are available or some other secret is used, it can be
1424	   included.

1426	        0                   1                   2                   3
1427	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1428	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1429	       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|   length=depends on KA Type   |
1430	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1431	       |              Message Type Block="DHPart2 " (2 words)          |
1432	       |                                                               |
1433	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1434	       |                                                               |
1435	       |                   pvi (length depends on KA Type)             |
1436	       |                               . . .                           |
1437	       |                                                               |
1438	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1439	       |                        rs1IDi (2 words)                       |
1440	       |                                                               |
1441	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1442	       |                        rs2IDi (2 words)                       |
1443	       |                                                               |
1444	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1445	       |                        sigsIDi (2 words)                      |
1446	       |                                                               |
1447	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1448	       |                       srtpsIDi (2 words)                      |
1449	       |                                                               |
1450	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1451	       |                    other_secretIDi (2 words)                  |
1452	       |                                                               |
1453	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1454	       |                          CRC (1 word)                         |
1455	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1457	     Figure 6. Extension header format for DHPart2 message

1459	6.7.  Confirm1 message

1461	   The Confirm1 message is sent in response to a valid DHPart2 message
1462	   after the SRTP session key and parameters have been negotiated.  As a
1463	   result, it is always sent in an SRTP packet.  The format is shown in
1464	   Figure 7 below.  The header extension itself has no parameters
1465	   besides the Message Type Block and the CRC.  The first 52 octets in
1466	   the SRTP payload are used by ZRTP to securely exchange a number of
1467	   parameters.  The plaintext parameter contains the known plaintext
1468	   "known plaintext".  The Disclosure Flag (D) is a Boolean bit defined
1469	   in Appendix B.  The Stay secure flag (S) is a Boolean bit defined in
1470	   Section 5.6.  The SAS Verified flag (V) is a Boolean bit defined in
1471	   Section 8.

1473	   The cache expiration interval is an unsigned 32 bit integer of the
1474	   number of seconds that the newly generated cached shared secret, rs1,
1475	   should be stored.  The hmac is a hash over the known plaintext "known
1476	   plaintext" and the flagoctet.

1478	   The parameters included in the SRTP payload MUST NOT be allowed to
1479	   pass to the RTP stack or errors may occur with the media stream.

1481	        0                   1                   2                   3
1482	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1483	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1484	       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|         length=3 words        |
1485	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1486	       |              Message Type Block="Confirm1" (2 words)          |
1487	       |                                                               |
1488	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1489	       |                          CRC (1 word)                         |
1490	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1492	         At the start of the SRTP payload:

1494	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1495	       |                                                               |
1496	       |                                                               |
1497	       |                "known plaintext" (15 octets)                  |
1498	       |                                               +-+-+-+-+-+-+-+-+
1499	       |                                               |0 0 0 0 0|D|S|V|
1500	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1501	       |              cache expiration interval (1 word)               |
1502	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1503	       |                                                               |
1504	       |                         hmac (8 words)                        |
1505	       |                             . . .                             |
1506	       |                                                               |
1507	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1509	     Figure 7. Extension header format for Confirm1 message

1511	6.8.  Confirm2 message

1513	   The Confirm2 message is sent in response to a Confirm1 message after
1514	   the SRTP session key and parameters have been negotiated.  As a
1515	   result, it is always sent in an SRTP packet.  The format is shown in
1516	   Figure 8 below.  The header extension itself has no parameters
1517	   besides the Message Type Block and the CRC.  The first 52 octets in
1518	   the SRTP payload are used by ZRTP to securely exchange a number of
1519	   parameters.  The plaintext parameter contains the known plaintext
1520	   "known plaintext".  The Disclosure Flag (D) is a Boolean bit defined
1521	   in Appendix B.  The Stay secure flag (S) is a Boolean bit defined in
1522	   Section 5.6.  The SAS Verified flag (V) is a Boolean bit defined in
1523	   Section 8.

1525	   The cache expiration interval is an unsigned 32 bit integer of the
1526	   number of seconds that the newly generated cached shared secret, rs1,
1527	   should be stored.  The hmac is a hash over the known plaintext "known
1528	   plaintext" and the flagoctet.

1530	   The parameters included in the SRTP payload MUST NOT be allowed to
1531	   pass to the RTP stack or errors may occur with the media stream.

1533	        0                   1                   2                   3
1534	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1535	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1536	       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|         length=3 words        |
1537	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1538	       |              Message Type Block="Confirm2" (2 words)          |
1539	       |                                                               |
1540	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1541	       |                          CRC (1 word)                         |
1542	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1544	        At the start of the SRTP payload:

1546	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1547	       |                                                               |
1548	       |                "known plaintext" (15 octets)                  |
1549	       |                                               +-+-+-+-+-+-+-+-+
1550	       |                                               |0 0 0 0 0|D|S|V|
1551	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1552	       |              cache expiration interval (1 word)               |
1553	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1554	       |                                                               |
1555	       |                         hmac (8 words)                        |
1556	       |                             . . .                             |
1557	       |                                                               |
1558	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1560	      Figure 8. Extension header format for Confirm2 message

1562	6.9.  Conf2ACK message

1564	   The Conf2ACK message is sent in response to a valid Confirm2 message.
1565	   The format is shown in Figure 9 below.  The receipt of a Conf2ACK
1566	   stops retransmission of the Confirm2 message.

1568	        0                   1                   2                   3
1569	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1570	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1571	       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|         length=3 words        |
1572	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1573	       |              Message Type Block="Conf2ACK" (2 words)          |
1574	       |                                                               |
1575	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1576	       |                          CRC (1 word)                         |
1577	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1579	     Figure 9. Extension header format for Conf2ACK message

1581	6.10.  GoClear message

1583	   The GoClear message is sent to switch from SRTP back to RTP or to
1584	   terminate an in-process ZRTP key agreement exchange.  The format is
1585	   shown in Figure 11 below.  The Reason String is a 16 character string
1586	   which contains the reason for the switch to clear.  If the GoClear is
1587	   sent due to a user interface selection, the reason is "User Request".
1588	   If the GoClear is sent due to a protocol error, the reason phrase is
1589	   generated to describe the reason.  The Reason String can be logged or
1590	   rendered for human consumption.

1592	   If the GoClear is sent to switch from SRTP back to RTP, the The
1593	   clear_hmac is used to authenticate the GoClear message so that bogus
1594	   GoClear messages introduced by an attacker can be detected and
1595	   discarded.

1597	        0                   1                   2                   3
1598	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1599	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1600	       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|        length=15 words        |
1601	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1602	       |              Message Type Block="GoClear " (2 words)          |
1603	       |                                                               |
1604	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1605	       |                                                               |
1606	       |                      Reason String  (4 words)                 |
1607	       |                                                               |
1608	       |                                                               |
1609	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1610	       |                                                               |
1611	       |                       clear_hmac (8 words)                    |
1612	       |                             . . .                             |
1613	       |                                                               |
1614	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1615	       |                          CRC (1 word)                         |
1616	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1618	     Figure 11. Extension header format for GoClear message

1620	6.11.  ClearACK message

1622	   The ClearACK message is sent to acknowledge receipt of a GoClear.  A
1623	   ClearACK is only sent if the clear_hmac from the GoClear message is
1624	   authenticated.  Otherwise, no response is returned.  The format is
1625	   shown in Figure 12 below.

1627	        0                   1                   2                   3
1628	        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
1629	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1630	       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|        length=3 words         |
1631	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1632	       |              Message Type Block="ClearACK" (2 words)          |
1633	       |                                                               |
1634	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
1635	       |                          CRC (1 word)                         |
1636	       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

1638	     Figure 12. Extension header format for ClearACK message

1640	7.  Retransmissions

1642	   ZRTP uses two retransmission timers T1 and T2.  T1 is used for
1643	   retransmission of Hello messages, when the support of ZRTP by the
1644	   other endpoint may not be known.  T2 is used in retransmissions of
1645	   all the other ZRTP messages with the exception of GoClear.

1647	   Practical experience has shown that RTP packet loss at the start of
1648	   an RTP session can be extremely high.  Since the entire ZRTP message
1649	   exchange occurs during this period, the defined retransmission scheme
1650	   is defined to be aggressive.  Since ZRTP packets with the exception
1651	   of the DHPart1 and DHPart2 messages are small, this should have
1652	   minimal effect on overall bandwidth utilization of the media session.

1654	   Hello ZRTP requests are retransmitted at an interval that starts at
1655	   T1 seconds and doubles after every retransmission, capping at 200ms.
1656	   A Hello message is retransmitted 20 times before giving up.  T1 has a
1657	   recommended value of 50 ms.  Retransmission of a Hello ends upon
1658	   receipt of a HelloACK or Commit message.

1660	   Non-Hello ZRTP requests are retransmitted only by the initiator -
1661	   that is, only Commit, DHPart2, and Confirm2 are retransmitted if the
1662	   corresponding message from the responder, DHPart1, Confirm1, and
1663	   Conf2ACK, are not received.  Non-Hello ZRTP messages are
1664	   retransmitted at an interval that starts at T2 seconds and doubles
1665	   after every retransmission, capping at 600ms.  Only the ZRTP
1666	   initiator performs retransmissions.  Each message is retransmitted 10
1667	   times before giving up and resuming a normal RTP session.  T2 has a
1668	   default value of 150ms.  Each message has a response message that
1669	   stops retransmissions, as shown in Table 7.  The high value of T2
1670	   means that retransmissions will likely only occur with packet loss.

1672	   A GoClear message is retransmitted at 500ms intervals until a
1673	   ClearACK message is received.

1675	       Message      Acknowledgement Message
1676	       -------      -----------------------
1677	       Hello        HelloACK or Commit
1678	       Commit       DHPart1 or Confirm1
1679	       DHPart2      Confirm1
1680	       Confirm1     Confirm2
1681	       Confirm2     Conf2ACK
1682	       GoClear      ClearACK

1684	      Table 7. Retransmitted ZRTP Messages and Responses

1686	8.  Short Authentication String

1688	   This section will discuss the implementation of the Short
1689	   Authentication String, or SAS in ZRTP.

1691	   The Short Authentication String (SAS) value is calculated as the hash
1692	   of both DH public values and the string "Short Authentication
1693	   String".

1695	   sasvalue = last 32 bits of hash(pvi | pvr | "Short Authentication
1696	   String")

1698	   The rendering of the SAS value depends on the SAS Type agreed upon in
1699	   the Commit message.  For the SAS Type of base32, the last 20 bits of
1700	   the sasvalue are rendered as a form of base32 encoding known as
1701	   libbase32 [10].  The purpose of base32 is to represent arbitrary
1702	   sequences of octets in a form that is as convenient as possible for
1703	   human users to manipulate.  As a result, the choice of characters is
1704	   slightly different from base32 as defined in RFC 3548.  The last 20
1705	   bits of the sasvalue results in four base32 characters which are
1706	   rendered to both ZRTP endpoints.  Other SAS Types may be defined to
1707	   render the SAS value in other ways.

1709	   The SAS SHOULD be rendered to the user.  In addition, the SAS SHOULD
1710	   be sent in a subsequent offer/answer exchange (a re-INVITE in SIP)
1711	   after the completion of ZRTP exchange using the ZRTP SAS SDP
1712	   attributes defined in Appendix A.

1714	   The SAS Verified flag (V) is set based on the user indicating that
1715	   SAS has been successfully performed.  The SAS Verified flag is
1716	   exchanged securely in the Confirm1 and Confirm2 messages of the next
1717	   session.  In other words, each party sends the SAS Verified flag from
1718	   the previous session in the Confirm message of the current session.
1719	   It is perfectly reasonable to have a ZRTP endpoint that never sets
1720	   the SAS Verified flag, because it would require adding complexity to
1721	   the user interface to allow the user to set it.  The SAS Verified
1722	   flag is not required to be set, but if it is available to the client
1723	   software, it allows for the possibility that the client software
1724	   could render to the user that the SAS verify procedure was carried
1725	   out in a previous session.

1727	   Regardless of whether there is a user interface element to allow the
1728	   user to set the SAS Verified flag, it is worth caching a shared
1729	   secret, because doing so reduces opportunities for an attacker in the
1730	   next call.

1732	   If at any time the users carry out the SAS procedure, and it actually
1733	   fails to match, then this means there is a very resourceful man in
1734	   the middle.  If this is the first call, the MitM was there on the
1735	   first call, which is impressive enough.  If it happens in a later
1736	   call, it also means the MitM must also know your cached shared
1737	   secret, because you could not have carried out any voice traffic at
1738	   all unless the session key was correctly computed and is also known
1739	   to the attacker.  This implies the MitM must have been present in all
1740	   the previous sessions, since the initial establishment of the first
1741	   shared secret.  This is indeed a resourceful attacker.  It also means
1742	   that if at any time he ceases his participation as a MitM on one of
1743	   your calls, the protocol will detect that the cached shared secret is
1744	   no longer valid -- because it was really two different shared secrets
1745	   all along, one of them between Alice and the attacker, and the other
1746	   between the attacker and Bob. The continuity of the cached shared
1747	   secrets make it possible for us to detect the MitM when he inserts
1748	   himself into the ongoing relationship, as well as when he leaves.
1749	   Also, if the attacker tries to stay with a long lineage of calls, but
1750	   fails to execute a DH MitM attack for even one missed call, he is
1751	   permanently excluded.  He can no longer resynchronize with the chain
1752	   of cached shared secrets.

1754	   Some sort of user interface element (maybe a checkbox) is needed to
1755	   allow the user to tell the software the SAS verify was successful,
1756	   causing the software to set the SAS Verified flag (V), which
1757	   (together with our cached shared secret) obviates the need to perform
1758	   the SAS procedure in the next call.  An additional user interface
1759	   element can be provided to let the user tell the software he detected
1760	   an actual SAS mismatch, which indicates a MitM attack.  The software
1761	   can then take appropriate action, clearing the SAS Verified flag, and
1762	   erase the cached shared secret from this session.  It is up to the
1763	   implementer to decide if this added user interface complexity is
1764	   warranted.

1766	   If the SAS matches, it means there is no MitM, which also implies it
1767	   is now safe to trust a cached shared secret for later calls.  If
1768	   inattentive users don't bother to check the SAS, it means we don't
1769	   know whether there is or is not a MitM, so even if we do establish a
1770	   new cached shared secret, there is a risk that our potential attacker
1771	   may have a subsequent opportunity to continue inserting himself in
1772	   the call, until we finally get around to checking the SAS.  If the
1773	   SAS matches, it means no attacker was present for any previous
1774	   session since we started propagating cached shared secrets, because
1775	   this session and all the previous sessions were also authenticated
1776	   with a continuous lineage of shared secrets.

1778	9.  IANA Considerations

1780	   This specification defines three new SDP [11] attributes in Appendix
1781	   A. The IANA registrations would be as follows:

1783	   Contact name:          Phil Zimmermann <prz@mit.edu>

1785	   Attribute name:        "zrtp".

1787	   Type of attribute:     Session level or Media level.

1789	   Subject to charset:    Not.

1791	   Purpose of attribute:  The 'zrtp' flag indicates that a UA supports the
1792	                          ZRTP protocol.

1794	   Allowed attribute values:  None.

1796	   IANA would registered the ZRTP SAS SDP attribute:

1798	   Contact name:          Phil Zimmermann <prz@mit.edu>

1800	   Attribute name:        "zrtp-sas".

1802	   Type of attribute:     Media level.

1804	   Subject to charset:    Yes.

1806	   Purpose of attribute:  The 'zrtp-sas' is used to convey the ZRTP SAS
1807	                          string that would be rendered to the users.  The
1808	                          the SAS is carried in the same format as it
1809	                          would be rendered.

1811	   Allowed attribute values:  String.

1813	   IANA would registered the ZRTP SASvalue SDP attribute:

1815	   Contact name:          Phil Zimmermann <prz@mit.edu>

1817	   Attribute name:        "zrtp-sasvalue".

1819	   Type of attribute:     Media level.

1821	   Subject to charset:    Not.

1823	   Purpose of attribute:  The 'zrtp-sasvalue' is used to convey the SASvalue
1824	                          used for deriving the SAS string.  The SAS value is
1825	                          encoded as hexadecimal.

1827	   Allowed attribute values:  Hex.

1829	10.  Security Considerations

1831	   This document is all about securely keying SRTP sessions.  As such,
1832	   security is discussed in every section.  The next version of this
1833	   draft will have a summary of those security properties discussed
1834	   throughout the document.

1836	   The ZRTP SDP attributes convey information through the signaling that
1837	   is already available in clear text through the media channel.  For
1838	   example, the ZRTP flag is equivalent to sending a ZRTP Hello message.
1839	   The SAS is calculated from the public Diffie-Hellman values exchanged
1840	   in the DHPart1 and DHPart2 messages and a known string.  As a result,
1841	   none of the ZRTP SDP attributes require confidentiality from the
1842	   signaling.

1844	   The ZRTP SAS attributes can use the signaling channel as an out-of-
1845	   band authentication mechanism.  This authentication is only useful if
1846	   the signaling channel has end-to-end integrity protection.  Note that
1847	   the SIP Identity header field [23] provides middle-to-end integrity
1848	   protection across SDP message bodies which provides useful protection
1849	   for ZRTP SAS attributes.

1851	11.  Acknowledgments

1853	   The authors would like to thank Bryce Wilcox-O'Hearn for his
1854	   contributions to the design of this protocol, and to thank Jon
1855	   Peterson, Colin Plumb, and Hal Finney for their helpful comments and
1856	   suggestions.  Also thanks to David McGrew, Roni Even, Viktor Krikun,
1857	   Werner Dittmann, Allen Pulsifer, Klaus Peters, and Abhishek Arya for
1858	   their feedback and comments.

1860	12.  Appendix A - ZRTP, SIP, and SDP

1862	   This section discusses how ZRTP, SIP, and SDP work together.

1864	   Note that ZRTP may be implemented without coupling with the SIP
1865	   signaling.  For example, ZRTP can be implemented as a "bump in the
1866	   wire" or as a "bump in the stack" in which RTP sent by the SIP UA is
1867	   converted to ZRTP.  In these cases, the SIP UA will have no knowledge
1868	   of ZRTP.  As a result, the signaling path discovery mechanisms
1869	   introduced in this section should not be definitive - they are a
1870	   hint.  Despite the absence of an indication of ZRTP support in an
1871	   offer or answer, a ZRTP endpoint SHOULD still send Hello messages.

1873	   ZRTP endpoints which have control over the signaling path include a
1874	   ZRTP SDP attributes in their SDP offers and answers.  The ZRTP
1875	   attribute, a=zrtp is a flag to indicate support for ZRTP.  There are
1876	   a number of potential uses for this attribute.  It is useful when
1877	   signaling elements would like to know when ZRTP may be utilized by
1878	   endpoints.  It is also useful if endpoints support multiple methods
1879	   of SRTP key management.  The ZRTP attribute can be used to ensure
1880	   that these key management approaches work together instead of against
1881	   each other.  For example, if only one endpoint supports ZRTP but both
1882	   support another method to key SRTP, then the other method will be
1883	   used instead.  When used in parallel, an SRTP secret carried in an
1884	   a=keymgt [20] or a=crypto [19] attribute can be used as a shared
1885	   secret for the srtp_secret.  The ZRTP attribute is also used to
1886	   signal to an intermediary ZRTP device not to act as a ZRTP endpoint,
1887	   as discussed in Appendix C.

1889	   The a=zrtp attribute can be included at a media level or at the
1890	   session level.  When used at the media level, it indicates that ZRTP
1891	   is supported on this media stream.  When used at the session level,
1892	   it indicates that ZRTP is supported in all media streams in the
1893	   session described by the offer or answer.

1895	   In some scenarios, it is desirable for a signaling intermediary to be
1896	   able to validate the SAS on behalf of the user.  This could be due to
1897	   an endpoint which has a user interface unable to render the SAS.  Or,
1898	   this could be a protection by an organization against lazy users who
1899	   never check the SAS.  Using either the ZRTP SAS or ZRTP SASvalue
1900	   attribute, the SAS check can be performed without requiring the human
1901	   users to speak the SAS.  Note that this check can only be relied on
1902	   if the signaling path has end-to-end integrity protection.

1904	   The ZRTP SAS attribute a=zrtp-sas is a Media level SDP attribute that
1905	   can be used to carry the SAS string which would be identical to that
1906	   rendered to the user.  The value passed depends on the negotiated SAS
1907	   Type.  Since the SAS is not known at the start of a session, the
1908	   a=zrtp-sas attribute will never be present in the initial offer/
1909	   answer exchange.  After the ZRTP exchange has completed, the SAS is
1910	   known and can be exchanged over the signaling using a second offer/
1911	   answer exchange (a re-INVITE in SIP terms).  Note that the SAS is not
1912	   a secret and as such does not need confidentiality protection when
1913	   sent over the signaling path.

1915	   The ZRTP SASvalue attribute a=zrtp-sasvalue attribute can be used to
1916	   send the 32 bit SAS value encoded as hex.  Note that this value is
1917	   not the same as that rendered to the user and is independent of the
1918	   negotiated SAS type.  Since the SAS is not known at the start of a
1919	   session, the a=zrtp-sas attribute will never be present in the
1920	   initial offer/answer exchange.  After the ZRTP exchange has
1921	   completed, the SAS is known and can be exchanged over the signaling
1922	   using a second offer/answer exchange (a re-INVITE in SIP terms).

1924	   The ABNF for the ZRTP attribute is as follows:

1926	        zrtp-attribute    = "a=zrtp"

1928	   The ABNF for the ZRTP SAS attribute is as follows:

1930	        zrtp-sas-attribute    = "a=zrtp-sas:" sas-string

1932	        sas-string            = non-ws-string

1934	        non-ws-string         = 1*(VCHAR/%x80-FF)
1935	                               ;string of visible characters

1937	   The ABNF for the ZRTP SASvalue attribute is as follows:

1939	        zrtp-sasvalue-attribute = "a=zrtp-sasvalue:" sas-value

1941	        sas-value               = 1*(HEXDIG)

1943	   Example of the ZRTP attribute in an initial SDP offer or answer used
1944	   at the session level:

1946	      v=0
1947	      o=bob 2890844527 2890844527 IN IP4 client.biloxi.example.com
1948	      s=
1949	      c=IN IP4 client.biloxi.example.com
1950	      a=zrtp
1951	      t=0 0
1952	      m=audio 3456 RTP/AVP 97 33
1953	      a=rtpmap:97 iLBC/8000
1954	      a=rtpmap:33 no-op/8000

1956	   Example of the ZRTP SAS and SASvalue attribute in a subsequent SDP
1957	   offer or answer used at the media level.  Note that the a=zrtp
1958	   attribute doesn't provide any additional information when used with
1959	   the SAS and SASvalue attributes but does not do any harm:

1961	      v=0
1962	      o=bob 2890844527 2890844528 IN IP4 client.biloxi.example.com
1963	      s=
1964	      c=IN IP4 client.biloxi.example.com
1965	      a=zrtp
1966	      t=0 0
1967	      m=audio 3456 RTP/AVP 97 33
1968	      a=rtpmap:97 iLBC/8000
1969	      a=rtpmap:33 no-op/8000
1970	      a=zrtp-sas:opz
1971	      a=ztrp-sasvalue:45e387ff

1973	   Another example showing a second media stream being added to the
1974	   session.  A second DH exchange is performed (instead of using the
1975	   Multistream mode) resulting in a second set of ZRTP SAS and SASvalue
1976	   attributes.

1978	      v=0
1979	      o=bob 2890844527 2890844528 IN IP4 client.biloxi.example.com
1980	      s=
1981	      c=IN IP4 client.biloxi.example.com
1982	      a=zrtp
1983	      t=0 0
1984	      m=audio 3456 RTP/AVP 97 33
1985	      a=rtpmap:97 iLBC/8000
1986	      a=rtpmap:33 no-op/8000
1987	      a=zrtp-sas:opz
1988	      a=ztrp-sasvalue:45e387ff
1989	      m=video 51372 RTP/AVP 31 33
1990	      a=rtpmap:31 H261/90000
1991	      a=rtpmap:33 no-op/8000
1992	      a=zrtp-sas:qvj
1993	      a=ztrp-sasvalue:5e017f3a

1995	13.  Appendix B - The ZRTP Disclosure flag

1997	   There are no back doors defined in the ZRTP protocol specification.
1998	   The designers of ZRTP would like to discourage back doors in ZRTP-
1999	   enabled products.  However, despite the lack of back doors in the
2000	   actual ZRTP protocol, it must be recognized that a ZRTP implementer
2001	   might still deliberately create a rogue ZRTP-enabled product that
2002	   implements a back door outside the scope of the ZRTP protocol.  For
2003	   example, they could create a product that discloses the SRTP session
2004	   key generated using ZRTP out-of-band to a third party.  They may even
2005	   have a legitimate business reason to do this for some customers.

2007	   For example, some environments have a need to monitor or record
2008	   calls, such as stock brokerage houses who want to discourage insider
2009	   trading, or special high security environments with special needs to
2010	   monitor their own phone calls.  We've all experienced automated
2011	   messages telling us that "This call may be monitored for quality
2012	   assurance".  A ZRTP endpoint in such an environment might
2013	   unilaterally disclose the session key to someone monitoring the call.
2014	   ZRTP-enabled products that perform such out-of-band disclosures of
2015	   the session key can undermine public confidence in the ZRTP protocol,
2016	   unless we do everything we can in the protocol to alert the other
2017	   user that this is happening.

2019	   If one of the parties is using a product that is designed to disclose
2020	   their session key, ZRTP requires them to confess this fact to the
2021	   other party through a protocol message to the other party's ZRTP
2022	   client, which can properly alert that user, perhaps by rendering it
2023	   in a GUI.  The disclosing party does this by sending a Disclosure
2024	   flag (D) in Confirm1 and Confirm2 messages as described in Sections
2025	   6.7 and 6.8.

2027	   Note that the intention here is to have the Disclosure flag identify
2028	   products that are designed to disclose their session keys, not to
2029	   identify which particular calls are compromised on a call-by-call
2030	   basis.  This is an important legal distinction, because most
2031	   government sanctioned wiretap regulations require a VoIP service
2032	   provider to not reveal which particular calls are wiretapped.  But
2033	   there is nothing illegal about revealing that a product is designed
2034	   to be wiretap-friendly.  The ZRTP protocol mandates that such a
2035	   product "out" itself.

2037	   You might be using a ZRTP-enabled product with no back doors, but if
2038	   your own GUI tells you the call is (mostly) secure, except that the
2039	   other party is using a product that is designed in such a way that it
2040	   may have disclosed the session key for monitoring purposes, you might
2041	   ask him what brand of secure telephone he is using, and make a mental
2042	   note not to purchase that brand yourself.  If we create a protocol
2043	   environment that requires such back-doored phones to confess their
2044	   nature, word will spread quickly, and the "unseen hand" of the free
2045	   market will act.  The free market has effectively dealt with this in
2046	   the past.

2048	   Of course, a ZRTP implementer can lie about his product having a back
2049	   door, but the ZRTP standard mandates that ZRTP-compliant products
2050	   MUST adhere to the requirement that a back door be confessed by
2051	   sending the Disclosure flag to the other party.

2053	   There will be inevitable comparisons to Steve Bellovin's 2003 April
2054	   fool's joke, when he submitted RFC 3514 [22] which defined the "Evil
2055	   bit" in the IPV4 header, for packets with "evil intent".  But we
2056	   submit that a similar idea can actually have some merit for securing
2057	   VoIP.  Sure, one can always imagine that some implementer will not be
2058	   fazed by the rules and will lie, but they would have lied anyway even
2059	   without the Disclosure flag.  There are good reasons to believe that
2060	   it will improve the overall percentage of implementations that at
2061	   least tell us if they put a back door in their products, and may even
2062	   get some of them to decide not to put in a back door at all.  From a
2063	   civic hygiene perspective, we are better off with having the
2064	   Disclosure flag in the protocol.

2066	   If an endpoint stores or logs SRTP keys or information that can be
2067	   used to reconstruct or recover SRTP keys after they are no longer in
2068	   use (i.e. the session is active), or otherwise discloses or passes
2069	   SRTP keys or information that can be used to reconstruct or recover
2070	   SRTP keys to another application or device, the Disclosure flag D
2071	   MUST be set in the Confirm1 or Confirm2 message.

2073	14.  Appendix C - Intermediary ZRTP Devices

2075	   This section discusses the operation of a ZRTP endpoint which is
2076	   actually an intermediary.  For example, consider a device which
2077	   proxies both signaling and media between endpoints.  There are three
2078	   possible ways in which such a device could support ZRTP.

2080	   An intermediary device can act transparently to the ZRTP protocol.
2081	   To do this, a device MUST pass RTP header extensions and payloads.
2082	   This is the RECOMMENDED behavior for intermediaries as ZRTP and SRTP
2083	   are best when done end-to-end.

2085	   An intermediary device could implement the ZRTP protocol and act as a
2086	   ZRTP endpoint on behalf of non-ZRTP endpoints behind the intermediary
2087	   device.  The intermediary could determine on a call-by-call basis
2088	   whether the endpoint behind it supports ZRTP based on the presence or
2089	   absence of the ZRTP SDP attribute flag (a=zrtp).  For non-ZRTP
2090	   endpoints, the intermediary device could act as the ZRTP endpoint
2091	   using its own ZID and cache.  This approach MUST only be used when
2092	   there is some other security method protecting the confidentiality of
2093	   the media between the intermediary and the inside endpoint, such as
2094	   IPSec or physical security.

2096	   The third mode, which is NOT RECOMMENDED, is for the intermediary
2097	   device to attempt to back-to-back the ZRTP protocol.  In this mode,
2098	   the intermediary would attempt to act as a ZRTP endpoint towards both
2099	   endpoints of the media session.  This approach MUST NOT be used as it
2100	   will always result in a detected Man-in-the-Middle attack and will
2101	   generate alarms on both endpoints and likely result in the immediate
2102	   termination of the session.  It cannot be stated strongly enough that
2103	   there are no usable back-to-back uses for the ZRTP protocol.

2105	   It is possible that an intermediary device acting as a ZRTP endpoint
2106	   might still receive ZRTP Hello and other messages from the inside
2107	   endpoint.  This could occur if there is another inline ZRTP device
2108	   which does not include the ZRTP SDP attribute flag.  If this occurs,
2109	   the intermediary MUST NOT pass these ZRTP messages if it is acting as
2110	   the ZRTP endpoint.

2112	15.  References

2114	15.1.  Normative References

2116	   [1]   Bradner, S., "Key words for use in RFCs to Indicate Requirement
2117	         Levels", BCP 14, RFC 2119, March 1997.

2119	   [2]   Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson,
2120	         "RTP: A Transport Protocol for Real-Time Applications", STD 64,
2121	         RFC 3550, July 2003.

2123	   [3]   Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K.
2124	         Norrman, "The Secure Real-time Transport Protocol (SRTP)",
2125	         RFC 3711, March 2004.

2127	   [4]   McGrew, D., "The use of AES-192 and AES-256 in Secure RTP",
2128	         draft-mcgrew-srtp-big-aes-00 (work in progress), April 2006.

2130	   [5]   Kivinen, T. and M. Kojo, "More Modular Exponential (MODP)
2131	         Diffie-Hellman groups for Internet Key Exchange (IKE)",
2132	         RFC 3526, May 2003.

2134	   [6]   Stone, J., Stewart, R., and D. Otis, "Stream Control
2135	         Transmission Protocol (SCTP) Checksum Change", RFC 3309,
2136	         September 2002.

2138	   [7]   Andreasen, F., "A No-Op Payload Format for RTP",
2139	         draft-wing-avt-rtp-noop-03 (work in progress), May 2005.

2141	   [8]   Ferguson, N. and B. Schneier, "Practical Cryptography", Wiley
2142	         Publishing 2003.

2144	   [9]   Barker, E. and J. Kelsey, "Recommendation for Random Number
2145	         Generation Using Deterministic Random Bit Generators", NIST
2146	         Special Publication 800-90 DRAFT (December 2005).

2148	   [10]  Wilcox, B., "Human-oriented base-32 encoding", http://
2149	         cvs.sourceforge.net/viewcvs.py/libbase32/libbase32/
2150	         DESIGN?rev=HEAD .

2152	   [11]  Handley, M., Jacobson, V., and C. Perkins, "SDP: Session
2153	         Description Protocol", RFC 4566, July 2006.

2155	15.2.  Informative References

2157	   [12]  Audet, F. and D. Wing, "Evaluation of SRTP Keying with SIP",
2158	         draft-wing-rtpsec-keying-eval-01 (work in progress), June 2006.

2160	   [13]  Zimmermann, P., "PGPfone",
2161	         http://www.pgpi.org/products/pgpfone/ .

2163	   [14]  Zimmermann, P., "Zfone", http://www.philzimmermann.com/zfone .

2165	   [15]  Blossom, E., "The VP1 Protocol for Voice Privacy Devices
2166	         Version 1.2", http://www.comsec.com/vp1-protocol.pdf .

2168	   [16]  "CryptoPhone", http://www.cryptophone.de/ .

2170	   [17]  Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A.,
2171	         Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP:
2172	         Session Initiation Protocol", RFC 3261, June 2002.

2174	   [18]  Ylonen, T. and C. Lonvick, "The Secure Shell (SSH) Protocol
2175	         Architecture", RFC 4251, January 2006.

2177	   [19]  Andreasen, F., Baugher, M., and D. Wing, "Session Description
2178	         Protocol (SDP) Security Descriptions for Media Streams",
2179	         RFC 4568, July 2006.

2181	   [20]  Arkko, J., Lindholm, F., Naslund, M., Norrman, K., and E.
2182	         Carrara, "Key Management Extensions for Session Description
2183	         Protocol (SDP) and Real Time Streaming Protocol (RTSP)",
2184	         RFC 4567, July 2006.

2186	   [21]  Arkko, J., Carrara, E., Lindholm, F., Naslund, M., and K.
2187	         Norrman, "MIKEY: Multimedia Internet KEYing", RFC 3830,
2188	         August 2004.

2190	   [22]  Bellovin, S., "The Security Flag in the IPv4 Header", RFC 3514,
2191	         April 1 2003.

2193	   [23]  Peterson, J. and C. Jennings, "Enhancements for Authenticated
2194	         Identity Management in the Session Initiation Protocol (SIP)",
2195	         RFC 4474, August 2006.

2197	Authors' Addresses

2199	   Philip Zimmermann
2200	   Zfone Project

2202	   Email: prz@mit.edu

2204	   Alan Johnston (editor)
2205	   Avaya
2206	   St. Louis, MO  63124

2208	   Email: alan@sipstation.com
2209	   Jon Callas
2210	   PGP Corporation

2212	   Email: jon@pgp.com

2214	Full Copyright Statement

2216	   Copyright (C) The Internet Society (2006).

2218	   This document is subject to the rights, licenses and restrictions
2219	   contained in BCP 78, and except as set forth therein, the authors
2220	   retain all their rights.

2222	   This document and the information contained herein are provided on an
2223	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
2224	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
2225	   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
2226	   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
2227	   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
2228	   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

2230	Intellectual Property

2232	   The IETF takes no position regarding the validity or scope of any
2233	   Intellectual Property Rights or other rights that might be claimed to
2234	   pertain to the implementation or use of the technology described in
2235	   this document or the extent to which any license under such rights
2236	   might or might not be available; nor does it represent that it has
2237	   made any independent effort to identify any such rights.  Information
2238	   on the procedures with respect to rights in RFC documents can be
2239	   found in BCP 78 and BCP 79.

2241	   Copies of IPR disclosures made to the IETF Secretariat and any
2242	   assurances of licenses to be made available, or the result of an
2243	   attempt made to obtain a general license or permission for the use of
2244	   such proprietary rights by implementers or users of this
2245	   specification can be obtained from the IETF on-line IPR repository at
2246	   http://www.ietf.org/ipr.

2248	   The IETF invites any interested party to bring to its attention any
2249	   copyrights, patents or patent applications, or other proprietary
2250	   rights that may cover technology that may be required to implement
2251	   this standard.  Please address the information to the IETF at
2252	   ietf-ipr@ietf.org.

2254	Acknowledgment

2256	   Funding for the RFC Editor function is provided by the IETF
2257	   Administrative Support Activity (IASA).