AVT WG                                                     P. Zimmermann
Internet-Draft                        Phil Zimmermann and Associates LLC
Expires: September 6, 2006                              A. Johnston, Ed.
                                                              SIPStation
                                                               J. Callas
                                                         PGP Corporation
                                                           March 5, 2006


   ZRTP: Extensions to RTP for Diffie-Hellman Key Agreement for SRTP
                      draft-zimmermann-avt-zrtp-01

Status of this Memo

   By submitting this Internet-Draft, each author represents that any
   applicable patent or other IPR claims of which he or she is aware
   have been or will be disclosed, and any of which he or she becomes
   aware will be disclosed, in accordance with Section 6 of BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF), its areas, and its working groups.  Note that
   other groups may also distribute working documents as Internet-
   Drafts.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   The list of current Internet-Drafts can be accessed at
   http://www.ietf.org/ietf/1id-abstracts.txt.

   The list of Internet-Draft Shadow Directories can be accessed at
   http://www.ietf.org/shadow.html.

   This Internet-Draft will expire on September 6, 2006.

Copyright Notice

   Copyright (C) The Internet Society (2006).

Abstract

   This document defines ZRTP, RTP (Real-time Transport Protocol) header
   extensions for a Diffie-Hellman exchange to agree on a session key
   and parameters for establishing Secure RTP (SRTP) sessions.  The ZRTP
   protocol is completely self-contained in RTP and does not require
   support in the signaling protocol or assume a Public Key


Zimmermann, et al.      Expires September 6, 2006               [Page 1]

Internet-Draft                    ZRTP                        March 2006


   Infrastructure (PKI) infrastructure.  For the media session, ZRTP
   provides confidentiality, protection against Man in the Middle (MitM)
   attacks, and, in cases where a secret is available from the signaling
   protocol, authentication.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  7
   3.  Protocol Description . . . . . . . . . . . . . . . . . . . . .  7
     3.1.  Overview . . . . . . . . . . . . . . . . . . . . . . . . .  7
     3.2.  Key Agreement Algorithm  . . . . . . . . . . . . . . . . .  9
       3.2.1.  Discovery  . . . . . . . . . . . . . . . . . . . . . .  9
       3.2.2.  Hash Commitment  . . . . . . . . . . . . . . . . . . . 10
       3.2.3.  Diffie-Hellman Exchange  . . . . . . . . . . . . . . . 11
       3.2.4.  Confirmation and Switch to SRTP  . . . . . . . . . . . 15
     3.3.  Random Number Generation . . . . . . . . . . . . . . . . . 16
   4.  RTP Header Extensions  . . . . . . . . . . . . . . . . . . . . 17
     4.1.  ZRTP Message Formats . . . . . . . . . . . . . . . . . . . 17
       4.1.1.  Message Type Block . . . . . . . . . . . . . . . . . . 17
       4.1.2.  Message Type Block . . . . . . . . . . . . . . . . . . 18
       4.1.3.  Cipher Type Block  . . . . . . . . . . . . . . . . . . 19
       4.1.4.  Public Key Type Block  . . . . . . . . . . . . . . . . 19
       4.1.5.  SAS Type Block . . . . . . . . . . . . . . . . . . . . 19
     4.2.  Hello message  . . . . . . . . . . . . . . . . . . . . . . 20
     4.3.  HelloACK message . . . . . . . . . . . . . . . . . . . . . 21
     4.4.  Commit message . . . . . . . . . . . . . . . . . . . . . . 22
     4.5.  DHPart1 message  . . . . . . . . . . . . . . . . . . . . . 23
     4.6.  DHPart2 message  . . . . . . . . . . . . . . . . . . . . . 24
     4.7.  Confirm1 message . . . . . . . . . . . . . . . . . . . . . 25
     4.8.  Confirm2 message . . . . . . . . . . . . . . . . . . . . . 26
     4.9.  Conf2ACK message . . . . . . . . . . . . . . . . . . . . . 27
     4.10. Error message  . . . . . . . . . . . . . . . . . . . . . . 27
     4.11. GoClear message  . . . . . . . . . . . . . . . . . . . . . 28
     4.12. ClearACK message . . . . . . . . . . . . . . . . . . . . . 29
   5.  Retransmissions  . . . . . . . . . . . . . . . . . . . . . . . 29
   6.  Short Authentication String  . . . . . . . . . . . . . . . . . 30
   7.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 32
   8.  Security Considerations  . . . . . . . . . . . . . . . . . . . 32
   9.  Acknowledgments  . . . . . . . . . . . . . . . . . . . . . . . 32
   10. Appendix - ZRTP, SIP, and SDP  . . . . . . . . . . . . . . . . 33
   11. References . . . . . . . . . . . . . . . . . . . . . . . . . . 33
     11.1. Normative References . . . . . . . . . . . . . . . . . . . 33
     11.2. Informative References . . . . . . . . . . . . . . . . . . 34
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 35
   Intellectual Property and Copyright Statements . . . . . . . . . . 36


Zimmermann, et al.      Expires September 6, 2006               [Page 2]

Internet-Draft                    ZRTP                        March 2006


1.  Introduction

   ZRTP is key agreement protocol which performs Diffie-Hellman key
   exchange during call setup in-band in the Real-time Transport
   Protocol (RTP) [1] media stream which has been established using some
   other signaling protocol such as Session Initiation Protocol (SIP)
   [11].  This generates a shared secret which is then used to generate
   keys and salt for a Secure RTP (SRTP) [2] session.  ZRTP borrows
   ideas from PGPfone [7].  A reference implementation of ZRTP is
   available as Zfone [8].

   The ZRTP protocol has some nice cryptographic features lacking in
   many other approaches to media session encryption.  Although it uses
   a public key algorithm, it does not rely on a public key
   infrastructure (PKI).  In fact, it does not use persistent public
   keys at all.  It uses ephemeral Diffie-Hellman (DH) with hash
   commitment, and allows the detection of Man in the Middle (MitM)
   attacks by displaying a short authentication string for the users to
   read and compare over the phone.  It has perfect forward secrecy,
   meaning the keys are destroyed at the end of the call, which
   precludes retroactively compromising the call by future disclosures
   of key material.  But even if the users are too lazy to bother with
   short authentication strings, we still get fairly decent
   authentication against a MitM attack, based on a form of key
   continuity.  It does this by caching some key material to use in the
   next call, to be mixed in with the next call's DH shared secret,
   giving it key continuity properties analogous to SSH.  All this is
   done without reliance on a PKI, key certification, trust models,
   certificate authorities, or key management complexity that bedevils
   the email encryption world.  It also does not rely on SIP signaling
   for the key management, and in fact does not rely on any servers at
   all.  It performs its key agreements and key management in a purely
   peer-to-peer manner over the RTP packet stream.

   Most secure phones rely on a Diffie-Hellman exchange to agree on a
   common session key.  But since DH is susceptible to a man-in-the-
   middle (MitM) attack, it is common practice to provide a way to
   authenticate the DH exchange.  In some military systems, this is done
   by depending on digital signatures backed by a centrally-managed PKI.
   A decade of industry experience has shown that deploying centrally
   managed PKIs can be a painful and often futile experience.  PKIs are
   just too messy, and require too much activation energy to get them
   started.  Setting up a PKI requires somebody to run it, which is not
   practical for an equipment provider.  A service provider like a
   carrier might venture down this path, but even then you have to deal
   with cross-carrier authentication, certificate revocation lists, and
   other complexities.  It is much simpler to avoid PKIs altogether,
   especially when developing secure commercial products.  It is


Zimmermann, et al.      Expires September 6, 2006               [Page 3]

Internet-Draft                    ZRTP                        March 2006


   therefore more common for commercial secure phones to augment the DH
   exchange with a Short Authentication String (SAS) combined with a
   hash commitment at the start of the key exchange, to shorten the
   length of SAS material that must be read aloud.  No PKI is required
   for this approach to authenticating the DH exchange.  The AT&T 3600,
   Eric Blossom's COMSEC secure phones [9], PGPfone [7], and CryptoPhone
   [10] are all examples of products that took this simpler lightweight
   approach.

   The main problem with this approach is inattentive users who may not
   execute the voice authentication procedure, or unattended secure
   phone calls to answering machines that cannot execute it.
   Additionally, some people worry about voice spoofing (the "Rich
   Little" attack), and some worry about trying to use it between people
   who don't know each other's voices.  This is not as much of a problem
   as it seems, because it isn't necessary that they recognize each
   other by their voice, it's only necessary that they detect that the
   voice used for the SAS procedure matches the voice in the rest of the
   phone call.  These concerns are not enough reason to embrace PKIs as
   an alternative, in my opinion.

   A popular and field-proven approach is used by SSH (Secure Shell)
   [12], which Peter Gutmann likes to call the "baby duck" security
   model.  SSH establishes a relationship by exchanging public keys in
   the initial session, when we assume no attacker is present, and this
   makes it possible to authenticate all subsequent sessions.  A
   successful MitM attacker has to have been present in all sessions all
   the way back to the first one, which is assumed to be difficult for
   the attacker.  All this is accomplished without resorting to a
   centrally-managed PKI.

   We use an analogous baby duck security model to authenticate the DH
   exchange in ZRTP.  We don't need to exchange persistent public keys,
   we can simply cache a shared secret and re-use it to authenticate a
   long series of DH exchanges for secure phone calls over a long period
   of time.  If we read aloud just one SAS, and then cache a shared
   secret for later calls to use for authentication, no new voice
   authentication rituals need to be executed.  We just have to remember
   we did one already.

   If we ever lose this cached shared secret, it is no longer available
   for authentication of DH exchanges, so we would have to do a new SAS
   procedure and start over with a new cached shared secret.  Then we
   could go back to omitting the voice authentication on later calls.

   A particularly compelling reason why this approach is attractive is
   that SAS is easiest to implement when a GUI or some sort of display
   is available, which raises the question of what to do when no display


Zimmermann, et al.      Expires September 6, 2006               [Page 4]

Internet-Draft                    ZRTP                        March 2006


   is available.  We envision some products that implement secure VoIP
   via a local network proxy, which lacks a display in many cases.  If
   we take an approach that greatly reduces the need for a SAS in each
   and every call, we can operate in GUI-less products with greater
   ease.

   It's a good idea to force your opponent to have to solve multiple
   problems in order to mount a successful attack.  Some examples of
   widely differing problems we might like to present him with are:
   Stealing a shared secret from one of the parties, being present on
   the very first session and every subsequent session to carry out an
   active MitM attack, and solving the discrete log problem.  We want to
   force the opponent to solve more than one of these problems to
   succeed.

   The protocol can make use different kinds of shared secrets.  Each
   type of shared secret is determined by a different method.  All of
   the shared secrets are hashed together to form a session key to
   encrypt the call.  An attacker must defeat all of the methods in
   order to determine the session key.

   First, there is the shared secret determined entirely by a Diffie-
   Hellman key agreement.  It changes with every call, based on random
   numbers.  An attacker may attempt a classic DH MitM attack on this
   secret, but we can protect against this by displaying and reading
   aloud a SAS, combined with adding a hash commitment at the beginning
   of the DH exchange.

   Second, there is an evolving shared secret, or ongoing shared secret
   that is automatically changed and refreshed and cached with every new
   session.  We will call this the cached shared secret, or sometimes
   the retained shared secret.  Each new image of this ongoing secret is
   a non-invertable function of its previous value and the new secret
   derived by the new DH agreement.  It's possible that no cached shared
   secret is available, because there were no previous sessions to
   inherit this value from, or because one side loses its cache.

   There are other approaches for key agreement for SRTP that compute a
   shared secret using information in the signaling.  For example, [14]
   describes how to carry a MIKEY (Multimedia Internet KEYing) [15]
   payload in SDP [16].  Or [13] describes directly carrying SRTP keying
   and configuration information in SDP.  ZRTP does not rely on the
   signaling to compute a shared secret, but If a client does produce a
   shared secret via the signaling, and makes it available to the ZRTP
   protocol, ZRTP can make use of this shared secret to augment the list
   of shared secrets that will be hashed together to form a session key.
   This way, any security weaknesses that might compromise the shared
   secret contributed by the signaling will not harm the final resulting


Zimmermann, et al.      Expires September 6, 2006               [Page 5]

Internet-Draft                    ZRTP                        March 2006


   session key.

   There may also be a static shared secret that the two parties agree
   on out-of-band in advance.  A hashed passphrase would suffice.

   The shared secret provided by the signaling (if available), the
   shared secret computed by DH, and the cached shared secret are all
   hashed together to compute the session key for a call.  If the cached
   shared secret is not available, it is omitted from the hash
   computation.  If the signaling provides no shared secret, it is also
   omitted from the hash computation.

   No DH MitM attack can succeed if the ongoing shared secret is
   available to the two parties, but not to the attacker.  This is
   because the attacker cannot compute a common session key with either
   party without knowing the cached secret component, even if he
   correctly executes a classic DH MitM attack.  Mixing in the cached
   shared secret for the session key calculation allows it to act as an
   implicit authenticator to protect the DH exchange, without requiring
   additional explicit HMACs to be computed on the DH parameters.  If
   the cached shared secret is available, a MitM attack would be
   instantly detected by the failure to achieve a shared session key,
   resulting in undecryptable packets.  The protocol can easily detect
   this.  It would be more accurate to say that the MitM attack is not
   merely detected, but thwarted.

   When adding the complexity of additional shared secrets beyond the
   familiar DH key agreement, we must make sure the lack of availability
   of the cached shared secret cannot prevent a call from going through,
   and we must also prevent false alarms that claim an attack was
   detected.

   An added benefit of using these cached shared secrets to mix in with
   the session keys is that it augments the entropy of the session key.
   Even if limits on the size of the DH exchange produces a session key
   with less than 256 bits of real work factor, the added entropy from
   the cached shared secret can bring up all the subsequent session keys
   to the full 256-bit AES key strength, assuming no attacker was
   present in the first call.

   We could have authenticated the DH exchange the same way SSH does it,
   with digital signatures, caching public keys instead of shared
   secrets.  But this approach with caching shared secrets seemed a bit
   simpler, and has the added benefit of adding more entropy to the
   session keys.

   The following sections provide an overview of the ZRTP protocol,
   describe the key agreement algorithm and RTP header extensions.


Zimmermann, et al.      Expires September 6, 2006               [Page 6]

Internet-Draft                    ZRTP                        March 2006


2.  Terminology

   In this document, the key words "MUST", "MUST NOT", "REQUIRED",
   "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY",
   and "OPTIONAL" are to be interpreted as described in RFC 2119 and
   indicate requirement levels for compliant implementations.


3.  Protocol Description

3.1.  Overview

   This section provides a description of how ZRTP works.  This
   description is non-normative in nature but is included to build
   understanding of the protocol.

   ZRTP is negotiated the same way a conventional RTP session is
   negotiated.  Using SIP, the AVP/RTP profile is used in SDP.  The ZRTP
   protocol begins after two endpoints have utilized a signaling
   protocol such as SIP and are ready to send or have already begun
   sending RTP packets.  This specification defines new RTP extension
   header which is used to carry the ZRTP messages between the
   endpoints.  Since RTP endpoints ignore unknown extension headers, the
   protocol is fully backwards compatible - a ZRTP endpoint attempting
   to perform key agreement with a non-ZRTP endpoint will simply receive
   normal RTP responses and can then inform the user that a secure
   session is not possible and either continue with the insecure session
   or terminate the session depending on the user's security policy.

   The ZRTP exchange begins at the same time that the first RTP packets
   are exchanged between the endpoints.  A ZRTP message can be embedded
   in RTP messages containing actual media samples, or they may be sent
   in separate RTP messages.  For example, if the RTP payload or codec
   supports silence or no-op messages, then these can be used for RTP
   transport.  If none of these are supported, an RTP packet containing
   comfort noise can be generated to carry a ZRTP message.

   A ZRTP endpoint initiates the exchange by sending a ZRTP Hello
   message to the other endpoint.  The purpose of the Hello message is
   to discover if the other endpoint supports the protocol and to see
   what algorithms the two ZRTP endpoints have in common.

   The Hello message contains the SRTP configuration options, and the
   ZID.  Each instance of ZRTP has a unique 96-bit random ZRTP ID or ZID
   that is generated once at installation time.  It is used to look up
   retained shared secrets in a local cache.  A single global ZID for a
   single installation is the simplest way to implement ZIDs, and may be
   required in applications where the encryption is being done by a


Zimmermann, et al.      Expires September 6, 2006               [Page 7]

Internet-Draft                    ZRTP                        March 2006


   "bump in the cord" proxy that does not know who is being called.
   However, it is specifically not precluded for an implementation to
   use multiple ZIDs, up to the limit of a separate one per callee.
   This then turns it into a long-lived "association ID" that does not
   apply to any other associations between a different pair of parties.
   It is a goal of this protocol to permit both options to interoperate
   freely.

   A response to a ZRTP Hello message is a ZRTP HelloACK message.  The
   HelloACK message simply acknowledges receipt of the Hello message and
   indicates support for the ZRTP protocol.  Since RTP uses best effort
   UDP transport, ZRTP has retransmission timers in case of lost
   datagrams.  There are two timers, both with exponential backoff
   mechanisms.  One timer is used for retransmissions of Hello messages
   and the other is used for retransmissions of all other messages after
   receipt of a HelloACK which indicates support of ZRTP by the other
   endpoint.

   After both endpoints exchange Hello and HelloACK messages, the key
   agreement exchange can begin with the ZRTP Commit message.  An
   example call flow is shown in Figure 1 below.  Note that the order of
   the Hello/HelloACK exchanges in F1/F2 and F3/F4 may be reversed.
   Also, an endpoint that receives a Hello message and wishes to
   immediately begin the ZRTP key agreement can omit the HelloACK and
   send the Commit instead.  In Figure 1, this would result in messages
   F2, F3, and F4 being omitted.  Note that the endpoint which sends the
   Commit message is considered the initiator of the ZRTP session and
   drives the key agreement exchange.


Zimmermann, et al.      Expires September 6, 2006               [Page 8]

Internet-Draft                    ZRTP                        March 2006


   Alice                                      Bob
     |                                         |
     | Alice and Bob establish a media session.|
     |                                         |
     |                   RTP                   |
     |<=======================================>|
     |                                         |
     | Hello (ver,cid,hash,cipher,pkt,sas,Alice's ZID) F1
     |---------------------------------------->|
     |                             HelloACK F2 |
     |<----------------------------------------|
     | Hello (ver,cid,hash,cipher,pkt,sas,Bob's ZID) F3
     |<----------------------------------------|
     | HelloACK F4                             |
     |---------------------------------------->|
     |                                         |
     |        Bob acts as the initiator        |
     |                                         |
     |   Commit (Bob's ZID,hash,cipher,pkt,hvi) F5
     |<----------------------------------------|
     | DHPart1 (pvr,rs1IDr,rs2IDr,sigsIDr,srtpsIDr,other_secretIDr) F6
     |---------------------------------------->|
     | DHPart2 (pvi,rs1IDi,rs2IDi,sigsIDi,ssrtpIDi,other_secretIDi) F7
     |<----------------------------------------|
     |                                         |
     | Alice and Bob generate SRTP session key.|
     |                                         |
     |               SRTP begins               |
     |<=======================================>|
     |                                         |
     | Confirm1 (plaintext,sasflag,hmac) F8    |
     |---------------------------------------->|
     |    Confirm2 (plaintext,sasflag,hmac) F9 |
     |<----------------------------------------|
     | Confirm2AK F10                          |
     |---------------------------------------->|
   Figure 1. Establishment of a SRTP session using ZRTP


3.2.  Key Agreement Algorithm

   The key agreement algorithm has four phases that are described
   normatively in the following sections.

3.2.1.  Discovery

   During the discovery phase, a ZRTP endpoint discovers if the other
   endpoint supports ZRTP and which ZRTP version, hash, cipher, public


Zimmermann, et al.      Expires September 6, 2006               [Page 9]

Internet-Draft                    ZRTP                        March 2006


   key type, and sas algorithms are supported.  In addition, each
   endpoint sends and discovers ZIDs.  The received ZID is used to
   retrieve previous retained shared secrets, rs1 and rs2.  If the
   endpoint has other secrets, then they are also collected.  The
   signaling secret (sigs), is passed from the signaling protocol used
   to establish the RTP session.  For SIP, it is the dialog identifier
   of a Secure SIP (SIPS) session: a string composed of Call-ID, to tag,
   and from tag.  From the definitions in RFC 3261 [11]:

   sigs = hash(call-id | to-tag | from-tag)

   Note: the dialog identifier of a non-secure SIP session should not be
   considered a signaling secret as it has no confidentiality
   protection.  For the SRTP secret (srtps), it is the SRTP master key
   and salt.  This information may have been passed in the signaling
   using MIKEY or SDP Security Descriptions, for example:

   srtps = hash(SRTP master key | SRTP master salt)

   Additional shared secrets can be defined and used as other_secret.
   If no secret of a given type is available, a random value is
   generated and used for that secret to ensure a mismatch in the hash
   comparisons in the DHPart1 and DHPart2 messages.  This prevents an
   eavesdropper from knowing how many shared secrets are available
   between the endpoints.

   A Hello message can be sent at any time, but is usually sent at the
   start of an RTP session to determine if the other endpoint supports
   ZRTP, and also if the SRTP implementations are compatible.  A Hello
   message is retransmitted using timer T1 and an exponential backoff
   mechanism detailed in Section 5 until the receipt of a HelloACK
   message or a Commit message.

3.2.2.  Hash Commitment

   The hash commitment is performed by the initiator of the ZRTP
   exchange.  From the intersection of the algorithms in the sent and
   received Hello messages, the initiator chooses a hash, cipher, public
   key type, and sas algorithm to be used.

   The key agreement begins with the initiator choosing a fresh random
   Diffie-Hellman (DH) secret value (svi) based on the chosen public key
   type value, and computing the public value.  (Note that to speed up
   processing, this computation can be done in advance.)  For guidance
   on generating random numbers, see the section on Random Number
   Generation.  The Diffie-Hellman secret value, svi, SHOULD be twice as
   long as the AES key length.  This means, if AES 128 is used, the DH
   secret value SHOULD be 256 bits long.  If AES 256 is used, the secret


Zimmermann, et al.      Expires September 6, 2006              [Page 10]

Internet-Draft                    ZRTP                        March 2006


   value SHOULD be 512 bits long.

   pvi = g^svi mod p

   where g and p are determined by the public key type value, and a
   hash, hvi, of the public value using the chosen hash algorithm.  The
   hvi includes the set of hash, cipher, pkt, and sas types from the
   responder's Hello message in the following order:

   hvi=hash(pvi | hashr1-5 | cipherr1-5 | pktr1-5 | sasr1-5)

   The information from the responder's Hello message is included in the
   hash calculation to prevent a bid-down attack by modification of the
   responder's Hello message.

   Note: If both sides send Commit messages initiating a secure session
   at the same time, the Commit message with the lowest hvi value is
   discarded and the other side is the initiator.  This breaks the tie,
   allowing the protocol to proceed from this point with a clear
   definition of who is the initiator and who is the responder.

3.2.3.  Diffie-Hellman Exchange

   The purpose of the Diffie-Hellman exchange is for the two ZRTP
   endpoints to generate a new shared secret, s0.  In addition, the
   endpoints discover if they have any shared secrets in common.  If
   they do, this exchange allows them to discover how many and agree on
   an ordering for them: s1, s2, etc.

3.2.3.1.  Responder Behavior

   Upon receipt of the Commit message, the responder generates its own
   fresh random DH secret value, svr, and computes the public value.
   (Note that to speed up processing, this computation can be done in
   advance.)  For guidance on random number generation, see the section
   on Random Number Generation.  The Diffie-Hellman secret value, svr,
   SHOULD be twice as long as the AES key length.  This means, if AES
   128 is used, the DH secret value SHOULD be 256 bits long.  If AES 256
   is used, the secret value SHOULD be 512 bits long.

   pvr = g^svr mod p

   The final shared secret, s0, is calculated by hashing the
   concatenation of the Diffie-Hellman shared secret (DHSS) followed by
   the (possibly empty) set of shared secrets that are actually shared
   between the initiator and responder.  For computing the hash, the
   shared secrets are sorted by ascending order of the initiator's
   corresponding shared secret IDs.  The remainder of this section


Zimmermann, et al.      Expires September 6, 2006              [Page 11]

Internet-Draft                    ZRTP                        March 2006


   describes an algorithm to accomplish this.

   First, an HMAC keyed hash is calculated using the first retained
   shared secret, rs1, as the key on the string "Responder" which
   generates a retained secret ID, rs1IDr, which is truncated to 64
   bits.  HMACs are calculated in a similar way for additonal shared
   secrets:

   rs1IDr = HMAC(rs1, "Responder")

   rs2IDr = HMAC(rs2, "Responder")

   sigsIDr = HMAC(sigs, "Responder")

   srtpsIDr = HMAC(srtps, "Responder")

   other_secretIDr = HMAC(other_secret, "Responder")

   A ZRTP DHPart1 message is generated containing pvr and the set of
   keyed hashes (HMACs) derived from the possibly shared secrets.

   Upon receipt of the DHPart2 message, the responder checks that the
   initiator's public DH value is not equal to 1 or p-1.  An attacker
   might inject a false DHPart2 packet with a value of 1 or p-1 for
   g^svi mod p, which would cause a disastrously weak final DH result to
   be computed.  If pvi is 1 or p-1, the user should be alerted of the
   attack and the protocol must be aborted.  Otherwise, the responder
   then computes the hash of the public DH value in the DHPart2 with the
   hash from the Commit.  If they are different (hash(pvi)!= hvi), a
   MitM attack is taking place and the user is alerted.

   The responder then calculates the Diffie-Hellman result:

   DHResult = pvi^svr mod p

   The responder then calculates the Diffie-Hellman shared secret:

   DHSS = hash(DHResult)

   The set of five shared secret IDs received from the DHPart2 message
   are stored as set A.

   The responder then calculates the set of secret IDs that are expected
   to be received from the initiator in the DHPart2 message:

   rs1IDi = HMAC(rs1, "Initiator")

   rs2IDi = HMAC(rs2, "Initiator")


Zimmermann, et al.      Expires September 6, 2006              [Page 12]

Internet-Draft                    ZRTP                        March 2006


   sigsIDi = HMAC(sigs, "Initiator")

   srtpsIDi = HMAC(srtps, "Initiator")

   other_secretIDi = HMAC(other_secret, "Initiator")

   The set (rs1IDi, rs2IDi, sigsIDi, srtpsIDi, other_secretIDi) is set
   B. Set C is the intersection of set A and set B. Set C is then sorted
   in ascending numerical order.  Set C will contain between zero and
   five secret IDs.  Set D is then created as the actual secrets
   corresponding to the secret IDs in set C in the same order.  The set
   D is expanded to 5 values by adding in null secrets: s1, s2, s3, s4,
   and s5.  The final shared secret, s0, is calculated by hashing the
   concatenation of the DHSS and the set of non-null shared secrets.  As
   a result, the null secrets have no effect on the concatenation
   operation:

   s0 = hash(DHSS | s1 | s2 | s3 | s4 | s5)

3.2.3.2.  Initiator Behavior

   Upon receipt of the DHPart1 message, the initiator checks that the
   responder's public DH value is not equal to 1 or p-1.  An attacker
   might inject a false DHPart1 packet with a value of 1 or p-1 for
   g^svr mod p, which would cause a disastrously weak final DH result to
   be computed.  If pvr is 1 or p-1, the user should be alerted of the
   attack and the protocol must be aborted.

   If pvr is not 1 or p-1, the initiator looks up any retained shared
   secrets associated with the responder's ZID.  The final shared
   secret, s0, is calculated by hashing the concatenation of the DHSS
   followed by the (possibly empty) set of shared secrets that are
   actually shared between the initiator and responder.  For computing
   the hash, the shared secrets are sorted by ascending order of the
   initiator's corresponding shared secret IDs.  The remainder of this
   section describes an algorithm to accomplish this.

   First, an HMAC keyed hash is calculated using the first retained
   shared secret, rs1, as the key on the string "Initiator" which
   generates a retained secret ID, rs1IDi, which is truncated to 64
   bits.  HMACs are calculated in a similar way for additional shared
   secrets:

   rs1IDi = HMAC(rs1, "Initiator")

   rs2IDi = HMAC(rs2, "Initiator")

   sigsIDi = HMAC(sigs, "Initiator")


Zimmermann, et al.      Expires September 6, 2006              [Page 13]

Internet-Draft                    ZRTP                        March 2006


   srtpsIDi = HMAC(srtps, "Initiator")

   other_secretIDi = HMAC(other_secret, "Initiator")

   The initiator then sends a DHPart2 message containing the initiator's
   public DH value and the set of calculated retained secret IDs.

   The initiator calculates the same Diffie-Hellman result using:

   DHResult = pvr^svi mod p

   The initiator then calculates the DH shared secret using:

   DHSS = hash(DHResult)

   The set of five shared secret IDs received in the DHPart1 message are
   stored as set A.

   The initiator then calculates the set of secret IDs that are expected
   to be received from the responder in the DHPart1 message:

   rs1IDr = HMAC(rs1, "Responder")

   rs2IDr = HMAC(rs2, "Responder")

   sigsIDr = HMAC(sigs, "Responder")

   srtpsIDr = HMAC(srtps, "Responder")

   other_secretIDr = HMAC(other_secret, "Responder")

   The set (rs1IDr, rs2IDr, sigsIDr, srtpsIDr, other_secretIDr) is B.
   Set C is the intersection of set A and set B. Set C will contain
   between zero and five secret IDs.  Set D is then created as the
   actual secrets corresponding to the secret IDs in set C. Set E is the
   set of secret IDs that corresponds to the secrets in set D sent in
   the DHPart2 message.  Set E is then sorted in ascending numerical
   order.  Set D is then sorted to the same order as the corresponding
   secrets in set E.

   The set D is expanded to 5 values by adding in null secrets: s1, s2,
   s3, s4, and s5.  The final shared secret, s0, is calculated by
   hashing the concatenation of the DHSS and the set of non-null shared
   secrets.  As a result, the null secrets have no effect on the
   concatenation operation:

   s0 = hash(DHSS | s1 | s2 | s3 | s4 | s5)


Zimmermann, et al.      Expires September 6, 2006              [Page 14]

Internet-Draft                    ZRTP                        March 2006


3.2.4.  Confirmation and Switch to SRTP

   The SRTP master key and master salt are then generated using the
   shared secret.  Separate SRTP keys and salts are used in each
   direction for each media stream.  Unless otherwise specified, ZRTP
   uses SRTP with no MKI, 32 bit authentication using HMAC-SHA1, AES-CM
   128 or 256 bit key length, 112 bit session salt key length, 2^48 key
   derivation rate, and SRTP prefix length 0.

   The ZRTP initiator encrypts and the ZRTP responder decrypts packets
   by using srtpkeyi and srtpsalti, which are generated by:

   srtpkeyi = HMAC(s0,"Initiator SRTP master key")

   srtpsalti = HMAC(s0,"Initiator SRTP master salt")

   The ZRTP responder encrypts and the ZRTP initiator decrypts packets
   by using srtpkeyr and srtpsaltr, which are generated by:

   srtpkeyr = HMAC(s0,"Responder SRTP master key")

   srtpsaltr = HMAC(s0,"Responder SRTP master salt")

   The HMAC key is generated by:

   hmackey = HMAC(s0,"HMAC key")

   Both sides now discard the rs2 value and store rs1 as rs2.  A new rs1
   is calculated from s0:

   rs1 = HMAC (s0, "retained secret")

   The endpoints can now switch to SRTP and begin packet encryption.
   The ZRTP Initiator and Responder use their own keying material for
   the SRTP session.  No MKI is used and a 32 bit authentication tag is
   used.

   The ZRTP Confirm1 and Confirm2 messages are sent for two reasons.
   First, they confirm that all the key agreement calculations were
   successful and the encryption is working, and they enable us to
   automatically detect a DH MitM attack from a reckless attacker who
   does not know the retained shared secret.  Second, they enable us to
   transmit the SASflag under cover of SRTP encryption, shielding it
   from a passive observer who would like to know if the human users are
   in the habit of diligently verifying the SAS.

   In the Confirm1 and Confirm2 messages, the sasflag Boolean is
   converted to an octet called sasflagoctet (resulting in either 0x00


Zimmermann, et al.      Expires September 6, 2006              [Page 15]

Internet-Draft                    ZRTP                        March 2006


   or 0x01).  Confirm1 and Confirm2 messages contain an HMAC of some
   known plaintext and the sasflagoctet.  The HMAC is explicitly
   included in the payload because we may not always be able to rely on
   the built-in authentication tag in SRTP, which might be configured to
   different sizes, including none.

   hmac = HMAC(hmackey, "known plaintext" | sasflagoctet )

   This information is not carried in the extension header but inserted
   at the start of the SRTP payload.

   The Comfirm2ACK message completes the exchange.

   The optional GoClear message is used to switch from SRTP back to RTP.
   To avoid relying on the optional SRTP authentication tag, the GoClear
   contains an HMAC of the string "GoClear" computed with the hmackey
   derived from the shared secret:

   clear_hmac = HMAC(hmackey, "GoClear")

   A GoClear message receives either a ClearACK message or an Error
   message, which indicates that the ZRTP endpoint does not support the
   GoClear mechanism or that the GoClear has failed authentication (the
   clear_hmac does not validate).

3.3.  Random Number Generation

   The ZRTP protocol uses random numbers for cryptographic key material,
   notably for the DH secret exponents, which must be freshly generated
   with each session.  Whenever a random number is needed, all of the
   following criteria must be satisfied:

   It MUST be derived from a physical entropy source, such as RF noise,
   acoustic noise, thermal noise, high resolution timings of
   environmental events, or other unpredictable physical sources of
   entropy.  Chapter 10 of [4] gives a detailed explanation of
   cryptographic grade random numbers and provides guidance for
   collecting suitable entropy.  The raw entropy must be distilled and
   processed through a deterministic random bit generator (DRBG).
   Examples of DRBGs may be found in NIST SP 800-90 [5], and in [4].

   It MUST be freshly generated, meaning that it must not have been used
   in a previous calculation.

   It MUST be greater than or equal to two, and less than or equal to
   2^L - 1, where L is the number of random bits required.

   It MUST be chosen with equal probability from the entire available


Zimmermann, et al.      Expires September 6, 2006              [Page 16]

Internet-Draft                    ZRTP                        March 2006


   number space, e.g., [2, 2^L - 1].


4.  RTP Header Extensions

   This specification defines a new RTP header extension used for all
   ZRTP messages.  When used, the X bit is set in the RTP header to
   indicate the presence of the RTP header extension.

   Section 5.3.1 in RFC 3550 defines the format of an RTP Header
   extension.  The Header extension is appended to the RTP header.  The
   first 16 bits are an identifier for the header extension, and the
   following 16 bits are length of the extension header in 32 bit words.
   All word lengths referenced in this specification follow RFC 3550 and
   are 32 bits or 4 octets.  All integer fields are carried in network
   byte order, that is, most significant byte (octet) first, commonly
   known as big-endian.  Each ZRTP message is carried in a single RTP
   header extension which is the value of 0x505A.

4.1.  ZRTP Message Formats

   ZRTP messages are designed to simplify endpoint parsing requirements
   and to reduce the opportunities for buffer overflow attacks (a good
   goal of any security extension should be to not introduce new attack
   vectors...)

   ZRTP uses 8 octet blocks (2 words) to encode many ZRTP parameters.
   These fixed-length blocks are used for Message Type, Hash Type,
   Cipher Type, and Public Key Type.  The values in the blocks are ASCII
   strings which are extended with spaces (0x20) to make them 8
   characters long.  Currently defined block values are listed in Tables
   1-4 below.  Additional block values may be defined and used.

   ZRTP uses this ASCII encoding to simplify debugging and make it
   "ethereal friendly".

4.1.1.  Message Type Block

   Currently eleven Message Type Blocks are defined - they represent the
   set of ZRTP message primitives.  ZRTP endpoints MUST support the
   Hello, HelloACK, Commit, DHPart1, DHPart2, Confirm1, Confirm2,
   Conf2ACK, and Error block types.  They MAY support GoClear and
   ClearACK.


Zimmermann, et al.      Expires September 6, 2006              [Page 17]

Internet-Draft                    ZRTP                        March 2006


    Message Type Block   |  Meaning
    ---------------------------------------------------
    Hello                |  Hello Message
                         |  defined in Section 4.2
    ---------------------------------------------------
    HelloACK             |  HelloACK Message
                         |  defined in Section 4.3
    ---------------------------------------------------
    Commit               |  Commit Message
                         |  defined in Section 4.4
    ---------------------------------------------------
    DHPart1              |  DHPart1 Message
                         |  defined in Section 4.4
    ---------------------------------------------------
    DHPart2              |  DHPart2 Message
                         |  defined in Section 4.5
    ---------------------------------------------------
    Confirm1             |  Confirm1 Message
                         |  defined in Section 4.6
    ---------------------------------------------------
    Confirm2             |  Confirm2 Message
                         |  defined in Section 4.7
    ---------------------------------------------------
    Conf2ACK             |  Conf2ACK Message
                         |  defined in Section 4.8
    ---------------------------------------------------
    Error                |  Error Message
                         |  defined in Section 4.9
    ---------------------------------------------------
    GoClear              |  GoClear Message
                         |  defined in Section 4.10
    ---------------------------------------------------
    ClearACK             |  ClearACK Message
                         |  defined in Section 4.11
    ---------------------------------------------------
    Table 1. Message Block Type Values

4.1.2.  Message Type Block

   Only one Hash Type is currently defined, SHA256, and all ZRTP
   endpoints MUST support this hash.  Additional Hash Types can be
   registered and used.


    Hash Type Block      |  Meaning
    ---------------------------------------------------
    SHA256               |  SHA-256 Hash defined in [SHA-256]
    ---------------------------------------------------


Zimmermann, et al.      Expires September 6, 2006              [Page 18]

Internet-Draft                    ZRTP                        March 2006


    Table 2. Hash Block Type Values


4.1.3.  Cipher Type Block

   All ZRTP endpoints MUST support AES128 and MAY support AES256 or
   other Cipher Types.  Also, if AES 128 is used, DH3k should be used.
   If AES 256 is used, DH4k should be used.


     Cipher Type Block    |  Meaning
    ---------------------------------------------------
    AES128                |  AES-CM with 128 bit keys
                          |  as defined in RFC 3711
    ---------------------------------------------------
    AES256                |  AES-CM with 256 bit keys
                          |  as defined in RFC 3711
    ---------------------------------------------------
    Table 3. Cipher Block Type Values

4.1.4.  Public Key Type Block

   All ZRTP endpoints MUST support DH3072 and MAY support DH4096.  ZRTP
   endpoints MUST use the DH generator function g=2.  The choice of AES
   key length is coupled to the choice of public key type.  If AES 128
   is chosen, DH3072 SHOULD be used.  If AES 256 is chosen, DH4096
   SHOULD be used.


     Public Key Type Block|  Meaning
    ---------------------------------------------------
    DH3072                |  DH with p=3072 bit prime
                          |  as defined in RFC 3526
    ---------------------------------------------------
    DH4096                |  DH with p=4096 bit prime
                          |  as defined in RFC 3526
    ---------------------------------------------------
    Table 4. Public Key Block Type Values

4.1.5.  SAS Type Block

   All ZRTP endpoints MAY support the libase32 Short Authentication
   String scheme or other SAS schemes.  The optional ZRTP SAS is
   described in Section 6.


Zimmermann, et al.      Expires September 6, 2006              [Page 19]

Internet-Draft                    ZRTP                        March 2006


     SAS Type Block       |  Meaning
    ---------------------------------------------------
    libase32              |  Short Authentication String using
                          |  libbase32 encoding defined in Section 6.
    ---------------------------------------------------
    Table 5. SAS Block Type Values


4.2.  Hello message

   The Hello message has the format shown in Figure 2 below.  The header
   extension payload contains the ZRTP version number and the list of
   algorithms supported by SRTP.  The extension header field format is
   shown in Figure 2.

   The Hello ZRTP message begins with the ZRTP header extension field
   followed by the 32 bit word count of the header field.  Next is a
   word containing the version (ver) of ZRTP.  For this specification,
   the version is the string "0.01".  Next is the Client Identifier
   string (cid) which is 15 octets long and identifies the vendor and
   release of the ZRTP software.  The Passive bit (P) is a Boolean
   normally set to False.  A ZRTP endpoint which is configured to never
   initiate secure sessions is regarded as passive, and would set the P
   bit to True.  Next is a list of supported Hash Types, Cipher Types,
   public key types, and SAS Type.  Five possible algorithms are listed
   for each using the Blocks defined in Tables 2, 3, 4, and 5.  If fewer
   than five algorithms are supported, spaces (0x20) are used to pad out
   the 10 words for each type.  The last parameter is the ZID, the 96
   bit long unique identifier for the ZRTP endpoint.


Zimmermann, et al.      Expires September 6, 2006              [Page 20]

Internet-Draft                    ZRTP                        March 2006


        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|        length=50 words        |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |              Message Type Block=Hello (2 words)               |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                        version (1 word)                       |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                                                               |
       |                 Client Identifier (15 octets)                 |
       |                                               +-+-+-+-+-+-+-+-+
       |                                               |0 0 0 0 0 0 0|P|
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                                                               |
       |                 Hash Type Blocks 1-5 (10 words)               |
       |                              . . .                            |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                                                               |
       |                Cipher Type Blocks 1-5 (10 words)              |
       |                              . . .                            |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                                                               |
       |             Public Key Type Blocks 1-5 (10 words)             |
       |                              . . .                            |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                                                               |
       |                  SAS Type Blocks 1-5 (10 words)               |
       |                              . . .                            |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                                                               |
       |                         ZID  (3 words)                        |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
   Figure 2. Extension header format for Hello message


4.3.  HelloACK message

   The HelloACK message is used to stop retransmissions of a Hello
   message.  A HelloACK is sent regardless if the version number in the
   Hello is supported or the algorithm list supported.  The receipt of a
   HelloACK stops retransmission of the Hello message.  The format is


Zimmermann, et al.      Expires September 6, 2006              [Page 21]

Internet-Draft                    ZRTP                        March 2006


   shown in Figure 3 below.  Note that a Commit message can be sent in
   place of a HelloACK by an initiator.


        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|         length=2 words        |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |              Message Type Block=HelloACK (2 words)            |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     Figure 3. Extension header format for HelloACK message


4.4.  Commit message

   The Commit message is sent to initiate the key agreement process
   after receiving a Hello message.  The Commit message contains the
   initiator's ZID and a list of selected algorithms (hash, cipher, pkt,
   sas) and hvi, a hash of the public DH value of the initiator and the
   algorithm list from the responder's Hello message.  A Commit cannot
   be sent until a Hello message has been received.


Zimmermann, et al.      Expires September 6, 2006              [Page 22]

Internet-Draft                    ZRTP                        March 2006


        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|        length=16 words        |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |              Message Type Block=Commit (2 words)              |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                                                               |
       |                         ZID  (3 words)                        |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                    Hash Type Blocks (2 words)                 |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                   Cipher Type Block (2 words)                 |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                Public Key Type Block (2 words)                |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                    SAS Type Block (2 words)                   |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                                                               |
       |                             hvi (8 words)                     |
       |                                . . .                          |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     Figure 4. Extension header format for Commit message

4.5.  DHPart1 message

   The DHPart1 message contain begins the DH exchange.  The format is
   shown in Figure 5 below.  The DHPart1 message is sent if a valid
   Commit message is received.  The length of the pvr value depends on
   the Public Key Type chosen.  If DH4096 is used, the pvr will be 128
   words (512 octets).  If DH3072 is used, it is 96 words (384 octets).

   The next five parameters are HMACs of potential shared secrets used
   in generating the ZRTP secret.  The first two, rs1IDr and rs2IDr, are
   the HMACs of the responder's two retained shared secrets, truncated
   to 64 bits.  Next is sigsIDr, the HMAC of the responder's signaling
   secret, truncated to 64 bits.  Next is srtpsIDr, the HMAC of the
   responder's SRTP secret, truncated to 64 bits.  The last parameter is
   the HMAC of an additional shared secret.  For example, if multiple
   SRTP secrets are available or some other secret is used, it can used
   as the other_secret.


Zimmermann, et al.      Expires September 6, 2006              [Page 23]

Internet-Draft                    ZRTP                        March 2006


        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|   length=depends on PK Type   |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |              Message Type Block=DHPart1 (2 words)             |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                                                               |
       |                 pvr (length depends on PK Type)               |
       |                               . . .                           |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                        rs1IDr (2 words)                       |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                        rs2IDr (2 words)                       |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                        sigsIDr (2 words)                      |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                       srtpsIDr (2 words)                      |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                    other_secretIDr (2 words)                  |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      Figure 5. Extension header format for DHPart1 message


4.6.  DHPart2 message

   The DHPart2 message completes the DH exchange.  A DHPart2 message is
   sent if a valid DHPart1 message is received.  The length of the pvi
   value depends on the Public Key Type chosen.  If DH4096 is used, the
   pvr will be 128 words (512 octets).  If DH3072 is used, it is 96
   words (384 octets).

   The next five parameters are HMACs of potential shared secrets used
   in generating the ZRTP secret.  The first two, rs1IDi and rs2IDi, are
   the HMACs of the initiator's two retained shared secrets, truncated
   to 64 bits.  Next is sigsIDi, the HMAC of the initiator's signaling
   secret, truncated to 64 bits.  Next is srtpsIDi, the HMAC of the
   initiator's SRTP secret, truncated to 64 bits.  The last parameter is
   the HMAC of an additional shared secret.  For example, if multiple
   SRTP secrets are available or some other secret is used, it can be
   included.


Zimmermann, et al.      Expires September 6, 2006              [Page 24]

Internet-Draft                    ZRTP                        March 2006


        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|   length=depends on PK Type   |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |              Message Type Block=DHPart2 (2 words)             |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                                                               |
       |                   pvi (length depends on PK Type)             |
       |                               . . .                           |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                        rs1IDi (2 words)                       |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                        rs2IDi (2 words)                       |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                        sigsIDi (2 words)                      |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                       srtpsIDi (2 words)                      |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                    other_secretIDi (2 words)                  |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     Figure 6. Extension header format for DHPart2 message

4.7.  Confirm1 message

   The Confirm1 message is sent in response to a valid DHPart2 message
   after the SRTP session key and parameters have been negotiated.  As a
   result, it is always sent in an SRTP packet.  The format is shown in
   Figure 7 below.  The header extension itself has no parameters
   besides the Message Type Block.  However, three parameters are
   carried in the SRTP payload.  The plaintext parameter contains the
   known plaintext "known plaintext".  The sasflag (S) is a Boolean bit.
   The hmac is a hash over the known plaintext "known plaintext" and the
   SASflag Boolean converted to the octet 0x00 or 0x01.

   The parameters included in the SRTP payload MUST NOT be allowed to
   pass to the RTP stack or errors may occur with the media stream.


Zimmermann, et al.      Expires September 6, 2006              [Page 25]

Internet-Draft                    ZRTP                        March 2006


        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|         length=2 words        |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |              Message Type Block=Confirm1 (2 words)            |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

         At the start of the SRTP payload:

       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                                                               |
       |                                                               |
       |                     plaintext (15 octets)                     |
       |                                               +-+-+-+-+-+-+-+-+
       |                                               |0 0 0 0 0 0 0|S|
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                                                               |
       |                         hmac (8 words)                        |
       |                             . . .                             |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     Figure 7. Extension header format for Confirm1 message


4.8.  Confirm2 message

   The Confirm2 message is sent in response to a Confirm1 message after
   the SRTP session key and parameters have been negotiated.  As a
   result, it is always sent in an SRTP packet.  The format is shown in
   Figure 8 below.  The header extension itself has no parameters
   besides the Message Type Block.  However, three parameters are
   carried in the SRTP payload.  The plaintext parameter contains the
   known plaintext "known plaintext".  The sasflag (S) is a Boolean bit.
   The hmac is a hash over the known plaintext "known plaintext" and the
   SASflag Boolean converted to the octet 0x00 or 0x01.

   The parameters included in the SRTP payload MUST NOT be allowed to
   pass to the RTP stack or errors may occur with the media stream.


Zimmermann, et al.      Expires September 6, 2006              [Page 26]

Internet-Draft                    ZRTP                        March 2006


        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|         length=2 words        |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |              Message Type Block=Confirm2 (2 words)            |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

        At the start of the SRTP payload:

       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                                                               |
       |                     plaintext (15 octets)                     |
       |                                               +-+-+-+-+-+-+-+-+
       |                                               |0 0 0 0 0 0 0|S|
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                                                               |
       |                         hmac (8 words)                        |
       |                             . . .                             |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
      Figure 8. Extension header format for Confirm1 message


4.9.  Conf2ACK message

   The Conf2ACK message is sent in response to a valid Confirm2 message.
   The format is shown in Figure 9 below.  The receipt of a Conf2ACK
   stops retransmission of the Confirm2 message.


        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|         length=2 words        |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |              Message Type Block=Conf2ACK (2 words)            |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     Figure 9. Extension header format for Conf2ACK message

4.10.  Error message

   An Error message is sent in response to another ZRTP message which is
   not valid or not supported.  The format is shown in Figure 10 below.
   Reasons could be: missing block or parameter, chosen parameter not in
   offered list, checksum failure, message type block not understood


Zimmermann, et al.      Expires September 6, 2006              [Page 27]

Internet-Draft                    ZRTP                        March 2006


   etc.  The ZRTP message type that generated the error is included in
   the Message Type Block.  This message can be sent in response to any
   ZRTP message except Hello and HelloACK and is never acknowledged or
   retransmitted.


        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|        length=4 words         |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |              Message Type Block=Error (2 words)               |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                  Message Type Block  (2 words)                |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     Figure 10. Extension header format for Error message


4.11.  GoClear message

   The optional GoClear message is sent to switch from SRTP back to RTP.
   The format is shown in Figure 11 below.  The clear_hmac is used to
   authenticate the GoClear message so that bogus GoClear messages
   introduced by an attacker can be detected and discarded.  This
   message is retransmitted at 500ms intervals until the receipt of a
   ClearACK message or an Error message.

   After sending a GoClear message, the ZRTP endpoint stops sending SRTP
   packets.  When a ClearACK is received, the ZRTP endpoint deletes the
   crypto context for the SRTP session and may then resume sending RTP
   packets.  However, if instead an Error message is received, the SRTP
   session resumes as if the GoClear had never been sent.


        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|        length=10 words        |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |              Message Type Block=GoClear (2 words)             |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                                                               |
       |                       clear_hmac (8 words)                    |
       |                             . . .                             |
       |                                                               |


Zimmermann, et al.      Expires September 6, 2006              [Page 28]

Internet-Draft                    ZRTP                        March 2006


       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     Figure 11. Extension header format for GoClear message


4.12.  ClearACK message

   The optional ClearACK message is sent to acknowledge receipt of a
   GoClear.  A ClearACK is only sent if the clear_hmac from the GoClear
   message is authenticated.  Otherwise, an Error message is returned.
   The format is shown in Figure 12 below.  A ZRTP endpoint that
   receives a GoClear message stops sending SRTP packets, generates a
   ClearACK in response, and deletes the crypto context for the SRTP
   session.  Until confirmation from the user is received (e.g. clicking
   a button, pressing a DTMF key, etc.), the ZRTP endpoint MUST NOT
   resume sending RTP packets.  The endpoint then renders the
   information that the media session has switched to clear mode to the
   user and waits for confirmation from the user.  To prevent pinholes
   from closing or NAT bindings from expiring, the ClearACK message
   should be resent every 5 seconds while waiting for confirmation from
   the user.  After confirmation of the notification is received from
   the user, the sending of RTP packets may begin.

   Note that if the GoClear/ClearACK mechanism is not supported by a
   ZRTP endpoint, an Error message MUST be sent in response to a GoClear
   message.


        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |0 1 0 1 0 0 0 0 0 1 0 1 1 0 1 0|        length=2 words         |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |              Message Type Block=ClearACK (2 words)            |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
     Figure 12. Extension header format for ClearACK message


5.  Retransmissions

   ZRTP uses two retransmission timers T1 and T2.  T1 is used for
   retransmission of Hello messages, when the support of ZRTP by the
   other endpoint may not be known.  T2 is used in retransmissions of
   all the other ZRTP messages with the exception of GoClear.  The
   retransmission of GoClear messages is discussed in the section on
   GoClear.


Zimmermann, et al.      Expires September 6, 2006              [Page 29]

Internet-Draft                    ZRTP                        March 2006


   Practical experience has shown that RTP packet loss at the start of
   an RTP session can be extremely high.  Since the entire ZRTP message
   exchange occurs during this period, the defined retransmission scheme
   is defined to be aggressive.  Since ZRTP packets with the exception
   of the DHPart1 and DHPart2 messages are small, this should have
   minimal effect on overall bandwidth utilization of the media session.

   Hello ZRTP requests are retransmitted at an interval that starts at
   T1 seconds and doubles after every retransmission, capping at 200ms.
   A Hello message is retransmitted 20 times before giving up.  T1 has a
   recommended value of 50 ms.  Retransmission of a Hello ends upon
   receipt of a HelloACK or Commit message.

   Non-Hello ZRTP requests are retransmitted only by the initiator -
   that is, only Commit, DHPart2, and Confirm2 are retransmitted if the
   corresponding message from the responder, DHPart1, Confirm1, and
   Conf2ACK, are not received.  Non-Hello ZRTP messages are
   retransmitted at an interval that starts at T2 seconds and doubles
   after every retransmission, capping at 600ms.  Only the ZRTP
   initiator performs retransmissions.  Each message is retransmitted 10
   times before giving up and resuming a normal RTP session.  T2 has a
   default value of 150ms.  Each message has a response message that
   stops retransmissions, as shown in Table 6.  The high value of T2
   means that retransmissions will likely only occur with packet loss.
   The receipt of an Error message ends retransmission of the message
   identified in the Error message.


       Message      Acknowledgement Message
       -------      -----------------------
       Hello        HelloACK or Commit
       Commit       DHPart1
       DHPart2      Confirm1
       Confirm2     Conf2ACK
       GoClear      ClearACK
      Table 6. Retransmitted ZRTP Messages and Responses


6.  Short Authentication String

   This section will discuss the implementation of the optional Short
   Authentication String, or SAS in ZRTP.

   The Short Authentication String (SAS) value is calculated as the hash
   of both DH public values and the string "Short Authentication
   String".


Zimmermann, et al.      Expires September 6, 2006              [Page 30]

Internet-Draft                    ZRTP                        March 2006


   sasvalue = hash(pvi | pvr | "Short Authentication String")

   The rendering of the SAS value depends on the SAS Type agreed upon in
   the Commit message.  For the SAS Type of libase32, the last 20 bits
   of the sasvalue are rendered as a form of base32 encoding known as
   libbase32 [6].  The purpose of libbase32 is to represent arbitrary
   sequences of octets in a form that is as convenient as possible for
   human users to manipulate.  As a result, the choice of characters is
   slightly different from base32 as defined in RFC 3548.  The last 20
   bits of the sasvalue results in four libbase32 characters which are
   rendered to both ZRTP endpoints.  Other SAS Types may be defined to
   render the SAS value in other ways.

   The sasflag is set based on the user indicating that SAS has been
   successfully performed.  The sasflag is exchanged securely in the
   Confirm1 and Confirm2 messages of the next session.  In other words,
   each party sends the sasflag from the previous session in the Confirm
   message of the current session.  It is perfectly reasonable to have a
   ZRTP endpoint that never sets the sasflag, because it would require
   adding complexity to the user interface to allow the user to set it.
   The sasflag is not required to be set, but if it is available to the
   client software, it allows for the possibility that the client
   software could render to the user that the SAS verify procedure was
   carried out in a previous session.

   Regardless of whether there is a user interface element to allow the
   user to set the sasflag, it is worth caching a shared secret, because
   doing so reduces opportunities for an attacker in the next call.

   If at any time the users carry out the SAS procedure, and it actually
   fails to match, then this means there is a very resourceful man in
   the middle.  If this is the first call, the MitM was there on the
   first call, which is impressive enough.  If it happens in a later
   call, it also means the MitM must also know your cached shared
   secret, because you could not have carried out any voice traffic at
   all unless the session key was correctly computed and is also known
   to the attacker.  This implies the MitM must have been present in all
   the previous sessions, since the initial establishment of the first
   shared secret.  This is indeed a resourceful attacker.  It also means
   that if at any time he ceases his participation as a MitM on one of
   your calls, the protocol will detect that the cached shared secret is
   no longer valid-- because it was really two different shared secrets
   all along, one of them between Alice and the attacker, and the other
   between the attacker and Bob. The continuity of the cached shared
   secrets make it possible for us to detect the MitM when he inserts
   himself into the ongoing relationship, as well as when he leaves.
   Also, if the attacker tries to stay with a long lineage of calls, but
   fails to execute a DH MitM attack for even one missed call, he is


Zimmermann, et al.      Expires September 6, 2006              [Page 31]

Internet-Draft                    ZRTP                        March 2006


   permanently excluded.  He can no longer resynchronize with the chain
   of cached shared secrets.

   Some sort of user interface element (maybe a checkbox) is needed to
   allow the user to tell the software the SAS verify was successful,
   causing the software to set the "SAS verified" flag, which (together
   with our cached shared secret) obviates the need to perform the SAS
   procedure in the next call.  An additional user interface element can
   be provided to let the user tell the software he detected an actual
   SAS mismatch, which indicates a MitM attack.  The software can then
   take appropriate action, clearing the "SAS verified" flags, and erase
   the cached shared secret from this session.  It is up to the
   implementer to decide if this added user interface complexity is
   warranted.

   If the SAS matches, it means there is no MitM, which also implies it
   is now safe to trust a cached shared secret for later calls.  If
   inattentive users don't bother to check the SAS, it means we don't
   know whether there is or is not a MitM, so even if we do establish a
   new cached shared secret, there is a risk that our potential attacker
   may have a subsequent opportunity to continue inserting himself in
   the call, until we finally get around to checking the SAS.  If the
   SAS matches, it means no attacker was present for any previous
   session since we started propagating cached shared secrets, because
   this session and all the previous sessions were also authenticated
   with a continuous lineage of shared secrets.


7.  IANA Considerations

   If an IANA registry for RTP extension headers were defined, then the
   value 0x505A would be reserved for ZRTP.


8.  Security Considerations

   This document is all about securely keying SRTP sessions.  As such,
   security is discussed in every section.  The next version of this
   draft will have a summary of those security properties discussed
   throughout the document.


9.  Acknowledgments

   The authors would like to thank Bryce Wilcox for his contributions to
   the design of this protocol, and to thank Jon Peterson, Colin Plumb,
   and Hal Finney for their helpful comments and suggestions.


Zimmermann, et al.      Expires September 6, 2006              [Page 32]

Internet-Draft                    ZRTP                        March 2006


10.  Appendix - ZRTP, SIP, and SDP

   This section discusses how ZRTP, SIP, and SDP work together.

   SIP UAs which support this specification would include the to-be-
   defined SDP attribute a=zrtp in their SDP offers and answers.  The
   presence of this attribute is a hint to another UA that ZRTP is
   supported.  If a UA supports both ZRTP and another approach to
   negotiate an SRTP secret such as [14] or [13] , then the presence of
   the a=zrtp attribute is critical.  If both UAs support ZRTP, they
   will first try ZRTP before attempting SRTP.  If only one endpoint
   supports ZRTP but both support SRTP, then the other method will be
   used instead.

   Note that ZRTP may be implemented without coupling with the SIP
   signaling.  For example, ZRTP can be implemented as a "bump in the
   wire" or as a "bump in the stack" in which RTP sent by the SIP UA is
   converted to ZRTP.  In these cases, the SIP UA will have no knowledge
   of ZRTP and will not include the a=zrtp attribute.  As a result, even
   if the other UA does not indicate support for ZRTP, a ZRTP endpoint
   SHOULD still send Hello messages.


11.  References

11.1.  Normative References

   [1]  Schulzrinne, H., Casner, S., Frederick, R., and V. Jacobson,
        "RTP: A Transport Protocol for Real-Time Applications", STD 64,
        RFC 3550, July 2003.

   [2]  Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K.
        Norrman, "The Secure Real-time Transport Protocol (SRTP)",
        RFC 3711, March 2004.

   [3]  Kivinen, T. and M. Kojo, "More Modular Exponential (MODP)
        Diffie-Hellman groups for Internet Key Exchange (IKE)",
        RFC 3526, May 2003.

   [4]  Ferguson, N. and B. Schneier, "Practical Cryptography", Wiley
        Publishing 2003.

   [5]  Barker, E. and J. Kelsey, "Recommendation for Random Number
        Generation Using Deterministic Random Bit Generators", NIST
        Special Publication 800-90 DRAFT (December 2005).

   [6]  O'Whielacronx, Z., "human-oriented base-32 encoding", http://
        cvs.sourceforge.net/viewcvs.py/libbase32/libbase32/


Zimmermann, et al.      Expires September 6, 2006              [Page 33]

Internet-Draft                    ZRTP                        March 2006


        DESIGN?rev=HEAD .

11.2.  Informative References

   [7]   Zimmermann, P., "PGPfone",
         http://www.pgpi.org/products/pgpfone/ .

   [8]   Zimmermann, P., "Zfone", http://www.philzimmermann.com/zfone .

   [9]   Blossom, E., "The VP1 Protocol for Voice Privacy Devices
         Version 1.2", http://www.comsec.com/vp1-protocol.pdf .

   [10]  "CryptoPhone", http://www.cryptophone.de/ .

   [11]  Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A.,
         Peterson, J., Sparks, R., Handley, M., and E. Schooler, "SIP:
         Session Initiation Protocol", RFC 3261, June 2002.

   [12]  Ylonen, T. and C. Lonvick, "The Secure Shell (SSH) Protocol
         Architecture", RFC 4251, January 2006.

   [13]  Andreasen, F., "Session Description Protocol Security
         Descriptions for Media Streams",
         draft-ietf-mmusic-sdescriptions-12 (work in progress),
         September 2005.

   [14]  Arkko, J., "Key Management Extensions for Session Description
         Protocol (SDP) and Real  Time Streaming Protocol (RTSP)",
         draft-ietf-mmusic-kmgmt-ext-15 (work in progress), June 2005.

   [15]  Arkko, J., Carrara, E., Lindholm, F., Naslund, M., and K.
         Norrman, "MIKEY: Multimedia Internet KEYing", RFC 3830,
         August 2004.

   [16]  Handley, M. and V. Jacobson, "SDP: Session Description
         Protocol", RFC 2327, April 1998.


Zimmermann, et al.      Expires September 6, 2006              [Page 34]

Internet-Draft                    ZRTP                        March 2006


Authors' Addresses

   Philip Zimmermann
   Phil Zimmermann and Associates LLC

   Email: prz@mit.edu


   Alan Johnston (editor)
   SIPStation
   St. Louis, MO  63124

   Email: alan@sipstation.com


   Jon Callas
   PGP Corporation

   Email: jon@pgp.com


Zimmermann, et al.      Expires September 6, 2006              [Page 35]

Internet-Draft                    ZRTP                        March 2006


Intellectual Property Statement

   The IETF takes no position regarding the validity or scope of any
   Intellectual Property Rights or other rights that might be claimed to
   pertain to the implementation or use of the technology described in
   this document or the extent to which any license under such rights
   might or might not be available; nor does it represent that it has
   made any independent effort to identify any such rights.  Information
   on the procedures with respect to rights in RFC documents can be
   found in BCP 78 and BCP 79.

   Copies of IPR disclosures made to the IETF Secretariat and any
   assurances of licenses to be made available, or the result of an
   attempt made to obtain a general license or permission for the use of
   such proprietary rights by implementers or users of this
   specification can be obtained from the IETF on-line IPR repository at
   http://www.ietf.org/ipr.

   The IETF invites any interested party to bring to its attention any
   copyrights, patents or patent applications, or other proprietary
   rights that may cover technology that may be required to implement
   this standard.  Please address the information to the IETF at
   ietf-ipr@ietf.org.


Disclaimer of Validity

   This document and the information contained herein are provided on an
   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.


Copyright Statement

   Copyright (C) The Internet Society (2006).  This document is subject
   to the rights, licenses and restrictions contained in BCP 78, and
   except as set forth therein, the authors retain all their rights.


Acknowledgment

   Funding for the RFC Editor function is currently provided by the
   Internet Society.


Zimmermann, et al.      Expires September 6, 2006              [Page 36]