idnits 2.17.1 

draft-ietf-sip-media-security-requirements-03.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 20.

  -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on
     line 2122.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 2133.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 2140.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 2146.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust Copyright Line does not match the
     current year

  == Line 948 has weird spacing: '...ication  along...'

  == Line 983 has weird spacing: '...RFP) in  the S...'

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (February 24, 2008) is 5899 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-07) exists of
     draft-ietf-avt-dtls-srtp-01

  == Outdated reference: A later version (-07) exists of
     draft-ietf-mmusic-media-path-middleboxes-00

  == Outdated reference: A later version (-13) exists of
     draft-ietf-mmusic-sdp-capability-negotiation-08

  == Outdated reference: A later version (-09) exists of
     draft-ietf-msec-mikey-applicability-08

  == Outdated reference: A later version (-15) exists of
     draft-ietf-sip-certs-05

  == Outdated reference: A later version (-06) exists of
     draft-mcgrew-srtp-ekt-03

  == Outdated reference: A later version (-04) exists of
     draft-wing-sipping-srtp-key-02

  == Outdated reference: A later version (-22) exists of
     draft-zimmermann-avt-zrtp-04

  -- Obsolete informational reference (is this intentional?): RFC 3388
     (Obsoleted by RFC 5888)

  -- Obsolete informational reference (is this intentional?): RFC 4346
     (Obsoleted by RFC 5246)

  -- Obsolete informational reference (is this intentional?): RFC 4474
     (Obsoleted by RFC 8224)

  -- Obsolete informational reference (is this intentional?): RFC 4492
     (Obsoleted by RFC 8422)


     Summary: 1 error (**), 0 flaws (~~), 11 warnings (==), 11 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	SIP Working Group                                           D. Wing, Ed.
3	Internet-Draft                                                     Cisco
4	Intended status: Informational                                  S. Fries
5	Expires: August 27, 2008                                      Siemens AG
6	                                                           H. Tschofenig
7	                                                  Nokia Siemens Networks
8	                                                                F. Audet
9	                                                                  Nortel
10	                                                       February 24, 2008

12	    Requirements and Analysis of Media Security Management Protocols
13	             draft-ietf-sip-media-security-requirements-03

15	Status of this Memo

17	   By submitting this Internet-Draft, each author represents that any
18	   applicable patent or other IPR claims of which he or she is aware
19	   have been or will be disclosed, and any of which he or she becomes
20	   aware will be disclosed, in accordance with Section 6 of BCP 79.

22	   Internet-Drafts are working documents of the Internet Engineering
23	   Task Force (IETF), its areas, and its working groups.  Note that
24	   other groups may also distribute working documents as Internet-
25	   Drafts.

27	   Internet-Drafts are draft documents valid for a maximum of six months
28	   and may be updated, replaced, or obsoleted by other documents at any
29	   time.  It is inappropriate to use Internet-Drafts as reference
30	   material or to cite them other than as "work in progress."

32	   The list of current Internet-Drafts can be accessed at
33	   http://www.ietf.org/ietf/1id-abstracts.txt.

35	   The list of Internet-Draft Shadow Directories can be accessed at
36	   http://www.ietf.org/shadow.html.

38	   This Internet-Draft will expire on August 27, 2008.

40	Abstract

42	   This document describes requirements for a protocol to negotiate a
43	   security context for SIP-signaled SRTP media.  In addition to the
44	   natural security requirements, this negotiation protocol must
45	   interoperate well with SIP in certain ways.  A number of proposals
46	   have been published and a summary of these proposals is in the
47	   appendix of this document.

49	Table of Contents

51	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
52	   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  4
53	   3.  Attack Scenarios . . . . . . . . . . . . . . . . . . . . . . .  5
54	   4.  Call Scenarios . . . . . . . . . . . . . . . . . . . . . . . .  8
55	     4.1.  Clipping Media Before Signaling Answer . . . . . . . . . .  8
56	     4.2.  Retargeting and Forking  . . . . . . . . . . . . . . . . .  9
57	     4.3.  Shared Key Conferencing  . . . . . . . . . . . . . . . . . 11
58	     4.4.  Recording  . . . . . . . . . . . . . . . . . . . . . . . . 13
59	     4.5.  PSTN gateway . . . . . . . . . . . . . . . . . . . . . . . 13
60	     4.6.  Call Setup Performance . . . . . . . . . . . . . . . . . . 14
61	     4.7.  Transcoding  . . . . . . . . . . . . . . . . . . . . . . . 15
62	   5.  Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 15
63	     5.1.  Key Management Protocol Requirements . . . . . . . . . . . 15
64	     5.2.  Security Requirements  . . . . . . . . . . . . . . . . . . 17
65	     5.3.  Requirements Outside of the Key Management Protocol  . . . 19
66	   6.  Security Considerations  . . . . . . . . . . . . . . . . . . . 20
67	   7.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 20
68	   8.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 20
69	   9.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 20
70	     9.1.  Normative References . . . . . . . . . . . . . . . . . . . 20
71	     9.2.  Informative References . . . . . . . . . . . . . . . . . . 21
72	   Appendix A.  Overview and Evaluation of Existing Keying
73	                Mechanisms  . . . . . . . . . . . . . . . . . . . . . 24
74	     A.1.  Signaling Path Keying Techniques . . . . . . . . . . . . . 24
75	       A.1.1.  MIKEY-NULL . . . . . . . . . . . . . . . . . . . . . . 25
76	       A.1.2.  MIKEY-PSK  . . . . . . . . . . . . . . . . . . . . . . 25
77	       A.1.3.  MIKEY-RSA  . . . . . . . . . . . . . . . . . . . . . . 25
78	       A.1.4.  MIKEY-RSA-R  . . . . . . . . . . . . . . . . . . . . . 26
79	       A.1.5.  MIKEY-DHSIGN . . . . . . . . . . . . . . . . . . . . . 26
80	       A.1.6.  MIKEY-DHHMAC . . . . . . . . . . . . . . . . . . . . . 26
81	       A.1.7.  MIKEY-ECIES and MIKEY-ECMQV (MIKEY-ECC)  . . . . . . . 26
82	       A.1.8.  Security Descriptions with SIPS  . . . . . . . . . . . 27
83	       A.1.9.  Security Descriptions with S/MIME  . . . . . . . . . . 27
84	       A.1.10. SDP-DH (expired) . . . . . . . . . . . . . . . . . . . 27
85	       A.1.11. MIKEYv2 in SDP (expired) . . . . . . . . . . . . . . . 27
86	       A.1.12. Evaluation Criteria - SIP  . . . . . . . . . . . . . . 28
87	       A.1.13. Evaluation Criteria - Security . . . . . . . . . . . . 36
88	     A.2.  Media Path Keying Technique  . . . . . . . . . . . . . . . 43
89	       A.2.1.  ZRTP . . . . . . . . . . . . . . . . . . . . . . . . . 43
90	     A.3.  Signaling and Media Path Keying Techniques . . . . . . . . 43
91	       A.3.1.  EKT  . . . . . . . . . . . . . . . . . . . . . . . . . 43
92	       A.3.2.  DTLS-SRTP  . . . . . . . . . . . . . . . . . . . . . . 44
93	       A.3.3.  MIKEYv2 Inband (expired) . . . . . . . . . . . . . . . 44
94	   Appendix B.  Out-of-Scope  . . . . . . . . . . . . . . . . . . . . 44
95	   Appendix C.  Requirement renumbering in -02  . . . . . . . . . . . 44
96	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 46
97	   Intellectual Property and Copyright Statements . . . . . . . . . . 48

99	1.  Introduction

101	   The work on media security started when the Session Initiation
102	   Protocol (SIP) was still in its infancy.  With the increased SIP
103	   deployment and the availability of new SIP extensions and related
104	   protocols, the need for end-to-end security was re-evaluated.  The
105	   procedure of re-evaluating prior protocol work and design decisions
106	   is not an uncommon strategy and, to some extent, considered necessary
107	   to ensure that the developed protocols indeed meet the previously
108	   envisioned needs for the users on the Internet.

110	   This document summarizes media security requirements, i.e.,
111	   requirements for mechanisms that negotiate security context such as
112	   cryptographic keys and parameters for SRTP.

114	   The organization of this document is as follows: Section 2 introduces
115	   terminology, Section 3 describes various attack scenarios against the
116	   signaling path and media path, Section 4 provides an overview about
117	   possible call scenarios, Section 5 lists requirements for media
118	   security.  The main part of the document concludes with the security
119	   considerations Section 6, IANA considerations Section 7 and an
120	   acknowledgement section in Section 8.  Appendix A lists and compares
121	   available solution proposals.  The following Appendix A.1.12 compares
122	   the different approaches regarding their suitability for the SIP
123	   signaling scenarios described in Appendix A, while Appendix A.1.13
124	   provides a comparison regarding security aspects.  Appendix B lists
125	   non-goals for this document.

127	2.  Terminology

129	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
130	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
131	   document are to be interpreted as described in [RFC2119], with the
132	   important qualification that, unless otherwise stated, these terms
133	   apply to the design of the media security key management protocol,
134	   not its implementation or application.

136	   Additionally, the following items are used in this document:

138	   AOR (Address-of-Record):   A SIP or SIPS URI that points to a domain
139	      with a location service that can map the URI to another URI where
140	      the user might be available.  Typically, the location service is
141	      populated through registrations.  An AOR is frequently thought of
142	      as the "public address" of the user.

144	   SSRC:  The 32-bit value that defines the synchronization source, used
145	      in RTP.  These are generally unique, but collisions can occur.

147	   two-time pad:  The use of the same key and the same keystream to
148	      encrypt different data.  For SRTP, a two-time pad occurs if two
149	      senders are using the same key and the same RTP SSRC value.

151	   Perfect Forward Secrecy (PFS):  The property that disclosure of the
152	      long-term secret keying material that is used to derive an agreed
153	      ephemeral key does not compromise the secrecy of agreed keys from
154	      earlier runs.

156	   active adversary:  An active adversary is able to alter data
157	      communication to affect its operation (see also [RFC4949]).

159	   passive adversary:  A passive adversary is able to learn information
160	      from data communication, but not alter that data communication
161	      (see also[RFC4949]).

163	   signaling path:  The signaling path is the route taken by SIP
164	      signaling messages transmitted between the calling and called user
165	      agents.  This can be either direct signaling between the calling
166	      and called user agents or, more commonly involves the SIP proxy
167	      servers that were involved in the call setup.

169	   media path:  The media path is the route taken by media packets
170	      exchanged by the endpoints.  In the simplest case, the endpoints
171	      exchange media directly, and the "media path" is defined by a
172	      quartet of IP addresses and TCP/UDP ports, along with an IP route.
173	      In other cases, this path may include RTP relays, mixers,
174	      transcoders, session border controllers, NATs, or media gateways.

176	3.  Attack Scenarios

178	   The discussion in this section relates to requirements R-PASS-MEDIA,
179	   R-PASS-SIG, R-ASSOC, R-SIG-MEDIA, R-ACT-ACT, and R-ID-BINDING.

181	   This document classifies adversaries according to their access and
182	   their capabilities.  An adversary might have access:

184	   1.  only to the media path,

186	   2.  only to the signaling path,

188	   3.  to the media path and to the signaling path.

190	   An attacker that can solely be located along the signaling path, and
191	   does not have access to media (item 2), is not considered in this
192	   document.

194	   There are two different types of adversaries, active and passive.  An
195	   active adversary may need to be active with regard to the key
196	   exchange relevant information traveling along the media path or
197	   traveling along the signaling path.

199	   Based on their robustness against the adversary capabilities
200	   described above, we can group security mechanisms using the following
201	   labels.  This list is generally ordered from easiest to compromise
202	   (at the top) to more difficult to compromise:

204	    +---------------+---------+--------------------------------------+
205	    | SIP signaling |  media  |             abbreviation             |
206	    +---------------+---------+--------------------------------------+
207	    |      none     | passive |      no-signaling-passive-media      |
208	    |      none     |  active |       no-signaling-active-media      |
209	    |    passive    | passive |    passive-signaling-passive-media   |
210	    |    passive    |  active |    passive-signaling-active-media    |
211	    |     active    | passive |    active-signaling-passive-media    |
212	    |     active    |  active |     active-signaling-active-media    |
213	    |     active    |  active | active-signaling-active-media-detect |
214	    +---------------+---------+--------------------------------------+

216	   no-signaling-passive-media:
217	      Access to only the media path is sufficient to reveal the content
218	      of the media traffic.

220	   passive-signaling-passive-media:
221	      Passive attack on the signaling and passive attack on the media
222	      path is necessary to reveal the content of the media traffic.

224	   passive-signaling-active-media:
225	      Passive attack on the signaling and active attack on the media
226	      path is necessary to reveal the content of the media traffic.

228	   active-signaling-passive-media:
229	      Active attack on the signaling path and passive attack on the
230	      media path is necessary to reveal the content of the media
231	      traffic.

233	   no-signaling-active-media:
234	      Active attack on the media path is sufficient to reveal the
235	      content of the media traffic.

237	   active-signaling-active-media:
238	      Active attack on both the signaling path and the media path is
239	      necessary to reveal the content of the media traffic.

241	   active-signaling-active-media-detect:
242	      Active attack on both signaling and media path is necessary to
243	      reveal the content of the media traffic (as with active-signaling-
244	      active-media), and the attack is detectable by protocol messages
245	      exchanged between the end points.

247	   For example, unencrypted RTP is vulnerable to no-signaling-passive-
248	   media.

250	   As another example, Security Descriptions [RFC4568], when protected
251	   by TLS (as it is commonly implemented and deployed), belongs in the
252	   passive-signaling-passive-media category since the adversary needs to
253	   learn the Security Descriptions key by seeing the SIP signaling
254	   message at a SIP proxy (assuming that the adversary is in control of
255	   the SIP proxy).  The media traffic can be decrypted using that
256	   learned key.

258	   As another example, DTLS-SRTP falls into active-signaling-active-
259	   media category when DTLS-SRTP is used with a public key based
260	   ciphersuite with self-signed certificates and without SIP-Identity
261	   [RFC4474].  An adversary would have to modify the fingerprint that is
262	   sent along the signaling path and subsequently to modify the
263	   certificates carried in the DTLS handshake that travel along the
264	   media path.  If DTLS-SRTP is used with both SIP Identity [RFC4474]
265	   and SIP Connected Identity [RFC4916], the RFC4474 signature protects
266	   both the offer and the answer, and such a system would then belong to
267	   the active-signaling-active-attack-detect category (provided, of
268	   course, the signaling path to the RFC4474 authenticator and verifier
269	   is secured as per RFC4474 and the RFC4474 authenticator and verifier
270	   are behaving as per RFC4474).

272	   The above discussion of DTLS-SRTP demonstrates how a single security
273	   protocol can be in different classes depending on the mode in which
274	   it is operated.  Other protocols can achieve similar effect by adding
275	   functions outside of the on-the-wire key management protocol itself.
276	   Although it may be appropriate to deploy lower-classed mechanisms in
277	   some cases, the ultimate security requirement for a media security
278	   negotiation protocol is that it have a mode of operation available in
279	   which it is detect-attack, which provides protection against the
280	   passive and active attacks and provides detection of such attacks.
281	   That is, there must be a way to use the protocol so that an active
282	   attack is required against both the signaling and media paths, and so
283	   that such attacks are detectable by the endpoints.

285	4.  Call Scenarios

287	   The following subsections describe call scenarios that pose the most
288	   challenge to the key management system for media data in cooperation
289	   with SIP signaling.

291	4.1.  Clipping Media Before Signaling Answer

293	   The discussion in this section relates to requirement R-AVOID-
294	   CLIPPING.

296	   Per the SDP Offer/Answer Model [RFC3264],

298	      "Once the offerer has sent the offer, it MUST be prepared to
299	      receive media for any recvonly streams described by that offer.
300	      It MUST be prepared to send and receive media for any sendrecv
301	      streams in the offer, and send media for any sendonly streams in
302	      the offer (of course, it cannot actually send until the peer
303	      provides an answer with the needed address and port information)."

305	   To meet this requirement with SRTP, the offerer needs to know the
306	   SRTP key for arriving media.  If either endpoint receives encrypted
307	   media before it has access to the associated SRTP key, it cannot play
308	   the media -- causing clipping.

310	   For key exchange mechanisms that send the answerer's key in SDP, a
311	   SIP provisional response [RFC3261], such as 183 (session progress),
312	   is useful.  However, the 183 messages are not reliable unless both
313	   the calling and called end point support PRACK [RFC3262], use TCP
314	   across all SIP proxies, implement Security Preconditions [RFC5027],
315	   or the both ends implement ICE [I-D.ietf-mmusic-ice] and the answerer
316	   implements the reliable provisional response mechanism described in
317	   ICE.  Unfortunately, there is not wide deployment of any of these
318	   techniques and there is industry reluctance to require these
319	   techniques to avoid the problems described in this section.

321	   Note that the receipt of an SDP answer is not always sufficient to
322	   allow media to be played to the offerer.  Sometimes, the offerer must
323	   send media in order to open up firewall holes or NAT bindings before
324	   media can be received.  In this case, even a solution that makes the
325	   key available before the SDP answer arrives will not help.

327	   Fixes to early media (i.e., the media that arrives at the SDP offerer
328	   before the SDP answer arrives) might make the requirements to become
329	   obsolete, but at the time of writing no progress has been
330	   accomplished.

332	4.2.  Retargeting and Forking

334	   The discussion in this section relates to requirements R-FORK-
335	   RETARGET, R-DISTINCT, R-HERFP, and R-BEST-SECURE.

337	   In SIP, a request sent to a specific AOR but delivered to a different
338	   AOR is called a "retarget".  A typical scenario is a "call
339	   forwarding" feature.  In Figure 1 Alice sends an INVITE in step 1
340	   that is sent to Bob in step 2.  Bob responds with a redirect (SIP
341	   response code 3xx) pointing to Carol in step 3.  This redirect
342	   typically does not propagate back to Alice but only goes to a proxy
343	   (i.e., the retargeting proxy) that sends the original INVITE to Carol
344	   in step 4.

346	                                    +-----+
347	                                    |Alice|
348	                                    +--+--+
349	                                       |
350	                                       | INVITE (1)
351	                                       V
352	                                  +----+----+
353	                                  |  proxy  |
354	                                  ++-+-----++
355	                                   | ^     |
356	                        INVITE (2) | |     | INVITE (4)
357	                    & redirect (3) | |     |
358	                                   V |     V
359	                                  ++-++   ++----+
360	                                  |Bob|   |Carol|
361	                                  +---+   +-----+

363	                           Figure 1: Retargeting

365	   Using retargeting might lead to situations where the UAC does not
366	   know where its request will be going.  This might not immediately
367	   seem like a serious problem; after all, when one places a telephone
368	   call on the PSTN, one never really knows if it will be forwarded to a
369	   different number, who will pick up the line when it rings, and so on.
370	   However, when considering SIP mechanisms for authenticating the
371	   called party, this function can also make it difficult to
372	   differentiate an intermediary that is behaving legitimately from an
373	   attacker.  From this perspective, the main problems with retargeting
374	   ares:

376	   Not detectable by the caller:   The originating user agent has no
377	      means of anticipating that the condition will arise, nor any means
378	      of determining that it has occurred until the call has already
379	      been set up.

381	   Not preventable by the caller:  There is no existing mechanism that
382	      might be employed by the originating user agent in order to
383	      guarantee that the call will not be re-targeted.

385	   The mechanism used by SIP for identifying the calling party is SIP
386	   Identity [RFC4474].  However, due to the nature of retargeting SIP
387	   Identity can only identify the calling party (that is, the party that
388	   initiated the SIP request).  Some key exchange mechanisms predate SIP
389	   Identity and include their own identity mechanism (e.g., MIKEY).
390	   However, those built-in identity mechanism also suffer from the SIP
391	   retargeting problem.  While Connected Identity [RFC4916] allows
392	   positive identification of the called party, the primary difficulty
393	   still remains that the calling party does not know if a mismatched
394	   called party is legitimate (i.e., due to authorized retargeting) or
395	   illegitimate (i.e., due to unauthorized retargeting by an attacker
396	   above to modify SIP signaling).

398	   In SIP, 'forking' is the delivery of a request to multiple locations.
399	   This happens when a single AOR is registered more than once.  An
400	   example of forking is when a user has a desk phone, PC client, and
401	   mobile handset all registered with the same AOR.

403	                                  +-----+
404	                                  |Alice|
405	                                  +--+--+
406	                                     |
407	                                     | INVITE
408	                                     V
409	                               +-----+-----+
410	                               |   proxy   |
411	                               ++---------++
412	                                |         |
413	                         INVITE |         | INVITE
414	                                V         V
415	                             +--+--+   +--+--+
416	                             |Bob-1|   |Bob-2|
417	                             +-----+   +-----+

419	                             Figure 2: Forking

421	   With forking, both Bob-1 and Bob-2 might send back SDP answers in SIP
422	   responses.  Alice will see those intermediate (18x) and final (200)
423	   responses.  It is useful for Alice to be able to associate the SIP
424	   response with the incoming media stream.  Although this association
425	   can be done with ICE [I-D.ietf-mmusic-ice], and ICE is useful to make
426	   this association with RTP, it is not desirable to require ICE to
427	   accomplish this association.

429	   Forking and retargeting are often used together.  For example, a boss
430	   and secretary might have both phones ring (forking) and rollover to
431	   voice mail if neither phone is answered (retargeting).

433	   To maintain security of the media traffic, only the end point that
434	   answers the call should know the SRTP keys for the session.  Forked
435	   and re-targeted calls only reveal sensitive information to non-
436	   responders when the signaling messages contain sensitive information
437	   (e.g., SRTP keys) that is accessible by parties that receive the
438	   offer, but may not respond (i.e., the original recipients in a
439	   retargeted call, or non-answering endpoints in a forked call).  For
440	   key exchange mechanisms that do not provide secure forking or secure
441	   retargeting, one workaround is to re-key immediately after forking or
442	   retargeting.  However, because the originator may not be aware that
443	   the call forked this mechanism requires rekeying immediately after
444	   every session is established.  This doubles the number of messages
445	   processed by the network.

447	   Further compounding this problem is a unique feature of SIP that when
448	   forking is used, there is always only one final error response
449	   delivered to the sender of the request: the forking proxy is
450	   responsible for choosing which final response to choose in the event
451	   where forking results in multiple final error responses being
452	   received by the forking proxy.  This means that if a request is
453	   rejected, say with information that the keying information was
454	   rejected and providing the far end's credentials, it is very possible
455	   that the rejection will never reach the sender.  This problem, called
456	   the Heterogeneous Error Response Forking Problem (HERFP)
457	   [I-D.mahy-sipping-herfp-fix], is difficult to solve in SIP.  Because
458	   we expect the HERFP to continue to be a problem in SIP for the
459	   foreseeable future, a media security system should function even in
460	   the presence of HERFP behavior.

462	4.3.  Shared Key Conferencing

464	   The consensus on the RTPSEC mailing list was to concentrate on
465	   unicast, point-to-point sessions.  Thus, there are no requirements
466	   related to shared key conferencing.  This section is retained for
467	   informational purposes.

469	   For efficient scaling, large audio and video conference bridges
470	   operate most efficiently by encrypting the current speaker once and
471	   distributing that stream to the conference attendees.  Typically,
472	   inactive participants receive the same streams -- they hear (or see)
473	   the active speaker(s), and the active speakers receive distinct
474	   streams that don't include themselves.  In order to maintain
475	   confidentiality of such conferences where listeners share a common
476	   key, all listeners must rekeyed when a listener joins or leaves a
477	   conference.

479	   An important use case for mixers/translators is a conference bridge:

481	                                         +----+
482	                             A --- 1 --->|    |
483	                               <-- 2 ----| M  |
484	                                         | I  |
485	                             B --- 3 --->| X  |
486	                               <-- 4 ----| E  |
487	                                         | R  |
488	                             C --- 5 --->|    |
489	                               <-- 6 ----|    |
490	                                         +----+

492	                       Figure 3: Centralized Keying

494	   In the figure above, 1, 3, and 5 are RTP media contributions from
495	   Alice, Bob, and Carol, and 2, 4, and 6 are the RTP flows to those
496	   devices carrying the 'mixed' media.

498	   Several scenarios are possible:

500	   a.  Multiple inbound sessions: 1, 3, and 5 are distinct RTP sessions,

502	   b.  Multiple outbound sessions: 2, 4, and 6 are distinct RTP
503	       sessions,

505	   c.  Single inbound session: 1, 3, and 5 are just different sources
506	       within the same RTP session,

508	   d.  Single outbound session: 2, 4, and 6 are different flows of the
509	       same (multi-unicast) RTP session

511	   If there are multiple inbound sessions and multiple outbound sessions
512	   (scenarios a and b), then every keying mechanism behaves as if the
513	   mixer were an end point and can set up a point-to-point secure
514	   session between the participant and the mixer.  This is the simplest
515	   situation, but is computationally wasteful, since SRTP processing has
516	   to be done independently for each participant.  The use of multiple
517	   inbound sessions (scenario a) doesn't waste computational resources,
518	   though it does consume additional cryptographic context on the mixer
519	   for each participant and has the advantage of non-repudiation of the
520	   originator of the incoming stream.

522	   To support a single outbound session (scenario d), the mixer has to
523	   dictate its encryption key to the participants.  Some keying
524	   mechanisms allow the transmitter to determine its own key, and others
525	   allow the offerer to determine the key for the offerer and answerer.
526	   Depending on how the call is established, the offerer might be a
527	   participant (such as a participant dialing into a conference bridge)
528	   or the offerer might be the mixer (such as a conference bridge
529	   calling a participant).  The use of offerless INVITEs may help some
530	   keying mechanisms reverse the role of offerer/answerer.  A
531	   difficulty, however, is knowing a priori if the role should be
532	   reversed for a particular call.

534	4.4.  Recording

536	   The discussion in this section relates to requirement R-RECORDING.

538	   Some business environments, such as stock brokers, banks, and catalog
539	   call centers, require recording calls with customers.  This is the
540	   familiar "this call is being recorded for quality purposes" heard
541	   during calls to these sorts of businesses.  In these environments,
542	   media recording is typically performed by an intermediate device
543	   (with RTP, this is typically implemented in a 'sniffer').

545	   When performing such call recording with SRTP, the end-to-end
546	   security is compromised.  This is unavoidable, but necessary because
547	   the operation of the business requires such recording.  It is
548	   desirable that the media security is not unduly compromised by the
549	   media recording.  The endpoint within the organization needs to be
550	   informed that there is an intermediate device and needs to cooperate
551	   with that intermediate device.

553	   This scenario does not place a requirement directly on the key
554	   management protocol.  The requirement could be met directly by the
555	   key management protocol (e.g., MIKEY-NULL or [RFC4568]) or through an
556	   external out-of-band-mechanism (e.g., [I-D.wing-sipping-srtp-key]).

558	4.5.  PSTN gateway

560	   The discussion in this section relates to requirement R-PSTN.

562	   It is desirable, even when one leg of a call is on the PSTN, that the
563	   IP leg of the call be protected with SRTP.

565	   A typical case of using media security where two entities are having
566	   a VoIP conversation over IP capable networks.  However, there are
567	   cases where the other end of the communication is not connected to an
568	   IP capable network.  In this kind of setting, there needs to be some
569	   kind of gateway at the edge of the IP network which converts the VoIP
570	   conversation to format understood by the other network.  An example
571	   of such gateway is a PSTN gateway sitting at the edge of IP and PSTN
572	   networks (such as the architecture described in [RFC3372]).

574	   If media security (e.g., SRTP protection) is employed in this kind of
575	   gateway-setting, then media security and the related key management
576	   is terminated at the PSTN gateway.  The other network (e.g., PSTN)
577	   may have its own measures to protect the communication, but this
578	   means that from media security point of view the media security is
579	   not employed truely end-to-end between the communicating entities.

581	4.6.  Call Setup Performance

583	   The discussion in this section relates to requirement R-REUSE.

585	   Some devices lack sufficient processing power to perform public key
586	   operations or Diffie-Hellman operations for each call, or prefer to
587	   avoid performing those operations on every call.  The ability to re-
588	   use previous public key or Diffie-Hellman operations can vastly
589	   decrease the call setup delay and processing requirements for such
590	   devices.

592	   In certain devices, it can take a second or two to perform a Diffie-
593	   Hellman operation.  Examples of these devices include handsets, IP
594	   Multimedia Services Identity Module (ISIMs), and PSTN gateways.  PSTN
595	   gateways typically utilize a Digital Signal Processor (DSP) which is
596	   not yet involved with typical DSP operations at the beginning of a
597	   call, thus the DSP could be used to perform the calculation, so as to
598	   avoid having the central host processor perform the calculation.
599	   However, not all PSTN gateways use DSPs (some have only central
600	   processors or their DSPs are incapable of performing the necessary
601	   public key or Diffie-Hellman operation), and handsets lack a
602	   separate, unused processor to perform these operations.

604	   Two scenarios where R-REUSE is useful are calls between an endpoint
605	   and its voicemail server or its PSTN gateway.  In those scenarios
606	   calls are made relatively often and it can be useful for the
607	   voicemail server or PSTN gateway to avoid public key operations for
608	   subsequent calls.

610	   Storing keys across sessions often interferes with perfect forward
611	   secrecy (R-PFS).

613	4.7.  Transcoding

615	   The discussion in this section relates to requirement R-TRANSCODER.

617	   In some environments is is necessary for network equipment to
618	   transcode from one codec (e.g., a highly compressed codec which makes
619	   efficient use of wireless bandwidth) to another codec (e.g., a
620	   standardized codec to a SIP peering interface).  With RTP, a
621	   transcoding function can be performed with the combination of a SIP
622	   B2BUA (to modify the SDP) and a processor to perform the transcoding
623	   between the codecs.  However, with end-to-end secured SRTP, a
624	   transcoding function implemented the same way is a man in the middle
625	   attack, and the key management system prevents its use.

627	   However, such a network-based transcoder can still be realized with
628	   the cooperation and approval of the endpoint, and can provide end-to-
629	   transcoder and transcoder-to-end security.

631	5.  Requirements

633	   This section is divided into several parts: requirements specific to
634	   the key management protocol (Section 5.1), attack scenarios
635	   (Section 5.2), and requirements which can be met inside the key
636	   management protocol or outside of the key management protocol
637	   (Section 5.3).

639	5.1.  Key Management Protocol Requirements

641	   SIP Forking and Retargeting, from Section 4.2:

643	   R-FORK-RETARGET:
644	         The media security key management protocol MUST securely
645	         support forking and retargeting when all endpoints are willing
646	         to use SRTP without causing the call setup to fail.  This
647	         requirement means the endpoints that did not answer the call
648	         MUST NOT learn the SRTP keys (in either direction) used by the
649	         answering endpoint.

651	   R-DISTINCT:
652	         The media security key management protocol MUST be capble of
653	         creating distinct, independent cryptographic contexts for each
654	         endpoint in a forked session.

656	   R-HERFP:
657	         The media security key management protocol MUST function
658	         securely even in the presence of HERFP behavior.

660	   Performance considerations:

662	   R-REUSE:
663	         The media security key management protocol MAY support the re-
664	         use of a previously established security context.

666	               Note: re-use of the security context does not imply re-
667	               use of RTP parameters (e.g., payload type or SSRC).

669	   Media considerations:

671	   R-AVOID-CLIPPING:
672	         The media security key management protocol SHOULD avoid
673	         clipping media before SDP answer without requiring Security
674	         Preconditions [RFC5027].  This requirement comes from
675	         Section 4.1.

677	   R-RTP-VALID:
678	         If SRTP key negotiation is performed over the media path (i.e.,
679	         using the same UDP/TCP ports as media packets), the key
680	         negotiation packets MUST NOT pass the RTP validity check
681	         defined in Appendix A.1 of [RFC3550].

683	   R-ASSOC:
684	         The media security key management protocol SHOULD include a
685	         mechanism for associating key management messages with both the
686	         signaling traffic that initiated the session and with protected
687	         media traffic.  Allowing such an association also allows the
688	         SDP offerer to avoid performing CPU-consuming operations (e.g.,
689	         Diffie-Hellman or public key operations) with attackers that
690	         have not seen the signaling messages.

692	         For example, if using a Diffie-Hellman keying technique with
693	         security preconditions that forks to 20 end points, the call
694	         initiator would get 20 provisional responses containing 20
695	         signed Diffie-Hellman key pairs.  Calculating 20 DH secrets and
696	         validating signatures can be a difficult task depending on the
697	         device capabilities.  Hence, in the case of forking, it is not
698	         desirable to perform a DH or PK operation with every party, but
699	         rather only with the party that answers the call (and incur
700	         some media clipping).  To do this, the signaling and media need
701	         to be associated so the calling party knows which key
702	         management needs to be completed.  This might be done by using
703	         the transport address indicated in the SDP, although NATs can
704	         complicate this association.

706	               Note: due to RTP's design requirements, it is expected
707	               that SRTP receivers will have to perform authentication
708	               of any received SRTP packets.

710	   R-NEGOTIATE:
711	         The media security key management protocol MUST allow a SIP
712	         User Agent to negotiate media security parameters for each
713	         individual session.

715	   R-PSTN:
716	         The media security key management protocol MUST support
717	         termination of media security in a PSTN gateway.  This
718	         requirement is from Section 4.5.

720	5.2.  Security Requirements

722	   This section describes overall security requirements and specific
723	   requirements from the attack scenarios (Section 3).

725	   Overall security requirements:

727	   R-PFS:
728	         The media security key management protocol MUST be able to
729	         support perfect forward secrecy.

731	   R-COMPUTE:
732	         The media security key management protocol MUST support
733	         offering additional SRTP cipher suites without incurring
734	         significant computational expense.

736	   R-CERTS:
737	         If the media security key management protocol employs
738	         certificates, it MUST be able to make use of both self-signed
739	         and CA-issued certificates.  As an alternative, the media
740	         security key management protocol MAY make use of "bare" public
741	         keys.

743	   R-FIPS:
744	         The media security key management protocol SHOULD use
745	         algorithms that allow FIPS 140-2 [FIPS-140-2] certification.

747	         Note that the United States Government can only purchase and
748	         use crypto implementations that have been validated by the
749	         FIPS-140 [FIPS-140-2] process:

751	         "The FIPS-140 standard is applicable to all Federal agencies
752	         that use cryptographic-based security systems to protect
753	         sensitive information in computer and telecommunication
754	         systems, including voice systems.  The adoption and use of this
755	         standard is available to private and commercial
756	         organizations."[cryptval]

758	         Some commercial organizations, such as banks and defense
759	         contractors, also require or prefer equipment which has
760	         validated by the FIPS-140 process.

762	   R-DOS:
763	         The media security key management protocol SHOULD NOT introduce
764	         new denial of service vulnerabilities (e.g., the protocol
765	         should not request the endpoint to perform CPU-intensive
766	         operations without the client being able to validate or
767	         authorize the request).

769	   R-EXISTING:
770	         The media security key management protocol SHOULD allow
771	         endpoints to authenticate using pre-existing cryptographic
772	         credentials, e.g., certificates or pre-shared keys.

774	   R-AGILITY:
775	         The media security key management protocol MUST provide crypto-
776	         agility, i.e., the ability to adapt to evolving cryptography
777	         and security requirements (update of cryptographic algorithms
778	         without substantial disruption to deployed implementations)

780	   R-DOWNGRADE:
781	         The media security key management protocol MUST protect cipher
782	         suite negotiation against downgrading attacks.

784	   R-PASS-MEDIA:
785	         The media security key management protocol MUST have a mode
786	         which prevents a passive adversary with access to the media
787	         path from gaining access to keying material used to protect
788	         SRTP media packets.

790	   R-PASS-SIG:
791	         The media security key management protocol MUST have a mode in
792	         which it prevents a passive adversary with access to the
793	         signaling path from gaining access to keying material used to
794	         protect SRTP media packets.

796	   R-SIG-MEDIA:
797	         The media security key management protocol MUST have a mode in
798	         which it defends itself from an attacker that is solely on the
799	         media path and from an attacker that is solely on the signaling
800	         path.  A successful attack refers to the ability for the
801	         adversary to obtain keying material to decrypt the SRTP
802	         encrypted media traffic.

804	   R-ID-BINDING:
805	         The media security key management protocol MUST use identifiers
806	         for endpoints that allow a domain to create signatures over
807	         those identifiers and the From address.

809	               This allows domains to deploy SIP Identity [RFC4474].

811	   R-ACT-ACT:
812	         The media security key management protocol MUST support a mode
813	         of operation that provides active-signaling-active-media-detect
814	         robustness, and MAY support modes of operation that provide
815	         lower levels of robustness (as described in Section 3).

817	               Failing to meet R-ACT-ACT indicates the protocol can not
818	               provide secure end-to-end media.

820	5.3.  Requirements Outside of the Key Management Protocol

822	   The requirements in this section are for an overall VoIP security
823	   system.  These requirements can be met within the key management
824	   protocol itself, or can be solved outside of the key management
825	   protocol itself (e.g., solved in SIP or in SDP).

827	   R-BEST-SECURE:
828	         Even when some end points of a forked or retargeted call are
829	         incapable of using SRTP, a solution MUST be described which
830	         allows the establishment of SRTP associations with SRTP-capable
831	         endpoints and / or RTP associations with non-SRTP-capable
832	         endpoints.  This requirement comes from Section 4.2.

834	   R-OTHER-SIGNALING:
835	         A solution SHOULD be able to negotiate keys for SRTP sessions
836	         created via different call signaling protocols (e.g., between
837	         Jabber, SIP, H.323, MGCP).

839	   R-RECORDING:
840	         A solution SHOULD be described which supports recording of
841	         decrypted media.  This requirement comes from Section 4.4.

843	   R-TRANSCODER:
844	         A solution SHOULD be described which supports intermediate
845	         nodes (e.g., transcoders), terminating or processing media,
846	         between the end points.

848	6.  Security Considerations

850	   This document lists requirements for securing media traffic.  As
851	   such, it addresses security throughout the document.

853	7.  IANA Considerations

855	   This document does not require actions by IANA.

857	8.  Acknowledgements

859	   For contributions to the requirements portion of this document, the
860	   authors would like to thank the active participants of the RTPSEC BoF
861	   and on the RTPSEC mailing list.  The authors would furthermore like
862	   to thank Wolfgang Buecker, Guenther Horn, Peter Howard, Hans-Heinrich
863	   Grusdt, Srinath Thiruvengadam, Martin Euchner, Eric Rescorla, Matt
864	   Lepinski, Dan York, Werner Dittmann, Richard Barnes, Vesa Lehtovirta,
865	   Colin Perkins, Peter Schneider, and Christer Holmberg for their
866	   feedback to this document.

868	   For contributions to the analysis portion of this document, the
869	   authors would like to thank Special thanks to Steffen Fries and
870	   Dragan Ignjatic for their excellent MIKEY comparison document
871	   [I-D.ietf-msec-mikey-applicability].  The authors would furthermore
872	   like to thank Cullen Jennings, David Oran, David McGrew, Mark
873	   Baugher, Flemming Andreasen, Eric Raymond, Dave Ward, Leo Huang, Eric
874	   Rescorla, Lakshminath Dondeti, Steffen Fries, Alan Johnston, Dragan
875	   Ignjatic and John Elwell for their feedback to this document.

877	   Thanks to Richard Barnes and Peter Schneider for thorough reviews and
878	   suggestions which improved the document considerably.

880	9.  References

882	9.1.  Normative References

884	   [FIPS-140-2]
885	              NIST, "Security Requirements for Cryptographic Modules",
886	              June 2005, <http://csrc.nist.gov/publications/fips/
887	              fips140-2/fips1402.pdf>.

889	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
890	              Requirement Levels", BCP 14, RFC 2119, March 1997.

892	   [RFC3261]  Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston,
893	              A., Peterson, J., Sparks, R., Handley, M., and E.
894	              Schooler, "SIP: Session Initiation Protocol", RFC 3261,
895	              June 2002.

897	   [RFC3262]  Rosenberg, J. and H. Schulzrinne, "Reliability of
898	              Provisional Responses in Session Initiation Protocol
899	              (SIP)", RFC 3262, June 2002.

901	   [RFC3264]  Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
902	              with Session Description Protocol (SDP)", RFC 3264,
903	              June 2002.

905	   [RFC3711]  Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K.
906	              Norrman, "The Secure Real-time Transport Protocol (SRTP)",
907	              RFC 3711, March 2004.

909	   [cryptval]
910	              NIST, "Cryptographic Module Validation Program",
911	              December 2006,
912	              <http://csrc.nist.gov/cryptval/140-2APP.htm>.

914	9.2.  Informative References

916	   [I-D.baugher-mmusic-sdp-dh]
917	              Baugher, M. and D. McGrew, "Diffie-Hellman Exchanges for
918	              Multimedia Sessions", draft-baugher-mmusic-sdp-dh-00 (work
919	              in progress), February 2006.

921	   [I-D.dondeti-msec-rtpsec-mikeyv2]
922	              Dondeti, L., "MIKEYv2: SRTP Key Management using MIKEY,
923	              revisited", draft-dondeti-msec-rtpsec-mikeyv2-01 (work in
924	              progress), March 2007.

926	   [I-D.fischl-sipping-media-dtls]
927	              Fischl, J., "Datagram Transport Layer Security (DTLS)
928	              Protocol for Protection of Media  Traffic Established with
929	              the Session Initiation Protocol",
930	              draft-fischl-sipping-media-dtls-03 (work in progress),
931	              July 2007.

933	   [I-D.ietf-avt-dtls-srtp]
934	              McGrew, D. and E. Rescorla, "Datagram Transport Layer
935	              Security (DTLS) Extension to Establish Keys for  Secure
936	              Real-time Transport Protocol (SRTP)",
937	              draft-ietf-avt-dtls-srtp-01 (work in progress),
938	              November 2007.

940	   [I-D.ietf-mmusic-ice]
941	              Rosenberg, J., "Interactive Connectivity Establishment
942	              (ICE): A Protocol for Network Address  Translator (NAT)
943	              Traversal for Offer/Answer Protocols",
944	              draft-ietf-mmusic-ice-19 (work in progress), October 2007.

946	   [I-D.ietf-mmusic-media-path-middleboxes]
947	              Stucker, B. and H. Tschofenig, "Analysis of Middlebox
948	              Interactions for Signaling Protocol Communication  along
949	              the Media Path",
950	              draft-ietf-mmusic-media-path-middleboxes-00 (work in
951	              progress), January 2008.

953	   [I-D.ietf-mmusic-sdp-capability-negotiation]
954	              Andreasen, F., "SDP Capability Negotiation",
955	              draft-ietf-mmusic-sdp-capability-negotiation-08 (work in
956	              progress), December 2007.

958	   [I-D.ietf-msec-mikey-applicability]
959	              Fries, S. and D. Ignjatic, "On the applicability of
960	              various MIKEY modes and extensions",
961	              draft-ietf-msec-mikey-applicability-08 (work in progress),
962	              February 2008.

964	   [I-D.ietf-msec-mikey-ecc]
965	              Milne, A., "ECC Algorithms for MIKEY",
966	              draft-ietf-msec-mikey-ecc-03 (work in progress),
967	              June 2007.

969	   [I-D.ietf-sip-certs]
970	              Jennings, C., Peterson, J., and J. Fischl, "Certificate
971	              Management Service for The Session Initiation Protocol
972	              (SIP)", draft-ietf-sip-certs-05 (work in progress),
973	              February 2008.

975	   [I-D.jennings-sipping-multipart]
976	              Wing, D. and C. Jennings, "Session Initiation Protocol
977	              (SIP) Offer/Answer with Multipart Alternative",
978	              draft-jennings-sipping-multipart-02 (work in progress),
979	              March 2006.

981	   [I-D.mahy-sipping-herfp-fix]
982	              Mahy, R., "A Solution to the Heterogeneous Error Response
983	              Forking Problem (HERFP) in  the Session Initiation
984	              Protocol (SIP)", draft-mahy-sipping-herfp-fix-01 (work in
985	              progress), March 2006.

987	   [I-D.mcgrew-srtp-ekt]
988	              McGrew, D., "Encrypted Key Transport for Secure RTP",
989	              draft-mcgrew-srtp-ekt-03 (work in progress), July 2007.

991	   [I-D.wing-sipping-srtp-key]
992	              Wing, D., Audet, F., Fries, S., and H. Tschofenig,
993	              "Disclosing Secure RTP (SRTP) Session Keys with a SIP
994	              Event Package", draft-wing-sipping-srtp-key-02 (work in
995	              progress), November 2007.

997	   [I-D.zimmermann-avt-zrtp]
998	              Zimmermann, P., "ZRTP: Media Path Key Agreement for Secure
999	              RTP", draft-zimmermann-avt-zrtp-04 (work in progress),
1000	              July 2007.

1002	   [RFC3372]  Vemuri, A. and J. Peterson, "Session Initiation Protocol
1003	              for Telephones (SIP-T): Context and Architectures",
1004	              BCP 63, RFC 3372, September 2002.

1006	   [RFC3388]  Camarillo, G., Eriksson, G., Holler, J., and H.
1007	              Schulzrinne, "Grouping of Media Lines in the Session
1008	              Description Protocol (SDP)", RFC 3388, December 2002.

1010	   [RFC3550]  Schulzrinne, H., Casner, S., Frederick, R., and V.
1011	              Jacobson, "RTP: A Transport Protocol for Real-Time
1012	              Applications", STD 64, RFC 3550, July 2003.

1014	   [RFC3830]  Arkko, J., Carrara, E., Lindholm, F., Naslund, M., and K.
1015	              Norrman, "MIKEY: Multimedia Internet KEYing", RFC 3830,
1016	              August 2004.

1018	   [RFC4346]  Dierks, T. and E. Rescorla, "The Transport Layer Security
1019	              (TLS) Protocol Version 1.1", RFC 4346, April 2006.

1021	   [RFC4474]  Peterson, J. and C. Jennings, "Enhancements for
1022	              Authenticated Identity Management in the Session
1023	              Initiation Protocol (SIP)", RFC 4474, August 2006.

1025	   [RFC4492]  Blake-Wilson, S., Bolyard, N., Gupta, V., Hawk, C., and B.
1026	              Moeller, "Elliptic Curve Cryptography (ECC) Cipher Suites
1027	              for Transport Layer Security (TLS)", RFC 4492, May 2006.

1029	   [RFC4568]  Andreasen, F., Baugher, M., and D. Wing, "Session
1030	              Description Protocol (SDP) Security Descriptions for Media
1031	              Streams", RFC 4568, July 2006.

1033	   [RFC4650]  Euchner, M., "HMAC-Authenticated Diffie-Hellman for
1034	              Multimedia Internet KEYing (MIKEY)", RFC 4650,
1035	              September 2006.

1037	   [RFC4738]  Ignjatic, D., Dondeti, L., Audet, F., and P. Lin, "MIKEY-
1038	              RSA-R: An Additional Mode of Key Distribution in
1039	              Multimedia Internet KEYing (MIKEY)", RFC 4738,
1040	              November 2006.

1042	   [RFC4771]  Lehtovirta, V., Naslund, M., and K. Norrman, "Integrity
1043	              Transform Carrying Roll-Over Counter for the Secure Real-
1044	              time Transport Protocol (SRTP)", RFC 4771, January 2007.

1046	   [RFC4916]  Elwell, J., "Connected Identity in the Session Initiation
1047	              Protocol (SIP)", RFC 4916, June 2007.

1049	   [RFC4949]  Shirey, R., "Internet Security Glossary, Version 2",
1050	              RFC 4949, August 2007.

1052	   [RFC5027]  Andreasen, F. and D. Wing, "Security Preconditions for
1053	              Session Description Protocol (SDP) Media Streams",
1054	              RFC 5027, October 2007.

1056	Appendix A.  Overview and Evaluation of Existing Keying Mechanisms

1058	   Based on how the SRTP keys are exchanged, each SRTP key exchange
1059	   mechanism belongs to one general category:

1061	A.1.  Signaling Path Keying Techniques

1063	      signaling path:  All the keying is carried in the call signaling
1064	         (SIP or SDP) path.

1066	      media path:  All the keying is carried in the SRTP/SRTCP media
1067	         path, and no signaling whatsoever is carried in the call
1068	         signaling path.

1070	      signaling and media path:  Parts of the keying are carried in the
1071	         SRTP/SRTCP media path, and parts are carried in the call
1072	         signaling (SIP or SDP) path.

1074	   One of the significant benefits of SRTP over other end-to-end
1075	   encryption mechanisms, such as for example IPsec, is that SRTP is
1076	   bandwidth efficient and SRTP retains the header of RTP packets.
1077	   Bandwidth efficiency is vital for VoIP in many scenarios where access
1078	   bandwidth is limited or expensive, and retaining the RTP header is
1079	   important for troubleshooting packet loss, delay, and jitter.

1081	   Related to SRTP's characteristics is a goal that any SRTP keying
1082	   mechanism to also be efficient and not cause additional call setup
1083	   delay.  Contributors to additional call setup delay include network
1084	   or database operations: retrieval of certificates and additional SIP
1085	   or media path messages, and computational overhead of establishing
1086	   keys or validating certificates.

1088	   When examining the choice between keying in the signaling path,
1089	   keying in the media path, or keying in both paths, it is important to
1090	   realize the media path is generally 'faster' than the SIP signaling
1091	   path.  The SIP signaling path has computational elements involved
1092	   which parse and route SIP messages.  The media path, on the other
1093	   hand, does not normally have computational elements involved, and
1094	   even when computational elements such as firewalls are involved, they
1095	   cause very little additional delay.  Thus, the media path can be
1096	   useful for exchanging several messages to establish SRTP keys.  A
1097	   disadvantage of keying over the media path is that interworking
1098	   different key exchange requires the interworking function be in the
1099	   media path, rather than just in the signaling path; in practice this
1100	   involvement is probably unavoidable anyway.

1102	A.1.1.  MIKEY-NULL

1104	   MIKEY-NULL [RFC3830] has the offerer indicate the SRTP keys for both
1105	   directions.  The key is sent unencrypted in SDP, which means the SDP
1106	   must be encrypted hop-by-hop (e.g., by using TLS (SIPS)) or end-to-
1107	   end (e.g., by using S/MIME).

1109	   MIKEY-NULL requires one message from offerer to answerer (half a
1110	   round trip), and does not add additional media path messages.

1112	A.1.2.  MIKEY-PSK

1114	   MIKEY-PSK (pre-shared key) [RFC3830] requires that all endpoints
1115	   share one common key.  MIKEY-PSK has the offerer encrypt the SRTP
1116	   keys for both directions using this pre-shared key.

1118	   MIKEY-PSK requires one message from offerer to answerer (half a round
1119	   trip), and does not add additional media path messages.

1121	A.1.3.  MIKEY-RSA

1123	   MIKEY-RSA [RFC3830] has the offerer encrypt the keys for both
1124	   directions using the intended answerer's public key, which is
1125	   obtained from a mechanism outside of MIKEY.

1127	   MIKEY-RSA requires one message from offerer to answerer (half a round
1128	   trip), and does not add additional media path messages.  MIKEY-RSA
1129	   requires the offerer to obtain the intended answerer's certificate.

1131	A.1.4.  MIKEY-RSA-R

1133	   MIKEY-RSA-R [RFC4738] is essentially the same as MIKEY-RSA but
1134	   reverses the role of the offerer and the answerer with regards to
1135	   providing the keys.  That is, the answerer encrypts the keys for both
1136	   directions using the offerer's public key.  Both the offerer and
1137	   answerer validate each other's public keys using a standard X.509
1138	   validation techniques.  MIKEY-RSA-R also enables sending certificates
1139	   in the MIKEY message.

1141	   MIKEY-RSA-R requires one message from offerer to answer, and one
1142	   message from answerer to offerer (full round trip), and does not add
1143	   additional media path messages.  MIKEY-RSA-R requires the offerer
1144	   validate the answerer's certificate.

1146	A.1.5.  MIKEY-DHSIGN

1148	   In MIKEY-DHSIGN [RFC3830] the offerer and answerer derive the key
1149	   from a Diffie-Hellman exchange.  In order to prevent an active man-
1150	   in-the-middle the DH exchange itself is signed using each endpoint's
1151	   private key and the associated public keys are validated using
1152	   standard X.509 validation techniques.

1154	   MIKEY-DHSIGN requires one message from offerer to answerer, and one
1155	   message from answerer to offerer (full round trip), and does not add
1156	   additional media path messages.  MIKEY-DHSIGN requires the offerer
1157	   and answerer to validate each other's certificates.  MIKEY-DHSIGN
1158	   also enables sending the answerer's certificate in the MIKEY message.

1160	A.1.6.  MIKEY-DHHMAC

1162	   MIKEY-DHHMAC [RFC4650] uses a pre-shared secret to HMAC the Diffie-
1163	   Hellman exchange, essentially combining aspects of MIKEY-PSK with
1164	   MIKEY-DHSIGN, but without MIKEY-DHSIGN's need for certificate
1165	   authentication.

1167	   MIKEY-DHHMAC requires one message from offerer to answerer, and one
1168	   message from answerer to offerer (full round trip), and does not add
1169	   additional media path messages.

1171	A.1.7.  MIKEY-ECIES and MIKEY-ECMQV (MIKEY-ECC)

1173	   ECC Algorithms For MIKEY [I-D.ietf-msec-mikey-ecc] describes how ECC
1174	   can be used with MIKEY-RSA (using ECDSA signature) and with MIKEY-
1175	   DHSIGN (using a new DH-Group code), and also defines two new ECC-
1176	   based algorithms, Elliptic Curve Integrated Encryption Scheme (ECIES)
1177	   and Elliptic Curve Menezes-Qu-Vanstone (ECMQV) .

1179	   With this proposal, the ECDSA signature, MIKEY-ECIES, and MIKEY-ECMQV
1180	   function exactly like MIKEY-RSA, and the new DH-Group code function
1181	   exactly like MIKEY-DHSIGN.  Therefore these ECC mechanisms are not
1182	   discussed separately in this document.

1184	A.1.8.  Security Descriptions with SIPS

1186	   Security Descriptions [RFC4568] has each side indicate the key it
1187	   will use for transmitting SRTP media, and the keys are sent in the
1188	   clear in SDP.  Security Descriptions relies on hop-by-hop (TLS via
1189	   "SIPS:") encryption to protect the keys exchanged in signaling.

1191	   Security Descriptions requires one message from offerer to answerer,
1192	   and one message from answerer to offerer (full round trip), and does
1193	   not add additional media path messages.

1195	A.1.9.  Security Descriptions with S/MIME

1197	   This keying mechanism is identical to Appendix A.1.8, except that
1198	   rather than protecting the signaling with TLS, the entire SDP is
1199	   encrypted with S/MIME.

1201	A.1.10.  SDP-DH (expired)

1203	   SDP Diffie-Hellman [I-D.baugher-mmusic-sdp-dh] exchanges Diffie-
1204	   Hellman messages in the signaling path to establish session keys.  To
1205	   protect against active man-in-the-middle attacks, the Diffie-Hellman
1206	   exchange needs to be protected with S/MIME, SIPS, or SIP-Identity
1207	   [RFC4474] and [RFC4474].

1209	   SDP-DH requires one message from offerer to answerer, and one message
1210	   from answerer to offerer (full round trip), and does not add
1211	   additional media path messages.

1213	A.1.11.  MIKEYv2 in SDP (expired)

1215	   MIKEYv2 [I-D.dondeti-msec-rtpsec-mikeyv2] adds mode negotiation to
1216	   MIKEYv1 and removes the time synchronization requirement.  It
1217	   therefore now takes 2 round-trips to complete.  In the first round
1218	   trip, the communicating parties learn each other's identities, agree
1219	   on a MIKEY mode, crypto algorithm, SRTP policy, and exchanges nonces
1220	   for replay protection.  In the second round trip, they negotiate
1221	   unicast and/or group SRTP context for SRTP and/or SRTCP.

1223	   Furthemore, MIKEYv2 also defines an in-band negotiation mode as an
1224	   alternative to SDP (see Appendix A.3.3).

1226	A.1.12.  Evaluation Criteria - SIP

1228	   This section considers how each keying mechanism interacts with SIP
1229	   features.

1231	A.1.12.1.  Secure Retargeting and Secure Forking

1233	   Retargeting and forking of signaling requests is described within
1234	   Section 4.2.  The following builds upon this description.

1236	   The following list compares the behavior of secure forking, answering
1237	   association, two-time pads, and secure retargeting for each keying
1238	   mechanism.

1240	      MIKEY-NULL  Secure Forking: No, all AORs see offerer's and
1241	         answerer's keys.  Answer is associated with media by the SSRC
1242	         in MIKEY.  Additionally, a two-time pad occurs if two branches
1243	         choose the same 32-bit SSRC and transmit SRTP packets.

1245	         Secure Retargeting: No, all targets see offerer's and
1246	         answerer's keys.  Suffers from retargeting identity problem.

1248	      MIKEY-PSK
1249	         Secure Forking: No, all AORs see offerer's and answerer's keys.
1250	         Answer is associated with media by the SSRC in MIKEY.  Note
1251	         that all AORs must share the same pre-shared key in order for
1252	         forking to work at all with MIKEY-PSK.  Additionally, a two-
1253	         time pad occurs if two branches choose the same 32-bit SSRC and
1254	         transmit SRTP packets.

1256	         Secure Retargeting: Not secure.  For retargeting to work, the
1257	         final target must possess the correct PSK.  As this is likely
1258	         in scenarios were the call is targeted to another device
1259	         belonging to the same user (forking), it is very unlikely that
1260	         other users will possess that PSK and be able to successfully
1261	         answer that call.

1263	      MIKEY-RSA
1264	         Secure Forking: No, all AORs see offerer's and answerer's keys.
1265	         Answer is associated with media by the SSRC in MIKEY.  Note
1266	         that all AORs must share the same private key in order for
1267	         forking to work at all with MIKEY-RSA.  Additionally, a two-
1268	         time pad occurs if two branches choose the same 32-bit SSRC and
1269	         transmit SRTP packets.

1271	         Secure Retargeting: No.

1273	      MIKEY-RSA-R
1274	         Secure Forking: Yes. Answer is associated with media by the
1275	         SSRC in MIKEY.

1277	         Secure Retargeting: Yes.

1279	      MIKEY-DHSIGN
1280	         Secure Forking: Yes, each forked endpoint negotiates unique
1281	         keys with the offerer for both directions.  Answer is
1282	         associated with media by the SSRC in MIKEY.

1284	         Secure Retargeting: Yes, each target negotiates unique keys
1285	         with the offerer for both directions.

1287	      MIKEYv2 in SDP
1288	         The behavior will depend on which mode is picked.

1290	      MIKEY-DHHMAC
1291	         Secure Forking: Yes, each forked endpoint negotiates unique
1292	         keys with the offerer for both directions.  Answer is
1293	         associated with media by the SSRC in MIKEY.

1295	         Secure Retargeting: Yes, each target negotiates unique keys
1296	         with the offerer for both directions.  Note that for the keys
1297	         to be meaningful, it would require the PSK to be the same for
1298	         all the potential intermediaries, which would only happen
1299	         within a single domain.

1301	      Security Descriptions with SIPS
1302	         Secure Forking: No.  Each forked endpoint sees the offerer's
1303	         key.  Answer is not associated with media.

1305	         Secure Retargeting: No.  Each target sees the offerer's key.

1307	      Security Descriptions with S/MIME
1308	         Secure Forking: No.  Each forked endpoint sees the offerer's
1309	         key.  Answer is not associated with media.

1311	         Secure Retargeting: No.  Each target sees the offerer's key.
1312	         Suffers from retargeting identity problem.

1314	      SDP-DH
1315	         Secure Forking: Yes. Each forked endpoint calculates a unique
1316	         SRTP key.  Answer is not associated with media.

1318	         Secure Retargeting: Yes. The final target calculates a unique
1319	         SRTP key.

1321	      ZRTP
1322	         Secure Forking: Yes. Each forked endpoint calculates a unique
1323	         SRTP key.  As ZRTP isn't signaled in SDP, there is no
1324	         association of the answer with media.

1326	         Secure Retargeting: Yes. The final target calculates a unique
1327	         SRTP key.

1329	      EKT
1330	         Secure Forking: Inherited from the bootstrapping mechanism (the
1331	         specific MIKEY mode or Security Descriptions).  Answer is
1332	         associated with media by the SPI in the EKT protocol.  Answer
1333	         is associated with media by the SPI in the EKT protocol.

1335	         Secure Retargeting: Inherited from the bootstrapping mechanism
1336	         (the specific MIKEY mode or Security Descriptions).

1338	      DTLS-SRTP
1339	         Secure Forking: Yes. Each forked endpoint calculates a unique
1340	         SRTP key.  Answer is associated with media by the certificate
1341	         fingerprint in signaling and certificate in the media path.

1343	         Secure Retargeting: Yes. The final target calculates a unique
1344	         SRTP key.

1346	      MIKEYv2 Inband
1347	         The behavior will depend on which mode is picked.

1349	A.1.12.2.  Clipping Media Before SDP Answer

1351	   Clipping media before receiving the signaling answer is described
1352	   within Section 4.1.  The following builds upon this description.

1354	   Furthermore, the problem of clipping gets compounded when forking is
1355	   used.  For example, if using a Diffie-Hellman keying technique with
1356	   security preconditions that forks to 20 endpoints, the call initiator
1357	   would get 20 provisional responses containing 20 signed Diffie-
1358	   Hellman half keys.  Calculating 20 DH secrets and validating
1359	   signatures can be a difficult task depending on the device
1360	   capabilities.

1362	   The following list compares the behavior of clipping before SDP
1363	   answer for each keying mechanism.

1365	      MIKEY-NULL
1366	         Not clipped.  The offerer provides the answerer's keys.

1368	      MIKEY-PSK
1369	         Not clipped.  The offerer provides the answerer's keys.

1371	      MIKEY-RSA
1372	         Not clipped.  The offerer provides the answerer's keys.

1374	      MIKEY-RSA-R
1375	         Clipped.  The answer contains the answerer's encryption key.

1377	      MIKEY-DHSIGN
1378	         Clipped.  The answer contains the answerer's Diffie-Hellman
1379	         response.

1381	      MIKEY-DHHMAC
1382	         Clipped.  The answer contains the answerer's Diffie-Hellman
1383	         response.

1385	      MIKEYv2 in SDP
1386	         The behavior will depend on which mode is picked.

1388	      Security Descriptions with SIPS
1389	         Clipped.  The answer contains the answerer's encryption key.

1391	      Security Descriptions with S/MIME
1392	         Clipped.  The answer contains the answerer's encryption key.

1394	      SDP-DH
1395	         Clipped.  The answer contains the answerer's Diffie-Hellman
1396	         response.

1398	      ZRTP
1399	         Not clipped because the session intially uses RTP.  While RTP
1400	         is flowing, both ends negotiate SRTP keys in the media path and
1401	         then switch to using SRTP.

1403	      EKT
1404	         Not clipped, as long as the first RTCP packet (containing the
1405	         answerer's key) is not lost in transit.  The answerer sends its
1406	         encryption key in RTCP, which arrives at the same time (or
1407	         before) the first SRTP packet encrypted with that key.

1409	            Note: RTCP needs to work, in the answerer-to-offerer
1410	            direction, before the offerer can decrypt SRTP media.

1412	      DTLS-SRTP
1413	         No clipping after the DTLS-SRTP handshake has completed.  SRTP
1414	         keys are exchanged in the media path.  Need to wait for SDP
1415	         answer to ensure DTLS-SRTP handshake was done with an
1416	         authorized party.

1418	            If a middlebox interferes with the media path, there can be
1419	            clipping [I-D.ietf-mmusic-media-path-middleboxes].

1421	      MIKEYv2 Inband
1422	         Not clipped.  Keys are exchanged in the media path without
1423	         relying on the signaling path.

1425	A.1.12.3.  Centralized Keying

1427	   Centralized keying is described within Section 4.3.  The following
1428	   builds upon this description.

1430	   The following list describes how each keying mechanism behaves with
1431	   centralized keying (scenario d) and rekeying.

1433	      MIKEY-NULL
1434	         Keying: Yes, if offerer is the mixer.  No, if offerer is the
1435	         participant (end user).

1437	         Rekeying: Yes, via re-INVITE

1439	      MIKEY-PSK
1440	         Keying: Yes, if offerer is the mixer.  No, if offerer is the
1441	         participant (end user).

1443	         Rekeying: Yes, with a re-INVITE

1445	      MIKEY-RSA
1446	         Keying: Yes, if offerer is the mixer.  No, if offerer is the
1447	         participant (end user).

1449	         Rekeying: Yes, with a re-INVITE

1451	      MIKEY-RSA-R
1452	         Keying: No, if offerer is the mixer.  Yes, if offerer is the
1453	         participant (end user).

1455	         Rekeying: n/a

1457	      MIKEY-DHSIGN
1458	         Keying: No; a group-key Diffie-Hellman protocol is not
1459	         supported.

1461	         Rekeying: n/a

1463	      MIKEY-DHHMAC
1464	         Keying: No; a group-key Diffie-Hellman protocol is not
1465	         supported.

1467	         Rekeying: n/a

1469	      MIKEYv2 in SDP
1470	         The behavior will depend on which mode is picked.

1472	      Security Descriptions with SIPS
1473	         Keying: Yes, if offerer is the mixer.  Yes, if offerer is the
1474	         participant.

1476	         Rekeying: Yes, with a re-INVITE.

1478	      Security Descriptions with S/MIME
1479	         Keying: Yes, if offerer is the mixer.  Yes, if offerer is the
1480	         participant.

1482	         Rekeying: Yes, with a re-INVITE.

1484	      SDP-DH
1485	         Keying: No; a group-key Diffie-Hellman protocol is not
1486	         supported.

1488	         Rekeying: n/a

1490	      ZRTP
1491	         Keying: No; a group-key Diffie-Hellman protocol is not
1492	         supported.

1494	         Rekeying: n/a

1496	      EKT
1497	         Keying: Yes. After bootstrapping a KEK using Security
1498	         Descriptions or MIKEY, each member originating an SRTP stream
1499	         can send its SRTP master key, sequence number and ROC via RTCP.

1501	         Rekeying: Yes. EKT supports each sender to transmit its SRTP
1502	         master key to the group via RTCP packets.  Thus, EKT supports
1503	         each originator of an SRTP stream to rekey at any time.

1505	      DTLS-SRTP
1506	         Keying: Yes, because with the assumed cipher suite,
1507	         TLS_RSA_WITH_3DES_EDE_CBC_SHA, each end indicates its SRTP key.

1509	         Rekeying: via DTLS in the media path.

1511	      MIKEYv2 Inband
1512	         The behavior will depend on which mode is picked.

1514	A.1.12.4.  SSRC and ROC

1516	   In SRTP, a cryptographic context is defined as the SSRC, destination
1517	   network address, and destination transport port number.  Whereas RTP,
1518	   a flow is defined as the destination network address and destination
1519	   transport port number.  This results in a problem -- how to
1520	   communicate the SSRC so that the SSRC can be used for the
1521	   cryptographic context.

1523	   Two approaches have emerged for this communication.  One, used by all
1524	   MIKEY modes, is to communicate the SSRCs to the peer in the MIKEY
1525	   exchange.  Another, used by Security Descriptions, is to use "late
1526	   bindng" -- that is, any new packet containing a previously-unseen
1527	   SSRC (which arrives at the same destination network address and
1528	   destination transport port number) will create a new cryptographic
1529	   context.  Another approach, common amongst techniques with media-path
1530	   SRTP key establishment, is to require a handshake over that media
1531	   path before SRTP packets are sent.  MIKEY's approach changes RTP's
1532	   SSRC collision detection behavior by requiring RTP to pre-establish
1533	   the SSRC values for each session.

1535	   Another related issue is that SRTP introduces a rollover counter
1536	   (ROC), which records how many times the SRTP sequence number has
1537	   rolled over.  As the sequence number is used for SRTP's default
1538	   ciphers, it is important that all endpoints know the value of the
1539	   ROC.  The ROC starts at 0 at the beginning of a session.

1541	   Some keying mechanisms cause a two-time pad to occur if two endpoints
1542	   of a forked call have an SSRC collision.

1544	   Note: A proposal has been made to send the ROC value on every Nth
1545	   SRTP packet[RFC4771].  This proposal has not yet been incorporated
1546	   into this document.

1548	   The following list examines handling of SSRC and ROC:

1550	      MIKEY-NULL
1551	         Each endpoint indicates a set of SSRCs and the ROC for SRTP
1552	         packets it transmits.

1554	      MIKEY-PSK
1555	         Each endpoint indicates a set of SSRCs and the ROC for SRTP
1556	         packets it transmits.

1558	      MIKEY-RSA
1559	         Each endpoint indicates a set of SSRCs and the ROC for SRTP
1560	         packets it transmits.

1562	      MIKEY-RSA-R
1563	         Each endpoint indicates a set of SSRCs and the ROC for SRTP
1564	         packets it transmits.

1566	      MIKEY-DHSIGN
1567	         Each endpoint indicates a set of SSRCs and the ROC for SRTP
1568	         packets it transmits.

1570	      MIKEY-DHHMAC
1571	         Each endpoint indicates a set of SSRCs and the ROC for SRTP
1572	         packets it transmits.

1574	      MIKEYv2 in SDP
1575	         Each endpoint indicates a set of SSRCs and the ROC for SRTP
1576	         packets it transmits.

1578	      Security Descriptions with SIPS
1579	         Neither SSRC nor ROC are signaled.  SSRC 'late binding' is
1580	         used.

1582	      Security Descriptions with S/MIME
1583	         Neither SSRC nor ROC are signaled.  SSRC 'late binding' is
1584	         used.

1586	      SDP-DH
1587	         Neither SSRC nor ROC are signaled.  SSRC 'late binding' is
1588	         used.

1590	      ZRTP
1591	         Neither SSRC nor ROC are signaled.  SSRC 'late binding' is
1592	         used.

1594	      EKT
1595	         The SSRC of the SRTCP packet containing an EKT update
1596	         corresponds to the SRTP master key and other parameters within
1597	         that packet.

1599	      DTLS-SRTP
1600	         Neither SSRC nor ROC are signaled.  SSRC 'late binding' is
1601	         used.

1603	      MIKEYv2 Inband
1604	         Each endpoint indicates a set of SSRCs and the ROC for SRTP
1605	         packets it transmits.

1607	A.1.13.  Evaluation Criteria - Security

1609	   This section evaluates each keying mechanism on the basis of their
1610	   security properties.

1612	A.1.13.1.  Distribution and Validation of Public Keys and Certificates

1614	   Using public key cryptography for confidentiality and authentication
1615	   can introduce requirements for two types of systems: (1) a system to
1616	   distribute public keys (often in the form of certificates), and (2) a
1617	   system for validating certificates.  We refer to the former as a key
1618	   distribution system and the latter as an authentication
1619	   infrastructure.  In many cases, a monolithic public key
1620	   infrastructure (PKI) is used for fulfill both of these roles.
1621	   However, these functions can be provided by many other systems.  For
1622	   instance, key distribution may be accomplished by any public
1623	   repository of keys.  Any system in which the two endpoints have
1624	   access to trust anchors and intermediate CA certificates that can be
1625	   used to validate other endpoints' certificates (including a system of
1626	   self-signed certificates) can be used to support certificate
1627	   validation in the below schemes.

1629	   With real-time communications it is desirable to avoid fetching keys
1630	   or certificates that delay call setup; rather it is preferable to
1631	   fetch or validate certificates in such a way that call setup isn't
1632	   delayed.  For example, a certificate can be validated while the phone
1633	   is ringing or can be validated while ring-back tones are being played
1634	   or even while the called party is answering the phone and saying
1635	   "hello".

1637	   SRTP key exchange mechanisms that require a particular authentication
1638	   infrastructure to operate (whether for distribution or validation)
1639	   are gated on the deployment of a such an infrastructure available to
1640	   both endpoints.  This means that no media security is achievable
1641	   until such an infrastructure exists.  For SIP, something like sip-
1642	   certs [I-D.ietf-sip-certs] might be used to obtain the certificate of
1643	   a peer.

1645	      Note: Even if sip-certs [I-D.ietf-sip-certs] was deployed, the
1646	      retargeting problem (Appendix A.1.12.1) would still prevent
1647	      successful deployment of keying techniques which require the
1648	      offerer to obtain the actual target's public key.

1650	   The following list compares the requirements introduced by the use of
1651	   public-key cryptography in each keying mechanism, both for public key
1652	   distribution and for certificate validation.

1654	      MIKEY-NULL
1655	         Public-key cryptography is not used.

1657	      MIKEY-PSK
1658	         Public-key cryptography is not used.  Rather, all endpoints
1659	         must have some way to exchange per-endpoint or per-system pre-
1660	         shared keys.

1662	      MIKEY-RSA
1663	         The offerer obtains the intended answerer's public key before
1664	         initiating the call.  This public key is used to encrypt the
1665	         SRTP keys.  There is no defined mechanism for the offerer to
1666	         obtain the answerer's public key, although [I-D.ietf-sip-certs]
1667	         might be viable in the future.

1669	         The offer may also contain a certificate for the offeror, which
1670	         would require an authentication infrastructure in order to be
1671	         validated by the receiver.

1673	      MIKEY-RSA-R
1674	         The offer contains the offerer's certificate, and the answer
1675	         contains the answerer's certificate.  The answerer uses the
1676	         public key in the certificate to encrypt the SRTP keys that
1677	         will be used by the offerer and the answerer.  An
1678	         authentication infrastructure is necessary to validate the
1679	         certificates.

1681	      MIKEY-DHSIGN
1682	         An authentication infrastructure is used to authenticate the
1683	         public key that is included in the MIKEY message.

1685	      MIKEY-DHHMAC
1686	         Public-key cryptography is not used.  Rather, all endpoints
1687	         must have some way to exchange per-endpoint or per-system pre-
1688	         shared keys.

1690	      MIKEYv2 in SDP
1691	         The behavior will depend on which mode is picked.

1693	      Security Descriptions with SIPS
1694	         Public-key cryptography is not used.

1696	      Security Descriptions with S/MIME
1697	         Use of S/MIME requires that the endpoints be able to fetch and
1698	         validate certificates for each other.  The offerer must obtain
1699	         the intended target's certificate and encrypts the SDP offer
1700	         with the public key contained in target's certificate.  The
1701	         answerer must obtain the offerer's certificate and encrypt the
1702	         SDP answer with the public key contained in the offerer's
1703	         certificate.

1705	      SDP-DH
1706	         Public-key cryptography is not used.

1708	      ZRTP
1709	         Public-key cryptography is not used.

1711	      EKT
1712	         Public-key cryptography is not used by itself, but might be
1713	         used by the EKT bootstrapping keying mechanism (such as certain
1714	         MIKEY modes).

1716	      DTLS-SRTP
1717	         Remote party's certificate is sent in media path, and a
1718	         fingerprint of the same certificate is sent in the signaling
1719	         path.

1721	      MIKEYv2 Inband
1722	         The behavior will depend on which mode is picked.

1724	A.1.13.2.  Perfect Forward Secrecy

1726	   In the context of SRTP, Perfect Forward Secrecy is the property that
1727	   SRTP session keys that protected a previous session are not
1728	   compromised if the static keys belonging to the endpoints are
1729	   compromised.  That is, if someone were to record your encrypted
1730	   session content and later acquires either party's private key, that
1731	   encrypted session content would be safe from decryption if your key
1732	   exchange mechanism had perfect forward secrecy.

1734	   The following list describes how each key exchange mechanism provides
1735	   PFS.

1737	      MIKEY-NULL
1738	         No PFS.

1740	      MIKEY-PSK
1741	         No PFS.

1743	      MIKEY-RSA
1744	         No PFS.

1746	      MIKEY-RSA-R
1747	         No PFS.

1749	      MIKEY-DHSIGN
1750	         PFS is provided with the Diffie-Hellman exchange.

1752	      MIKEY-DHHMAC
1753	         PFS is provided with the Diffie-Hellman exchange.

1755	      MIKEYv2 in SDP
1756	         The behavior will depend on which mode is picked.

1758	      Security Descriptions with SIPS
1759	         No PFS.

1761	      Security Descriptions with S/MIME
1762	         No PFS.

1764	      SDP-DH
1765	         PFS is provided with the Diffie-Hellman exchange.

1767	      ZRTP
1768	         PFS is provided with the Diffie-Hellman exchange.

1770	      EKT
1771	         No PFS.

1773	      DTLS-SRTP
1774	         PFS is achieved if the negotiated cipher suite includes an
1775	         exponential or discrete-logarithmic key exchange (e.g., Diffie-
1776	         Hellman (DH_RSA from [RFC4346]) or Elliptic Curve Diffie-
1777	         Hellman [RFC4492]).

1779	      MIKEYv2 Inband
1780	         The behavior will depend on which mode is picked.

1782	A.1.13.3.  Best Effort Encryption

1784	   With best effort encryption, SRTP is used with endpoints that support
1785	   SRTP, otherwise RTP is used.

1787	   SIP needs a backwards-compatible best effort encryption in order for
1788	   SRTP to work successfully with SIP retargeting and forking when there
1789	   is a mix of forked or retargeted devices that support SRTP and don't
1790	   support SRTP.

1792	      Consider the case of Bob, with a phone that only does RTP and a
1793	      voice mail system that supports SRTP and RTP.  If Alice calls Bob
1794	      with an SRTP offer, Bob's RTP-only phone will reject the media
1795	      stream (with an empty "m=" line) because Bob's phone doesn't
1796	      understand SRTP (RTP/SAVP).  Alice's phone will see this rejected
1797	      media stream and may terminate the entire call (BYE) and re-
1798	      initiate the call as RTP-only, or Alice's phone may decide to
1799	      continue with call setup with the SRTP-capable leg (the voice mail
1800	      system).  If Alice's phone decided to re-initiate the call as RTP-
1801	      only, and Bob doesn't answer his phone, Alice will then leave
1802	      voice mail using only RTP, rather than SRTP as expected.

1804	   Currently, several techniques are commonly considered as candidates
1805	   to provide opportunistic encryption:

1807	   multipart/alternative
1808	      [I-D.jennings-sipping-multipart] describes how to form a
1809	      multipart/alternative body part in SIP.  The significant issues
1810	      with this technique are (1) that multipart MIME is incompatible
1811	      with existing SIP proxies, firewalls, Session Border Controllers,
1812	      and endpoints and (2) when forking, the Heterogeneous Error
1813	      Response Forking Problem (HERFP) [I-D.mahy-sipping-herfp-fix]
1814	      causes problems if such non-multipart-capable endpoints were
1815	      involved in the forking.

1817	   SDP Grouping
1818	      A new SDP grouping mechanism (following the idea introduced in
1819	      [RFC3388]) has been discussed which would allow a media line to
1820	      indicate RTP/AVP and another media line to indicate RTP/SAVP,
1821	      allowing non-SRTP-aware endpoints to choose RTP/AVP and SRTP-aware
1822	      endpoints to choose RTP/SAVP.  As of this writing, this SDP
1823	      grouping mechanism has not been published as an Internet Draft.

1825	   session attribute
1826	      With this technique, the endpoints signal their desire to do SRTP
1827	      by signaling RTP (RTP/AVP), and using an attribute ("a=") in the
1828	      SDP.  This technique is entirely backwards compatible with non-
1829	      SRTP-aware endpoints, but doesn't use the RTP/SAVP protocol
1830	      registered by SRTP [RFC3711].

1832	   SDP Capability Negotiation
1833	      SDP Capability Negotiation
1834	      [I-D.ietf-mmusic-sdp-capability-negotiation] provides a backwards-
1835	      compatible mechanism to allow offering both SRTP and RTP in a
1836	      single offer.  This is the preferred technique.

1838	   Probing
1839	      With this technique, the endpoints first establish an RTP session
1840	      using RTP (RTP/AVP).  The endpoints send probe messages, over the
1841	      media path, to determine if the remote endpoint supports their
1842	      keying technique.

1844	   The preferred technique, SDP Capability Negotiation
1845	   [I-D.ietf-mmusic-sdp-capability-negotiation], can be used with all
1846	   key exchange mechanisms.  What remains unique is ZRTP, which can also
1847	   accomplish its best effort encryption by probing (sending ZRTP
1848	   messages over the media path) or by session attribute (see "a=zrtp",
1849	   defined in Section 10 of [I-D.zimmermann-avt-zrtp]).  Current
1850	   implementations of ZRTP use probing.

1852	A.1.13.4.  Upgrading Algorithms

1854	   It is necessary to allow upgrading SRTP encryption and hash
1855	   algorithms, as well as upgrading the cryptographic functions used for
1856	   the key exchange mechanism.  With SIP's offer/answer model, this can
1857	   be computionally expensive because the offer needs to contain all
1858	   combinations of the key exchange mechanisms (all MIKEY modes,
1859	   Security Descriptions) and all SRTP cryptographic suites (AES-128,
1860	   AES-256) and all SRTP cryptographic hash functions (SHA-1, SHA-256)
1861	   that the offerer supports.  In order to do this, the offerer has to
1862	   expend CPU resources to build an offer containing all of this
1863	   information which becomes computationally prohibitive.

1865	   Thus, it is important to keep the offerer's CPU impact fixed so that
1866	   offering multiple new SRTP encryption and hash functions incurs no
1867	   additional expense.

1869	   The following list describes the CPU effort involved in using each
1870	   key exchange technique.

1872	      MIKEY-NULL
1873	         No significant computaional expense.

1875	      MIKEY-PSK
1876	         No significant computational expense.

1878	      MIKEY-RSA
1879	         For each offered SRTP crypto suite, the offerer has to perform
1880	         RSA operation to encrypt the TGK

1882	      MIKEY-RSA-R
1883	         For each offered SRTP crypto suite, the offerer has to perform
1884	         public key operation to sign the MIKEY message.

1886	      MIKEY-DHSIGN
1887	         For each offered SRTP crypto suite, the offerer has to perform
1888	         Diffie-Hellman operation, and a public key operation to sign
1889	         the Diffie-Hellman output.

1891	      MIKEY-DHHMAC
1892	         For each offered SRTP crypto suite, the offerer has to perform
1893	         Diffie-Hellman operation.

1895	      MIKEYv2 in SDP
1896	         The behavior will depend on which mode is picked.

1898	      Security Descriptions with SIPS
1899	         No significant computational expense.

1901	      Security Descriptions with S/MIME
1902	         S/MIME requires the offerer and the answerer to encrypt the SDP
1903	         with the other's public key, and to decrypt the received SDP
1904	         with their own private key.

1906	      SDP-DH
1907	         For each offered SRTP crypto suite, the offerer has to perform
1908	         a Diffie-Hellman operation.

1910	      ZRTP
1911	         The offerer has no additional computational expense at all, as
1912	         the offer contains no information about ZRTP or might contain
1913	         "a=zrtp".

1915	      EKT
1916	         The offerer's Computational expense depends entirely on the EKT
1917	         bootstrapping mechanism selected (one or more MIKEY modes or
1918	         Security Descriptions).

1920	      DTLS-SRTP
1921	         The offerer has no additional computational expense at all, as
1922	         the offer contains only a fingerprint of the certificate that
1923	         will be presented in the DTLS exchange.

1925	      MIKEYv2 Inband
1926	         The behavior will depend on which mode is picked.

1928	A.2.  Media Path Keying Technique

1930	A.2.1.  ZRTP

1932	   ZRTP [I-D.zimmermann-avt-zrtp] does not exchange information in the
1933	   signaling path (although it's possible for endpoints to indicate
1934	   support for ZRTP with "a=zrtp" in the initial Offer).  In ZRTP the
1935	   keys are exchanged entirely in the media path using a Diffie-Hellman
1936	   exchange.  The advantage to this mechanism is that the signaling
1937	   channel is used only for call setup and the media channel is used to
1938	   establish an encrypted channel -- much like encryption devices on the
1939	   PSTN.  ZRTP uses voice authentication of its Diffie-Hellman exchange
1940	   by having each person read digits to the other person.  Subsequent
1941	   sessions with the same ZRTP endpoint can be authenticated using the
1942	   stored hash of the previously negotiated key rather than voice
1943	   authentication.

1945	   ZRTP uses 4 media path messages (Hello, Commit, DHPart1, and DHPart2)
1946	   to establish the SRTP key, and 3 media path confirmation messages.
1947	   These initial messages are all sent as non-RTP packets.

1949	      Note that when ZRTP probing is used, unencrypted RTP is being
1950	      exchanged until the SRTP keys are established.

1952	A.3.  Signaling and Media Path Keying Techniques

1954	A.3.1.  EKT

1956	   EKT [I-D.mcgrew-srtp-ekt] relies on another SRTP key exchange
1957	   protocol, such as Security Descriptions or MIKEY, for bootstrapping.
1958	   In the initial phase, each member of a conference uses an SRTP key
1959	   exchange protocol to establish a common key encryption key (KEK).
1960	   Each member may use the KEK to securely transport its SRTP master key
1961	   and current SRTP rollover counter (ROC), via RTCP, to the other
1962	   participants in the session.

1964	   EKT requires the offerer to send some parameters (EKT_Cipher, KEK,
1965	   and security parameter index (SPI)) via the bootstrapping protocol
1966	   such as Security Descriptions or MIKEY.  Each answerer sends an SRTCP
1967	   message which contains the answerer's SRTP Master Key, rollover
1968	   counter, and the SRTP sequence number.  Rekeying is done by sending a
1969	   new SRTCP message.  For reliable transport, multiple RTCP messages
1970	   need to be sent.

1972	A.3.2.  DTLS-SRTP

1974	   DTLS-SRTP [I-D.ietf-avt-dtls-srtp] exchanges public key fingerprints
1975	   in SDP [I-D.fischl-sipping-media-dtls] and then establishes a DTLS
1976	   session over the media channel.  The endpoints use the DTLS handshake
1977	   to agree on crypto suites and establish SRTP session keys.  SRTP
1978	   packets are then exchanged between the endpoints.

1980	   DTLS-SRTP requires one message from offerer to answerer (half round
1981	   trip), and one message from the answerer to offerer (full round trip)
1982	   so the offerer can correlate the SDP answer with the answering
1983	   endpoint.  DTLS-SRTP uses 4 media path messages to establish the SRTP
1984	   key.

1986	   This document assumes DTLS will use TLS_RSA_WITH_3DES_EDE_CBC_SHA as
1987	   its cipher suite, which is the mandatory-to-implement cipher suite in
1988	   TLS [RFC4346].

1990	A.3.3.  MIKEYv2 Inband (expired)

1992	   As defined in Appendix A.1.11, MIKEYv2 also defines an in-band
1993	   negotiation mode as an alternative to SDP (see Appendix A.3.3).  The
1994	   details are not sorted out in the draft yet on what in-band actually
1995	   means (i.e., UDP, RTP, RTCP, etc.).

1997	Appendix B.  Out-of-Scope

1999	   Discussions concluded that key management for shared-key encryption
2000	   of conferencing is outside the scope of this document.  As the
2001	   priority is point-to-point unicast SRTP session keying, resolving
2002	   shared-key SRTP session keying is deferred to later and left as an
2003	   item for future investigations.

2005	   The compromise of an endpoint that has access to decrypted media
2006	   (e.g., SIP user agent, transcoder, recorder) is out of scope of this
2007	   document.  Such a compromise might be via privilege escalation,
2008	   installation of a virus or trojan horse, or similar attacks.

2010	Appendix C.  Requirement renumbering in -02

2012	   [[RFC Editor: Please delete this section prior to publication.]]
2013	   Previous versions of this document used requirement numbers, which
2014	   were changed to mnemonics as follows:

2016	   R1    R-FORK-RETARGET

2018	   R2    R-BEST-SECURE

2020	   R3    R-DISTINCT

2022	   R4    R-REUSE; changed from 'MAY' to 'protocol MUST support, and
2023	         SHOULD implement'

2025	   R5    R-AVOID-CLIPPING

2027	   R6    R-PASS-MEDIA

2029	   R7    R-PASS-SIG

2031	   R8    R-PFS

2033	   R9    R-COMPUTE

2035	   R10   R-RTP-VALID

2037	   R11   (folded into R4; was reuse previous session)

2039	   R12   R-CERTS

2041	   R13   R-FIPS

2043	   R14   R-ASSOC

2045	   R15   (deleted; was ability to upgrade from RTP to SRTP, but
2046	         requirement was unclear on what it meant)

2048	   R16   R-DOS

2050	   R17   R-SIG-MEDIA

2052	   R18   R-EXISTING

2054	   R19   R-AGILITY

2056	   R20   R-DOWNGRADE
2057	   R21   R-NEGOTIATE

2059	   R23   R-OTHER-SIGNALING

2061	   R23   R-RECORDING (R23 was duplicated in previous versions of the
2062	         document)

2064	   R24   (deleted; was lawful intercept)

2066	   R25   R-TRANSCODER

2068	   R26   R-PSTN

2070	   R27   R-ID-BINDING

2072	   R28   R-ACT-ACT

2074	Authors' Addresses

2076	   Dan Wing (editor)
2077	   Cisco Systems, Inc.
2078	   170 West Tasman Drive
2079	   San Jose, CA  95134
2080	   USA

2082	   Email: dwing@cisco.com

2084	   Steffen Fries
2085	   Siemens AG
2086	   Otto-Hahn-Ring 6
2087	   Munich, Bavaria  81739
2088	   Germany

2090	   Email: steffen.fries@siemens.com

2092	   Hannes Tschofenig
2093	   Nokia Siemens Networks
2094	   Otto-Hahn-Ring 6
2095	   Munich, Bavaria  81739
2096	   Germany

2098	   Email: Hannes.Tschofenig@nsn.com
2099	   URI:   http://www.tschofenig.com
2100	   Francois Audet
2101	   Nortel
2102	   4655 Great America Parkway
2103	   Santa Clara, CA  95054
2104	   USA

2106	   Email: audet@nortel.com

2108	Full Copyright Statement

2110	   Copyright (C) The IETF Trust (2008).

2112	   This document is subject to the rights, licenses and restrictions
2113	   contained in BCP 78, and except as set forth therein, the authors
2114	   retain all their rights.

2116	   This document and the information contained herein are provided on an
2117	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
2118	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
2119	   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
2120	   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
2121	   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
2122	   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

2124	Intellectual Property

2126	   The IETF takes no position regarding the validity or scope of any
2127	   Intellectual Property Rights or other rights that might be claimed to
2128	   pertain to the implementation or use of the technology described in
2129	   this document or the extent to which any license under such rights
2130	   might or might not be available; nor does it represent that it has
2131	   made any independent effort to identify any such rights.  Information
2132	   on the procedures with respect to rights in RFC documents can be
2133	   found in BCP 78 and BCP 79.

2135	   Copies of IPR disclosures made to the IETF Secretariat and any
2136	   assurances of licenses to be made available, or the result of an
2137	   attempt made to obtain a general license or permission for the use of
2138	   such proprietary rights by implementers or users of this
2139	   specification can be obtained from the IETF on-line IPR repository at
2140	   http://www.ietf.org/ipr.

2142	   The IETF invites any interested party to bring to its attention any
2143	   copyrights, patents or patent applications, or other proprietary
2144	   rights that may cover technology that may be required to implement
2145	   this standard.  Please address the information to the IETF at
2146	   ietf-ipr@ietf.org.

2148	Acknowledgment

2150	   This document was produced using xml2rfc v1.33pre8 (of
2151	   http://xml.resource.org/) from a source in RFC-2629 XML format.