idnits 2.17.1 

draft-ietf-sip-media-security-requirements-08.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3978, Section 5.1 on line 20.

  -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on
     line 2121.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 2132.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 2139.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 2145.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust Copyright Line does not match the
     current year

  == Line 962 has weird spacing: '...ication  along...'

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (October 30, 2008) is 5650 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-07) exists of
     draft-ietf-avt-dtls-srtp-06

  == Outdated reference: A later version (-07) exists of
     draft-ietf-mmusic-media-path-middleboxes-01

  == Outdated reference: A later version (-13) exists of
     draft-ietf-mmusic-sdp-capability-negotiation-09

  == Outdated reference: A later version (-15) exists of
     draft-ietf-sip-certs-06

  == Outdated reference: A later version (-06) exists of
     draft-mcgrew-srtp-ekt-03

  == Outdated reference: A later version (-04) exists of
     draft-wing-sipping-srtp-key-03

  == Outdated reference: A later version (-22) exists of
     draft-zimmermann-avt-zrtp-10

  -- Obsolete informational reference (is this intentional?): RFC 4474
     (Obsoleted by RFC 8224)

  -- Obsolete informational reference (is this intentional?): RFC 4492
     (Obsoleted by RFC 8422)


     Summary: 1 error (**), 0 flaws (~~), 9 warnings (==), 9 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	SIP Working Group                                           D. Wing, Ed.
3	Internet-Draft                                                     Cisco
4	Intended status: Informational                                  S. Fries
5	Expires: May 3, 2009                                          Siemens AG
6	                                                           H. Tschofenig
7	                                                  Nokia Siemens Networks
8	                                                                F. Audet
9	                                                                  Nortel
10	                                                        October 30, 2008

12	    Requirements and Analysis of Media Security Management Protocols
13	             draft-ietf-sip-media-security-requirements-08

15	Status of this Memo

17	   By submitting this Internet-Draft, each author represents that any
18	   applicable patent or other IPR claims of which he or she is aware
19	   have been or will be disclosed, and any of which he or she becomes
20	   aware will be disclosed, in accordance with Section 6 of BCP 79.

22	   Internet-Drafts are working documents of the Internet Engineering
23	   Task Force (IETF), its areas, and its working groups.  Note that
24	   other groups may also distribute working documents as Internet-
25	   Drafts.

27	   Internet-Drafts are draft documents valid for a maximum of six months
28	   and may be updated, replaced, or obsoleted by other documents at any
29	   time.  It is inappropriate to use Internet-Drafts as reference
30	   material or to cite them other than as "work in progress."

32	   The list of current Internet-Drafts can be accessed at
33	   http://www.ietf.org/ietf/1id-abstracts.txt.

35	   The list of Internet-Draft Shadow Directories can be accessed at
36	   http://www.ietf.org/shadow.html.

38	   This Internet-Draft will expire on May 3, 2009.

40	Abstract

42	   This document describes requirements for a protocol to negotiate a
43	   security context for SIP-signaled SRTP media.  In addition to the
44	   natural security requirements, this negotiation protocol must
45	   interoperate well with SIP in certain ways.  A number of proposals
46	   have been published and a summary of these proposals is in the
47	   appendix of this document.

49	Table of Contents

51	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
52	   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  4
53	   3.  Attack Scenarios . . . . . . . . . . . . . . . . . . . . . . .  5
54	   4.  Call Scenarios and Requirements Considerations . . . . . . . .  8
55	     4.1.  Clipping Media Before Signaling Answer . . . . . . . . . .  8
56	     4.2.  Retargeting and Forking  . . . . . . . . . . . . . . . . .  9
57	     4.3.  Recording  . . . . . . . . . . . . . . . . . . . . . . . . 12
58	     4.4.  PSTN gateway . . . . . . . . . . . . . . . . . . . . . . . 12
59	     4.5.  Call Setup Performance . . . . . . . . . . . . . . . . . . 13
60	     4.6.  Transcoding  . . . . . . . . . . . . . . . . . . . . . . . 13
61	     4.7.  Upgrading to SRTP  . . . . . . . . . . . . . . . . . . . . 14
62	     4.8.  Interworking with Other Signaling Protocols  . . . . . . . 14
63	     4.9.  Certificates . . . . . . . . . . . . . . . . . . . . . . . 15
64	   5.  Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 15
65	     5.1.  Key Management Protocol Requirements . . . . . . . . . . . 15
66	     5.2.  Security Requirements  . . . . . . . . . . . . . . . . . . 17
67	     5.3.  Requirements Outside of the Key Management Protocol  . . . 19
68	   6.  Security Considerations  . . . . . . . . . . . . . . . . . . . 20
69	   7.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 20
70	   8.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 20
71	   9.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 20
72	     9.1.  Normative References . . . . . . . . . . . . . . . . . . . 20
73	     9.2.  Informative References . . . . . . . . . . . . . . . . . . 21
74	   Appendix A.  Overview and Evaluation of Existing Keying
75	                Mechanisms  . . . . . . . . . . . . . . . . . . . . . 24
76	     A.1.  Signaling Path Keying Techniques . . . . . . . . . . . . . 25
77	       A.1.1.  MIKEY-NULL . . . . . . . . . . . . . . . . . . . . . . 25
78	       A.1.2.  MIKEY-PSK  . . . . . . . . . . . . . . . . . . . . . . 25
79	       A.1.3.  MIKEY-RSA  . . . . . . . . . . . . . . . . . . . . . . 26
80	       A.1.4.  MIKEY-RSA-R  . . . . . . . . . . . . . . . . . . . . . 26
81	       A.1.5.  MIKEY-DHSIGN . . . . . . . . . . . . . . . . . . . . . 26
82	       A.1.6.  MIKEY-DHHMAC . . . . . . . . . . . . . . . . . . . . . 26
83	       A.1.7.  MIKEY-ECIES and MIKEY-ECMQV (MIKEY-ECC)  . . . . . . . 27
84	       A.1.8.  Security Descriptions with SIPS  . . . . . . . . . . . 27
85	       A.1.9.  Security Descriptions with S/MIME  . . . . . . . . . . 27
86	       A.1.10. SDP-DH (expired) . . . . . . . . . . . . . . . . . . . 27
87	       A.1.11. MIKEYv2 in SDP (expired) . . . . . . . . . . . . . . . 27
88	     A.2.  Media Path Keying Technique  . . . . . . . . . . . . . . . 28
89	       A.2.1.  ZRTP . . . . . . . . . . . . . . . . . . . . . . . . . 28
90	     A.3.  Signaling and Media Path Keying Techniques . . . . . . . . 28
91	       A.3.1.  EKT  . . . . . . . . . . . . . . . . . . . . . . . . . 28
92	       A.3.2.  DTLS-SRTP  . . . . . . . . . . . . . . . . . . . . . . 29
93	       A.3.3.  MIKEYv2 Inband (expired) . . . . . . . . . . . . . . . 29
94	     A.4.  Evaluation Criteria - SIP  . . . . . . . . . . . . . . . . 29
95	       A.4.1.  Secure Retargeting and Secure Forking  . . . . . . . . 29
96	       A.4.2.  Clipping Media Before SDP Answer . . . . . . . . . . . 32
97	       A.4.3.  SSRC and ROC . . . . . . . . . . . . . . . . . . . . . 34
98	     A.5.  Evaluation Criteria - Security . . . . . . . . . . . . . . 36
99	       A.5.1.  Distribution and Validation of Persistent Public
100	               Keys and Certificates  . . . . . . . . . . . . . . . . 36
101	       A.5.2.  Perfect Forward Secrecy  . . . . . . . . . . . . . . . 38
102	       A.5.3.  Best Effort Encryption . . . . . . . . . . . . . . . . 40
103	       A.5.4.  Upgrading Algorithms . . . . . . . . . . . . . . . . . 41
104	   Appendix B.  Out-of-Scope  . . . . . . . . . . . . . . . . . . . . 43
105	     B.1.  Shared Key Conferencing  . . . . . . . . . . . . . . . . . 43
106	   Appendix C.  Requirement renumbering in -02  . . . . . . . . . . . 44
107	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 46
108	   Intellectual Property and Copyright Statements . . . . . . . . . . 48

110	1.  Introduction

112	   The work on media security started when the Session Initiation
113	   Protocol (SIP) was still in its infancy.  With the increased SIP
114	   deployment and the availability of new SIP extensions and related
115	   protocols, the need for end-to-end security was re-evaluated.  The
116	   procedure of re-evaluating prior protocol work and design decisions
117	   is not an uncommon strategy and, to some extent, considered necessary
118	   to ensure that the developed protocols indeed meet the previously
119	   envisioned needs for the users on the Internet.

121	   This document summarizes media security requirements, i.e.,
122	   requirements for mechanisms that negotiate security context such as
123	   cryptographic keys and parameters for SRTP.

125	   The organization of this document is as follows: Section 2 introduces
126	   terminology, Section 3 describes various attack scenarios against the
127	   signaling path and media path, Section 4 provides an overview about
128	   possible call scenarios, Section 5 lists requirements for media
129	   security.  The main part of the document concludes with the security
130	   considerations Section 6, IANA considerations Section 7 and an
131	   acknowledgement section in Section 8.  Appendix A lists and compares
132	   available solution proposals.  The following Appendix A.4 compares
133	   the different approaches regarding their suitability for the SIP
134	   signaling scenarios described in Appendix A, while Appendix A.5
135	   provides a comparison regarding security aspects.  Appendix B lists
136	   non-goals for this document.

138	2.  Terminology

140	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
141	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
142	   document are to be interpreted as described in [RFC2119], with the
143	   important qualification that, unless otherwise stated, these terms
144	   apply to the design of the media security key management protocol,
145	   not its implementation or application.

147	   Furthermore, the terminology described in SIP ([RFC3261]) regarding
148	   functions and components are used throughout the document

150	   Additionally, the following items are used in this document:

152	   AOR (Address-of-Record):   A SIP or SIPS URI that points to a domain
153	      with a location service that can map the URI to another URI where
154	      the user might be available.  Typically, the location service is
155	      populated through registrations.  An AOR is frequently thought of
156	      as the "public address" of the user.

158	   SSRC:  The 32-bit value that defines the synchronization source, used
159	      in RTP.  These are generally unique, but collisions can occur.

161	   two-time pad:  The use of the same key and the same keystream to
162	      encrypt different data.  For SRTP, a two-time pad occurs if two
163	      senders are using the same key and the same RTP SSRC value.

165	   Perfect Forward Secrecy (PFS):  The property that disclosure of the
166	      long-term secret keying material that is used to derive an agreed
167	      ephemeral key does not compromise the secrecy of agreed keys from
168	      earlier runs.

170	   active adversary:  An active adversary is able to alter data
171	      communication to affect its operation (see also [RFC4949]).

173	   passive adversary:  A passive adversary is able to learn information
174	      from data communication, but not alter that data communication
175	      (see also[RFC4949]).

177	   signaling path:  The signaling path is the route taken by SIP
178	      signaling messages transmitted between the calling and called user
179	      agents.  This can be either direct signaling between the calling
180	      and called user agents or, more commonly involves the SIP proxy
181	      servers that were involved in the call setup.

183	   media path:  The media path is the route taken by media packets
184	      exchanged by the endpoints.  In the simplest case, the endpoints
185	      exchange media directly, and the "media path" is defined by a
186	      quartet of IP addresses and TCP/UDP ports, along with an IP route.
187	      In other cases, this path may include RTP relays, mixers,
188	      transcoders, session border controllers, NATs, or media gateways.

190	3.  Attack Scenarios

192	   The discussion in this section relates to requirements R-PASS-MEDIA,
193	   R-PASS-SIG, R-ASSOC, R-SIG-MEDIA, R-ACT-ACT, and R-ID-BINDING.

195	   This document classifies adversaries according to their access and
196	   their capabilities.  An adversary might have access:

198	   1.  only to the media path,

200	   2.  only to the signaling path,

202	   3.  to the media path and to the signaling path.

204	   An attacker that can solely be located along the signaling path, and
205	   does not have access to media (item 2), is not considered in this
206	   document.

208	   There are two different types of adversaries, active and passive.  An
209	   active adversary may need to be active with regard to the key
210	   exchange relevant information traveling along the media path or
211	   traveling along the signaling path.

213	   Based on their robustness against the adversary capabilities
214	   described above, we can group security mechanisms using the following
215	   labels.  This list is generally ordered from easiest to compromise
216	   (at the top) to more difficult to compromise:

218	    +---------------+---------+--------------------------------------+
219	    | SIP signaling |  media  |             abbreviation             |
220	    +---------------+---------+--------------------------------------+
221	    |      none     | passive |      no-signaling-passive-media      |
222	    |      none     |  active |       no-signaling-active-media      |
223	    |    passive    | passive |    passive-signaling-passive-media   |
224	    |    passive    |  active |    passive-signaling-active-media    |
225	    |     active    | passive |    active-signaling-passive-media    |
226	    |     active    |  active |     active-signaling-active-media    |
227	    |     active    |  active | active-signaling-active-media-detect |
228	    +---------------+---------+--------------------------------------+

230	   no-signaling-passive-media:
231	      Access to only the media path is sufficient to reveal the content
232	      of the media traffic.

234	   passive-signaling-passive-media:
235	      Passive attack on the signaling and passive attack on the media
236	      path is necessary to reveal the content of the media traffic.

238	   passive-signaling-active-media:
239	      Passive attack on the signaling and active attack on the media
240	      path is necessary to reveal the content of the media traffic.

242	   active-signaling-passive-media:
243	      Active attack on the signaling path and passive attack on the
244	      media path is necessary to reveal the content of the media
245	      traffic.

247	   no-signaling-active-media:
248	      Active attack on the media path is sufficient to reveal the
249	      content of the media traffic.

251	   active-signaling-active-media:
252	      Active attack on both the signaling path and the media path is
253	      necessary to reveal the content of the media traffic.

255	   active-signaling-active-media-detect:
256	      Active attack on both signaling and media path is necessary to
257	      reveal the content of the media traffic (as with active-signaling-
258	      active-media), and the attack is detectable by protocol messages
259	      exchanged between the end points.

261	   For example, unencrypted RTP is vulnerable to no-signaling-passive-
262	   media.

264	   As another example, Security Descriptions [RFC4568], when protected
265	   by TLS (as it is commonly implemented and deployed), belongs in the
266	   passive-signaling-passive-media category since the adversary needs to
267	   learn the Security Descriptions key by seeing the SIP signaling
268	   message at a SIP proxy (assuming that the adversary is in control of
269	   the SIP proxy).  The media traffic can be decrypted using that
270	   learned key.

272	   As another example, DTLS-SRTP falls into active-signaling-active-
273	   media category when DTLS-SRTP is used with a public key based
274	   ciphersuite with self-signed certificates and without SIP-Identity
275	   [RFC4474].  An adversary would have to modify the fingerprint that is
276	   sent along the signaling path and subsequently to modify the
277	   certificates carried in the DTLS handshake that travel along the
278	   media path.  If DTLS-SRTP is used with both SIP Identity [RFC4474]
279	   and SIP Connected Identity [RFC4916], the RFC4474 signature protects
280	   both the offer and the answer, and such a system would then belong to
281	   the active-signaling-active-attack-detect category (provided, of
282	   course, the signaling path to the RFC4474 authenticator and verifier
283	   is secured as per RFC4474 and the RFC4474 authenticator and verifier
284	   are behaving as per RFC4474).

286	   The above discussion of DTLS-SRTP demonstrates how a single security
287	   protocol can be in different classes depending on the mode in which
288	   it is operated.  Other protocols can achieve similar effect by adding
289	   functions outside of the on-the-wire key management protocol itself.
290	   Although it may be appropriate to deploy lower-classed mechanisms in
291	   some cases, the ultimate security requirement for a media security
292	   negotiation protocol is that it have a mode of operation available in
293	   which is detect-attack, which provides protection against the passive
294	   and active attacks and provides detection of such attacks.  That is,
295	   there must be a way to use the protocol so that an active attack is
296	   required against both the signaling and media paths, and so that such
297	   attacks are detectable by the endpoints.

299	4.  Call Scenarios and Requirements Considerations

301	   The following subsections describe call scenarios that pose the most
302	   challenge to the key management system for media data in cooperation
303	   with SIP signaling.

305	   Throughout the subsections requirements are stated by using the
306	   nomenclature R- to state an explicit requirement.  All of the stated
307	   requirements are explanied in detail in section Section 5.  The
308	   requirements in section Section 5 are listed according their
309	   association to the key management protocol, to attack scenarios, and
310	   requirements which can be met inside the key management protocol or
311	   outside of the key management protocol.

313	4.1.  Clipping Media Before Signaling Answer

315	   The discussion in this section relates to requirement R-AVOID-
316	   CLIPPING and R-ALLOW-RTP.

318	   Per the SDP Offer/Answer Model [RFC3264],

320	      "Once the offerer has sent the offer, it MUST be prepared to
321	      receive media for any recvonly streams described by that offer.
322	      It MUST be prepared to send and receive media for any sendrecv
323	      streams in the offer, and send media for any sendonly streams in
324	      the offer (of course, it cannot actually send until the peer
325	      provides an answer with the needed address and port information)."

327	   To meet this requirement with SRTP, the offerer needs to know the
328	   SRTP key for arriving media.  If either endpoint receives encrypted
329	   media before it has access to the associated SRTP key, it cannot play
330	   the media -- causing clipping.

332	   For key exchange mechanisms that send the answerer's key in SDP, a
333	   SIP provisional response [RFC3261], such as 183 (session progress),
334	   is useful.  However, the 183 messages are not reliable unless both
335	   the calling and called end point support PRACK [RFC3262], use TCP
336	   across all SIP proxies, implement Security Preconditions [RFC5027],
337	   or the both ends implement ICE [I-D.ietf-mmusic-ice] and the answerer
338	   implements the reliable provisional response mechanism described in
339	   ICE.  Unfortunately, there is not wide deployment of any of these
340	   techniques and there is industry reluctance to require these
341	   techniques to avoid the problems described in this section.

343	   Note that the receipt of an SDP answer is not always sufficient to
344	   allow media to be played to the offerer.  Sometimes, the offerer must
345	   send media in order to open up firewall holes or NAT bindings before
346	   media can be received (for details see

348	   [I-D.ietf-mmusic-media-path-middleboxes]).  In this case, even a
349	   solution that makes the key available before the SDP answer arrives
350	   will not help.

352	   Preventing the arrival of early media (i.e., media that arrives at
353	   the SDP offerer before the SDP answer arrives) might obsolete the
354	   R-AVOID-CLIPPING requirement, but at the time of writing such early
355	   media exists in many normal call scenarios.

357	4.2.  Retargeting and Forking

359	   The discussion in this section relates to requirements R-FORK-
360	   RETARGET, R-DISTINCT, R-HERFP, and R-BEST-SECURE.

362	   In SIP, a request sent to a specific AOR but delivered to a different
363	   AOR is called a "retarget".  A typical scenario is a "call
364	   forwarding" feature.  In Figure 1 Alice sends an INVITE in step 1
365	   that is sent to Bob in step 2.  Bob responds with a redirect (SIP
366	   response code 3xx) pointing to Carol in step 3.  This redirect
367	   typically does not propagate back to Alice but only goes to a proxy
368	   (i.e., the retargeting proxy) that sends the original INVITE to Carol
369	   in step 4.

371	                                    +-----+
372	                                    |Alice|
373	                                    +--+--+
374	                                       |
375	                                       | INVITE (1)
376	                                       V
377	                                  +----+----+
378	                                  |  proxy  |
379	                                  ++-+-----++
380	                                   | ^     |
381	                        INVITE (2) | |     | INVITE (4)
382	                    & redirect (3) | |     |
383	                                   V |     V
384	                                  ++-++   ++----+
385	                                  |Bob|   |Carol|
386	                                  +---+   +-----+

388	                           Figure 1: Retargeting

390	   Using retargeting might lead to situations where the UAC does not
391	   know where its request will be going.  This might not immediately
392	   seem like a serious problem; after all, when one places a telephone
393	   call on the PSTN, one never really knows if it will be forwarded to a
394	   different number, who will pick up the line when it rings, and so on.

396	   However, when considering SIP mechanisms for authenticating the
397	   called party, this function can also make it difficult to
398	   differentiate an intermediary that is behaving legitimately from an
399	   attacker.  From this perspective, the main problems with retargeting
400	   ares:

402	   Not detectable by the caller:   The originating user agent has no
403	      means of anticipating that the condition will arise, nor any means
404	      of determining that it has occurred until the call has already
405	      been set up.

407	   Not preventable by the caller:  There is no existing mechanism that
408	      might be employed by the originating user agent in order to
409	      guarantee that the call will not be re-targeted.

411	   The mechanism used by SIP for identifying the calling party is SIP
412	   Identity [RFC4474].  However, due to the nature of retargeting SIP
413	   Identity can only identify the calling party (that is, the party that
414	   initiated the SIP request).  Some key exchange mechanisms predate SIP
415	   Identity and include their own identity mechanism (e.g., MIKEY).
416	   However, those built-in identity mechanism also suffer from the SIP
417	   retargeting problem.  While Connected Identity [RFC4916] allows
418	   positive identification of the called party, the primary difficulty
419	   still remains that the calling party does not know if a mismatched
420	   called party is legitimate (i.e., due to authorized retargeting) or
421	   illegitimate (i.e., due to unauthorized retargeting by an attacker
422	   above to modify SIP signaling).

424	   In SIP, 'forking' is the delivery of a request to multiple locations.
425	   This happens when a single AOR is registered more than once.  An
426	   example of forking is when a user has a desk phone, PC client, and
427	   mobile handset all registered with the same AOR.

429	                                  +-----+
430	                                  |Alice|
431	                                  +--+--+
432	                                     |
433	                                     | INVITE
434	                                     V
435	                               +-----+-----+
436	                               |   proxy   |
437	                               ++---------++
438	                                |         |
439	                         INVITE |         | INVITE
440	                                V         V
441	                             +--+--+   +--+--+
442	                             |Bob-1|   |Bob-2|
443	                             +-----+   +-----+

445	                             Figure 2: Forking

447	   With forking, both Bob-1 and Bob-2 might send back SDP answers in SIP
448	   responses.  Alice will see those intermediate (18x) and final (200)
449	   responses.  It is useful for Alice to be able to associate the SIP
450	   response with the incoming media stream.  Although this association
451	   can be done with ICE [I-D.ietf-mmusic-ice], and ICE is useful to make
452	   this association with RTP, it is not desirable to require ICE to
453	   accomplish this association.

455	   Forking and retargeting are often used together.  For example, a boss
456	   and secretary might have both phones ring (forking) and rollover to
457	   voice mail if neither phone is answered (retargeting).

459	   To maintain security of the media traffic, only the end point that
460	   answers the call should know the SRTP keys for the session.  Forked
461	   and re-targeted calls only reveal sensitive information to non-
462	   responders when the signaling messages contain sensitive information
463	   (e.g., SRTP keys) that is accessible by parties that receive the
464	   offer, but may not respond (i.e., the original recipients in a
465	   retargeted call, or non-answering endpoints in a forked call).  For
466	   key exchange mechanisms that do not provide secure forking or secure
467	   retargeting, one workaround is to re-key immediately after forking or
468	   retargeting.  However, because the originator may not be aware that
469	   the call forked this mechanism requires rekeying immediately after
470	   every session is established.  This doubles the number of messages
471	   processed by the network.

473	   Further compounding this problem is a unique feature of SIP that when
474	   forking is used, there is always only one final error response
475	   delivered to the sender of the request: the forking proxy is
476	   responsible for choosing which final response to choose in the event
477	   where forking results in multiple final error responses being
478	   received by the forking proxy.  This means that if a request is
479	   rejected, say with information that the keying information was
480	   rejected and providing the far end's credentials, it is very possible
481	   that the rejection will never reach the sender.  This problem, called
482	   the Heterogeneous Error Response Forking Problem (HERFP) [RFC3326],
483	   is difficult to solve in SIP.  Because we expect the HERFP to
484	   continue to be a problem in SIP for the foreseeable future, a media
485	   security system should function even in the presence of HERFP
486	   behavior.

488	4.3.  Recording

490	   The discussion in this section relates to requirement R-RECORDING.

492	   Some business environments, such as stock brokers, banks, and catalog
493	   call centers, require recording calls with customers.  This is the
494	   familiar "this call is being recorded for quality purposes" heard
495	   during calls to these sorts of businesses.  In these environments,
496	   media recording is typically performed by an intermediate device
497	   (with RTP, this is typically implemented in a 'sniffer').

499	   When performing such call recording with SRTP, the end-to-end
500	   security is compromised.  This is unavoidable, but necessary because
501	   the operation of the business requires such recording.  It is
502	   desirable that the media security is not unduly compromised by the
503	   media recording.  The endpoint within the organization needs to be
504	   informed that there is an intermediate device and needs to cooperate
505	   with that intermediate device.

507	   This scenario does not place a requirement directly on the key
508	   management protocol.  The requirement could be met directly by the
509	   key management protocol (e.g., MIKEY-NULL or [RFC4568]) or through an
510	   external out-of-band-mechanism (e.g., [I-D.wing-sipping-srtp-key]).

512	4.4.  PSTN gateway

514	   The discussion in this section relates to requirement R-PSTN.

516	   It is desirable, even when one leg of a call is on the PSTN, that the
517	   IP leg of the call be protected with SRTP.

519	   A typical case of using media security where two entities are having
520	   a VoIP conversation over IP capable networks.  However, there are
521	   cases where the other end of the communication is not connected to an
522	   IP capable network.  In this kind of setting, there needs to be some
523	   kind of gateway at the edge of the IP network which converts the VoIP
524	   conversation to format understood by the other network.  An example
525	   of such gateway is a PSTN gateway sitting at the edge of IP and PSTN
526	   networks (such as the architecture described in [RFC3372]).

528	   If media security (e.g., SRTP protection) is employed in this kind of
529	   gateway-setting, then media security and the related key management
530	   is terminated at the PSTN gateway.  The other network (e.g., PSTN)
531	   may have its own measures to protect the communication, but this
532	   means that from media security point of view the media security is
533	   not employed truely end-to-end between the communicating entities.

535	4.5.  Call Setup Performance

537	   The discussion in this section relates to requirement R-REUSE.

539	   Some devices lack sufficient processing power to perform public key
540	   operations or Diffie-Hellman operations for each call, or prefer to
541	   avoid performing those operations on every call.  The ability to re-
542	   use previous public key or Diffie-Hellman operations can vastly
543	   decrease the call setup delay and processing requirements for such
544	   devices.

546	   In certain devices, it can take a second or two to perform a Diffie-
547	   Hellman operation.  Examples of these devices include handsets, IP
548	   Multimedia Services Identity Module (ISIMs), and PSTN gateways.  PSTN
549	   gateways typically utilize a Digital Signal Processor (DSP) which is
550	   not yet involved with typical DSP operations at the beginning of a
551	   call, thus the DSP could be used to perform the calculation, so as to
552	   avoid having the central host processor perform the calculation.
553	   However, not all PSTN gateways use DSPs (some have only central
554	   processors or their DSPs are incapable of performing the necessary
555	   public key or Diffie-Hellman operation), and handsets lack a
556	   separate, unused processor to perform these operations.

558	   Two scenarios where R-REUSE is useful are calls between an endpoint
559	   and its voicemail server or its PSTN gateway.  In those scenarios
560	   calls are made relatively often and it can be useful for the
561	   voicemail server or PSTN gateway to avoid public key operations for
562	   subsequent calls.

564	   Storing keys across sessions often interferes with perfect forward
565	   secrecy (R-PFS).

567	4.6.  Transcoding

569	   The discussion in this section relates to requirement R-TRANSCODER.

571	   In some environments is is necessary for network equipment to
572	   transcode from one codec (e.g., a highly compressed codec which makes
573	   efficient use of wireless bandwidth) to another codec (e.g., a
574	   standardized codec to a SIP peering interface).  With RTP, a
575	   transcoding function can be performed with the combination of a SIP
576	   B2BUA (to modify the SDP) and a processor to perform the transcoding
577	   between the codecs.  However, with end-to-end secured SRTP, a
578	   transcoding function implemented the same way is a man in the middle
579	   attack, and the key management system prevents its use.

581	   However, such a network-based transcoder can still be realized with
582	   the cooperation and approval of the endpoint, and can provide end-to-
583	   transcoder and transcoder-to-end security.

585	4.7.  Upgrading to SRTP

587	   The discussion in this section relates to the requirement R-ALLOW-
588	   RTP.

590	   Legitimate RTP media can be sent to an endpoint for announcements,
591	   colorful ringback tones (e.g., music), advertising, or normal call
592	   progress tones.  The RTP may be received before an associated SDP
593	   answer.  For details on various scenarios, see
594	   [I-D.stucker-sipping-early-media-coping].

596	   While receiving such RTP exposes the calling party to a risk of
597	   receiving malicious RTP from an attacker, SRTP endpoints will need to
598	   receive and play out RTP media in order to be compatible with
599	   deployed systems that send RTP to calling parties.

601	4.8.  Interworking with Other Signaling Protocols

603	   The discussion in this section relates to the requirement R-OTHER-
604	   SIGNALING.

606	   In many environments, some devices are signaled with protocols other
607	   than SIP which do not share SIP's offer/answer model (e.g., [H.248.1]
608	   or do not utilize SDP (e.g., H.323).  In other environments, both
609	   endpoints may be SIP, but may use different key management systems
610	   (e.g., one uses MIKEY-RSA, the other MIKEY-RSA-R).

612	   In these environments, it is desirable to have SRTP -- rather than
613	   RTP -- between the two endpoints.  It is always possible, although
614	   undesirable, to interwork those disparate signaling systems or
615	   disparate key management systems by decrypting and re-encrypting each
616	   SRTP packet in a device in the middle of the network (often the same
617	   device performing the signaling interworking).  This is undesirable
618	   due to the cost and increased attack area, as such an SRTP/SRTP
619	   interworking device is a valuable attack target.

621	   At the time of this writing, interworking is considered important.
622	   Interworking without decryption/encryption of the SRTP, while useful,
623	   is not yet deemed critical because the scale of such SRTP deployments
624	   is, to date, relatively small.

626	4.9.  Certificates

628	   The discussion in this section relates to R-CERTS.

630	   On the Internet and on some private networks, validating another
631	   peer's certificate is often done through a trust anchor -- a list of
632	   Certificate Authorities that are trusted.  It can be difficult or
633	   expensive for a peer to obtain these certificates.  In all cases,
634	   both parties to the call would need to trust the same trust anchor
635	   (i.e., "certificate authority").  For these reasons, it is important
636	   that the media plane key management protocol offer a mechanism that
637	   allows end-users who have no prior association to authenticate to
638	   each other without acquiring credentials from a third party trust
639	   point.  Note that this does not rule out mechanisms in which servers
640	   have certificates and attest to the identities of end-users.

642	5.  Requirements

644	   This section is divided into several parts: requirements specific to
645	   the key management protocol (Section 5.1), attack scenarios
646	   (Section 5.2), and requirements which can be met inside the key
647	   management protocol or outside of the key management protocol
648	   (Section 5.3).

650	5.1.  Key Management Protocol Requirements

652	   SIP Forking and Retargeting, from Section 4.2:

654	   R-FORK-RETARGET:
655	         The media security key management protocol MUST securely
656	         support forking and retargeting when all endpoints are willing
657	         to use SRTP without causing the call setup to fail.  This
658	         requirement means the endpoints that did not answer the call
659	         MUST NOT learn the SRTP keys (in either direction) used by the
660	         answering endpoint.

662	   R-DISTINCT:
663	         The media security key management protocol MUST be capable of
664	         creating distinct, independent cryptographic contexts for each
665	         endpoint in a forked session.

667	   R-HERFP:
668	         The media security key management protocol MUST function
669	         securely even in the presence of HERFP behavior, i.e., the
670	         rejection of key information does not reach the sender.

672	   Performance considerations:

674	   R-REUSE:
675	         The media security key management protocol MAY support the re-
676	         use of a previously established security context.

678	               Note: re-use of the security context does not imply re-
679	               use of RTP parameters (e.g., payload type or SSRC).

681	   Media considerations:

683	   R-AVOID-CLIPPING:
684	         The media security key management protocol SHOULD avoid
685	         clipping media before SDP answer without requiring Security
686	         Preconditions [RFC5027].  This requirement comes from
687	         Section 4.1.

689	   R-RTP-VALID:
690	         If SRTP key negotiation is performed over the media path (i.e.,
691	         using the same UDP/TCP ports as media packets), the key
692	         negotiation packets MUST NOT pass the RTP validity check
693	         defined in Appendix A.1 of [RFC3550].

695	   R-ASSOC:
696	         The media security key management protocol SHOULD include a
697	         mechanism for associating key management messages with both the
698	         signaling traffic that initiated the session and with protected
699	         media traffic.  It is useful to associate key management
700	         messages with call signaling messages, as this allows the SDP
701	         offerer to avoid performing CPU-consuming operations (e.g.,
702	         Diffie-Hellman or public key operations) with attackers that
703	         have not seen the signaling messages.

705	         For example, if using a Diffie-Hellman keying technique with
706	         security preconditions that forks to 20 end points, the call
707	         initiator would get 20 provisional responses containing 20
708	         signed Diffie-Hellman key pairs.  Calculating 20 Diffie-Hellman
709	         secrets and validating signatures can be a difficult task for
710	         some devices.  Hence, in the case of forking, it is not
711	         desirable to perform a Diffie-Hellman operation with every
712	         party, but rather only with the party that answers the call
713	         (and incur some media clipping).  To do this, the signaling and
714	         media need to be associated so the calling party knows which
715	         key management exchange needs to be completed.  This might be
716	         done by using the transport address indicated in the SDP,
717	         although NATs can complicate this association.

719	               Note: due to RTP's design requirements, it is expected
720	               that SRTP receivers will have to perform authentication
721	               of any received SRTP packets.

723	   R-NEGOTIATE:
724	         The media security key management protocol MUST allow a SIP
725	         User Agent to negotiate media security parameters for each
726	         individual session.

728	   R-PSTN:
729	         The media security key management protocol MUST support
730	         termination of media security in a PSTN gateway.  This
731	         requirement is from Section 4.4.

733	5.2.  Security Requirements

735	   This section describes overall security requirements and specific
736	   requirements from the attack scenarios (Section 3).

738	   Overall security requirements:

740	   R-PFS:
741	         The media security key management protocol MUST be able to
742	         support perfect forward secrecy.

744	   R-COMPUTE:
745	         The media security key management protocol MUST support
746	         offering additional SRTP cipher suites without incurring
747	         significant computational expense.

749	   R-CERTS:
750	         The key management protocol MUST NOT require that end-users
751	         obtain credentials (certificates or private keys) from a third-
752	         party trust anchor.

754	   R-FIPS:
755	         The media security key management protocol SHOULD use
756	         algorithms that allow FIPS 140-2 [FIPS-140-2] certification.

758	         The United States Government can only purchase and use crypto
759	         implementations that have been validated by the FIPS-140
760	         [FIPS-140-2] process:

762	               "The FIPS-140 standard is applicable to all Federal
763	               agencies that use cryptographic-based security systems to
764	               protect sensitive information in computer and
765	               telecommunication systems, including voice systems.  The
766	               adoption and use of this standard is available to private
767	               and commercial organizations."

769	         Some commercial organizations, such as banks and defense
770	         contractors, require or prefer equipment which has received the
771	         same validation.

773	   R-DOS:
774	         The media security key management protocol SHOULD NOT introduce
775	         new denial of service vulnerabilities (e.g., the protocol
776	         should not request the endpoint to perform CPU-intensive
777	         operations without the client being able to validate or
778	         authorize the request).

780	   R-EXISTING:
781	         The media security key management protocol SHOULD allow
782	         endpoints to authenticate using pre-existing cryptographic
783	         credentials, e.g., certificates or pre-shared keys.

785	   R-AGILITY:
786	         The media security key management protocol MUST provide crypto-
787	         agility, i.e., the ability to adapt to evolving cryptography
788	         and security requirements (update of cryptographic algorithms
789	         without substantial disruption to deployed implementations)

791	   R-DOWNGRADE:
792	         The media security key management protocol MUST protect cipher
793	         suite negotiation against downgrading attacks.

795	   R-PASS-MEDIA:
796	         The media security key management protocol MUST have a mode
797	         which prevents a passive adversary with access to the media
798	         path from gaining access to keying material used to protect
799	         SRTP media packets.

801	   R-PASS-SIG:
802	         The media security key management protocol MUST have a mode in
803	         which it prevents a passive adversary with access to the
804	         signaling path from gaining access to keying material used to
805	         protect SRTP media packets.

807	   R-SIG-MEDIA:
808	         The media security key management protocol MUST have a mode in
809	         which it defends itself from an attacker that is solely on the
810	         media path and from an attacker that is solely on the signaling
811	         path.  A successful attack refers to the ability for the
812	         adversary to obtain keying material to decrypt the SRTP
813	         encrypted media traffic.

815	   R-ID-BINDING:
816	         The media security key management protocol MUST enable the
817	         media security keys to be cryptographically bound to an
818	         identity of the endpoint.

820	               This allows domains to deploy SIP Identity [RFC4474].

822	   R-ACT-ACT:
823	         The media security key management protocol MUST support a mode
824	         of operation that provides active-signaling-active-media-detect
825	         robustness, and MAY support modes of operation that provide
826	         lower levels of robustness (as described in Section 3).

828	               Failing to meet R-ACT-ACT indicates the protocol can not
829	               provide secure end-to-end media.

831	5.3.  Requirements Outside of the Key Management Protocol

833	   The requirements in this section are for an overall VoIP security
834	   system.  These requirements can be met within the key management
835	   protocol itself, or can be solved outside of the key management
836	   protocol itself (e.g., solved in SIP or in SDP).

838	   R-BEST-SECURE:
839	         Even when some end points of a forked or retargeted call are
840	         incapable of using SRTP, a solution MUST be described which
841	         allows the establishment of SRTP associations with SRTP-capable
842	         endpoints and / or RTP associations with non-SRTP-capable
843	         endpoints.

845	   R-OTHER-SIGNALING:
846	         A solution SHOULD be able to negotiate keys for SRTP sessions
847	         created via different call signaling protocols (e.g., between
848	         Jabber, SIP, H.323, MGCP).

850	   R-RECORDING:
851	         A solution SHOULD be described which supports recording of
852	         decrypted media.  This requirement comes from Section 4.3.

854	   R-TRANSCODER:
855	         A solution SHOULD be described which supports intermediate
856	         nodes (e.g., transcoders), terminating or processing media,
857	         between the end points.

859	   R-ALLOW-RTP:  A solution SHOULD be described which allows RTP media
860	         to be received by the calling party until SRTP has been
861	         negotiated with the answerer, after which SRTP is preferred
862	         over RTP.

864	6.  Security Considerations

866	   This document lists requirements for securing media traffic.  As
867	   such, it addresses security throughout the document.

869	7.  IANA Considerations

871	   This document does not require actions by IANA.

873	8.  Acknowledgements

875	   For contributions to the requirements portion of this document, the
876	   authors would like to thank the active participants of the RTPSEC BoF
877	   and on the RTPSEC mailing list, and a special thanks to Steffen Fries
878	   and Dragan Ignjatic for their excellent MIKEY comparison [RFC5197]
879	   document.

881	   The authors would furthermore like to thank the following people for
882	   their review, suggestions, and comments: Flemming Andreasen, Richard
883	   Barnes, Mark Baugher, Wolfgang Buecker, Werner Dittmann, Lakshminath
884	   Dondeti, John Elwell, Martin Euchner, Hans-Heinrich Grusdt, Christer
885	   Holmberg, Guenther Horn, Peter Howard, Leo Huang, Dragan Ignjatic,
886	   Cullen Jennings, Alan Johnston, Vesa Lehtovirta, Matt Lepinski, David
887	   McGrew, David Oran, Colin Perkins, Eric Raymond, Eric Rescorla, Peter
888	   Schneider, Srinath Thiruvengadam, Dave Ward, Dan York, and Phil
889	   Zimmermann.

891	9.  References

893	9.1.  Normative References

895	   [FIPS-140-2]
896	              NIST, "Security Requirements for Cryptographic Modules",
897	              June 2005, <http://csrc.nist.gov/publications/fips/
898	              fips140-2/fips1402.pdf>.

900	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
901	              Requirement Levels", BCP 14, RFC 2119, March 1997.

903	   [RFC3261]  Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston,
904	              A., Peterson, J., Sparks, R., Handley, M., and E.
905	              Schooler, "SIP: Session Initiation Protocol", RFC 3261,
906	              June 2002.

908	   [RFC3262]  Rosenberg, J. and H. Schulzrinne, "Reliability of
909	              Provisional Responses in Session Initiation Protocol
910	              (SIP)", RFC 3262, June 2002.

912	   [RFC3264]  Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model
913	              with Session Description Protocol (SDP)", RFC 3264,
914	              June 2002.

916	   [RFC3711]  Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K.
917	              Norrman, "The Secure Real-time Transport Protocol (SRTP)",
918	              RFC 3711, March 2004.

920	   [cryptval]
921	              NIST, "Cryptographic Module Validation Program",
922	              December 2006,
923	              <http://csrc.nist.gov/cryptval/140-2APP.htm>.

925	9.2.  Informative References

927	   [H.248.1]  ITU, "Gateway control protocol", June 2000,
928	              <http://www.itu.int/rec/T-REC-H.248/e>.

930	   [I-D.baugher-mmusic-sdp-dh]
931	              Baugher, M. and D. McGrew, "Diffie-Hellman Exchanges for
932	              Multimedia Sessions", draft-baugher-mmusic-sdp-dh-00 (work
933	              in progress), February 2006.

935	   [I-D.dondeti-msec-rtpsec-mikeyv2]
936	              Dondeti, L., "MIKEYv2: SRTP Key Management using MIKEY,
937	              revisited", draft-dondeti-msec-rtpsec-mikeyv2-01 (work in
938	              progress), March 2007.

940	   [I-D.fischl-sipping-media-dtls]
941	              Fischl, J., "Datagram Transport Layer Security (DTLS)
942	              Protocol for Protection of Media  Traffic Established with
943	              the Session Initiation Protocol",
944	              draft-fischl-sipping-media-dtls-03 (work in progress),
945	              July 2007.

947	   [I-D.ietf-avt-dtls-srtp]
948	              McGrew, D. and E. Rescorla, "Datagram Transport Layer
949	              Security (DTLS) Extension to Establish Keys for  Secure
950	              Real-time Transport Protocol (SRTP)",
951	              draft-ietf-avt-dtls-srtp-06 (work in progress),
952	              October 2008.

954	   [I-D.ietf-mmusic-ice]
955	              Rosenberg, J., "Interactive Connectivity Establishment
956	              (ICE): A Protocol for Network Address  Translator (NAT)
957	              Traversal for Offer/Answer Protocols",
958	              draft-ietf-mmusic-ice-19 (work in progress), October 2007.

960	   [I-D.ietf-mmusic-media-path-middleboxes]
961	              Stucker, B. and H. Tschofenig, "Analysis of Middlebox
962	              Interactions for Signaling Protocol Communication  along
963	              the Media Path",
964	              draft-ietf-mmusic-media-path-middleboxes-01 (work in
965	              progress), July 2008.

967	   [I-D.ietf-mmusic-sdp-capability-negotiation]
968	              Andreasen, F., "SDP Capability Negotiation",
969	              draft-ietf-mmusic-sdp-capability-negotiation-09 (work in
970	              progress), July 2008.

972	   [I-D.ietf-msec-mikey-ecc]
973	              Milne, A., "ECC Algorithms for MIKEY",
974	              draft-ietf-msec-mikey-ecc-03 (work in progress),
975	              June 2007.

977	   [I-D.ietf-sip-certs]
978	              Jennings, C. and J. Fischl, "Certificate Management
979	              Service for The Session Initiation Protocol (SIP)",
980	              draft-ietf-sip-certs-06 (work in progress), April 2008.

982	   [I-D.ietf-tls-rfc4346-bis]
983	              Dierks, T. and E. Rescorla, "The Transport Layer Security
984	              (TLS) Protocol Version 1.2", draft-ietf-tls-rfc4346-bis-10
985	              (work in progress), March 2008.

987	   [I-D.jennings-sipping-multipart]
988	              Wing, D. and C. Jennings, "Session Initiation Protocol
989	              (SIP) Offer/Answer with Multipart Alternative",
990	              draft-jennings-sipping-multipart-02 (work in progress),
991	              March 2006.

993	   [I-D.mcgrew-srtp-ekt]
994	              McGrew, D., "Encrypted Key Transport for Secure RTP",
995	              draft-mcgrew-srtp-ekt-03 (work in progress), July 2007.

997	   [I-D.stucker-sipping-early-media-coping]
998	              Stucker, B., "Coping with Early Media in the Session
999	              Initiation Protocol (SIP)",
1000	              draft-stucker-sipping-early-media-coping-03 (work in
1001	              progress), October 2006.

1003	   [I-D.wing-sipping-srtp-key]
1004	              Wing, D., Audet, F., Fries, S., Tschofenig, H., and A.
1005	              Johnston, "Secure Media Recording and Transcoding with the
1006	              Session Initiation  Protocol",
1007	              draft-wing-sipping-srtp-key-03 (work in progress),
1008	              February 2008.

1010	   [I-D.zimmermann-avt-zrtp]
1011	              Zimmermann, P., Johnston, A., and J. Callas, "ZRTP: Media
1012	              Path Key Agreement for Secure RTP",
1013	              draft-zimmermann-avt-zrtp-10 (work in progress),
1014	              October 2008.

1016	   [RFC3326]  Schulzrinne, H., Oran, D., and G. Camarillo, "The Reason
1017	              Header Field for the Session Initiation Protocol (SIP)",
1018	              RFC 3326, December 2002.

1020	   [RFC3372]  Vemuri, A. and J. Peterson, "Session Initiation Protocol
1021	              for Telephones (SIP-T): Context and Architectures",
1022	              BCP 63, RFC 3372, September 2002.

1024	   [RFC3550]  Schulzrinne, H., Casner, S., Frederick, R., and V.
1025	              Jacobson, "RTP: A Transport Protocol for Real-Time
1026	              Applications", STD 64, RFC 3550, July 2003.

1028	   [RFC3830]  Arkko, J., Carrara, E., Lindholm, F., Naslund, M., and K.
1029	              Norrman, "MIKEY: Multimedia Internet KEYing", RFC 3830,
1030	              August 2004.

1032	   [RFC4474]  Peterson, J. and C. Jennings, "Enhancements for
1033	              Authenticated Identity Management in the Session
1034	              Initiation Protocol (SIP)", RFC 4474, August 2006.

1036	   [RFC4492]  Blake-Wilson, S., Bolyard, N., Gupta, V., Hawk, C., and B.
1037	              Moeller, "Elliptic Curve Cryptography (ECC) Cipher Suites
1038	              for Transport Layer Security (TLS)", RFC 4492, May 2006.

1040	   [RFC4568]  Andreasen, F., Baugher, M., and D. Wing, "Session
1041	              Description Protocol (SDP) Security Descriptions for Media
1042	              Streams", RFC 4568, July 2006.

1044	   [RFC4650]  Euchner, M., "HMAC-Authenticated Diffie-Hellman for
1045	              Multimedia Internet KEYing (MIKEY)", RFC 4650,
1046	              September 2006.

1048	   [RFC4738]  Ignjatic, D., Dondeti, L., Audet, F., and P. Lin, "MIKEY-
1049	              RSA-R: An Additional Mode of Key Distribution in
1050	              Multimedia Internet KEYing (MIKEY)", RFC 4738,
1051	              November 2006.

1053	   [RFC4771]  Lehtovirta, V., Naslund, M., and K. Norrman, "Integrity
1054	              Transform Carrying Roll-Over Counter for the Secure Real-
1055	              time Transport Protocol (SRTP)", RFC 4771, January 2007.

1057	   [RFC4916]  Elwell, J., "Connected Identity in the Session Initiation
1058	              Protocol (SIP)", RFC 4916, June 2007.

1060	   [RFC4949]  Shirey, R., "Internet Security Glossary, Version 2",
1061	              RFC 4949, August 2007.

1063	   [RFC5027]  Andreasen, F. and D. Wing, "Security Preconditions for
1064	              Session Description Protocol (SDP) Media Streams",
1065	              RFC 5027, October 2007.

1067	   [RFC5197]  Fries, S. and D. Ignjatic, "On the Applicability of
1068	              Various Multimedia Internet KEYing (MIKEY) Modes and
1069	              Extensions", RFC 5197, June 2008.

1071	Appendix A.  Overview and Evaluation of Existing Keying Mechanisms

1073	   Based on how the SRTP keys are exchanged, each SRTP key exchange
1074	   mechanism belongs to one general category:

1076	   signaling path:
1077	        All the keying is carried in the call signaling (SIP or SDP)
1078	        path.

1080	   media path:
1081	        All the keying is carried in the SRTP/SRTCP media path, and no
1082	        signaling whatsoever is carried in the call signaling path.

1084	   signaling and media path:
1085	        Parts of the keying are carried in the SRTP/SRTCP media path,
1086	        and parts are carried in the call signaling (SIP or SDP) path.

1088	   One of the significant benefits of SRTP over other end-to-end
1089	   encryption mechanisms, such as for example IPsec, is that SRTP is
1090	   bandwidth efficient and SRTP retains the header of RTP packets.

1092	   Bandwidth efficiency is vital for VoIP in many scenarios where access
1093	   bandwidth is limited or expensive, and retaining the RTP header is
1094	   important for troubleshooting packet loss, delay, and jitter.

1096	   Related to SRTP's characteristics is a goal that any SRTP keying
1097	   mechanism to also be efficient and not cause additional call setup
1098	   delay.  Contributors to additional call setup delay include network
1099	   or database operations: retrieval of certificates and additional SIP
1100	   or media path messages, and computational overhead of establishing
1101	   keys or validating certificates.

1103	   When examining the choice between keying in the signaling path,
1104	   keying in the media path, or keying in both paths, it is important to
1105	   realize the media path is generally 'faster' than the SIP signaling
1106	   path.  The SIP signaling path has computational elements involved
1107	   which parse and route SIP messages.  The media path, on the other
1108	   hand, does not normally have computational elements involved, and
1109	   even when computational elements such as firewalls are involved, they
1110	   cause very little additional delay.  Thus, the media path can be
1111	   useful for exchanging several messages to establish SRTP keys.  A
1112	   disadvantage of keying over the media path is that interworking
1113	   different key exchange requires the interworking function be in the
1114	   media path, rather than just in the signaling path; in practice this
1115	   involvement is probably unavoidable anyway.

1117	A.1.  Signaling Path Keying Techniques

1119	A.1.1.  MIKEY-NULL

1121	   MIKEY-NULL [RFC3830] has the offerer indicate the SRTP keys for both
1122	   directions.  The key is sent unencrypted in SDP, which means the SDP
1123	   must be encrypted hop-by-hop (e.g., by using TLS (SIPS)) or end-to-
1124	   end (e.g., by using S/MIME).

1126	   MIKEY-NULL requires one message from offerer to answerer (half a
1127	   round trip), and does not add additional media path messages.

1129	A.1.2.  MIKEY-PSK

1131	   MIKEY-PSK (pre-shared key) [RFC3830] requires that all endpoints
1132	   share one common key.  MIKEY-PSK has the offerer encrypt the SRTP
1133	   keys for both directions using this pre-shared key.

1135	   MIKEY-PSK requires one message from offerer to answerer (half a round
1136	   trip), and does not add additional media path messages.

1138	A.1.3.  MIKEY-RSA

1140	   MIKEY-RSA [RFC3830] has the offerer encrypt the keys for both
1141	   directions using the intended answerer's public key, which is
1142	   obtained from a mechanism outside of MIKEY.

1144	   MIKEY-RSA requires one message from offerer to answerer (half a round
1145	   trip), and does not add additional media path messages.  MIKEY-RSA
1146	   requires the offerer to obtain the intended answerer's certificate.

1148	A.1.4.  MIKEY-RSA-R

1150	   MIKEY-RSA-R [RFC4738] is essentially the same as MIKEY-RSA but
1151	   reverses the role of the offerer and the answerer with regards to
1152	   providing the keys.  That is, the answerer encrypts the keys for both
1153	   directions using the offerer's public key.  Both the offerer and
1154	   answerer validate each other's public keys using a standard X.509
1155	   validation techniques.  MIKEY-RSA-R also enables sending certificates
1156	   in the MIKEY message.

1158	   MIKEY-RSA-R requires one message from offerer to answer, and one
1159	   message from answerer to offerer (full round trip), and does not add
1160	   additional media path messages.  MIKEY-RSA-R requires the offerer
1161	   validate the answerer's certificate.

1163	A.1.5.  MIKEY-DHSIGN

1165	   In MIKEY-DHSIGN [RFC3830] the offerer and answerer derive the key
1166	   from a Diffie-Hellman exchange.  In order to prevent an active man-
1167	   in-the-middle the DH exchange itself is signed using each endpoint's
1168	   private key and the associated public keys are validated using
1169	   standard X.509 validation techniques.

1171	   MIKEY-DHSIGN requires one message from offerer to answerer, and one
1172	   message from answerer to offerer (full round trip), and does not add
1173	   additional media path messages.  MIKEY-DHSIGN requires the offerer
1174	   and answerer to validate each other's certificates.  MIKEY-DHSIGN
1175	   also enables sending the answerer's certificate in the MIKEY message.

1177	A.1.6.  MIKEY-DHHMAC

1179	   MIKEY-DHHMAC [RFC4650] uses a pre-shared secret to HMAC the Diffie-
1180	   Hellman exchange, essentially combining aspects of MIKEY-PSK with
1181	   MIKEY-DHSIGN, but without MIKEY-DHSIGN's need for certificate
1182	   authentication.

1184	   MIKEY-DHHMAC requires one message from offerer to answerer, and one
1185	   message from answerer to offerer (full round trip), and does not add
1186	   additional media path messages.

1188	A.1.7.  MIKEY-ECIES and MIKEY-ECMQV (MIKEY-ECC)

1190	   ECC Algorithms For MIKEY [I-D.ietf-msec-mikey-ecc] describes how ECC
1191	   can be used with MIKEY-RSA (using ECDSA signature) and with MIKEY-
1192	   DHSIGN (using a new DH-Group code), and also defines two new ECC-
1193	   based algorithms, Elliptic Curve Integrated Encryption Scheme (ECIES)
1194	   and Elliptic Curve Menezes-Qu-Vanstone (ECMQV) .

1196	   With this proposal, the ECDSA signature, MIKEY-ECIES, and MIKEY-ECMQV
1197	   function exactly like MIKEY-RSA, and the new DH-Group code function
1198	   exactly like MIKEY-DHSIGN.  Therefore these ECC mechanisms are not
1199	   discussed separately in this document.

1201	A.1.8.  Security Descriptions with SIPS

1203	   Security Descriptions [RFC4568] has each side indicate the key it
1204	   will use for transmitting SRTP media, and the keys are sent in the
1205	   clear in SDP.  Security Descriptions relies on hop-by-hop (TLS via
1206	   "SIPS:") encryption to protect the keys exchanged in signaling.

1208	   Security Descriptions requires one message from offerer to answerer,
1209	   and one message from answerer to offerer (full round trip), and does
1210	   not add additional media path messages.

1212	A.1.9.  Security Descriptions with S/MIME

1214	   This keying mechanism is identical to Appendix A.1.8, except that
1215	   rather than protecting the signaling with TLS, the entire SDP is
1216	   encrypted with S/MIME.

1218	A.1.10.  SDP-DH (expired)

1220	   SDP Diffie-Hellman [I-D.baugher-mmusic-sdp-dh] exchanges Diffie-
1221	   Hellman messages in the signaling path to establish session keys.  To
1222	   protect against active man-in-the-middle attacks, the Diffie-Hellman
1223	   exchange needs to be protected with S/MIME, SIPS, or SIP Identity
1224	   [RFC4474] and SIP Conected Identity [RFC4916].

1226	   SDP-DH requires one message from offerer to answerer, and one message
1227	   from answerer to offerer (full round trip), and does not add
1228	   additional media path messages.

1230	A.1.11.  MIKEYv2 in SDP (expired)

1232	   MIKEYv2 [I-D.dondeti-msec-rtpsec-mikeyv2] adds mode negotiation to
1233	   MIKEYv1 and removes the time synchronization requirement.  It
1234	   therefore now takes 2 round-trips to complete.  In the first round
1235	   trip, the communicating parties learn each other's identities, agree
1236	   on a MIKEY mode, crypto algorithm, SRTP policy, and exchanges nonces
1237	   for replay protection.  In the second round trip, they negotiate
1238	   unicast and/or group SRTP context for SRTP and/or SRTCP.

1240	   Furthemore, MIKEYv2 also defines an in-band negotiation mode as an
1241	   alternative to SDP (see Appendix A.3.3).

1243	A.2.  Media Path Keying Technique

1245	A.2.1.  ZRTP

1247	   ZRTP [I-D.zimmermann-avt-zrtp] does not exchange information in the
1248	   signaling path (although it's possible for endpoints to exchange a
1249	   hash of the ZRTP Hello message with "a=zrtp-hash" in the initial
1250	   Offer if sent over an integrity-protected signaling channel.  This
1251	   provides some useful correlation between the signaling and media
1252	   layers).  In ZRTP the keys are exchanged entirely in the media path
1253	   using a Diffie-Hellman exchange.  The advantage to this mechanism is
1254	   that the signaling channel is used only for call setup and the media
1255	   channel is used to establish an encrypted channel -- much like
1256	   encryption devices on the PSTN.  ZRTP uses voice authentication of
1257	   its Diffie-Hellman exchange by having each person read digits or
1258	   words to the other person.  Subsequent sessions with the same ZRTP
1259	   endpoint can be authenticated using the stored hash of the previously
1260	   negotiated key rather than voice authentication.  ZRTP uses 4 media
1261	   path messages (Hello, Commit, DHPart1, and DHPart2) to establish the
1262	   SRTP key, and 3 media path confirmation messages.  These initial
1263	   messages are all sent as non-RTP packets.

1265	      Note that when ZRTP probing is used, unencrypted RTP can be
1266	      exchanged until the SRTP keys are established.

1268	A.3.  Signaling and Media Path Keying Techniques

1270	A.3.1.  EKT

1272	   EKT [I-D.mcgrew-srtp-ekt] relies on another SRTP key exchange
1273	   protocol, such as Security Descriptions or MIKEY, for bootstrapping.
1274	   In the initial phase, each member of a conference uses an SRTP key
1275	   exchange protocol to establish a common key encryption key (KEK).
1276	   Each member may use the KEK to securely transport its SRTP master key
1277	   and current SRTP rollover counter (ROC), via RTCP, to the other
1278	   participants in the session.

1280	   EKT requires the offerer to send some parameters (EKT_Cipher, KEK,
1281	   and security parameter index (SPI)) via the bootstrapping protocol
1282	   such as Security Descriptions or MIKEY.  Each answerer sends an SRTCP
1283	   message which contains the answerer's SRTP Master Key, rollover
1284	   counter, and the SRTP sequence number.  Rekeying is done by sending a
1285	   new SRTCP message.  For reliable transport, multiple RTCP messages
1286	   need to be sent.

1288	A.3.2.  DTLS-SRTP

1290	   DTLS-SRTP [I-D.ietf-avt-dtls-srtp] exchanges public key fingerprints
1291	   in SDP [I-D.fischl-sipping-media-dtls] and then establishes a DTLS
1292	   session over the media channel.  The endpoints use the DTLS handshake
1293	   to agree on crypto suites and establish SRTP session keys.  SRTP
1294	   packets are then exchanged between the endpoints.

1296	   DTLS-SRTP requires one message from offerer to answerer (half round
1297	   trip), and one message from the answerer to offerer (full round trip)
1298	   so the offerer can correlate the SDP answer with the answering
1299	   endpoint.  DTLS-SRTP uses 4 media path messages to establish the SRTP
1300	   key.

1302	   This document assumes DTLS will use TLS_RSA_WITH_AES_128_CBC_SHA as
1303	   its cipher suite, which is the mandatory-to-implement cipher suite in
1304	   TLS [I-D.ietf-tls-rfc4346-bis].

1306	A.3.3.  MIKEYv2 Inband (expired)

1308	   As defined in Appendix A.1.11, MIKEYv2 also defines an in-band
1309	   negotiation mode as an alternative to SDP (see Appendix A.3.3).  The
1310	   details are not sorted out in the draft yet on what in-band actually
1311	   means (i.e., UDP, RTP, RTCP, etc.).

1313	A.4.  Evaluation Criteria - SIP

1315	   This section considers how each keying mechanism interacts with SIP
1316	   features.

1318	A.4.1.  Secure Retargeting and Secure Forking

1320	   Retargeting and forking of signaling requests is described within
1321	   Section 4.2.  The following builds upon this description.

1323	   The following list compares the behavior of secure forking, answering
1324	   association, two-time pads, and secure retargeting for each keying
1325	   mechanism.

1327	      MIKEY-NULL  Secure Forking: No, all AORs see offerer's and
1328	         answerer's keys.  Answer is associated with media by the SSRC
1329	         in MIKEY.  Additionally, a two-time pad occurs if two branches
1330	         choose the same 32-bit SSRC and transmit SRTP packets.

1332	         Secure Retargeting: No, all targets see offerer's and
1333	         answerer's keys.  Suffers from retargeting identity problem.

1335	      MIKEY-PSK
1336	         Secure Forking: No, all AORs see offerer's and answerer's keys.
1337	         Answer is associated with media by the SSRC in MIKEY.  Note
1338	         that all AORs must share the same pre-shared key in order for
1339	         forking to work at all with MIKEY-PSK.  Additionally, a two-
1340	         time pad occurs if two branches choose the same 32-bit SSRC and
1341	         transmit SRTP packets.

1343	         Secure Retargeting: Not secure.  For retargeting to work, the
1344	         final target must possess the correct PSK.  As this is likely
1345	         in scenarios were the call is targeted to another device
1346	         belonging to the same user (forking), it is very unlikely that
1347	         other users will possess that PSK and be able to successfully
1348	         answer that call.

1350	      MIKEY-RSA
1351	         Secure Forking: No, all AORs see offerer's and answerer's keys.
1352	         Answer is associated with media by the SSRC in MIKEY.  Note
1353	         that all AORs must share the same private key in order for
1354	         forking to work at all with MIKEY-RSA.  Additionally, a two-
1355	         time pad occurs if two branches choose the same 32-bit SSRC and
1356	         transmit SRTP packets.

1358	         Secure Retargeting: No.

1360	      MIKEY-RSA-R
1361	         Secure Forking: Yes. Answer is associated with media by the
1362	         SSRC in MIKEY.

1364	         Secure Retargeting: Yes.

1366	      MIKEY-DHSIGN
1367	         Secure Forking: Yes, each forked endpoint negotiates unique
1368	         keys with the offerer for both directions.  Answer is
1369	         associated with media by the SSRC in MIKEY.

1371	         Secure Retargeting: Yes, each target negotiates unique keys
1372	         with the offerer for both directions.

1374	      MIKEYv2 in SDP
1375	         The behavior will depend on which mode is picked.

1377	      MIKEY-DHHMAC
1378	         Secure Forking: Yes, each forked endpoint negotiates unique
1379	         keys with the offerer for both directions.  Answer is
1380	         associated with media by the SSRC in MIKEY.

1382	         Secure Retargeting: Yes, each target negotiates unique keys
1383	         with the offerer for both directions.  Note that for the keys
1384	         to be meaningful, it would require the PSK to be the same for
1385	         all the potential intermediaries, which would only happen
1386	         within a single domain.

1388	      Security Descriptions with SIPS
1389	         Secure Forking: No.  Each forked endpoint sees the offerer's
1390	         key.  Answer is not associated with media.

1392	         Secure Retargeting: No.  Each target sees the offerer's key.

1394	      Security Descriptions with S/MIME
1395	         Secure Forking: No.  Each forked endpoint sees the offerer's
1396	         key.  Answer is not associated with media.

1398	         Secure Retargeting: No.  Each target sees the offerer's key.
1399	         Suffers from retargeting identity problem.

1401	      SDP-DH
1402	         Secure Forking: Yes. Each forked endpoint calculates a unique
1403	         SRTP key.  Answer is not associated with media.

1405	         Secure Retargeting: Yes. The final target calculates a unique
1406	         SRTP key.

1408	      ZRTP
1409	         Yes. Each forked endpoint calculates a unique SRTP key.  With
1410	         the "a=zrtp-hash" attribute, the media can be associated with
1411	         an answer.

1413	         Secure Retargeting: Yes. The final target calculates a unique
1414	         SRTP key.

1416	      EKT
1417	         Secure Forking: Inherited from the bootstrapping mechanism (the
1418	         specific MIKEY mode or Security Descriptions).  Answer is
1419	         associated with media by the SPI in the EKT protocol.  Answer
1420	         is associated with media by the SPI in the EKT protocol.

1422	         Secure Retargeting: Inherited from the bootstrapping mechanism
1423	         (the specific MIKEY mode or Security Descriptions).

1425	      DTLS-SRTP
1426	         Secure Forking: Yes. Each forked endpoint calculates a unique
1427	         SRTP key.  Answer is associated with media by the certificate
1428	         fingerprint in signaling and certificate in the media path.

1430	         Secure Retargeting: Yes. The final target calculates a unique
1431	         SRTP key.

1433	      MIKEYv2 Inband
1434	         The behavior will depend on which mode is picked.

1436	A.4.2.  Clipping Media Before SDP Answer

1438	   Clipping media before receiving the signaling answer is described
1439	   within Section 4.1.  The following builds upon this description.

1441	   Furthermore, the problem of clipping gets compounded when forking is
1442	   used.  For example, if using a Diffie-Hellman keying technique with
1443	   security preconditions that forks to 20 endpoints, the call initiator
1444	   would get 20 provisional responses containing 20 signed Diffie-
1445	   Hellman half keys.  Calculating 20 DH secrets and validating
1446	   signatures can be a difficult task depending on the device
1447	   capabilities.

1449	   The following list compares the behavior of clipping before SDP
1450	   answer for each keying mechanism.

1452	      MIKEY-NULL
1453	         Not clipped.  The offerer provides the answerer's keys.

1455	      MIKEY-PSK
1456	         Not clipped.  The offerer provides the answerer's keys.

1458	      MIKEY-RSA
1459	         Not clipped.  The offerer provides the answerer's keys.

1461	      MIKEY-RSA-R
1462	         Clipped.  The answer contains the answerer's encryption key.

1464	      MIKEY-DHSIGN
1465	         Clipped.  The answer contains the answerer's Diffie-Hellman
1466	         response.

1468	      MIKEY-DHHMAC
1469	         Clipped.  The answer contains the answerer's Diffie-Hellman
1470	         response.

1472	      MIKEYv2 in SDP
1473	         The behavior will depend on which mode is picked.

1475	      Security Descriptions with SIPS
1476	         Clipped.  The answer contains the answerer's encryption key.

1478	      Security Descriptions with S/MIME
1479	         Clipped.  The answer contains the answerer's encryption key.

1481	      SDP-DH
1482	         Clipped.  The answer contains the answerer's Diffie-Hellman
1483	         response.

1485	      ZRTP
1486	         Not clipped because the session intially uses RTP.  While RTP
1487	         is flowing, both ends negotiate SRTP keys in the media path and
1488	         then switch to using SRTP.

1490	      EKT
1491	         Not clipped, as long as the first RTCP packet (containing the
1492	         answerer's key) is not lost in transit.  The answerer sends its
1493	         encryption key in RTCP, which arrives at the same time (or
1494	         before) the first SRTP packet encrypted with that key.

1496	            Note: RTCP needs to work, in the answerer-to-offerer
1497	            direction, before the offerer can decrypt SRTP media.

1499	      DTLS-SRTP
1500	         No clipping after the DTLS-SRTP handshake has completed.  SRTP
1501	         keys are exchanged in the media path.  Need to wait for SDP
1502	         answer to ensure DTLS-SRTP handshake was done with an
1503	         authorized party.

1505	            If a middlebox interferes with the media path, there can be
1506	            clipping [I-D.ietf-mmusic-media-path-middleboxes].

1508	      MIKEYv2 Inband
1509	         Not clipped.  Keys are exchanged in the media path without
1510	         relying on the signaling path.

1512	A.4.3.  SSRC and ROC

1514	   In SRTP, a cryptographic context is defined as the SSRC, destination
1515	   network address, and destination transport port number.  Whereas RTP,
1516	   a flow is defined as the destination network address and destination
1517	   transport port number.  This results in a problem -- how to
1518	   communicate the SSRC so that the SSRC can be used for the
1519	   cryptographic context.

1521	   Two approaches have emerged for this communication.  One, used by all
1522	   MIKEY modes, is to communicate the SSRCs to the peer in the MIKEY
1523	   exchange.  Another, used by Security Descriptions, is to apply "late
1524	   bindng" -- that is, any new packet containing a previously-unseen
1525	   SSRC (which arrives at the same destination network address and
1526	   destination transport port number) will create a new cryptographic
1527	   context.  Another approach, common amongst techniques with media-path
1528	   SRTP key establishment, is to require a handshake over that media
1529	   path before SRTP packets are sent.  MIKEY's approach changes RTP's
1530	   SSRC collision detection behavior by requiring RTP to pre-establish
1531	   the SSRC values for each session.

1533	   Another related issue is that SRTP introduces a rollover counter
1534	   (ROC), which records how many times the SRTP sequence number has
1535	   rolled over.  As the sequence number is used for SRTP's default
1536	   ciphers, it is important that all endpoints know the value of the
1537	   ROC.  The ROC starts at 0 at the beginning of a session.

1539	   Some keying mechanisms cause a two-time pad to occur if two endpoints
1540	   of a forked call have an SSRC collision.

1542	   Note: A proposal has been made to send the ROC value on every Nth
1543	   SRTP packet[RFC4771].  This proposal has not yet been incorporated
1544	   into this document.

1546	   The following list examines handling of SSRC and ROC:

1548	      MIKEY-NULL
1549	         Each endpoint indicates a set of SSRCs and the ROC for SRTP
1550	         packets it transmits.

1552	      MIKEY-PSK
1553	         Each endpoint indicates a set of SSRCs and the ROC for SRTP
1554	         packets it transmits.

1556	      MIKEY-RSA
1557	         Each endpoint indicates a set of SSRCs and the ROC for SRTP
1558	         packets it transmits.

1560	      MIKEY-RSA-R
1561	         Each endpoint indicates a set of SSRCs and the ROC for SRTP
1562	         packets it transmits.

1564	      MIKEY-DHSIGN
1565	         Each endpoint indicates a set of SSRCs and the ROC for SRTP
1566	         packets it transmits.

1568	      MIKEY-DHHMAC
1569	         Each endpoint indicates a set of SSRCs and the ROC for SRTP
1570	         packets it transmits.

1572	      MIKEYv2 in SDP
1573	         Each endpoint indicates a set of SSRCs and the ROC for SRTP
1574	         packets it transmits.

1576	      Security Descriptions with SIPS
1577	         Neither SSRC nor ROC are signaled.  SSRC 'late binding' is
1578	         used.

1580	      Security Descriptions with S/MIME
1581	         Neither SSRC nor ROC are signaled.  SSRC 'late binding' is
1582	         used.

1584	      SDP-DH
1585	         Neither SSRC nor ROC are signaled.  SSRC 'late binding' is
1586	         used.

1588	      ZRTP
1589	         Neither SSRC nor ROC are signaled.  SSRC 'late binding' is
1590	         used.

1592	      EKT
1593	         The SSRC of the SRTCP packet containing an EKT update
1594	         corresponds to the SRTP master key and other parameters within
1595	         that packet.

1597	      DTLS-SRTP
1598	         Neither SSRC nor ROC are signaled.  SSRC 'late binding' is
1599	         used.

1601	      MIKEYv2 Inband
1602	         Each endpoint indicates a set of SSRCs and the ROC for SRTP
1603	         packets it transmits.

1605	A.5.  Evaluation Criteria - Security

1607	   This section evaluates each keying mechanism on the basis of their
1608	   security properties.

1610	A.5.1.  Distribution and Validation of Persistent Public Keys and
1611	        Certificates

1613	   Using persistent public keys for confidentiality and authentication
1614	   can introduce requirements for two types of systems, often
1615	   implemented using certificates: (1) a system to distribute those
1616	   persistent public keys certificates, and (2) a system for validating
1617	   those persistent public keys.  We refer to the former as a key
1618	   distribution system and the latter as an authentication
1619	   infrastructure.  In many cases, a monolithic public key
1620	   infrastructure (PKI) is used for fulfill both of these roles.
1621	   However, these functions can be provided by many other systems.  For
1622	   instance, key distribution may be accomplished by any public
1623	   repository of keys.  Any system in which the two endpoints have
1624	   access to trust anchors and intermediate CA certificates that can be
1625	   used to validate other endpoints' certificates (including a system of
1626	   self-signed certificates) can be used to support certificate
1627	   validation in the below schemes.

1629	   With real-time communications it is desirable to avoid fetching or
1630	   validating certificates that delay call setup.  Rather, it is
1631	   preferable to fetch or validate certificates in such a way that call
1632	   setup is not delayed.  For example, a certificate can be validated
1633	   while the phone is ringing or can be validated while ring-back tones
1634	   are being played or even while the called party is answering the
1635	   phone and saying "hello".  Even better is to avoid fetching or
1636	   validating persistent public keys at all.

1638	   SRTP key exchange mechanisms that require a particular authentication
1639	   infrastructure to operate (whether for distribution or validation)
1640	   are gated on the deployment of a such an infrastructure available to
1641	   both endpoints.  This means that no media security is achievable
1642	   until such an infrastructure exists.  For SIP, something like sip-
1643	   certs [I-D.ietf-sip-certs] might be used to obtain the certificate of
1644	   a peer.

1646	      Note: Even if sip-certs [I-D.ietf-sip-certs] was deployed, the
1647	      retargeting problem (Appendix A.4.1) would still prevent
1648	      successful deployment of keying techniques which require the
1649	      offerer to obtain the actual target's public key.

1651	   The following list compares the requirements introduced by the use of
1652	   public-key cryptography in each keying mechanism, both for public key
1653	   distribution and for certificate validation.

1655	      MIKEY-NULL
1656	         Public-key cryptography is not used.

1658	      MIKEY-PSK
1659	         Public-key cryptography is not used.  Rather, all endpoints
1660	         must have some way to exchange per-endpoint or per-system pre-
1661	         shared keys.

1663	      MIKEY-RSA
1664	         The offerer obtains the intended answerer's public key before
1665	         initiating the call.  This public key is used to encrypt the
1666	         SRTP keys.  There is no defined mechanism for the offerer to
1667	         obtain the answerer's public key, although [I-D.ietf-sip-certs]
1668	         might be viable in the future.

1670	         The offer may also contain a certificate for the offeror, which
1671	         would require an authentication infrastructure in order to be
1672	         validated by the receiver.

1674	      MIKEY-RSA-R
1675	         The offer contains the offerer's certificate, and the answer
1676	         contains the answerer's certificate.  The answerer uses the
1677	         public key in the certificate to encrypt the SRTP keys that
1678	         will be used by the offerer and the answerer.  An
1679	         authentication infrastructure is necessary to validate the
1680	         certificates.

1682	      MIKEY-DHSIGN
1683	         An authentication infrastructure is used to authenticate the
1684	         public key that is included in the MIKEY message.

1686	      MIKEY-DHHMAC
1687	         Public-key cryptography is not used.  Rather, all endpoints
1688	         must have some way to exchange per-endpoint or per-system pre-
1689	         shared keys.

1691	      MIKEYv2 in SDP
1692	         The behavior will depend on which mode is picked.

1694	      Security Descriptions with SIPS
1695	         Public-key cryptography is not used.

1697	      Security Descriptions with S/MIME
1698	         Use of S/MIME requires that the endpoints be able to fetch and
1699	         validate certificates for each other.  The offerer must obtain
1700	         the intended target's certificate and encrypts the SDP offer
1701	         with the public key contained in target's certificate.  The
1702	         answerer must obtain the offerer's certificate and encrypt the
1703	         SDP answer with the public key contained in the offerer's
1704	         certificate.

1706	      SDP-DH
1707	         Public-key cryptography is not used.

1709	      ZRTP
1710	         Public-key cryptography is used (Diffie-Hellman), but without
1711	         dependence on persistent public keys.  Thus, certificates are
1712	         not fetched or validated.

1714	      EKT
1715	         Public-key cryptography is not used by itself, but might be
1716	         used by the EKT bootstrapping keying mechanism (such as certain
1717	         MIKEY modes).

1719	      DTLS-SRTP
1720	         Remote party's certificate is sent in media path, and a
1721	         fingerprint of the same certificate is sent in the signaling
1722	         path.

1724	      MIKEYv2 Inband
1725	         The behavior will depend on which mode is picked.

1727	A.5.2.  Perfect Forward Secrecy

1729	   In the context of SRTP, Perfect Forward Secrecy is the property that
1730	   SRTP session keys that protected a previous session are not
1731	   compromised if the static keys belonging to the endpoints are
1732	   compromised.  That is, if someone were to record your encrypted
1733	   session content and later acquires either party's private key, that
1734	   encrypted session content would be safe from decryption if your key
1735	   exchange mechanism had perfect forward secrecy.

1737	   The following list describes how each key exchange mechanism provides
1738	   PFS.

1740	      MIKEY-NULL
1741	         Not applicable; MIKEY-NULL does not have a long-term secret.

1743	      MIKEY-PSK
1744	         No PFS.

1746	      MIKEY-RSA
1747	         No PFS.

1749	      MIKEY-RSA-R
1750	         No PFS.

1752	      MIKEY-DHSIGN
1753	         PFS is provided with the Diffie-Hellman exchange.

1755	      MIKEY-DHHMAC
1756	         PFS is provided with the Diffie-Hellman exchange.

1758	      MIKEYv2 in SDP
1759	         The behavior will depend on which mode is picked.

1761	      Security Descriptions with SIPS
1762	         Not applicable; Security Descriptions does not have a long-term
1763	         secret.

1765	      Security Descriptions with S/MIME
1766	         Not applicable; Security Descriptions does not have a long-term
1767	         secret.

1769	      SDP-DH
1770	         PFS is provided with the Diffie-Hellman exchange.

1772	      ZRTP
1773	         PFS is provided with the Diffie-Hellman exchange.

1775	      EKT
1776	         No PFS.

1778	      DTLS-SRTP
1779	         PFS is provided if the negotiated cipher suite uses ephemeral
1780	         keys (e.g., Diffie-Hellman (DHE_RSA [I-D.ietf-tls-rfc4346-bis])
1781	         or Elliptic Curve Diffie-Hellman [RFC4492]).

1783	      MIKEYv2 Inband
1784	         The behavior will depend on which mode is picked.

1786	A.5.3.  Best Effort Encryption

1788	   With best effort encryption, SRTP is used with endpoints that support
1789	   SRTP, otherwise RTP is used.

1791	   SIP needs a backwards-compatible best effort encryption in order for
1792	   SRTP to work successfully with SIP retargeting and forking when there
1793	   is a mix of forked or retargeted devices that support SRTP and don't
1794	   support SRTP.

1796	      Consider the case of Bob, with a phone that only does RTP and a
1797	      voice mail system that supports SRTP and RTP.  If Alice calls Bob
1798	      with an SRTP offer, Bob's RTP-only phone will reject the media
1799	      stream (with an empty "m=" line) because Bob's phone doesn't
1800	      understand SRTP (RTP/SAVP).  Alice's phone will see this rejected
1801	      media stream and may terminate the entire call (BYE) and re-
1802	      initiate the call as RTP-only, or Alice's phone may decide to
1803	      continue with call setup with the SRTP-capable leg (the voice mail
1804	      system).  If Alice's phone decided to re-initiate the call as RTP-
1805	      only, and Bob doesn't answer his phone, Alice will then leave
1806	      voice mail using only RTP, rather than SRTP as expected.

1808	   Currently, several techniques are commonly considered as candidates
1809	   to provide opportunistic encryption:

1811	   multipart/alternative
1812	      [I-D.jennings-sipping-multipart] describes how to form a
1813	      multipart/alternative body part in SIP.  The significant issues
1814	      with this technique are (1) that multipart MIME is incompatible
1815	      with existing SIP proxies, firewalls, Session Border Controllers,
1816	      and endpoints and (2) when forking, the Heterogeneous Error
1817	      Response Forking Problem (HERFP) [RFC3326] causes problems if such
1818	      non-multipart-capable endpoints were involved in the forking.

1820	   session attribute
1821	      With this technique, the endpoints signal their desire to do SRTP
1822	      by signaling RTP (RTP/AVP), and using an attribute ("a=") in the
1823	      SDP.  This technique is entirely backwards compatible with non-
1824	      SRTP-aware endpoints, but doesn't use the RTP/SAVP protocol
1825	      registered by SRTP [RFC3711].

1827	   SDP Capability Negotiation
1828	      SDP Capability Negotiation
1829	      [I-D.ietf-mmusic-sdp-capability-negotiation] provides a backwards-
1830	      compatible mechanism to allow offering both SRTP and RTP in a
1831	      single offer.  This is the preferred technique.

1833	   Probing
1834	      With this technique, the endpoints first establish an RTP session
1835	      using RTP (RTP/AVP).  The endpoints send probe messages, over the
1836	      media path, to determine if the remote endpoint supports their
1837	      keying technique.  A disadvantage of probing is an active attacker
1838	      can interfere with probes, and until probing completes (and SRTP
1839	      is established) the media is in the clear.

1841	   The preferred technique, SDP Capability Negotiation
1842	   [I-D.ietf-mmusic-sdp-capability-negotiation], can be used with all
1843	   key exchange mechanisms.  What remains unique is ZRTP, which can also
1844	   accomplish its best effort encryption by probing (sending ZRTP
1845	   messages over the media path) or by session attribute (see "a=zrtp-
1846	   hash" in [I-D.zimmermann-avt-zrtp]).  Current implementations of ZRTP
1847	   use probing.

1849	A.5.4.  Upgrading Algorithms

1851	   It is necessary to allow upgrading SRTP encryption and hash
1852	   algorithms, as well as upgrading the cryptographic functions used for
1853	   the key exchange mechanism.  With SIP's offer/answer model, this can
1854	   be computionally expensive because the offer needs to contain all
1855	   combinations of the key exchange mechanisms (all MIKEY modes,
1856	   Security Descriptions) and all SRTP cryptographic suites (AES-128,
1857	   AES-256) and all SRTP cryptographic hash functions (SHA-1, SHA-256)
1858	   that the offerer supports.  In order to do this, the offerer has to
1859	   expend CPU resources to build an offer containing all of this
1860	   information which becomes computationally prohibitive.

1862	   Thus, it is important to keep the offerer's CPU impact fixed so that
1863	   offering multiple new SRTP encryption and hash functions incurs no
1864	   additional expense.

1866	   The following list describes the CPU effort involved in using each
1867	   key exchange technique.

1869	      MIKEY-NULL
1870	         No significant computaional expense.

1872	      MIKEY-PSK
1873	         No significant computational expense.

1875	      MIKEY-RSA
1876	         For each offered SRTP crypto suite, the offerer has to perform
1877	         RSA operation to encrypt the TGK

1879	      MIKEY-RSA-R
1880	         For each offered SRTP crypto suite, the offerer has to perform
1881	         public key operation to sign the MIKEY message.

1883	      MIKEY-DHSIGN
1884	         For each offered SRTP crypto suite, the offerer has to perform
1885	         Diffie-Hellman operation, and a public key operation to sign
1886	         the Diffie-Hellman output.

1888	      MIKEY-DHHMAC
1889	         For each offered SRTP crypto suite, the offerer has to perform
1890	         Diffie-Hellman operation.

1892	      MIKEYv2 in SDP
1893	         The behavior will depend on which mode is picked.

1895	      Security Descriptions with SIPS
1896	         No significant computational expense.

1898	      Security Descriptions with S/MIME
1899	         S/MIME requires the offerer and the answerer to encrypt the SDP
1900	         with the other's public key, and to decrypt the received SDP
1901	         with their own private key.

1903	      SDP-DH
1904	         For each offered SRTP crypto suite, the offerer has to perform
1905	         a Diffie-Hellman operation.

1907	      ZRTP
1908	         The offerer has no additional computational expense at all, as
1909	         the offer contains no information about ZRTP or might contain
1910	         "a=zrtp-hash".

1912	      EKT
1913	         The offerer's Computational expense depends entirely on the EKT
1914	         bootstrapping mechanism selected (one or more MIKEY modes or
1915	         Security Descriptions).

1917	      DTLS-SRTP
1918	         The offerer has no additional computational expense at all, as
1919	         the offer contains only a fingerprint of the certificate that
1920	         will be presented in the DTLS exchange.

1922	      MIKEYv2 Inband
1923	         The behavior will depend on which mode is picked.

1925	Appendix B.  Out-of-Scope

1927	   The compromise of an endpoint that has access to decrypted media
1928	   (e.g., SIP user agent, transcoder, recorder) is out of scope of this
1929	   document.  Such a compromise might be via privilege escalation,
1930	   installation of a virus or trojan horse, or similar attacks.

1932	B.1.  Shared Key Conferencing

1934	   The consensus on the RTPSEC mailing list was to concentrate on
1935	   unicast, point-to-point sessions.  Thus, there are no requirements
1936	   related to shared key conferencing.  This section is retained for
1937	   informational purposes.

1939	   For efficient scaling, large audio and video conference bridges
1940	   operate most efficiently by encrypting the current speaker once and
1941	   distributing that stream to the conference attendees.  Typically,
1942	   inactive participants receive the same streams -- they hear (or see)
1943	   the active speaker(s), and the active speakers receive distinct
1944	   streams that don't include themselves.  In order to maintain
1945	   confidentiality of such conferences where listeners share a common
1946	   key, all listeners must rekeyed when a listener joins or leaves a
1947	   conference.

1949	   An important use case for mixers/translators is a conference bridge:

1951	                                         +----+
1952	                             A --- 1 --->|    |
1953	                               <-- 2 ----| M  |
1954	                                         | I  |
1955	                             B --- 3 --->| X  |
1956	                               <-- 4 ----| E  |
1957	                                         | R  |
1958	                             C --- 5 --->|    |
1959	                               <-- 6 ----|    |
1960	                                         +----+

1962	                       Figure 3: Centralized Keying

1964	   In the figure above, 1, 3, and 5 are RTP media contributions from
1965	   Alice, Bob, and Carol, and 2, 4, and 6 are the RTP flows to those
1966	   devices carrying the 'mixed' media.

1968	   Several scenarios are possible:

1970	   a.  Multiple inbound sessions: 1, 3, and 5 are distinct RTP sessions,

1972	   b.  Multiple outbound sessions: 2, 4, and 6 are distinct RTP
1973	       sessions,

1975	   c.  Single inbound session: 1, 3, and 5 are just different sources
1976	       within the same RTP session,

1978	   d.  Single outbound session: 2, 4, and 6 are different flows of the
1979	       same (multi-unicast) RTP session

1981	   If there are multiple inbound sessions and multiple outbound sessions
1982	   (scenarios a and b), then every keying mechanism behaves as if the
1983	   mixer were an end point and can set up a point-to-point secure
1984	   session between the participant and the mixer.  This is the simplest
1985	   situation, but is computationally wasteful, since SRTP processing has
1986	   to be done independently for each participant.  The use of multiple
1987	   inbound sessions (scenario a) doesn't waste computational resources,
1988	   though it does consume additional cryptographic context on the mixer
1989	   for each participant and has the advantage of data origin
1990	   authentication.

1992	   To support a single outbound session (scenario d), the mixer has to
1993	   dictate its encryption key to the participants.  Some keying
1994	   mechanisms allow the transmitter to determine its own key, and others
1995	   allow the offerer to determine the key for the offerer and answerer.
1996	   Depending on how the call is established, the offerer might be a
1997	   participant (such as a participant dialing into a conference bridge)
1998	   or the offerer might be the mixer (such as a conference bridge
1999	   calling a participant).  The use of offerless INVITEs may help some
2000	   keying mechanisms reverse the role of offerer/answerer.  A
2001	   difficulty, however, is knowing a priori if the role should be
2002	   reversed for a particular call.  The significant advantage of a
2003	   single outbound session is the number of SRTP encryption operations
2004	   remains constant even as the number of participants increases.
2005	   However, a disadvantage is that data origin authentication is lost,
2006	   allowing any participant to spoof the sender (because all
2007	   participants know the sender's SRTP key).

2009	Appendix C.  Requirement renumbering in -02

2011	   [[RFC Editor: Please delete this section prior to publication.]]

2013	   Previous versions of this document used requirement numbers, which
2014	   were changed to mnemonics as follows:

2016	   R1    R-FORK-RETARGET

2018	   R2    R-BEST-SECURE

2020	   R3    R-DISTINCT

2022	   R4    R-REUSE; changed from 'MAY' to 'protocol MUST support, and
2023	         SHOULD implement'

2025	   R5    R-AVOID-CLIPPING

2027	   R6    R-PASS-MEDIA

2029	   R7    R-PASS-SIG

2031	   R8    R-PFS

2033	   R9    R-COMPUTE

2035	   R10   R-RTP-VALID

2037	   R11   (folded into R4; was reuse previous session)

2039	   R12   R-CERTS

2041	   R13   R-FIPS

2043	   R14   R-ASSOC

2045	   R15   R-ALLOW-RTP

2047	   R16   R-DOS

2049	   R17   R-SIG-MEDIA

2051	   R18   R-EXISTING

2053	   R19   R-AGILITY

2055	   R20   R-DOWNGRADE

2057	   R21   R-NEGOTIATE

2059	   R23   R-OTHER-SIGNALING
2060	   R23   R-RECORDING (R23 was duplicated in previous versions of the
2061	         document)

2063	   R24   (deleted; was lawful intercept)

2065	   R25   R-TRANSCODER

2067	   R26   R-PSTN

2069	   R27   R-ID-BINDING

2071	   R28   R-ACT-ACT

2073	Authors' Addresses

2075	   Dan Wing (editor)
2076	   Cisco Systems, Inc.
2077	   170 West Tasman Drive
2078	   San Jose, CA  95134
2079	   USA

2081	   Email: dwing@cisco.com

2083	   Steffen Fries
2084	   Siemens AG
2085	   Otto-Hahn-Ring 6
2086	   Munich, Bavaria  81739
2087	   Germany

2089	   Email: steffen.fries@siemens.com

2091	   Hannes Tschofenig
2092	   Nokia Siemens Networks
2093	   Otto-Hahn-Ring 6
2094	   Munich, Bavaria  81739
2095	   Germany

2097	   Email: Hannes.Tschofenig@nsn.com
2098	   URI:   http://www.tschofenig.priv.at
2099	   Francois Audet
2100	   Nortel
2101	   4655 Great America Parkway
2102	   Santa Clara, CA  95054
2103	   USA

2105	   Email: audet@nortel.com

2107	Full Copyright Statement

2109	   Copyright (C) The IETF Trust (2008).

2111	   This document is subject to the rights, licenses and restrictions
2112	   contained in BCP 78, and except as set forth therein, the authors
2113	   retain all their rights.

2115	   This document and the information contained herein are provided on an
2116	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
2117	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND
2118	   THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS
2119	   OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF
2120	   THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
2121	   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

2123	Intellectual Property

2125	   The IETF takes no position regarding the validity or scope of any
2126	   Intellectual Property Rights or other rights that might be claimed to
2127	   pertain to the implementation or use of the technology described in
2128	   this document or the extent to which any license under such rights
2129	   might or might not be available; nor does it represent that it has
2130	   made any independent effort to identify any such rights.  Information
2131	   on the procedures with respect to rights in RFC documents can be
2132	   found in BCP 78 and BCP 79.

2134	   Copies of IPR disclosures made to the IETF Secretariat and any
2135	   assurances of licenses to be made available, or the result of an
2136	   attempt made to obtain a general license or permission for the use of
2137	   such proprietary rights by implementers or users of this
2138	   specification can be obtained from the IETF on-line IPR repository at
2139	   http://www.ietf.org/ipr.

2141	   The IETF invites any interested party to bring to its attention any
2142	   copyrights, patents or patent applications, or other proprietary
2143	   rights that may cover technology that may be required to implement
2144	   this standard.  Please address the information to the IETF at
2145	   ietf-ipr@ietf.org.

2147	Acknowledgment

2149	   This document was produced using xml2rfc v1.33 (of
2150	   http://xml.resource.org/) from a source in RFC-2629 XML format.