idnits 2.17.1 draft-ietf-sip-media-security-requirements-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 20. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 2122. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 2133. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 2140. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 2146. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year == Line 948 has weird spacing: '...ication along...' == Line 983 has weird spacing: '...RFP) in the S...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 24, 2008) is 5899 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-07) exists of draft-ietf-avt-dtls-srtp-01 == Outdated reference: A later version (-07) exists of draft-ietf-mmusic-media-path-middleboxes-00 == Outdated reference: A later version (-13) exists of draft-ietf-mmusic-sdp-capability-negotiation-08 == Outdated reference: A later version (-09) exists of draft-ietf-msec-mikey-applicability-08 == Outdated reference: A later version (-15) exists of draft-ietf-sip-certs-05 == Outdated reference: A later version (-06) exists of draft-mcgrew-srtp-ekt-03 == Outdated reference: A later version (-04) exists of draft-wing-sipping-srtp-key-02 == Outdated reference: A later version (-22) exists of draft-zimmermann-avt-zrtp-04 -- Obsolete informational reference (is this intentional?): RFC 3388 (Obsoleted by RFC 5888) -- Obsolete informational reference (is this intentional?): RFC 4346 (Obsoleted by RFC 5246) -- Obsolete informational reference (is this intentional?): RFC 4474 (Obsoleted by RFC 8224) -- Obsolete informational reference (is this intentional?): RFC 4492 (Obsoleted by RFC 8422) Summary: 1 error (**), 0 flaws (~~), 11 warnings (==), 11 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 SIP Working Group D. Wing, Ed. 3 Internet-Draft Cisco 4 Intended status: Informational S. Fries 5 Expires: August 27, 2008 Siemens AG 6 H. Tschofenig 7 Nokia Siemens Networks 8 F. Audet 9 Nortel 10 February 24, 2008 12 Requirements and Analysis of Media Security Management Protocols 13 draft-ietf-sip-media-security-requirements-03 15 Status of this Memo 17 By submitting this Internet-Draft, each author represents that any 18 applicable patent or other IPR claims of which he or she is aware 19 have been or will be disclosed, and any of which he or she becomes 20 aware will be disclosed, in accordance with Section 6 of BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF), its areas, and its working groups. Note that 24 other groups may also distribute working documents as Internet- 25 Drafts. 27 Internet-Drafts are draft documents valid for a maximum of six months 28 and may be updated, replaced, or obsoleted by other documents at any 29 time. It is inappropriate to use Internet-Drafts as reference 30 material or to cite them other than as "work in progress." 32 The list of current Internet-Drafts can be accessed at 33 http://www.ietf.org/ietf/1id-abstracts.txt. 35 The list of Internet-Draft Shadow Directories can be accessed at 36 http://www.ietf.org/shadow.html. 38 This Internet-Draft will expire on August 27, 2008. 40 Abstract 42 This document describes requirements for a protocol to negotiate a 43 security context for SIP-signaled SRTP media. In addition to the 44 natural security requirements, this negotiation protocol must 45 interoperate well with SIP in certain ways. A number of proposals 46 have been published and a summary of these proposals is in the 47 appendix of this document. 49 Table of Contents 51 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 52 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 53 3. Attack Scenarios . . . . . . . . . . . . . . . . . . . . . . . 5 54 4. Call Scenarios . . . . . . . . . . . . . . . . . . . . . . . . 8 55 4.1. Clipping Media Before Signaling Answer . . . . . . . . . . 8 56 4.2. Retargeting and Forking . . . . . . . . . . . . . . . . . 9 57 4.3. Shared Key Conferencing . . . . . . . . . . . . . . . . . 11 58 4.4. Recording . . . . . . . . . . . . . . . . . . . . . . . . 13 59 4.5. PSTN gateway . . . . . . . . . . . . . . . . . . . . . . . 13 60 4.6. Call Setup Performance . . . . . . . . . . . . . . . . . . 14 61 4.7. Transcoding . . . . . . . . . . . . . . . . . . . . . . . 15 62 5. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 15 63 5.1. Key Management Protocol Requirements . . . . . . . . . . . 15 64 5.2. Security Requirements . . . . . . . . . . . . . . . . . . 17 65 5.3. Requirements Outside of the Key Management Protocol . . . 19 66 6. Security Considerations . . . . . . . . . . . . . . . . . . . 20 67 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 68 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 20 69 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 20 70 9.1. Normative References . . . . . . . . . . . . . . . . . . . 20 71 9.2. Informative References . . . . . . . . . . . . . . . . . . 21 72 Appendix A. Overview and Evaluation of Existing Keying 73 Mechanisms . . . . . . . . . . . . . . . . . . . . . 24 74 A.1. Signaling Path Keying Techniques . . . . . . . . . . . . . 24 75 A.1.1. MIKEY-NULL . . . . . . . . . . . . . . . . . . . . . . 25 76 A.1.2. MIKEY-PSK . . . . . . . . . . . . . . . . . . . . . . 25 77 A.1.3. MIKEY-RSA . . . . . . . . . . . . . . . . . . . . . . 25 78 A.1.4. MIKEY-RSA-R . . . . . . . . . . . . . . . . . . . . . 26 79 A.1.5. MIKEY-DHSIGN . . . . . . . . . . . . . . . . . . . . . 26 80 A.1.6. MIKEY-DHHMAC . . . . . . . . . . . . . . . . . . . . . 26 81 A.1.7. MIKEY-ECIES and MIKEY-ECMQV (MIKEY-ECC) . . . . . . . 26 82 A.1.8. Security Descriptions with SIPS . . . . . . . . . . . 27 83 A.1.9. Security Descriptions with S/MIME . . . . . . . . . . 27 84 A.1.10. SDP-DH (expired) . . . . . . . . . . . . . . . . . . . 27 85 A.1.11. MIKEYv2 in SDP (expired) . . . . . . . . . . . . . . . 27 86 A.1.12. Evaluation Criteria - SIP . . . . . . . . . . . . . . 28 87 A.1.13. Evaluation Criteria - Security . . . . . . . . . . . . 36 88 A.2. Media Path Keying Technique . . . . . . . . . . . . . . . 43 89 A.2.1. ZRTP . . . . . . . . . . . . . . . . . . . . . . . . . 43 90 A.3. Signaling and Media Path Keying Techniques . . . . . . . . 43 91 A.3.1. EKT . . . . . . . . . . . . . . . . . . . . . . . . . 43 92 A.3.2. DTLS-SRTP . . . . . . . . . . . . . . . . . . . . . . 44 93 A.3.3. MIKEYv2 Inband (expired) . . . . . . . . . . . . . . . 44 94 Appendix B. Out-of-Scope . . . . . . . . . . . . . . . . . . . . 44 95 Appendix C. Requirement renumbering in -02 . . . . . . . . . . . 44 96 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 46 97 Intellectual Property and Copyright Statements . . . . . . . . . . 48 99 1. Introduction 101 The work on media security started when the Session Initiation 102 Protocol (SIP) was still in its infancy. With the increased SIP 103 deployment and the availability of new SIP extensions and related 104 protocols, the need for end-to-end security was re-evaluated. The 105 procedure of re-evaluating prior protocol work and design decisions 106 is not an uncommon strategy and, to some extent, considered necessary 107 to ensure that the developed protocols indeed meet the previously 108 envisioned needs for the users on the Internet. 110 This document summarizes media security requirements, i.e., 111 requirements for mechanisms that negotiate security context such as 112 cryptographic keys and parameters for SRTP. 114 The organization of this document is as follows: Section 2 introduces 115 terminology, Section 3 describes various attack scenarios against the 116 signaling path and media path, Section 4 provides an overview about 117 possible call scenarios, Section 5 lists requirements for media 118 security. The main part of the document concludes with the security 119 considerations Section 6, IANA considerations Section 7 and an 120 acknowledgement section in Section 8. Appendix A lists and compares 121 available solution proposals. The following Appendix A.1.12 compares 122 the different approaches regarding their suitability for the SIP 123 signaling scenarios described in Appendix A, while Appendix A.1.13 124 provides a comparison regarding security aspects. Appendix B lists 125 non-goals for this document. 127 2. Terminology 129 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 130 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 131 document are to be interpreted as described in [RFC2119], with the 132 important qualification that, unless otherwise stated, these terms 133 apply to the design of the media security key management protocol, 134 not its implementation or application. 136 Additionally, the following items are used in this document: 138 AOR (Address-of-Record): A SIP or SIPS URI that points to a domain 139 with a location service that can map the URI to another URI where 140 the user might be available. Typically, the location service is 141 populated through registrations. An AOR is frequently thought of 142 as the "public address" of the user. 144 SSRC: The 32-bit value that defines the synchronization source, used 145 in RTP. These are generally unique, but collisions can occur. 147 two-time pad: The use of the same key and the same keystream to 148 encrypt different data. For SRTP, a two-time pad occurs if two 149 senders are using the same key and the same RTP SSRC value. 151 Perfect Forward Secrecy (PFS): The property that disclosure of the 152 long-term secret keying material that is used to derive an agreed 153 ephemeral key does not compromise the secrecy of agreed keys from 154 earlier runs. 156 active adversary: An active adversary is able to alter data 157 communication to affect its operation (see also [RFC4949]). 159 passive adversary: A passive adversary is able to learn information 160 from data communication, but not alter that data communication 161 (see also[RFC4949]). 163 signaling path: The signaling path is the route taken by SIP 164 signaling messages transmitted between the calling and called user 165 agents. This can be either direct signaling between the calling 166 and called user agents or, more commonly involves the SIP proxy 167 servers that were involved in the call setup. 169 media path: The media path is the route taken by media packets 170 exchanged by the endpoints. In the simplest case, the endpoints 171 exchange media directly, and the "media path" is defined by a 172 quartet of IP addresses and TCP/UDP ports, along with an IP route. 173 In other cases, this path may include RTP relays, mixers, 174 transcoders, session border controllers, NATs, or media gateways. 176 3. Attack Scenarios 178 The discussion in this section relates to requirements R-PASS-MEDIA, 179 R-PASS-SIG, R-ASSOC, R-SIG-MEDIA, R-ACT-ACT, and R-ID-BINDING. 181 This document classifies adversaries according to their access and 182 their capabilities. An adversary might have access: 184 1. only to the media path, 186 2. only to the signaling path, 188 3. to the media path and to the signaling path. 190 An attacker that can solely be located along the signaling path, and 191 does not have access to media (item 2), is not considered in this 192 document. 194 There are two different types of adversaries, active and passive. An 195 active adversary may need to be active with regard to the key 196 exchange relevant information traveling along the media path or 197 traveling along the signaling path. 199 Based on their robustness against the adversary capabilities 200 described above, we can group security mechanisms using the following 201 labels. This list is generally ordered from easiest to compromise 202 (at the top) to more difficult to compromise: 204 +---------------+---------+--------------------------------------+ 205 | SIP signaling | media | abbreviation | 206 +---------------+---------+--------------------------------------+ 207 | none | passive | no-signaling-passive-media | 208 | none | active | no-signaling-active-media | 209 | passive | passive | passive-signaling-passive-media | 210 | passive | active | passive-signaling-active-media | 211 | active | passive | active-signaling-passive-media | 212 | active | active | active-signaling-active-media | 213 | active | active | active-signaling-active-media-detect | 214 +---------------+---------+--------------------------------------+ 216 no-signaling-passive-media: 217 Access to only the media path is sufficient to reveal the content 218 of the media traffic. 220 passive-signaling-passive-media: 221 Passive attack on the signaling and passive attack on the media 222 path is necessary to reveal the content of the media traffic. 224 passive-signaling-active-media: 225 Passive attack on the signaling and active attack on the media 226 path is necessary to reveal the content of the media traffic. 228 active-signaling-passive-media: 229 Active attack on the signaling path and passive attack on the 230 media path is necessary to reveal the content of the media 231 traffic. 233 no-signaling-active-media: 234 Active attack on the media path is sufficient to reveal the 235 content of the media traffic. 237 active-signaling-active-media: 238 Active attack on both the signaling path and the media path is 239 necessary to reveal the content of the media traffic. 241 active-signaling-active-media-detect: 242 Active attack on both signaling and media path is necessary to 243 reveal the content of the media traffic (as with active-signaling- 244 active-media), and the attack is detectable by protocol messages 245 exchanged between the end points. 247 For example, unencrypted RTP is vulnerable to no-signaling-passive- 248 media. 250 As another example, Security Descriptions [RFC4568], when protected 251 by TLS (as it is commonly implemented and deployed), belongs in the 252 passive-signaling-passive-media category since the adversary needs to 253 learn the Security Descriptions key by seeing the SIP signaling 254 message at a SIP proxy (assuming that the adversary is in control of 255 the SIP proxy). The media traffic can be decrypted using that 256 learned key. 258 As another example, DTLS-SRTP falls into active-signaling-active- 259 media category when DTLS-SRTP is used with a public key based 260 ciphersuite with self-signed certificates and without SIP-Identity 261 [RFC4474]. An adversary would have to modify the fingerprint that is 262 sent along the signaling path and subsequently to modify the 263 certificates carried in the DTLS handshake that travel along the 264 media path. If DTLS-SRTP is used with both SIP Identity [RFC4474] 265 and SIP Connected Identity [RFC4916], the RFC4474 signature protects 266 both the offer and the answer, and such a system would then belong to 267 the active-signaling-active-attack-detect category (provided, of 268 course, the signaling path to the RFC4474 authenticator and verifier 269 is secured as per RFC4474 and the RFC4474 authenticator and verifier 270 are behaving as per RFC4474). 272 The above discussion of DTLS-SRTP demonstrates how a single security 273 protocol can be in different classes depending on the mode in which 274 it is operated. Other protocols can achieve similar effect by adding 275 functions outside of the on-the-wire key management protocol itself. 276 Although it may be appropriate to deploy lower-classed mechanisms in 277 some cases, the ultimate security requirement for a media security 278 negotiation protocol is that it have a mode of operation available in 279 which it is detect-attack, which provides protection against the 280 passive and active attacks and provides detection of such attacks. 281 That is, there must be a way to use the protocol so that an active 282 attack is required against both the signaling and media paths, and so 283 that such attacks are detectable by the endpoints. 285 4. Call Scenarios 287 The following subsections describe call scenarios that pose the most 288 challenge to the key management system for media data in cooperation 289 with SIP signaling. 291 4.1. Clipping Media Before Signaling Answer 293 The discussion in this section relates to requirement R-AVOID- 294 CLIPPING. 296 Per the SDP Offer/Answer Model [RFC3264], 298 "Once the offerer has sent the offer, it MUST be prepared to 299 receive media for any recvonly streams described by that offer. 300 It MUST be prepared to send and receive media for any sendrecv 301 streams in the offer, and send media for any sendonly streams in 302 the offer (of course, it cannot actually send until the peer 303 provides an answer with the needed address and port information)." 305 To meet this requirement with SRTP, the offerer needs to know the 306 SRTP key for arriving media. If either endpoint receives encrypted 307 media before it has access to the associated SRTP key, it cannot play 308 the media -- causing clipping. 310 For key exchange mechanisms that send the answerer's key in SDP, a 311 SIP provisional response [RFC3261], such as 183 (session progress), 312 is useful. However, the 183 messages are not reliable unless both 313 the calling and called end point support PRACK [RFC3262], use TCP 314 across all SIP proxies, implement Security Preconditions [RFC5027], 315 or the both ends implement ICE [I-D.ietf-mmusic-ice] and the answerer 316 implements the reliable provisional response mechanism described in 317 ICE. Unfortunately, there is not wide deployment of any of these 318 techniques and there is industry reluctance to require these 319 techniques to avoid the problems described in this section. 321 Note that the receipt of an SDP answer is not always sufficient to 322 allow media to be played to the offerer. Sometimes, the offerer must 323 send media in order to open up firewall holes or NAT bindings before 324 media can be received. In this case, even a solution that makes the 325 key available before the SDP answer arrives will not help. 327 Fixes to early media (i.e., the media that arrives at the SDP offerer 328 before the SDP answer arrives) might make the requirements to become 329 obsolete, but at the time of writing no progress has been 330 accomplished. 332 4.2. Retargeting and Forking 334 The discussion in this section relates to requirements R-FORK- 335 RETARGET, R-DISTINCT, R-HERFP, and R-BEST-SECURE. 337 In SIP, a request sent to a specific AOR but delivered to a different 338 AOR is called a "retarget". A typical scenario is a "call 339 forwarding" feature. In Figure 1 Alice sends an INVITE in step 1 340 that is sent to Bob in step 2. Bob responds with a redirect (SIP 341 response code 3xx) pointing to Carol in step 3. This redirect 342 typically does not propagate back to Alice but only goes to a proxy 343 (i.e., the retargeting proxy) that sends the original INVITE to Carol 344 in step 4. 346 +-----+ 347 |Alice| 348 +--+--+ 349 | 350 | INVITE (1) 351 V 352 +----+----+ 353 | proxy | 354 ++-+-----++ 355 | ^ | 356 INVITE (2) | | | INVITE (4) 357 & redirect (3) | | | 358 V | V 359 ++-++ ++----+ 360 |Bob| |Carol| 361 +---+ +-----+ 363 Figure 1: Retargeting 365 Using retargeting might lead to situations where the UAC does not 366 know where its request will be going. This might not immediately 367 seem like a serious problem; after all, when one places a telephone 368 call on the PSTN, one never really knows if it will be forwarded to a 369 different number, who will pick up the line when it rings, and so on. 370 However, when considering SIP mechanisms for authenticating the 371 called party, this function can also make it difficult to 372 differentiate an intermediary that is behaving legitimately from an 373 attacker. From this perspective, the main problems with retargeting 374 ares: 376 Not detectable by the caller: The originating user agent has no 377 means of anticipating that the condition will arise, nor any means 378 of determining that it has occurred until the call has already 379 been set up. 381 Not preventable by the caller: There is no existing mechanism that 382 might be employed by the originating user agent in order to 383 guarantee that the call will not be re-targeted. 385 The mechanism used by SIP for identifying the calling party is SIP 386 Identity [RFC4474]. However, due to the nature of retargeting SIP 387 Identity can only identify the calling party (that is, the party that 388 initiated the SIP request). Some key exchange mechanisms predate SIP 389 Identity and include their own identity mechanism (e.g., MIKEY). 390 However, those built-in identity mechanism also suffer from the SIP 391 retargeting problem. While Connected Identity [RFC4916] allows 392 positive identification of the called party, the primary difficulty 393 still remains that the calling party does not know if a mismatched 394 called party is legitimate (i.e., due to authorized retargeting) or 395 illegitimate (i.e., due to unauthorized retargeting by an attacker 396 above to modify SIP signaling). 398 In SIP, 'forking' is the delivery of a request to multiple locations. 399 This happens when a single AOR is registered more than once. An 400 example of forking is when a user has a desk phone, PC client, and 401 mobile handset all registered with the same AOR. 403 +-----+ 404 |Alice| 405 +--+--+ 406 | 407 | INVITE 408 V 409 +-----+-----+ 410 | proxy | 411 ++---------++ 412 | | 413 INVITE | | INVITE 414 V V 415 +--+--+ +--+--+ 416 |Bob-1| |Bob-2| 417 +-----+ +-----+ 419 Figure 2: Forking 421 With forking, both Bob-1 and Bob-2 might send back SDP answers in SIP 422 responses. Alice will see those intermediate (18x) and final (200) 423 responses. It is useful for Alice to be able to associate the SIP 424 response with the incoming media stream. Although this association 425 can be done with ICE [I-D.ietf-mmusic-ice], and ICE is useful to make 426 this association with RTP, it is not desirable to require ICE to 427 accomplish this association. 429 Forking and retargeting are often used together. For example, a boss 430 and secretary might have both phones ring (forking) and rollover to 431 voice mail if neither phone is answered (retargeting). 433 To maintain security of the media traffic, only the end point that 434 answers the call should know the SRTP keys for the session. Forked 435 and re-targeted calls only reveal sensitive information to non- 436 responders when the signaling messages contain sensitive information 437 (e.g., SRTP keys) that is accessible by parties that receive the 438 offer, but may not respond (i.e., the original recipients in a 439 retargeted call, or non-answering endpoints in a forked call). For 440 key exchange mechanisms that do not provide secure forking or secure 441 retargeting, one workaround is to re-key immediately after forking or 442 retargeting. However, because the originator may not be aware that 443 the call forked this mechanism requires rekeying immediately after 444 every session is established. This doubles the number of messages 445 processed by the network. 447 Further compounding this problem is a unique feature of SIP that when 448 forking is used, there is always only one final error response 449 delivered to the sender of the request: the forking proxy is 450 responsible for choosing which final response to choose in the event 451 where forking results in multiple final error responses being 452 received by the forking proxy. This means that if a request is 453 rejected, say with information that the keying information was 454 rejected and providing the far end's credentials, it is very possible 455 that the rejection will never reach the sender. This problem, called 456 the Heterogeneous Error Response Forking Problem (HERFP) 457 [I-D.mahy-sipping-herfp-fix], is difficult to solve in SIP. Because 458 we expect the HERFP to continue to be a problem in SIP for the 459 foreseeable future, a media security system should function even in 460 the presence of HERFP behavior. 462 4.3. Shared Key Conferencing 464 The consensus on the RTPSEC mailing list was to concentrate on 465 unicast, point-to-point sessions. Thus, there are no requirements 466 related to shared key conferencing. This section is retained for 467 informational purposes. 469 For efficient scaling, large audio and video conference bridges 470 operate most efficiently by encrypting the current speaker once and 471 distributing that stream to the conference attendees. Typically, 472 inactive participants receive the same streams -- they hear (or see) 473 the active speaker(s), and the active speakers receive distinct 474 streams that don't include themselves. In order to maintain 475 confidentiality of such conferences where listeners share a common 476 key, all listeners must rekeyed when a listener joins or leaves a 477 conference. 479 An important use case for mixers/translators is a conference bridge: 481 +----+ 482 A --- 1 --->| | 483 <-- 2 ----| M | 484 | I | 485 B --- 3 --->| X | 486 <-- 4 ----| E | 487 | R | 488 C --- 5 --->| | 489 <-- 6 ----| | 490 +----+ 492 Figure 3: Centralized Keying 494 In the figure above, 1, 3, and 5 are RTP media contributions from 495 Alice, Bob, and Carol, and 2, 4, and 6 are the RTP flows to those 496 devices carrying the 'mixed' media. 498 Several scenarios are possible: 500 a. Multiple inbound sessions: 1, 3, and 5 are distinct RTP sessions, 502 b. Multiple outbound sessions: 2, 4, and 6 are distinct RTP 503 sessions, 505 c. Single inbound session: 1, 3, and 5 are just different sources 506 within the same RTP session, 508 d. Single outbound session: 2, 4, and 6 are different flows of the 509 same (multi-unicast) RTP session 511 If there are multiple inbound sessions and multiple outbound sessions 512 (scenarios a and b), then every keying mechanism behaves as if the 513 mixer were an end point and can set up a point-to-point secure 514 session between the participant and the mixer. This is the simplest 515 situation, but is computationally wasteful, since SRTP processing has 516 to be done independently for each participant. The use of multiple 517 inbound sessions (scenario a) doesn't waste computational resources, 518 though it does consume additional cryptographic context on the mixer 519 for each participant and has the advantage of non-repudiation of the 520 originator of the incoming stream. 522 To support a single outbound session (scenario d), the mixer has to 523 dictate its encryption key to the participants. Some keying 524 mechanisms allow the transmitter to determine its own key, and others 525 allow the offerer to determine the key for the offerer and answerer. 526 Depending on how the call is established, the offerer might be a 527 participant (such as a participant dialing into a conference bridge) 528 or the offerer might be the mixer (such as a conference bridge 529 calling a participant). The use of offerless INVITEs may help some 530 keying mechanisms reverse the role of offerer/answerer. A 531 difficulty, however, is knowing a priori if the role should be 532 reversed for a particular call. 534 4.4. Recording 536 The discussion in this section relates to requirement R-RECORDING. 538 Some business environments, such as stock brokers, banks, and catalog 539 call centers, require recording calls with customers. This is the 540 familiar "this call is being recorded for quality purposes" heard 541 during calls to these sorts of businesses. In these environments, 542 media recording is typically performed by an intermediate device 543 (with RTP, this is typically implemented in a 'sniffer'). 545 When performing such call recording with SRTP, the end-to-end 546 security is compromised. This is unavoidable, but necessary because 547 the operation of the business requires such recording. It is 548 desirable that the media security is not unduly compromised by the 549 media recording. The endpoint within the organization needs to be 550 informed that there is an intermediate device and needs to cooperate 551 with that intermediate device. 553 This scenario does not place a requirement directly on the key 554 management protocol. The requirement could be met directly by the 555 key management protocol (e.g., MIKEY-NULL or [RFC4568]) or through an 556 external out-of-band-mechanism (e.g., [I-D.wing-sipping-srtp-key]). 558 4.5. PSTN gateway 560 The discussion in this section relates to requirement R-PSTN. 562 It is desirable, even when one leg of a call is on the PSTN, that the 563 IP leg of the call be protected with SRTP. 565 A typical case of using media security where two entities are having 566 a VoIP conversation over IP capable networks. However, there are 567 cases where the other end of the communication is not connected to an 568 IP capable network. In this kind of setting, there needs to be some 569 kind of gateway at the edge of the IP network which converts the VoIP 570 conversation to format understood by the other network. An example 571 of such gateway is a PSTN gateway sitting at the edge of IP and PSTN 572 networks (such as the architecture described in [RFC3372]). 574 If media security (e.g., SRTP protection) is employed in this kind of 575 gateway-setting, then media security and the related key management 576 is terminated at the PSTN gateway. The other network (e.g., PSTN) 577 may have its own measures to protect the communication, but this 578 means that from media security point of view the media security is 579 not employed truely end-to-end between the communicating entities. 581 4.6. Call Setup Performance 583 The discussion in this section relates to requirement R-REUSE. 585 Some devices lack sufficient processing power to perform public key 586 operations or Diffie-Hellman operations for each call, or prefer to 587 avoid performing those operations on every call. The ability to re- 588 use previous public key or Diffie-Hellman operations can vastly 589 decrease the call setup delay and processing requirements for such 590 devices. 592 In certain devices, it can take a second or two to perform a Diffie- 593 Hellman operation. Examples of these devices include handsets, IP 594 Multimedia Services Identity Module (ISIMs), and PSTN gateways. PSTN 595 gateways typically utilize a Digital Signal Processor (DSP) which is 596 not yet involved with typical DSP operations at the beginning of a 597 call, thus the DSP could be used to perform the calculation, so as to 598 avoid having the central host processor perform the calculation. 599 However, not all PSTN gateways use DSPs (some have only central 600 processors or their DSPs are incapable of performing the necessary 601 public key or Diffie-Hellman operation), and handsets lack a 602 separate, unused processor to perform these operations. 604 Two scenarios where R-REUSE is useful are calls between an endpoint 605 and its voicemail server or its PSTN gateway. In those scenarios 606 calls are made relatively often and it can be useful for the 607 voicemail server or PSTN gateway to avoid public key operations for 608 subsequent calls. 610 Storing keys across sessions often interferes with perfect forward 611 secrecy (R-PFS). 613 4.7. Transcoding 615 The discussion in this section relates to requirement R-TRANSCODER. 617 In some environments is is necessary for network equipment to 618 transcode from one codec (e.g., a highly compressed codec which makes 619 efficient use of wireless bandwidth) to another codec (e.g., a 620 standardized codec to a SIP peering interface). With RTP, a 621 transcoding function can be performed with the combination of a SIP 622 B2BUA (to modify the SDP) and a processor to perform the transcoding 623 between the codecs. However, with end-to-end secured SRTP, a 624 transcoding function implemented the same way is a man in the middle 625 attack, and the key management system prevents its use. 627 However, such a network-based transcoder can still be realized with 628 the cooperation and approval of the endpoint, and can provide end-to- 629 transcoder and transcoder-to-end security. 631 5. Requirements 633 This section is divided into several parts: requirements specific to 634 the key management protocol (Section 5.1), attack scenarios 635 (Section 5.2), and requirements which can be met inside the key 636 management protocol or outside of the key management protocol 637 (Section 5.3). 639 5.1. Key Management Protocol Requirements 641 SIP Forking and Retargeting, from Section 4.2: 643 R-FORK-RETARGET: 644 The media security key management protocol MUST securely 645 support forking and retargeting when all endpoints are willing 646 to use SRTP without causing the call setup to fail. This 647 requirement means the endpoints that did not answer the call 648 MUST NOT learn the SRTP keys (in either direction) used by the 649 answering endpoint. 651 R-DISTINCT: 652 The media security key management protocol MUST be capble of 653 creating distinct, independent cryptographic contexts for each 654 endpoint in a forked session. 656 R-HERFP: 657 The media security key management protocol MUST function 658 securely even in the presence of HERFP behavior. 660 Performance considerations: 662 R-REUSE: 663 The media security key management protocol MAY support the re- 664 use of a previously established security context. 666 Note: re-use of the security context does not imply re- 667 use of RTP parameters (e.g., payload type or SSRC). 669 Media considerations: 671 R-AVOID-CLIPPING: 672 The media security key management protocol SHOULD avoid 673 clipping media before SDP answer without requiring Security 674 Preconditions [RFC5027]. This requirement comes from 675 Section 4.1. 677 R-RTP-VALID: 678 If SRTP key negotiation is performed over the media path (i.e., 679 using the same UDP/TCP ports as media packets), the key 680 negotiation packets MUST NOT pass the RTP validity check 681 defined in Appendix A.1 of [RFC3550]. 683 R-ASSOC: 684 The media security key management protocol SHOULD include a 685 mechanism for associating key management messages with both the 686 signaling traffic that initiated the session and with protected 687 media traffic. Allowing such an association also allows the 688 SDP offerer to avoid performing CPU-consuming operations (e.g., 689 Diffie-Hellman or public key operations) with attackers that 690 have not seen the signaling messages. 692 For example, if using a Diffie-Hellman keying technique with 693 security preconditions that forks to 20 end points, the call 694 initiator would get 20 provisional responses containing 20 695 signed Diffie-Hellman key pairs. Calculating 20 DH secrets and 696 validating signatures can be a difficult task depending on the 697 device capabilities. Hence, in the case of forking, it is not 698 desirable to perform a DH or PK operation with every party, but 699 rather only with the party that answers the call (and incur 700 some media clipping). To do this, the signaling and media need 701 to be associated so the calling party knows which key 702 management needs to be completed. This might be done by using 703 the transport address indicated in the SDP, although NATs can 704 complicate this association. 706 Note: due to RTP's design requirements, it is expected 707 that SRTP receivers will have to perform authentication 708 of any received SRTP packets. 710 R-NEGOTIATE: 711 The media security key management protocol MUST allow a SIP 712 User Agent to negotiate media security parameters for each 713 individual session. 715 R-PSTN: 716 The media security key management protocol MUST support 717 termination of media security in a PSTN gateway. This 718 requirement is from Section 4.5. 720 5.2. Security Requirements 722 This section describes overall security requirements and specific 723 requirements from the attack scenarios (Section 3). 725 Overall security requirements: 727 R-PFS: 728 The media security key management protocol MUST be able to 729 support perfect forward secrecy. 731 R-COMPUTE: 732 The media security key management protocol MUST support 733 offering additional SRTP cipher suites without incurring 734 significant computational expense. 736 R-CERTS: 737 If the media security key management protocol employs 738 certificates, it MUST be able to make use of both self-signed 739 and CA-issued certificates. As an alternative, the media 740 security key management protocol MAY make use of "bare" public 741 keys. 743 R-FIPS: 744 The media security key management protocol SHOULD use 745 algorithms that allow FIPS 140-2 [FIPS-140-2] certification. 747 Note that the United States Government can only purchase and 748 use crypto implementations that have been validated by the 749 FIPS-140 [FIPS-140-2] process: 751 "The FIPS-140 standard is applicable to all Federal agencies 752 that use cryptographic-based security systems to protect 753 sensitive information in computer and telecommunication 754 systems, including voice systems. The adoption and use of this 755 standard is available to private and commercial 756 organizations."[cryptval] 758 Some commercial organizations, such as banks and defense 759 contractors, also require or prefer equipment which has 760 validated by the FIPS-140 process. 762 R-DOS: 763 The media security key management protocol SHOULD NOT introduce 764 new denial of service vulnerabilities (e.g., the protocol 765 should not request the endpoint to perform CPU-intensive 766 operations without the client being able to validate or 767 authorize the request). 769 R-EXISTING: 770 The media security key management protocol SHOULD allow 771 endpoints to authenticate using pre-existing cryptographic 772 credentials, e.g., certificates or pre-shared keys. 774 R-AGILITY: 775 The media security key management protocol MUST provide crypto- 776 agility, i.e., the ability to adapt to evolving cryptography 777 and security requirements (update of cryptographic algorithms 778 without substantial disruption to deployed implementations) 780 R-DOWNGRADE: 781 The media security key management protocol MUST protect cipher 782 suite negotiation against downgrading attacks. 784 R-PASS-MEDIA: 785 The media security key management protocol MUST have a mode 786 which prevents a passive adversary with access to the media 787 path from gaining access to keying material used to protect 788 SRTP media packets. 790 R-PASS-SIG: 791 The media security key management protocol MUST have a mode in 792 which it prevents a passive adversary with access to the 793 signaling path from gaining access to keying material used to 794 protect SRTP media packets. 796 R-SIG-MEDIA: 797 The media security key management protocol MUST have a mode in 798 which it defends itself from an attacker that is solely on the 799 media path and from an attacker that is solely on the signaling 800 path. A successful attack refers to the ability for the 801 adversary to obtain keying material to decrypt the SRTP 802 encrypted media traffic. 804 R-ID-BINDING: 805 The media security key management protocol MUST use identifiers 806 for endpoints that allow a domain to create signatures over 807 those identifiers and the From address. 809 This allows domains to deploy SIP Identity [RFC4474]. 811 R-ACT-ACT: 812 The media security key management protocol MUST support a mode 813 of operation that provides active-signaling-active-media-detect 814 robustness, and MAY support modes of operation that provide 815 lower levels of robustness (as described in Section 3). 817 Failing to meet R-ACT-ACT indicates the protocol can not 818 provide secure end-to-end media. 820 5.3. Requirements Outside of the Key Management Protocol 822 The requirements in this section are for an overall VoIP security 823 system. These requirements can be met within the key management 824 protocol itself, or can be solved outside of the key management 825 protocol itself (e.g., solved in SIP or in SDP). 827 R-BEST-SECURE: 828 Even when some end points of a forked or retargeted call are 829 incapable of using SRTP, a solution MUST be described which 830 allows the establishment of SRTP associations with SRTP-capable 831 endpoints and / or RTP associations with non-SRTP-capable 832 endpoints. This requirement comes from Section 4.2. 834 R-OTHER-SIGNALING: 835 A solution SHOULD be able to negotiate keys for SRTP sessions 836 created via different call signaling protocols (e.g., between 837 Jabber, SIP, H.323, MGCP). 839 R-RECORDING: 840 A solution SHOULD be described which supports recording of 841 decrypted media. This requirement comes from Section 4.4. 843 R-TRANSCODER: 844 A solution SHOULD be described which supports intermediate 845 nodes (e.g., transcoders), terminating or processing media, 846 between the end points. 848 6. Security Considerations 850 This document lists requirements for securing media traffic. As 851 such, it addresses security throughout the document. 853 7. IANA Considerations 855 This document does not require actions by IANA. 857 8. Acknowledgements 859 For contributions to the requirements portion of this document, the 860 authors would like to thank the active participants of the RTPSEC BoF 861 and on the RTPSEC mailing list. The authors would furthermore like 862 to thank Wolfgang Buecker, Guenther Horn, Peter Howard, Hans-Heinrich 863 Grusdt, Srinath Thiruvengadam, Martin Euchner, Eric Rescorla, Matt 864 Lepinski, Dan York, Werner Dittmann, Richard Barnes, Vesa Lehtovirta, 865 Colin Perkins, Peter Schneider, and Christer Holmberg for their 866 feedback to this document. 868 For contributions to the analysis portion of this document, the 869 authors would like to thank Special thanks to Steffen Fries and 870 Dragan Ignjatic for their excellent MIKEY comparison document 871 [I-D.ietf-msec-mikey-applicability]. The authors would furthermore 872 like to thank Cullen Jennings, David Oran, David McGrew, Mark 873 Baugher, Flemming Andreasen, Eric Raymond, Dave Ward, Leo Huang, Eric 874 Rescorla, Lakshminath Dondeti, Steffen Fries, Alan Johnston, Dragan 875 Ignjatic and John Elwell for their feedback to this document. 877 Thanks to Richard Barnes and Peter Schneider for thorough reviews and 878 suggestions which improved the document considerably. 880 9. References 882 9.1. Normative References 884 [FIPS-140-2] 885 NIST, "Security Requirements for Cryptographic Modules", 886 June 2005, . 889 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 890 Requirement Levels", BCP 14, RFC 2119, March 1997. 892 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 893 A., Peterson, J., Sparks, R., Handley, M., and E. 894 Schooler, "SIP: Session Initiation Protocol", RFC 3261, 895 June 2002. 897 [RFC3262] Rosenberg, J. and H. Schulzrinne, "Reliability of 898 Provisional Responses in Session Initiation Protocol 899 (SIP)", RFC 3262, June 2002. 901 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 902 with Session Description Protocol (SDP)", RFC 3264, 903 June 2002. 905 [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 906 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 907 RFC 3711, March 2004. 909 [cryptval] 910 NIST, "Cryptographic Module Validation Program", 911 December 2006, 912 . 914 9.2. Informative References 916 [I-D.baugher-mmusic-sdp-dh] 917 Baugher, M. and D. McGrew, "Diffie-Hellman Exchanges for 918 Multimedia Sessions", draft-baugher-mmusic-sdp-dh-00 (work 919 in progress), February 2006. 921 [I-D.dondeti-msec-rtpsec-mikeyv2] 922 Dondeti, L., "MIKEYv2: SRTP Key Management using MIKEY, 923 revisited", draft-dondeti-msec-rtpsec-mikeyv2-01 (work in 924 progress), March 2007. 926 [I-D.fischl-sipping-media-dtls] 927 Fischl, J., "Datagram Transport Layer Security (DTLS) 928 Protocol for Protection of Media Traffic Established with 929 the Session Initiation Protocol", 930 draft-fischl-sipping-media-dtls-03 (work in progress), 931 July 2007. 933 [I-D.ietf-avt-dtls-srtp] 934 McGrew, D. and E. Rescorla, "Datagram Transport Layer 935 Security (DTLS) Extension to Establish Keys for Secure 936 Real-time Transport Protocol (SRTP)", 937 draft-ietf-avt-dtls-srtp-01 (work in progress), 938 November 2007. 940 [I-D.ietf-mmusic-ice] 941 Rosenberg, J., "Interactive Connectivity Establishment 942 (ICE): A Protocol for Network Address Translator (NAT) 943 Traversal for Offer/Answer Protocols", 944 draft-ietf-mmusic-ice-19 (work in progress), October 2007. 946 [I-D.ietf-mmusic-media-path-middleboxes] 947 Stucker, B. and H. Tschofenig, "Analysis of Middlebox 948 Interactions for Signaling Protocol Communication along 949 the Media Path", 950 draft-ietf-mmusic-media-path-middleboxes-00 (work in 951 progress), January 2008. 953 [I-D.ietf-mmusic-sdp-capability-negotiation] 954 Andreasen, F., "SDP Capability Negotiation", 955 draft-ietf-mmusic-sdp-capability-negotiation-08 (work in 956 progress), December 2007. 958 [I-D.ietf-msec-mikey-applicability] 959 Fries, S. and D. Ignjatic, "On the applicability of 960 various MIKEY modes and extensions", 961 draft-ietf-msec-mikey-applicability-08 (work in progress), 962 February 2008. 964 [I-D.ietf-msec-mikey-ecc] 965 Milne, A., "ECC Algorithms for MIKEY", 966 draft-ietf-msec-mikey-ecc-03 (work in progress), 967 June 2007. 969 [I-D.ietf-sip-certs] 970 Jennings, C., Peterson, J., and J. Fischl, "Certificate 971 Management Service for The Session Initiation Protocol 972 (SIP)", draft-ietf-sip-certs-05 (work in progress), 973 February 2008. 975 [I-D.jennings-sipping-multipart] 976 Wing, D. and C. Jennings, "Session Initiation Protocol 977 (SIP) Offer/Answer with Multipart Alternative", 978 draft-jennings-sipping-multipart-02 (work in progress), 979 March 2006. 981 [I-D.mahy-sipping-herfp-fix] 982 Mahy, R., "A Solution to the Heterogeneous Error Response 983 Forking Problem (HERFP) in the Session Initiation 984 Protocol (SIP)", draft-mahy-sipping-herfp-fix-01 (work in 985 progress), March 2006. 987 [I-D.mcgrew-srtp-ekt] 988 McGrew, D., "Encrypted Key Transport for Secure RTP", 989 draft-mcgrew-srtp-ekt-03 (work in progress), July 2007. 991 [I-D.wing-sipping-srtp-key] 992 Wing, D., Audet, F., Fries, S., and H. Tschofenig, 993 "Disclosing Secure RTP (SRTP) Session Keys with a SIP 994 Event Package", draft-wing-sipping-srtp-key-02 (work in 995 progress), November 2007. 997 [I-D.zimmermann-avt-zrtp] 998 Zimmermann, P., "ZRTP: Media Path Key Agreement for Secure 999 RTP", draft-zimmermann-avt-zrtp-04 (work in progress), 1000 July 2007. 1002 [RFC3372] Vemuri, A. and J. Peterson, "Session Initiation Protocol 1003 for Telephones (SIP-T): Context and Architectures", 1004 BCP 63, RFC 3372, September 2002. 1006 [RFC3388] Camarillo, G., Eriksson, G., Holler, J., and H. 1007 Schulzrinne, "Grouping of Media Lines in the Session 1008 Description Protocol (SDP)", RFC 3388, December 2002. 1010 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 1011 Jacobson, "RTP: A Transport Protocol for Real-Time 1012 Applications", STD 64, RFC 3550, July 2003. 1014 [RFC3830] Arkko, J., Carrara, E., Lindholm, F., Naslund, M., and K. 1015 Norrman, "MIKEY: Multimedia Internet KEYing", RFC 3830, 1016 August 2004. 1018 [RFC4346] Dierks, T. and E. Rescorla, "The Transport Layer Security 1019 (TLS) Protocol Version 1.1", RFC 4346, April 2006. 1021 [RFC4474] Peterson, J. and C. Jennings, "Enhancements for 1022 Authenticated Identity Management in the Session 1023 Initiation Protocol (SIP)", RFC 4474, August 2006. 1025 [RFC4492] Blake-Wilson, S., Bolyard, N., Gupta, V., Hawk, C., and B. 1026 Moeller, "Elliptic Curve Cryptography (ECC) Cipher Suites 1027 for Transport Layer Security (TLS)", RFC 4492, May 2006. 1029 [RFC4568] Andreasen, F., Baugher, M., and D. Wing, "Session 1030 Description Protocol (SDP) Security Descriptions for Media 1031 Streams", RFC 4568, July 2006. 1033 [RFC4650] Euchner, M., "HMAC-Authenticated Diffie-Hellman for 1034 Multimedia Internet KEYing (MIKEY)", RFC 4650, 1035 September 2006. 1037 [RFC4738] Ignjatic, D., Dondeti, L., Audet, F., and P. Lin, "MIKEY- 1038 RSA-R: An Additional Mode of Key Distribution in 1039 Multimedia Internet KEYing (MIKEY)", RFC 4738, 1040 November 2006. 1042 [RFC4771] Lehtovirta, V., Naslund, M., and K. Norrman, "Integrity 1043 Transform Carrying Roll-Over Counter for the Secure Real- 1044 time Transport Protocol (SRTP)", RFC 4771, January 2007. 1046 [RFC4916] Elwell, J., "Connected Identity in the Session Initiation 1047 Protocol (SIP)", RFC 4916, June 2007. 1049 [RFC4949] Shirey, R., "Internet Security Glossary, Version 2", 1050 RFC 4949, August 2007. 1052 [RFC5027] Andreasen, F. and D. Wing, "Security Preconditions for 1053 Session Description Protocol (SDP) Media Streams", 1054 RFC 5027, October 2007. 1056 Appendix A. Overview and Evaluation of Existing Keying Mechanisms 1058 Based on how the SRTP keys are exchanged, each SRTP key exchange 1059 mechanism belongs to one general category: 1061 A.1. Signaling Path Keying Techniques 1063 signaling path: All the keying is carried in the call signaling 1064 (SIP or SDP) path. 1066 media path: All the keying is carried in the SRTP/SRTCP media 1067 path, and no signaling whatsoever is carried in the call 1068 signaling path. 1070 signaling and media path: Parts of the keying are carried in the 1071 SRTP/SRTCP media path, and parts are carried in the call 1072 signaling (SIP or SDP) path. 1074 One of the significant benefits of SRTP over other end-to-end 1075 encryption mechanisms, such as for example IPsec, is that SRTP is 1076 bandwidth efficient and SRTP retains the header of RTP packets. 1077 Bandwidth efficiency is vital for VoIP in many scenarios where access 1078 bandwidth is limited or expensive, and retaining the RTP header is 1079 important for troubleshooting packet loss, delay, and jitter. 1081 Related to SRTP's characteristics is a goal that any SRTP keying 1082 mechanism to also be efficient and not cause additional call setup 1083 delay. Contributors to additional call setup delay include network 1084 or database operations: retrieval of certificates and additional SIP 1085 or media path messages, and computational overhead of establishing 1086 keys or validating certificates. 1088 When examining the choice between keying in the signaling path, 1089 keying in the media path, or keying in both paths, it is important to 1090 realize the media path is generally 'faster' than the SIP signaling 1091 path. The SIP signaling path has computational elements involved 1092 which parse and route SIP messages. The media path, on the other 1093 hand, does not normally have computational elements involved, and 1094 even when computational elements such as firewalls are involved, they 1095 cause very little additional delay. Thus, the media path can be 1096 useful for exchanging several messages to establish SRTP keys. A 1097 disadvantage of keying over the media path is that interworking 1098 different key exchange requires the interworking function be in the 1099 media path, rather than just in the signaling path; in practice this 1100 involvement is probably unavoidable anyway. 1102 A.1.1. MIKEY-NULL 1104 MIKEY-NULL [RFC3830] has the offerer indicate the SRTP keys for both 1105 directions. The key is sent unencrypted in SDP, which means the SDP 1106 must be encrypted hop-by-hop (e.g., by using TLS (SIPS)) or end-to- 1107 end (e.g., by using S/MIME). 1109 MIKEY-NULL requires one message from offerer to answerer (half a 1110 round trip), and does not add additional media path messages. 1112 A.1.2. MIKEY-PSK 1114 MIKEY-PSK (pre-shared key) [RFC3830] requires that all endpoints 1115 share one common key. MIKEY-PSK has the offerer encrypt the SRTP 1116 keys for both directions using this pre-shared key. 1118 MIKEY-PSK requires one message from offerer to answerer (half a round 1119 trip), and does not add additional media path messages. 1121 A.1.3. MIKEY-RSA 1123 MIKEY-RSA [RFC3830] has the offerer encrypt the keys for both 1124 directions using the intended answerer's public key, which is 1125 obtained from a mechanism outside of MIKEY. 1127 MIKEY-RSA requires one message from offerer to answerer (half a round 1128 trip), and does not add additional media path messages. MIKEY-RSA 1129 requires the offerer to obtain the intended answerer's certificate. 1131 A.1.4. MIKEY-RSA-R 1133 MIKEY-RSA-R [RFC4738] is essentially the same as MIKEY-RSA but 1134 reverses the role of the offerer and the answerer with regards to 1135 providing the keys. That is, the answerer encrypts the keys for both 1136 directions using the offerer's public key. Both the offerer and 1137 answerer validate each other's public keys using a standard X.509 1138 validation techniques. MIKEY-RSA-R also enables sending certificates 1139 in the MIKEY message. 1141 MIKEY-RSA-R requires one message from offerer to answer, and one 1142 message from answerer to offerer (full round trip), and does not add 1143 additional media path messages. MIKEY-RSA-R requires the offerer 1144 validate the answerer's certificate. 1146 A.1.5. MIKEY-DHSIGN 1148 In MIKEY-DHSIGN [RFC3830] the offerer and answerer derive the key 1149 from a Diffie-Hellman exchange. In order to prevent an active man- 1150 in-the-middle the DH exchange itself is signed using each endpoint's 1151 private key and the associated public keys are validated using 1152 standard X.509 validation techniques. 1154 MIKEY-DHSIGN requires one message from offerer to answerer, and one 1155 message from answerer to offerer (full round trip), and does not add 1156 additional media path messages. MIKEY-DHSIGN requires the offerer 1157 and answerer to validate each other's certificates. MIKEY-DHSIGN 1158 also enables sending the answerer's certificate in the MIKEY message. 1160 A.1.6. MIKEY-DHHMAC 1162 MIKEY-DHHMAC [RFC4650] uses a pre-shared secret to HMAC the Diffie- 1163 Hellman exchange, essentially combining aspects of MIKEY-PSK with 1164 MIKEY-DHSIGN, but without MIKEY-DHSIGN's need for certificate 1165 authentication. 1167 MIKEY-DHHMAC requires one message from offerer to answerer, and one 1168 message from answerer to offerer (full round trip), and does not add 1169 additional media path messages. 1171 A.1.7. MIKEY-ECIES and MIKEY-ECMQV (MIKEY-ECC) 1173 ECC Algorithms For MIKEY [I-D.ietf-msec-mikey-ecc] describes how ECC 1174 can be used with MIKEY-RSA (using ECDSA signature) and with MIKEY- 1175 DHSIGN (using a new DH-Group code), and also defines two new ECC- 1176 based algorithms, Elliptic Curve Integrated Encryption Scheme (ECIES) 1177 and Elliptic Curve Menezes-Qu-Vanstone (ECMQV) . 1179 With this proposal, the ECDSA signature, MIKEY-ECIES, and MIKEY-ECMQV 1180 function exactly like MIKEY-RSA, and the new DH-Group code function 1181 exactly like MIKEY-DHSIGN. Therefore these ECC mechanisms are not 1182 discussed separately in this document. 1184 A.1.8. Security Descriptions with SIPS 1186 Security Descriptions [RFC4568] has each side indicate the key it 1187 will use for transmitting SRTP media, and the keys are sent in the 1188 clear in SDP. Security Descriptions relies on hop-by-hop (TLS via 1189 "SIPS:") encryption to protect the keys exchanged in signaling. 1191 Security Descriptions requires one message from offerer to answerer, 1192 and one message from answerer to offerer (full round trip), and does 1193 not add additional media path messages. 1195 A.1.9. Security Descriptions with S/MIME 1197 This keying mechanism is identical to Appendix A.1.8, except that 1198 rather than protecting the signaling with TLS, the entire SDP is 1199 encrypted with S/MIME. 1201 A.1.10. SDP-DH (expired) 1203 SDP Diffie-Hellman [I-D.baugher-mmusic-sdp-dh] exchanges Diffie- 1204 Hellman messages in the signaling path to establish session keys. To 1205 protect against active man-in-the-middle attacks, the Diffie-Hellman 1206 exchange needs to be protected with S/MIME, SIPS, or SIP-Identity 1207 [RFC4474] and [RFC4474]. 1209 SDP-DH requires one message from offerer to answerer, and one message 1210 from answerer to offerer (full round trip), and does not add 1211 additional media path messages. 1213 A.1.11. MIKEYv2 in SDP (expired) 1215 MIKEYv2 [I-D.dondeti-msec-rtpsec-mikeyv2] adds mode negotiation to 1216 MIKEYv1 and removes the time synchronization requirement. It 1217 therefore now takes 2 round-trips to complete. In the first round 1218 trip, the communicating parties learn each other's identities, agree 1219 on a MIKEY mode, crypto algorithm, SRTP policy, and exchanges nonces 1220 for replay protection. In the second round trip, they negotiate 1221 unicast and/or group SRTP context for SRTP and/or SRTCP. 1223 Furthemore, MIKEYv2 also defines an in-band negotiation mode as an 1224 alternative to SDP (see Appendix A.3.3). 1226 A.1.12. Evaluation Criteria - SIP 1228 This section considers how each keying mechanism interacts with SIP 1229 features. 1231 A.1.12.1. Secure Retargeting and Secure Forking 1233 Retargeting and forking of signaling requests is described within 1234 Section 4.2. The following builds upon this description. 1236 The following list compares the behavior of secure forking, answering 1237 association, two-time pads, and secure retargeting for each keying 1238 mechanism. 1240 MIKEY-NULL Secure Forking: No, all AORs see offerer's and 1241 answerer's keys. Answer is associated with media by the SSRC 1242 in MIKEY. Additionally, a two-time pad occurs if two branches 1243 choose the same 32-bit SSRC and transmit SRTP packets. 1245 Secure Retargeting: No, all targets see offerer's and 1246 answerer's keys. Suffers from retargeting identity problem. 1248 MIKEY-PSK 1249 Secure Forking: No, all AORs see offerer's and answerer's keys. 1250 Answer is associated with media by the SSRC in MIKEY. Note 1251 that all AORs must share the same pre-shared key in order for 1252 forking to work at all with MIKEY-PSK. Additionally, a two- 1253 time pad occurs if two branches choose the same 32-bit SSRC and 1254 transmit SRTP packets. 1256 Secure Retargeting: Not secure. For retargeting to work, the 1257 final target must possess the correct PSK. As this is likely 1258 in scenarios were the call is targeted to another device 1259 belonging to the same user (forking), it is very unlikely that 1260 other users will possess that PSK and be able to successfully 1261 answer that call. 1263 MIKEY-RSA 1264 Secure Forking: No, all AORs see offerer's and answerer's keys. 1265 Answer is associated with media by the SSRC in MIKEY. Note 1266 that all AORs must share the same private key in order for 1267 forking to work at all with MIKEY-RSA. Additionally, a two- 1268 time pad occurs if two branches choose the same 32-bit SSRC and 1269 transmit SRTP packets. 1271 Secure Retargeting: No. 1273 MIKEY-RSA-R 1274 Secure Forking: Yes. Answer is associated with media by the 1275 SSRC in MIKEY. 1277 Secure Retargeting: Yes. 1279 MIKEY-DHSIGN 1280 Secure Forking: Yes, each forked endpoint negotiates unique 1281 keys with the offerer for both directions. Answer is 1282 associated with media by the SSRC in MIKEY. 1284 Secure Retargeting: Yes, each target negotiates unique keys 1285 with the offerer for both directions. 1287 MIKEYv2 in SDP 1288 The behavior will depend on which mode is picked. 1290 MIKEY-DHHMAC 1291 Secure Forking: Yes, each forked endpoint negotiates unique 1292 keys with the offerer for both directions. Answer is 1293 associated with media by the SSRC in MIKEY. 1295 Secure Retargeting: Yes, each target negotiates unique keys 1296 with the offerer for both directions. Note that for the keys 1297 to be meaningful, it would require the PSK to be the same for 1298 all the potential intermediaries, which would only happen 1299 within a single domain. 1301 Security Descriptions with SIPS 1302 Secure Forking: No. Each forked endpoint sees the offerer's 1303 key. Answer is not associated with media. 1305 Secure Retargeting: No. Each target sees the offerer's key. 1307 Security Descriptions with S/MIME 1308 Secure Forking: No. Each forked endpoint sees the offerer's 1309 key. Answer is not associated with media. 1311 Secure Retargeting: No. Each target sees the offerer's key. 1312 Suffers from retargeting identity problem. 1314 SDP-DH 1315 Secure Forking: Yes. Each forked endpoint calculates a unique 1316 SRTP key. Answer is not associated with media. 1318 Secure Retargeting: Yes. The final target calculates a unique 1319 SRTP key. 1321 ZRTP 1322 Secure Forking: Yes. Each forked endpoint calculates a unique 1323 SRTP key. As ZRTP isn't signaled in SDP, there is no 1324 association of the answer with media. 1326 Secure Retargeting: Yes. The final target calculates a unique 1327 SRTP key. 1329 EKT 1330 Secure Forking: Inherited from the bootstrapping mechanism (the 1331 specific MIKEY mode or Security Descriptions). Answer is 1332 associated with media by the SPI in the EKT protocol. Answer 1333 is associated with media by the SPI in the EKT protocol. 1335 Secure Retargeting: Inherited from the bootstrapping mechanism 1336 (the specific MIKEY mode or Security Descriptions). 1338 DTLS-SRTP 1339 Secure Forking: Yes. Each forked endpoint calculates a unique 1340 SRTP key. Answer is associated with media by the certificate 1341 fingerprint in signaling and certificate in the media path. 1343 Secure Retargeting: Yes. The final target calculates a unique 1344 SRTP key. 1346 MIKEYv2 Inband 1347 The behavior will depend on which mode is picked. 1349 A.1.12.2. Clipping Media Before SDP Answer 1351 Clipping media before receiving the signaling answer is described 1352 within Section 4.1. The following builds upon this description. 1354 Furthermore, the problem of clipping gets compounded when forking is 1355 used. For example, if using a Diffie-Hellman keying technique with 1356 security preconditions that forks to 20 endpoints, the call initiator 1357 would get 20 provisional responses containing 20 signed Diffie- 1358 Hellman half keys. Calculating 20 DH secrets and validating 1359 signatures can be a difficult task depending on the device 1360 capabilities. 1362 The following list compares the behavior of clipping before SDP 1363 answer for each keying mechanism. 1365 MIKEY-NULL 1366 Not clipped. The offerer provides the answerer's keys. 1368 MIKEY-PSK 1369 Not clipped. The offerer provides the answerer's keys. 1371 MIKEY-RSA 1372 Not clipped. The offerer provides the answerer's keys. 1374 MIKEY-RSA-R 1375 Clipped. The answer contains the answerer's encryption key. 1377 MIKEY-DHSIGN 1378 Clipped. The answer contains the answerer's Diffie-Hellman 1379 response. 1381 MIKEY-DHHMAC 1382 Clipped. The answer contains the answerer's Diffie-Hellman 1383 response. 1385 MIKEYv2 in SDP 1386 The behavior will depend on which mode is picked. 1388 Security Descriptions with SIPS 1389 Clipped. The answer contains the answerer's encryption key. 1391 Security Descriptions with S/MIME 1392 Clipped. The answer contains the answerer's encryption key. 1394 SDP-DH 1395 Clipped. The answer contains the answerer's Diffie-Hellman 1396 response. 1398 ZRTP 1399 Not clipped because the session intially uses RTP. While RTP 1400 is flowing, both ends negotiate SRTP keys in the media path and 1401 then switch to using SRTP. 1403 EKT 1404 Not clipped, as long as the first RTCP packet (containing the 1405 answerer's key) is not lost in transit. The answerer sends its 1406 encryption key in RTCP, which arrives at the same time (or 1407 before) the first SRTP packet encrypted with that key. 1409 Note: RTCP needs to work, in the answerer-to-offerer 1410 direction, before the offerer can decrypt SRTP media. 1412 DTLS-SRTP 1413 No clipping after the DTLS-SRTP handshake has completed. SRTP 1414 keys are exchanged in the media path. Need to wait for SDP 1415 answer to ensure DTLS-SRTP handshake was done with an 1416 authorized party. 1418 If a middlebox interferes with the media path, there can be 1419 clipping [I-D.ietf-mmusic-media-path-middleboxes]. 1421 MIKEYv2 Inband 1422 Not clipped. Keys are exchanged in the media path without 1423 relying on the signaling path. 1425 A.1.12.3. Centralized Keying 1427 Centralized keying is described within Section 4.3. The following 1428 builds upon this description. 1430 The following list describes how each keying mechanism behaves with 1431 centralized keying (scenario d) and rekeying. 1433 MIKEY-NULL 1434 Keying: Yes, if offerer is the mixer. No, if offerer is the 1435 participant (end user). 1437 Rekeying: Yes, via re-INVITE 1439 MIKEY-PSK 1440 Keying: Yes, if offerer is the mixer. No, if offerer is the 1441 participant (end user). 1443 Rekeying: Yes, with a re-INVITE 1445 MIKEY-RSA 1446 Keying: Yes, if offerer is the mixer. No, if offerer is the 1447 participant (end user). 1449 Rekeying: Yes, with a re-INVITE 1451 MIKEY-RSA-R 1452 Keying: No, if offerer is the mixer. Yes, if offerer is the 1453 participant (end user). 1455 Rekeying: n/a 1457 MIKEY-DHSIGN 1458 Keying: No; a group-key Diffie-Hellman protocol is not 1459 supported. 1461 Rekeying: n/a 1463 MIKEY-DHHMAC 1464 Keying: No; a group-key Diffie-Hellman protocol is not 1465 supported. 1467 Rekeying: n/a 1469 MIKEYv2 in SDP 1470 The behavior will depend on which mode is picked. 1472 Security Descriptions with SIPS 1473 Keying: Yes, if offerer is the mixer. Yes, if offerer is the 1474 participant. 1476 Rekeying: Yes, with a re-INVITE. 1478 Security Descriptions with S/MIME 1479 Keying: Yes, if offerer is the mixer. Yes, if offerer is the 1480 participant. 1482 Rekeying: Yes, with a re-INVITE. 1484 SDP-DH 1485 Keying: No; a group-key Diffie-Hellman protocol is not 1486 supported. 1488 Rekeying: n/a 1490 ZRTP 1491 Keying: No; a group-key Diffie-Hellman protocol is not 1492 supported. 1494 Rekeying: n/a 1496 EKT 1497 Keying: Yes. After bootstrapping a KEK using Security 1498 Descriptions or MIKEY, each member originating an SRTP stream 1499 can send its SRTP master key, sequence number and ROC via RTCP. 1501 Rekeying: Yes. EKT supports each sender to transmit its SRTP 1502 master key to the group via RTCP packets. Thus, EKT supports 1503 each originator of an SRTP stream to rekey at any time. 1505 DTLS-SRTP 1506 Keying: Yes, because with the assumed cipher suite, 1507 TLS_RSA_WITH_3DES_EDE_CBC_SHA, each end indicates its SRTP key. 1509 Rekeying: via DTLS in the media path. 1511 MIKEYv2 Inband 1512 The behavior will depend on which mode is picked. 1514 A.1.12.4. SSRC and ROC 1516 In SRTP, a cryptographic context is defined as the SSRC, destination 1517 network address, and destination transport port number. Whereas RTP, 1518 a flow is defined as the destination network address and destination 1519 transport port number. This results in a problem -- how to 1520 communicate the SSRC so that the SSRC can be used for the 1521 cryptographic context. 1523 Two approaches have emerged for this communication. One, used by all 1524 MIKEY modes, is to communicate the SSRCs to the peer in the MIKEY 1525 exchange. Another, used by Security Descriptions, is to use "late 1526 bindng" -- that is, any new packet containing a previously-unseen 1527 SSRC (which arrives at the same destination network address and 1528 destination transport port number) will create a new cryptographic 1529 context. Another approach, common amongst techniques with media-path 1530 SRTP key establishment, is to require a handshake over that media 1531 path before SRTP packets are sent. MIKEY's approach changes RTP's 1532 SSRC collision detection behavior by requiring RTP to pre-establish 1533 the SSRC values for each session. 1535 Another related issue is that SRTP introduces a rollover counter 1536 (ROC), which records how many times the SRTP sequence number has 1537 rolled over. As the sequence number is used for SRTP's default 1538 ciphers, it is important that all endpoints know the value of the 1539 ROC. The ROC starts at 0 at the beginning of a session. 1541 Some keying mechanisms cause a two-time pad to occur if two endpoints 1542 of a forked call have an SSRC collision. 1544 Note: A proposal has been made to send the ROC value on every Nth 1545 SRTP packet[RFC4771]. This proposal has not yet been incorporated 1546 into this document. 1548 The following list examines handling of SSRC and ROC: 1550 MIKEY-NULL 1551 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1552 packets it transmits. 1554 MIKEY-PSK 1555 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1556 packets it transmits. 1558 MIKEY-RSA 1559 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1560 packets it transmits. 1562 MIKEY-RSA-R 1563 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1564 packets it transmits. 1566 MIKEY-DHSIGN 1567 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1568 packets it transmits. 1570 MIKEY-DHHMAC 1571 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1572 packets it transmits. 1574 MIKEYv2 in SDP 1575 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1576 packets it transmits. 1578 Security Descriptions with SIPS 1579 Neither SSRC nor ROC are signaled. SSRC 'late binding' is 1580 used. 1582 Security Descriptions with S/MIME 1583 Neither SSRC nor ROC are signaled. SSRC 'late binding' is 1584 used. 1586 SDP-DH 1587 Neither SSRC nor ROC are signaled. SSRC 'late binding' is 1588 used. 1590 ZRTP 1591 Neither SSRC nor ROC are signaled. SSRC 'late binding' is 1592 used. 1594 EKT 1595 The SSRC of the SRTCP packet containing an EKT update 1596 corresponds to the SRTP master key and other parameters within 1597 that packet. 1599 DTLS-SRTP 1600 Neither SSRC nor ROC are signaled. SSRC 'late binding' is 1601 used. 1603 MIKEYv2 Inband 1604 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1605 packets it transmits. 1607 A.1.13. Evaluation Criteria - Security 1609 This section evaluates each keying mechanism on the basis of their 1610 security properties. 1612 A.1.13.1. Distribution and Validation of Public Keys and Certificates 1614 Using public key cryptography for confidentiality and authentication 1615 can introduce requirements for two types of systems: (1) a system to 1616 distribute public keys (often in the form of certificates), and (2) a 1617 system for validating certificates. We refer to the former as a key 1618 distribution system and the latter as an authentication 1619 infrastructure. In many cases, a monolithic public key 1620 infrastructure (PKI) is used for fulfill both of these roles. 1621 However, these functions can be provided by many other systems. For 1622 instance, key distribution may be accomplished by any public 1623 repository of keys. Any system in which the two endpoints have 1624 access to trust anchors and intermediate CA certificates that can be 1625 used to validate other endpoints' certificates (including a system of 1626 self-signed certificates) can be used to support certificate 1627 validation in the below schemes. 1629 With real-time communications it is desirable to avoid fetching keys 1630 or certificates that delay call setup; rather it is preferable to 1631 fetch or validate certificates in such a way that call setup isn't 1632 delayed. For example, a certificate can be validated while the phone 1633 is ringing or can be validated while ring-back tones are being played 1634 or even while the called party is answering the phone and saying 1635 "hello". 1637 SRTP key exchange mechanisms that require a particular authentication 1638 infrastructure to operate (whether for distribution or validation) 1639 are gated on the deployment of a such an infrastructure available to 1640 both endpoints. This means that no media security is achievable 1641 until such an infrastructure exists. For SIP, something like sip- 1642 certs [I-D.ietf-sip-certs] might be used to obtain the certificate of 1643 a peer. 1645 Note: Even if sip-certs [I-D.ietf-sip-certs] was deployed, the 1646 retargeting problem (Appendix A.1.12.1) would still prevent 1647 successful deployment of keying techniques which require the 1648 offerer to obtain the actual target's public key. 1650 The following list compares the requirements introduced by the use of 1651 public-key cryptography in each keying mechanism, both for public key 1652 distribution and for certificate validation. 1654 MIKEY-NULL 1655 Public-key cryptography is not used. 1657 MIKEY-PSK 1658 Public-key cryptography is not used. Rather, all endpoints 1659 must have some way to exchange per-endpoint or per-system pre- 1660 shared keys. 1662 MIKEY-RSA 1663 The offerer obtains the intended answerer's public key before 1664 initiating the call. This public key is used to encrypt the 1665 SRTP keys. There is no defined mechanism for the offerer to 1666 obtain the answerer's public key, although [I-D.ietf-sip-certs] 1667 might be viable in the future. 1669 The offer may also contain a certificate for the offeror, which 1670 would require an authentication infrastructure in order to be 1671 validated by the receiver. 1673 MIKEY-RSA-R 1674 The offer contains the offerer's certificate, and the answer 1675 contains the answerer's certificate. The answerer uses the 1676 public key in the certificate to encrypt the SRTP keys that 1677 will be used by the offerer and the answerer. An 1678 authentication infrastructure is necessary to validate the 1679 certificates. 1681 MIKEY-DHSIGN 1682 An authentication infrastructure is used to authenticate the 1683 public key that is included in the MIKEY message. 1685 MIKEY-DHHMAC 1686 Public-key cryptography is not used. Rather, all endpoints 1687 must have some way to exchange per-endpoint or per-system pre- 1688 shared keys. 1690 MIKEYv2 in SDP 1691 The behavior will depend on which mode is picked. 1693 Security Descriptions with SIPS 1694 Public-key cryptography is not used. 1696 Security Descriptions with S/MIME 1697 Use of S/MIME requires that the endpoints be able to fetch and 1698 validate certificates for each other. The offerer must obtain 1699 the intended target's certificate and encrypts the SDP offer 1700 with the public key contained in target's certificate. The 1701 answerer must obtain the offerer's certificate and encrypt the 1702 SDP answer with the public key contained in the offerer's 1703 certificate. 1705 SDP-DH 1706 Public-key cryptography is not used. 1708 ZRTP 1709 Public-key cryptography is not used. 1711 EKT 1712 Public-key cryptography is not used by itself, but might be 1713 used by the EKT bootstrapping keying mechanism (such as certain 1714 MIKEY modes). 1716 DTLS-SRTP 1717 Remote party's certificate is sent in media path, and a 1718 fingerprint of the same certificate is sent in the signaling 1719 path. 1721 MIKEYv2 Inband 1722 The behavior will depend on which mode is picked. 1724 A.1.13.2. Perfect Forward Secrecy 1726 In the context of SRTP, Perfect Forward Secrecy is the property that 1727 SRTP session keys that protected a previous session are not 1728 compromised if the static keys belonging to the endpoints are 1729 compromised. That is, if someone were to record your encrypted 1730 session content and later acquires either party's private key, that 1731 encrypted session content would be safe from decryption if your key 1732 exchange mechanism had perfect forward secrecy. 1734 The following list describes how each key exchange mechanism provides 1735 PFS. 1737 MIKEY-NULL 1738 No PFS. 1740 MIKEY-PSK 1741 No PFS. 1743 MIKEY-RSA 1744 No PFS. 1746 MIKEY-RSA-R 1747 No PFS. 1749 MIKEY-DHSIGN 1750 PFS is provided with the Diffie-Hellman exchange. 1752 MIKEY-DHHMAC 1753 PFS is provided with the Diffie-Hellman exchange. 1755 MIKEYv2 in SDP 1756 The behavior will depend on which mode is picked. 1758 Security Descriptions with SIPS 1759 No PFS. 1761 Security Descriptions with S/MIME 1762 No PFS. 1764 SDP-DH 1765 PFS is provided with the Diffie-Hellman exchange. 1767 ZRTP 1768 PFS is provided with the Diffie-Hellman exchange. 1770 EKT 1771 No PFS. 1773 DTLS-SRTP 1774 PFS is achieved if the negotiated cipher suite includes an 1775 exponential or discrete-logarithmic key exchange (e.g., Diffie- 1776 Hellman (DH_RSA from [RFC4346]) or Elliptic Curve Diffie- 1777 Hellman [RFC4492]). 1779 MIKEYv2 Inband 1780 The behavior will depend on which mode is picked. 1782 A.1.13.3. Best Effort Encryption 1784 With best effort encryption, SRTP is used with endpoints that support 1785 SRTP, otherwise RTP is used. 1787 SIP needs a backwards-compatible best effort encryption in order for 1788 SRTP to work successfully with SIP retargeting and forking when there 1789 is a mix of forked or retargeted devices that support SRTP and don't 1790 support SRTP. 1792 Consider the case of Bob, with a phone that only does RTP and a 1793 voice mail system that supports SRTP and RTP. If Alice calls Bob 1794 with an SRTP offer, Bob's RTP-only phone will reject the media 1795 stream (with an empty "m=" line) because Bob's phone doesn't 1796 understand SRTP (RTP/SAVP). Alice's phone will see this rejected 1797 media stream and may terminate the entire call (BYE) and re- 1798 initiate the call as RTP-only, or Alice's phone may decide to 1799 continue with call setup with the SRTP-capable leg (the voice mail 1800 system). If Alice's phone decided to re-initiate the call as RTP- 1801 only, and Bob doesn't answer his phone, Alice will then leave 1802 voice mail using only RTP, rather than SRTP as expected. 1804 Currently, several techniques are commonly considered as candidates 1805 to provide opportunistic encryption: 1807 multipart/alternative 1808 [I-D.jennings-sipping-multipart] describes how to form a 1809 multipart/alternative body part in SIP. The significant issues 1810 with this technique are (1) that multipart MIME is incompatible 1811 with existing SIP proxies, firewalls, Session Border Controllers, 1812 and endpoints and (2) when forking, the Heterogeneous Error 1813 Response Forking Problem (HERFP) [I-D.mahy-sipping-herfp-fix] 1814 causes problems if such non-multipart-capable endpoints were 1815 involved in the forking. 1817 SDP Grouping 1818 A new SDP grouping mechanism (following the idea introduced in 1819 [RFC3388]) has been discussed which would allow a media line to 1820 indicate RTP/AVP and another media line to indicate RTP/SAVP, 1821 allowing non-SRTP-aware endpoints to choose RTP/AVP and SRTP-aware 1822 endpoints to choose RTP/SAVP. As of this writing, this SDP 1823 grouping mechanism has not been published as an Internet Draft. 1825 session attribute 1826 With this technique, the endpoints signal their desire to do SRTP 1827 by signaling RTP (RTP/AVP), and using an attribute ("a=") in the 1828 SDP. This technique is entirely backwards compatible with non- 1829 SRTP-aware endpoints, but doesn't use the RTP/SAVP protocol 1830 registered by SRTP [RFC3711]. 1832 SDP Capability Negotiation 1833 SDP Capability Negotiation 1834 [I-D.ietf-mmusic-sdp-capability-negotiation] provides a backwards- 1835 compatible mechanism to allow offering both SRTP and RTP in a 1836 single offer. This is the preferred technique. 1838 Probing 1839 With this technique, the endpoints first establish an RTP session 1840 using RTP (RTP/AVP). The endpoints send probe messages, over the 1841 media path, to determine if the remote endpoint supports their 1842 keying technique. 1844 The preferred technique, SDP Capability Negotiation 1845 [I-D.ietf-mmusic-sdp-capability-negotiation], can be used with all 1846 key exchange mechanisms. What remains unique is ZRTP, which can also 1847 accomplish its best effort encryption by probing (sending ZRTP 1848 messages over the media path) or by session attribute (see "a=zrtp", 1849 defined in Section 10 of [I-D.zimmermann-avt-zrtp]). Current 1850 implementations of ZRTP use probing. 1852 A.1.13.4. Upgrading Algorithms 1854 It is necessary to allow upgrading SRTP encryption and hash 1855 algorithms, as well as upgrading the cryptographic functions used for 1856 the key exchange mechanism. With SIP's offer/answer model, this can 1857 be computionally expensive because the offer needs to contain all 1858 combinations of the key exchange mechanisms (all MIKEY modes, 1859 Security Descriptions) and all SRTP cryptographic suites (AES-128, 1860 AES-256) and all SRTP cryptographic hash functions (SHA-1, SHA-256) 1861 that the offerer supports. In order to do this, the offerer has to 1862 expend CPU resources to build an offer containing all of this 1863 information which becomes computationally prohibitive. 1865 Thus, it is important to keep the offerer's CPU impact fixed so that 1866 offering multiple new SRTP encryption and hash functions incurs no 1867 additional expense. 1869 The following list describes the CPU effort involved in using each 1870 key exchange technique. 1872 MIKEY-NULL 1873 No significant computaional expense. 1875 MIKEY-PSK 1876 No significant computational expense. 1878 MIKEY-RSA 1879 For each offered SRTP crypto suite, the offerer has to perform 1880 RSA operation to encrypt the TGK 1882 MIKEY-RSA-R 1883 For each offered SRTP crypto suite, the offerer has to perform 1884 public key operation to sign the MIKEY message. 1886 MIKEY-DHSIGN 1887 For each offered SRTP crypto suite, the offerer has to perform 1888 Diffie-Hellman operation, and a public key operation to sign 1889 the Diffie-Hellman output. 1891 MIKEY-DHHMAC 1892 For each offered SRTP crypto suite, the offerer has to perform 1893 Diffie-Hellman operation. 1895 MIKEYv2 in SDP 1896 The behavior will depend on which mode is picked. 1898 Security Descriptions with SIPS 1899 No significant computational expense. 1901 Security Descriptions with S/MIME 1902 S/MIME requires the offerer and the answerer to encrypt the SDP 1903 with the other's public key, and to decrypt the received SDP 1904 with their own private key. 1906 SDP-DH 1907 For each offered SRTP crypto suite, the offerer has to perform 1908 a Diffie-Hellman operation. 1910 ZRTP 1911 The offerer has no additional computational expense at all, as 1912 the offer contains no information about ZRTP or might contain 1913 "a=zrtp". 1915 EKT 1916 The offerer's Computational expense depends entirely on the EKT 1917 bootstrapping mechanism selected (one or more MIKEY modes or 1918 Security Descriptions). 1920 DTLS-SRTP 1921 The offerer has no additional computational expense at all, as 1922 the offer contains only a fingerprint of the certificate that 1923 will be presented in the DTLS exchange. 1925 MIKEYv2 Inband 1926 The behavior will depend on which mode is picked. 1928 A.2. Media Path Keying Technique 1930 A.2.1. ZRTP 1932 ZRTP [I-D.zimmermann-avt-zrtp] does not exchange information in the 1933 signaling path (although it's possible for endpoints to indicate 1934 support for ZRTP with "a=zrtp" in the initial Offer). In ZRTP the 1935 keys are exchanged entirely in the media path using a Diffie-Hellman 1936 exchange. The advantage to this mechanism is that the signaling 1937 channel is used only for call setup and the media channel is used to 1938 establish an encrypted channel -- much like encryption devices on the 1939 PSTN. ZRTP uses voice authentication of its Diffie-Hellman exchange 1940 by having each person read digits to the other person. Subsequent 1941 sessions with the same ZRTP endpoint can be authenticated using the 1942 stored hash of the previously negotiated key rather than voice 1943 authentication. 1945 ZRTP uses 4 media path messages (Hello, Commit, DHPart1, and DHPart2) 1946 to establish the SRTP key, and 3 media path confirmation messages. 1947 These initial messages are all sent as non-RTP packets. 1949 Note that when ZRTP probing is used, unencrypted RTP is being 1950 exchanged until the SRTP keys are established. 1952 A.3. Signaling and Media Path Keying Techniques 1954 A.3.1. EKT 1956 EKT [I-D.mcgrew-srtp-ekt] relies on another SRTP key exchange 1957 protocol, such as Security Descriptions or MIKEY, for bootstrapping. 1958 In the initial phase, each member of a conference uses an SRTP key 1959 exchange protocol to establish a common key encryption key (KEK). 1960 Each member may use the KEK to securely transport its SRTP master key 1961 and current SRTP rollover counter (ROC), via RTCP, to the other 1962 participants in the session. 1964 EKT requires the offerer to send some parameters (EKT_Cipher, KEK, 1965 and security parameter index (SPI)) via the bootstrapping protocol 1966 such as Security Descriptions or MIKEY. Each answerer sends an SRTCP 1967 message which contains the answerer's SRTP Master Key, rollover 1968 counter, and the SRTP sequence number. Rekeying is done by sending a 1969 new SRTCP message. For reliable transport, multiple RTCP messages 1970 need to be sent. 1972 A.3.2. DTLS-SRTP 1974 DTLS-SRTP [I-D.ietf-avt-dtls-srtp] exchanges public key fingerprints 1975 in SDP [I-D.fischl-sipping-media-dtls] and then establishes a DTLS 1976 session over the media channel. The endpoints use the DTLS handshake 1977 to agree on crypto suites and establish SRTP session keys. SRTP 1978 packets are then exchanged between the endpoints. 1980 DTLS-SRTP requires one message from offerer to answerer (half round 1981 trip), and one message from the answerer to offerer (full round trip) 1982 so the offerer can correlate the SDP answer with the answering 1983 endpoint. DTLS-SRTP uses 4 media path messages to establish the SRTP 1984 key. 1986 This document assumes DTLS will use TLS_RSA_WITH_3DES_EDE_CBC_SHA as 1987 its cipher suite, which is the mandatory-to-implement cipher suite in 1988 TLS [RFC4346]. 1990 A.3.3. MIKEYv2 Inband (expired) 1992 As defined in Appendix A.1.11, MIKEYv2 also defines an in-band 1993 negotiation mode as an alternative to SDP (see Appendix A.3.3). The 1994 details are not sorted out in the draft yet on what in-band actually 1995 means (i.e., UDP, RTP, RTCP, etc.). 1997 Appendix B. Out-of-Scope 1999 Discussions concluded that key management for shared-key encryption 2000 of conferencing is outside the scope of this document. As the 2001 priority is point-to-point unicast SRTP session keying, resolving 2002 shared-key SRTP session keying is deferred to later and left as an 2003 item for future investigations. 2005 The compromise of an endpoint that has access to decrypted media 2006 (e.g., SIP user agent, transcoder, recorder) is out of scope of this 2007 document. Such a compromise might be via privilege escalation, 2008 installation of a virus or trojan horse, or similar attacks. 2010 Appendix C. Requirement renumbering in -02 2012 [[RFC Editor: Please delete this section prior to publication.]] 2013 Previous versions of this document used requirement numbers, which 2014 were changed to mnemonics as follows: 2016 R1 R-FORK-RETARGET 2018 R2 R-BEST-SECURE 2020 R3 R-DISTINCT 2022 R4 R-REUSE; changed from 'MAY' to 'protocol MUST support, and 2023 SHOULD implement' 2025 R5 R-AVOID-CLIPPING 2027 R6 R-PASS-MEDIA 2029 R7 R-PASS-SIG 2031 R8 R-PFS 2033 R9 R-COMPUTE 2035 R10 R-RTP-VALID 2037 R11 (folded into R4; was reuse previous session) 2039 R12 R-CERTS 2041 R13 R-FIPS 2043 R14 R-ASSOC 2045 R15 (deleted; was ability to upgrade from RTP to SRTP, but 2046 requirement was unclear on what it meant) 2048 R16 R-DOS 2050 R17 R-SIG-MEDIA 2052 R18 R-EXISTING 2054 R19 R-AGILITY 2056 R20 R-DOWNGRADE 2057 R21 R-NEGOTIATE 2059 R23 R-OTHER-SIGNALING 2061 R23 R-RECORDING (R23 was duplicated in previous versions of the 2062 document) 2064 R24 (deleted; was lawful intercept) 2066 R25 R-TRANSCODER 2068 R26 R-PSTN 2070 R27 R-ID-BINDING 2072 R28 R-ACT-ACT 2074 Authors' Addresses 2076 Dan Wing (editor) 2077 Cisco Systems, Inc. 2078 170 West Tasman Drive 2079 San Jose, CA 95134 2080 USA 2082 Email: dwing@cisco.com 2084 Steffen Fries 2085 Siemens AG 2086 Otto-Hahn-Ring 6 2087 Munich, Bavaria 81739 2088 Germany 2090 Email: steffen.fries@siemens.com 2092 Hannes Tschofenig 2093 Nokia Siemens Networks 2094 Otto-Hahn-Ring 6 2095 Munich, Bavaria 81739 2096 Germany 2098 Email: Hannes.Tschofenig@nsn.com 2099 URI: http://www.tschofenig.com 2100 Francois Audet 2101 Nortel 2102 4655 Great America Parkway 2103 Santa Clara, CA 95054 2104 USA 2106 Email: audet@nortel.com 2108 Full Copyright Statement 2110 Copyright (C) The IETF Trust (2008). 2112 This document is subject to the rights, licenses and restrictions 2113 contained in BCP 78, and except as set forth therein, the authors 2114 retain all their rights. 2116 This document and the information contained herein are provided on an 2117 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 2118 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 2119 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 2120 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 2121 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 2122 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 2124 Intellectual Property 2126 The IETF takes no position regarding the validity or scope of any 2127 Intellectual Property Rights or other rights that might be claimed to 2128 pertain to the implementation or use of the technology described in 2129 this document or the extent to which any license under such rights 2130 might or might not be available; nor does it represent that it has 2131 made any independent effort to identify any such rights. Information 2132 on the procedures with respect to rights in RFC documents can be 2133 found in BCP 78 and BCP 79. 2135 Copies of IPR disclosures made to the IETF Secretariat and any 2136 assurances of licenses to be made available, or the result of an 2137 attempt made to obtain a general license or permission for the use of 2138 such proprietary rights by implementers or users of this 2139 specification can be obtained from the IETF on-line IPR repository at 2140 http://www.ietf.org/ipr. 2142 The IETF invites any interested party to bring to its attention any 2143 copyrights, patents or patent applications, or other proprietary 2144 rights that may cover technology that may be required to implement 2145 this standard. Please address the information to the IETF at 2146 ietf-ipr@ietf.org. 2148 Acknowledgment 2150 This document was produced using xml2rfc v1.33pre8 (of 2151 http://xml.resource.org/) from a source in RFC-2629 XML format.