idnits 2.17.1 draft-ietf-sip-media-security-requirements-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 20. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 2121. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 2132. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 2139. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 2145. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year == Line 962 has weird spacing: '...ication along...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (October 30, 2008) is 5650 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-07) exists of draft-ietf-avt-dtls-srtp-06 == Outdated reference: A later version (-07) exists of draft-ietf-mmusic-media-path-middleboxes-01 == Outdated reference: A later version (-13) exists of draft-ietf-mmusic-sdp-capability-negotiation-09 == Outdated reference: A later version (-15) exists of draft-ietf-sip-certs-06 == Outdated reference: A later version (-06) exists of draft-mcgrew-srtp-ekt-03 == Outdated reference: A later version (-04) exists of draft-wing-sipping-srtp-key-03 == Outdated reference: A later version (-22) exists of draft-zimmermann-avt-zrtp-10 -- Obsolete informational reference (is this intentional?): RFC 4474 (Obsoleted by RFC 8224) -- Obsolete informational reference (is this intentional?): RFC 4492 (Obsoleted by RFC 8422) Summary: 1 error (**), 0 flaws (~~), 9 warnings (==), 9 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 SIP Working Group D. Wing, Ed. 3 Internet-Draft Cisco 4 Intended status: Informational S. Fries 5 Expires: May 3, 2009 Siemens AG 6 H. Tschofenig 7 Nokia Siemens Networks 8 F. Audet 9 Nortel 10 October 30, 2008 12 Requirements and Analysis of Media Security Management Protocols 13 draft-ietf-sip-media-security-requirements-08 15 Status of this Memo 17 By submitting this Internet-Draft, each author represents that any 18 applicable patent or other IPR claims of which he or she is aware 19 have been or will be disclosed, and any of which he or she becomes 20 aware will be disclosed, in accordance with Section 6 of BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF), its areas, and its working groups. Note that 24 other groups may also distribute working documents as Internet- 25 Drafts. 27 Internet-Drafts are draft documents valid for a maximum of six months 28 and may be updated, replaced, or obsoleted by other documents at any 29 time. It is inappropriate to use Internet-Drafts as reference 30 material or to cite them other than as "work in progress." 32 The list of current Internet-Drafts can be accessed at 33 http://www.ietf.org/ietf/1id-abstracts.txt. 35 The list of Internet-Draft Shadow Directories can be accessed at 36 http://www.ietf.org/shadow.html. 38 This Internet-Draft will expire on May 3, 2009. 40 Abstract 42 This document describes requirements for a protocol to negotiate a 43 security context for SIP-signaled SRTP media. In addition to the 44 natural security requirements, this negotiation protocol must 45 interoperate well with SIP in certain ways. A number of proposals 46 have been published and a summary of these proposals is in the 47 appendix of this document. 49 Table of Contents 51 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 52 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 53 3. Attack Scenarios . . . . . . . . . . . . . . . . . . . . . . . 5 54 4. Call Scenarios and Requirements Considerations . . . . . . . . 8 55 4.1. Clipping Media Before Signaling Answer . . . . . . . . . . 8 56 4.2. Retargeting and Forking . . . . . . . . . . . . . . . . . 9 57 4.3. Recording . . . . . . . . . . . . . . . . . . . . . . . . 12 58 4.4. PSTN gateway . . . . . . . . . . . . . . . . . . . . . . . 12 59 4.5. Call Setup Performance . . . . . . . . . . . . . . . . . . 13 60 4.6. Transcoding . . . . . . . . . . . . . . . . . . . . . . . 13 61 4.7. Upgrading to SRTP . . . . . . . . . . . . . . . . . . . . 14 62 4.8. Interworking with Other Signaling Protocols . . . . . . . 14 63 4.9. Certificates . . . . . . . . . . . . . . . . . . . . . . . 15 64 5. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 15 65 5.1. Key Management Protocol Requirements . . . . . . . . . . . 15 66 5.2. Security Requirements . . . . . . . . . . . . . . . . . . 17 67 5.3. Requirements Outside of the Key Management Protocol . . . 19 68 6. Security Considerations . . . . . . . . . . . . . . . . . . . 20 69 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 70 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 20 71 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 20 72 9.1. Normative References . . . . . . . . . . . . . . . . . . . 20 73 9.2. Informative References . . . . . . . . . . . . . . . . . . 21 74 Appendix A. Overview and Evaluation of Existing Keying 75 Mechanisms . . . . . . . . . . . . . . . . . . . . . 24 76 A.1. Signaling Path Keying Techniques . . . . . . . . . . . . . 25 77 A.1.1. MIKEY-NULL . . . . . . . . . . . . . . . . . . . . . . 25 78 A.1.2. MIKEY-PSK . . . . . . . . . . . . . . . . . . . . . . 25 79 A.1.3. MIKEY-RSA . . . . . . . . . . . . . . . . . . . . . . 26 80 A.1.4. MIKEY-RSA-R . . . . . . . . . . . . . . . . . . . . . 26 81 A.1.5. MIKEY-DHSIGN . . . . . . . . . . . . . . . . . . . . . 26 82 A.1.6. MIKEY-DHHMAC . . . . . . . . . . . . . . . . . . . . . 26 83 A.1.7. MIKEY-ECIES and MIKEY-ECMQV (MIKEY-ECC) . . . . . . . 27 84 A.1.8. Security Descriptions with SIPS . . . . . . . . . . . 27 85 A.1.9. Security Descriptions with S/MIME . . . . . . . . . . 27 86 A.1.10. SDP-DH (expired) . . . . . . . . . . . . . . . . . . . 27 87 A.1.11. MIKEYv2 in SDP (expired) . . . . . . . . . . . . . . . 27 88 A.2. Media Path Keying Technique . . . . . . . . . . . . . . . 28 89 A.2.1. ZRTP . . . . . . . . . . . . . . . . . . . . . . . . . 28 90 A.3. Signaling and Media Path Keying Techniques . . . . . . . . 28 91 A.3.1. EKT . . . . . . . . . . . . . . . . . . . . . . . . . 28 92 A.3.2. DTLS-SRTP . . . . . . . . . . . . . . . . . . . . . . 29 93 A.3.3. MIKEYv2 Inband (expired) . . . . . . . . . . . . . . . 29 94 A.4. Evaluation Criteria - SIP . . . . . . . . . . . . . . . . 29 95 A.4.1. Secure Retargeting and Secure Forking . . . . . . . . 29 96 A.4.2. Clipping Media Before SDP Answer . . . . . . . . . . . 32 97 A.4.3. SSRC and ROC . . . . . . . . . . . . . . . . . . . . . 34 98 A.5. Evaluation Criteria - Security . . . . . . . . . . . . . . 36 99 A.5.1. Distribution and Validation of Persistent Public 100 Keys and Certificates . . . . . . . . . . . . . . . . 36 101 A.5.2. Perfect Forward Secrecy . . . . . . . . . . . . . . . 38 102 A.5.3. Best Effort Encryption . . . . . . . . . . . . . . . . 40 103 A.5.4. Upgrading Algorithms . . . . . . . . . . . . . . . . . 41 104 Appendix B. Out-of-Scope . . . . . . . . . . . . . . . . . . . . 43 105 B.1. Shared Key Conferencing . . . . . . . . . . . . . . . . . 43 106 Appendix C. Requirement renumbering in -02 . . . . . . . . . . . 44 107 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 46 108 Intellectual Property and Copyright Statements . . . . . . . . . . 48 110 1. Introduction 112 The work on media security started when the Session Initiation 113 Protocol (SIP) was still in its infancy. With the increased SIP 114 deployment and the availability of new SIP extensions and related 115 protocols, the need for end-to-end security was re-evaluated. The 116 procedure of re-evaluating prior protocol work and design decisions 117 is not an uncommon strategy and, to some extent, considered necessary 118 to ensure that the developed protocols indeed meet the previously 119 envisioned needs for the users on the Internet. 121 This document summarizes media security requirements, i.e., 122 requirements for mechanisms that negotiate security context such as 123 cryptographic keys and parameters for SRTP. 125 The organization of this document is as follows: Section 2 introduces 126 terminology, Section 3 describes various attack scenarios against the 127 signaling path and media path, Section 4 provides an overview about 128 possible call scenarios, Section 5 lists requirements for media 129 security. The main part of the document concludes with the security 130 considerations Section 6, IANA considerations Section 7 and an 131 acknowledgement section in Section 8. Appendix A lists and compares 132 available solution proposals. The following Appendix A.4 compares 133 the different approaches regarding their suitability for the SIP 134 signaling scenarios described in Appendix A, while Appendix A.5 135 provides a comparison regarding security aspects. Appendix B lists 136 non-goals for this document. 138 2. Terminology 140 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 141 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 142 document are to be interpreted as described in [RFC2119], with the 143 important qualification that, unless otherwise stated, these terms 144 apply to the design of the media security key management protocol, 145 not its implementation or application. 147 Furthermore, the terminology described in SIP ([RFC3261]) regarding 148 functions and components are used throughout the document 150 Additionally, the following items are used in this document: 152 AOR (Address-of-Record): A SIP or SIPS URI that points to a domain 153 with a location service that can map the URI to another URI where 154 the user might be available. Typically, the location service is 155 populated through registrations. An AOR is frequently thought of 156 as the "public address" of the user. 158 SSRC: The 32-bit value that defines the synchronization source, used 159 in RTP. These are generally unique, but collisions can occur. 161 two-time pad: The use of the same key and the same keystream to 162 encrypt different data. For SRTP, a two-time pad occurs if two 163 senders are using the same key and the same RTP SSRC value. 165 Perfect Forward Secrecy (PFS): The property that disclosure of the 166 long-term secret keying material that is used to derive an agreed 167 ephemeral key does not compromise the secrecy of agreed keys from 168 earlier runs. 170 active adversary: An active adversary is able to alter data 171 communication to affect its operation (see also [RFC4949]). 173 passive adversary: A passive adversary is able to learn information 174 from data communication, but not alter that data communication 175 (see also[RFC4949]). 177 signaling path: The signaling path is the route taken by SIP 178 signaling messages transmitted between the calling and called user 179 agents. This can be either direct signaling between the calling 180 and called user agents or, more commonly involves the SIP proxy 181 servers that were involved in the call setup. 183 media path: The media path is the route taken by media packets 184 exchanged by the endpoints. In the simplest case, the endpoints 185 exchange media directly, and the "media path" is defined by a 186 quartet of IP addresses and TCP/UDP ports, along with an IP route. 187 In other cases, this path may include RTP relays, mixers, 188 transcoders, session border controllers, NATs, or media gateways. 190 3. Attack Scenarios 192 The discussion in this section relates to requirements R-PASS-MEDIA, 193 R-PASS-SIG, R-ASSOC, R-SIG-MEDIA, R-ACT-ACT, and R-ID-BINDING. 195 This document classifies adversaries according to their access and 196 their capabilities. An adversary might have access: 198 1. only to the media path, 200 2. only to the signaling path, 202 3. to the media path and to the signaling path. 204 An attacker that can solely be located along the signaling path, and 205 does not have access to media (item 2), is not considered in this 206 document. 208 There are two different types of adversaries, active and passive. An 209 active adversary may need to be active with regard to the key 210 exchange relevant information traveling along the media path or 211 traveling along the signaling path. 213 Based on their robustness against the adversary capabilities 214 described above, we can group security mechanisms using the following 215 labels. This list is generally ordered from easiest to compromise 216 (at the top) to more difficult to compromise: 218 +---------------+---------+--------------------------------------+ 219 | SIP signaling | media | abbreviation | 220 +---------------+---------+--------------------------------------+ 221 | none | passive | no-signaling-passive-media | 222 | none | active | no-signaling-active-media | 223 | passive | passive | passive-signaling-passive-media | 224 | passive | active | passive-signaling-active-media | 225 | active | passive | active-signaling-passive-media | 226 | active | active | active-signaling-active-media | 227 | active | active | active-signaling-active-media-detect | 228 +---------------+---------+--------------------------------------+ 230 no-signaling-passive-media: 231 Access to only the media path is sufficient to reveal the content 232 of the media traffic. 234 passive-signaling-passive-media: 235 Passive attack on the signaling and passive attack on the media 236 path is necessary to reveal the content of the media traffic. 238 passive-signaling-active-media: 239 Passive attack on the signaling and active attack on the media 240 path is necessary to reveal the content of the media traffic. 242 active-signaling-passive-media: 243 Active attack on the signaling path and passive attack on the 244 media path is necessary to reveal the content of the media 245 traffic. 247 no-signaling-active-media: 248 Active attack on the media path is sufficient to reveal the 249 content of the media traffic. 251 active-signaling-active-media: 252 Active attack on both the signaling path and the media path is 253 necessary to reveal the content of the media traffic. 255 active-signaling-active-media-detect: 256 Active attack on both signaling and media path is necessary to 257 reveal the content of the media traffic (as with active-signaling- 258 active-media), and the attack is detectable by protocol messages 259 exchanged between the end points. 261 For example, unencrypted RTP is vulnerable to no-signaling-passive- 262 media. 264 As another example, Security Descriptions [RFC4568], when protected 265 by TLS (as it is commonly implemented and deployed), belongs in the 266 passive-signaling-passive-media category since the adversary needs to 267 learn the Security Descriptions key by seeing the SIP signaling 268 message at a SIP proxy (assuming that the adversary is in control of 269 the SIP proxy). The media traffic can be decrypted using that 270 learned key. 272 As another example, DTLS-SRTP falls into active-signaling-active- 273 media category when DTLS-SRTP is used with a public key based 274 ciphersuite with self-signed certificates and without SIP-Identity 275 [RFC4474]. An adversary would have to modify the fingerprint that is 276 sent along the signaling path and subsequently to modify the 277 certificates carried in the DTLS handshake that travel along the 278 media path. If DTLS-SRTP is used with both SIP Identity [RFC4474] 279 and SIP Connected Identity [RFC4916], the RFC4474 signature protects 280 both the offer and the answer, and such a system would then belong to 281 the active-signaling-active-attack-detect category (provided, of 282 course, the signaling path to the RFC4474 authenticator and verifier 283 is secured as per RFC4474 and the RFC4474 authenticator and verifier 284 are behaving as per RFC4474). 286 The above discussion of DTLS-SRTP demonstrates how a single security 287 protocol can be in different classes depending on the mode in which 288 it is operated. Other protocols can achieve similar effect by adding 289 functions outside of the on-the-wire key management protocol itself. 290 Although it may be appropriate to deploy lower-classed mechanisms in 291 some cases, the ultimate security requirement for a media security 292 negotiation protocol is that it have a mode of operation available in 293 which is detect-attack, which provides protection against the passive 294 and active attacks and provides detection of such attacks. That is, 295 there must be a way to use the protocol so that an active attack is 296 required against both the signaling and media paths, and so that such 297 attacks are detectable by the endpoints. 299 4. Call Scenarios and Requirements Considerations 301 The following subsections describe call scenarios that pose the most 302 challenge to the key management system for media data in cooperation 303 with SIP signaling. 305 Throughout the subsections requirements are stated by using the 306 nomenclature R- to state an explicit requirement. All of the stated 307 requirements are explanied in detail in section Section 5. The 308 requirements in section Section 5 are listed according their 309 association to the key management protocol, to attack scenarios, and 310 requirements which can be met inside the key management protocol or 311 outside of the key management protocol. 313 4.1. Clipping Media Before Signaling Answer 315 The discussion in this section relates to requirement R-AVOID- 316 CLIPPING and R-ALLOW-RTP. 318 Per the SDP Offer/Answer Model [RFC3264], 320 "Once the offerer has sent the offer, it MUST be prepared to 321 receive media for any recvonly streams described by that offer. 322 It MUST be prepared to send and receive media for any sendrecv 323 streams in the offer, and send media for any sendonly streams in 324 the offer (of course, it cannot actually send until the peer 325 provides an answer with the needed address and port information)." 327 To meet this requirement with SRTP, the offerer needs to know the 328 SRTP key for arriving media. If either endpoint receives encrypted 329 media before it has access to the associated SRTP key, it cannot play 330 the media -- causing clipping. 332 For key exchange mechanisms that send the answerer's key in SDP, a 333 SIP provisional response [RFC3261], such as 183 (session progress), 334 is useful. However, the 183 messages are not reliable unless both 335 the calling and called end point support PRACK [RFC3262], use TCP 336 across all SIP proxies, implement Security Preconditions [RFC5027], 337 or the both ends implement ICE [I-D.ietf-mmusic-ice] and the answerer 338 implements the reliable provisional response mechanism described in 339 ICE. Unfortunately, there is not wide deployment of any of these 340 techniques and there is industry reluctance to require these 341 techniques to avoid the problems described in this section. 343 Note that the receipt of an SDP answer is not always sufficient to 344 allow media to be played to the offerer. Sometimes, the offerer must 345 send media in order to open up firewall holes or NAT bindings before 346 media can be received (for details see 348 [I-D.ietf-mmusic-media-path-middleboxes]). In this case, even a 349 solution that makes the key available before the SDP answer arrives 350 will not help. 352 Preventing the arrival of early media (i.e., media that arrives at 353 the SDP offerer before the SDP answer arrives) might obsolete the 354 R-AVOID-CLIPPING requirement, but at the time of writing such early 355 media exists in many normal call scenarios. 357 4.2. Retargeting and Forking 359 The discussion in this section relates to requirements R-FORK- 360 RETARGET, R-DISTINCT, R-HERFP, and R-BEST-SECURE. 362 In SIP, a request sent to a specific AOR but delivered to a different 363 AOR is called a "retarget". A typical scenario is a "call 364 forwarding" feature. In Figure 1 Alice sends an INVITE in step 1 365 that is sent to Bob in step 2. Bob responds with a redirect (SIP 366 response code 3xx) pointing to Carol in step 3. This redirect 367 typically does not propagate back to Alice but only goes to a proxy 368 (i.e., the retargeting proxy) that sends the original INVITE to Carol 369 in step 4. 371 +-----+ 372 |Alice| 373 +--+--+ 374 | 375 | INVITE (1) 376 V 377 +----+----+ 378 | proxy | 379 ++-+-----++ 380 | ^ | 381 INVITE (2) | | | INVITE (4) 382 & redirect (3) | | | 383 V | V 384 ++-++ ++----+ 385 |Bob| |Carol| 386 +---+ +-----+ 388 Figure 1: Retargeting 390 Using retargeting might lead to situations where the UAC does not 391 know where its request will be going. This might not immediately 392 seem like a serious problem; after all, when one places a telephone 393 call on the PSTN, one never really knows if it will be forwarded to a 394 different number, who will pick up the line when it rings, and so on. 396 However, when considering SIP mechanisms for authenticating the 397 called party, this function can also make it difficult to 398 differentiate an intermediary that is behaving legitimately from an 399 attacker. From this perspective, the main problems with retargeting 400 ares: 402 Not detectable by the caller: The originating user agent has no 403 means of anticipating that the condition will arise, nor any means 404 of determining that it has occurred until the call has already 405 been set up. 407 Not preventable by the caller: There is no existing mechanism that 408 might be employed by the originating user agent in order to 409 guarantee that the call will not be re-targeted. 411 The mechanism used by SIP for identifying the calling party is SIP 412 Identity [RFC4474]. However, due to the nature of retargeting SIP 413 Identity can only identify the calling party (that is, the party that 414 initiated the SIP request). Some key exchange mechanisms predate SIP 415 Identity and include their own identity mechanism (e.g., MIKEY). 416 However, those built-in identity mechanism also suffer from the SIP 417 retargeting problem. While Connected Identity [RFC4916] allows 418 positive identification of the called party, the primary difficulty 419 still remains that the calling party does not know if a mismatched 420 called party is legitimate (i.e., due to authorized retargeting) or 421 illegitimate (i.e., due to unauthorized retargeting by an attacker 422 above to modify SIP signaling). 424 In SIP, 'forking' is the delivery of a request to multiple locations. 425 This happens when a single AOR is registered more than once. An 426 example of forking is when a user has a desk phone, PC client, and 427 mobile handset all registered with the same AOR. 429 +-----+ 430 |Alice| 431 +--+--+ 432 | 433 | INVITE 434 V 435 +-----+-----+ 436 | proxy | 437 ++---------++ 438 | | 439 INVITE | | INVITE 440 V V 441 +--+--+ +--+--+ 442 |Bob-1| |Bob-2| 443 +-----+ +-----+ 445 Figure 2: Forking 447 With forking, both Bob-1 and Bob-2 might send back SDP answers in SIP 448 responses. Alice will see those intermediate (18x) and final (200) 449 responses. It is useful for Alice to be able to associate the SIP 450 response with the incoming media stream. Although this association 451 can be done with ICE [I-D.ietf-mmusic-ice], and ICE is useful to make 452 this association with RTP, it is not desirable to require ICE to 453 accomplish this association. 455 Forking and retargeting are often used together. For example, a boss 456 and secretary might have both phones ring (forking) and rollover to 457 voice mail if neither phone is answered (retargeting). 459 To maintain security of the media traffic, only the end point that 460 answers the call should know the SRTP keys for the session. Forked 461 and re-targeted calls only reveal sensitive information to non- 462 responders when the signaling messages contain sensitive information 463 (e.g., SRTP keys) that is accessible by parties that receive the 464 offer, but may not respond (i.e., the original recipients in a 465 retargeted call, or non-answering endpoints in a forked call). For 466 key exchange mechanisms that do not provide secure forking or secure 467 retargeting, one workaround is to re-key immediately after forking or 468 retargeting. However, because the originator may not be aware that 469 the call forked this mechanism requires rekeying immediately after 470 every session is established. This doubles the number of messages 471 processed by the network. 473 Further compounding this problem is a unique feature of SIP that when 474 forking is used, there is always only one final error response 475 delivered to the sender of the request: the forking proxy is 476 responsible for choosing which final response to choose in the event 477 where forking results in multiple final error responses being 478 received by the forking proxy. This means that if a request is 479 rejected, say with information that the keying information was 480 rejected and providing the far end's credentials, it is very possible 481 that the rejection will never reach the sender. This problem, called 482 the Heterogeneous Error Response Forking Problem (HERFP) [RFC3326], 483 is difficult to solve in SIP. Because we expect the HERFP to 484 continue to be a problem in SIP for the foreseeable future, a media 485 security system should function even in the presence of HERFP 486 behavior. 488 4.3. Recording 490 The discussion in this section relates to requirement R-RECORDING. 492 Some business environments, such as stock brokers, banks, and catalog 493 call centers, require recording calls with customers. This is the 494 familiar "this call is being recorded for quality purposes" heard 495 during calls to these sorts of businesses. In these environments, 496 media recording is typically performed by an intermediate device 497 (with RTP, this is typically implemented in a 'sniffer'). 499 When performing such call recording with SRTP, the end-to-end 500 security is compromised. This is unavoidable, but necessary because 501 the operation of the business requires such recording. It is 502 desirable that the media security is not unduly compromised by the 503 media recording. The endpoint within the organization needs to be 504 informed that there is an intermediate device and needs to cooperate 505 with that intermediate device. 507 This scenario does not place a requirement directly on the key 508 management protocol. The requirement could be met directly by the 509 key management protocol (e.g., MIKEY-NULL or [RFC4568]) or through an 510 external out-of-band-mechanism (e.g., [I-D.wing-sipping-srtp-key]). 512 4.4. PSTN gateway 514 The discussion in this section relates to requirement R-PSTN. 516 It is desirable, even when one leg of a call is on the PSTN, that the 517 IP leg of the call be protected with SRTP. 519 A typical case of using media security where two entities are having 520 a VoIP conversation over IP capable networks. However, there are 521 cases where the other end of the communication is not connected to an 522 IP capable network. In this kind of setting, there needs to be some 523 kind of gateway at the edge of the IP network which converts the VoIP 524 conversation to format understood by the other network. An example 525 of such gateway is a PSTN gateway sitting at the edge of IP and PSTN 526 networks (such as the architecture described in [RFC3372]). 528 If media security (e.g., SRTP protection) is employed in this kind of 529 gateway-setting, then media security and the related key management 530 is terminated at the PSTN gateway. The other network (e.g., PSTN) 531 may have its own measures to protect the communication, but this 532 means that from media security point of view the media security is 533 not employed truely end-to-end between the communicating entities. 535 4.5. Call Setup Performance 537 The discussion in this section relates to requirement R-REUSE. 539 Some devices lack sufficient processing power to perform public key 540 operations or Diffie-Hellman operations for each call, or prefer to 541 avoid performing those operations on every call. The ability to re- 542 use previous public key or Diffie-Hellman operations can vastly 543 decrease the call setup delay and processing requirements for such 544 devices. 546 In certain devices, it can take a second or two to perform a Diffie- 547 Hellman operation. Examples of these devices include handsets, IP 548 Multimedia Services Identity Module (ISIMs), and PSTN gateways. PSTN 549 gateways typically utilize a Digital Signal Processor (DSP) which is 550 not yet involved with typical DSP operations at the beginning of a 551 call, thus the DSP could be used to perform the calculation, so as to 552 avoid having the central host processor perform the calculation. 553 However, not all PSTN gateways use DSPs (some have only central 554 processors or their DSPs are incapable of performing the necessary 555 public key or Diffie-Hellman operation), and handsets lack a 556 separate, unused processor to perform these operations. 558 Two scenarios where R-REUSE is useful are calls between an endpoint 559 and its voicemail server or its PSTN gateway. In those scenarios 560 calls are made relatively often and it can be useful for the 561 voicemail server or PSTN gateway to avoid public key operations for 562 subsequent calls. 564 Storing keys across sessions often interferes with perfect forward 565 secrecy (R-PFS). 567 4.6. Transcoding 569 The discussion in this section relates to requirement R-TRANSCODER. 571 In some environments is is necessary for network equipment to 572 transcode from one codec (e.g., a highly compressed codec which makes 573 efficient use of wireless bandwidth) to another codec (e.g., a 574 standardized codec to a SIP peering interface). With RTP, a 575 transcoding function can be performed with the combination of a SIP 576 B2BUA (to modify the SDP) and a processor to perform the transcoding 577 between the codecs. However, with end-to-end secured SRTP, a 578 transcoding function implemented the same way is a man in the middle 579 attack, and the key management system prevents its use. 581 However, such a network-based transcoder can still be realized with 582 the cooperation and approval of the endpoint, and can provide end-to- 583 transcoder and transcoder-to-end security. 585 4.7. Upgrading to SRTP 587 The discussion in this section relates to the requirement R-ALLOW- 588 RTP. 590 Legitimate RTP media can be sent to an endpoint for announcements, 591 colorful ringback tones (e.g., music), advertising, or normal call 592 progress tones. The RTP may be received before an associated SDP 593 answer. For details on various scenarios, see 594 [I-D.stucker-sipping-early-media-coping]. 596 While receiving such RTP exposes the calling party to a risk of 597 receiving malicious RTP from an attacker, SRTP endpoints will need to 598 receive and play out RTP media in order to be compatible with 599 deployed systems that send RTP to calling parties. 601 4.8. Interworking with Other Signaling Protocols 603 The discussion in this section relates to the requirement R-OTHER- 604 SIGNALING. 606 In many environments, some devices are signaled with protocols other 607 than SIP which do not share SIP's offer/answer model (e.g., [H.248.1] 608 or do not utilize SDP (e.g., H.323). In other environments, both 609 endpoints may be SIP, but may use different key management systems 610 (e.g., one uses MIKEY-RSA, the other MIKEY-RSA-R). 612 In these environments, it is desirable to have SRTP -- rather than 613 RTP -- between the two endpoints. It is always possible, although 614 undesirable, to interwork those disparate signaling systems or 615 disparate key management systems by decrypting and re-encrypting each 616 SRTP packet in a device in the middle of the network (often the same 617 device performing the signaling interworking). This is undesirable 618 due to the cost and increased attack area, as such an SRTP/SRTP 619 interworking device is a valuable attack target. 621 At the time of this writing, interworking is considered important. 622 Interworking without decryption/encryption of the SRTP, while useful, 623 is not yet deemed critical because the scale of such SRTP deployments 624 is, to date, relatively small. 626 4.9. Certificates 628 The discussion in this section relates to R-CERTS. 630 On the Internet and on some private networks, validating another 631 peer's certificate is often done through a trust anchor -- a list of 632 Certificate Authorities that are trusted. It can be difficult or 633 expensive for a peer to obtain these certificates. In all cases, 634 both parties to the call would need to trust the same trust anchor 635 (i.e., "certificate authority"). For these reasons, it is important 636 that the media plane key management protocol offer a mechanism that 637 allows end-users who have no prior association to authenticate to 638 each other without acquiring credentials from a third party trust 639 point. Note that this does not rule out mechanisms in which servers 640 have certificates and attest to the identities of end-users. 642 5. Requirements 644 This section is divided into several parts: requirements specific to 645 the key management protocol (Section 5.1), attack scenarios 646 (Section 5.2), and requirements which can be met inside the key 647 management protocol or outside of the key management protocol 648 (Section 5.3). 650 5.1. Key Management Protocol Requirements 652 SIP Forking and Retargeting, from Section 4.2: 654 R-FORK-RETARGET: 655 The media security key management protocol MUST securely 656 support forking and retargeting when all endpoints are willing 657 to use SRTP without causing the call setup to fail. This 658 requirement means the endpoints that did not answer the call 659 MUST NOT learn the SRTP keys (in either direction) used by the 660 answering endpoint. 662 R-DISTINCT: 663 The media security key management protocol MUST be capable of 664 creating distinct, independent cryptographic contexts for each 665 endpoint in a forked session. 667 R-HERFP: 668 The media security key management protocol MUST function 669 securely even in the presence of HERFP behavior, i.e., the 670 rejection of key information does not reach the sender. 672 Performance considerations: 674 R-REUSE: 675 The media security key management protocol MAY support the re- 676 use of a previously established security context. 678 Note: re-use of the security context does not imply re- 679 use of RTP parameters (e.g., payload type or SSRC). 681 Media considerations: 683 R-AVOID-CLIPPING: 684 The media security key management protocol SHOULD avoid 685 clipping media before SDP answer without requiring Security 686 Preconditions [RFC5027]. This requirement comes from 687 Section 4.1. 689 R-RTP-VALID: 690 If SRTP key negotiation is performed over the media path (i.e., 691 using the same UDP/TCP ports as media packets), the key 692 negotiation packets MUST NOT pass the RTP validity check 693 defined in Appendix A.1 of [RFC3550]. 695 R-ASSOC: 696 The media security key management protocol SHOULD include a 697 mechanism for associating key management messages with both the 698 signaling traffic that initiated the session and with protected 699 media traffic. It is useful to associate key management 700 messages with call signaling messages, as this allows the SDP 701 offerer to avoid performing CPU-consuming operations (e.g., 702 Diffie-Hellman or public key operations) with attackers that 703 have not seen the signaling messages. 705 For example, if using a Diffie-Hellman keying technique with 706 security preconditions that forks to 20 end points, the call 707 initiator would get 20 provisional responses containing 20 708 signed Diffie-Hellman key pairs. Calculating 20 Diffie-Hellman 709 secrets and validating signatures can be a difficult task for 710 some devices. Hence, in the case of forking, it is not 711 desirable to perform a Diffie-Hellman operation with every 712 party, but rather only with the party that answers the call 713 (and incur some media clipping). To do this, the signaling and 714 media need to be associated so the calling party knows which 715 key management exchange needs to be completed. This might be 716 done by using the transport address indicated in the SDP, 717 although NATs can complicate this association. 719 Note: due to RTP's design requirements, it is expected 720 that SRTP receivers will have to perform authentication 721 of any received SRTP packets. 723 R-NEGOTIATE: 724 The media security key management protocol MUST allow a SIP 725 User Agent to negotiate media security parameters for each 726 individual session. 728 R-PSTN: 729 The media security key management protocol MUST support 730 termination of media security in a PSTN gateway. This 731 requirement is from Section 4.4. 733 5.2. Security Requirements 735 This section describes overall security requirements and specific 736 requirements from the attack scenarios (Section 3). 738 Overall security requirements: 740 R-PFS: 741 The media security key management protocol MUST be able to 742 support perfect forward secrecy. 744 R-COMPUTE: 745 The media security key management protocol MUST support 746 offering additional SRTP cipher suites without incurring 747 significant computational expense. 749 R-CERTS: 750 The key management protocol MUST NOT require that end-users 751 obtain credentials (certificates or private keys) from a third- 752 party trust anchor. 754 R-FIPS: 755 The media security key management protocol SHOULD use 756 algorithms that allow FIPS 140-2 [FIPS-140-2] certification. 758 The United States Government can only purchase and use crypto 759 implementations that have been validated by the FIPS-140 760 [FIPS-140-2] process: 762 "The FIPS-140 standard is applicable to all Federal 763 agencies that use cryptographic-based security systems to 764 protect sensitive information in computer and 765 telecommunication systems, including voice systems. The 766 adoption and use of this standard is available to private 767 and commercial organizations." 769 Some commercial organizations, such as banks and defense 770 contractors, require or prefer equipment which has received the 771 same validation. 773 R-DOS: 774 The media security key management protocol SHOULD NOT introduce 775 new denial of service vulnerabilities (e.g., the protocol 776 should not request the endpoint to perform CPU-intensive 777 operations without the client being able to validate or 778 authorize the request). 780 R-EXISTING: 781 The media security key management protocol SHOULD allow 782 endpoints to authenticate using pre-existing cryptographic 783 credentials, e.g., certificates or pre-shared keys. 785 R-AGILITY: 786 The media security key management protocol MUST provide crypto- 787 agility, i.e., the ability to adapt to evolving cryptography 788 and security requirements (update of cryptographic algorithms 789 without substantial disruption to deployed implementations) 791 R-DOWNGRADE: 792 The media security key management protocol MUST protect cipher 793 suite negotiation against downgrading attacks. 795 R-PASS-MEDIA: 796 The media security key management protocol MUST have a mode 797 which prevents a passive adversary with access to the media 798 path from gaining access to keying material used to protect 799 SRTP media packets. 801 R-PASS-SIG: 802 The media security key management protocol MUST have a mode in 803 which it prevents a passive adversary with access to the 804 signaling path from gaining access to keying material used to 805 protect SRTP media packets. 807 R-SIG-MEDIA: 808 The media security key management protocol MUST have a mode in 809 which it defends itself from an attacker that is solely on the 810 media path and from an attacker that is solely on the signaling 811 path. A successful attack refers to the ability for the 812 adversary to obtain keying material to decrypt the SRTP 813 encrypted media traffic. 815 R-ID-BINDING: 816 The media security key management protocol MUST enable the 817 media security keys to be cryptographically bound to an 818 identity of the endpoint. 820 This allows domains to deploy SIP Identity [RFC4474]. 822 R-ACT-ACT: 823 The media security key management protocol MUST support a mode 824 of operation that provides active-signaling-active-media-detect 825 robustness, and MAY support modes of operation that provide 826 lower levels of robustness (as described in Section 3). 828 Failing to meet R-ACT-ACT indicates the protocol can not 829 provide secure end-to-end media. 831 5.3. Requirements Outside of the Key Management Protocol 833 The requirements in this section are for an overall VoIP security 834 system. These requirements can be met within the key management 835 protocol itself, or can be solved outside of the key management 836 protocol itself (e.g., solved in SIP or in SDP). 838 R-BEST-SECURE: 839 Even when some end points of a forked or retargeted call are 840 incapable of using SRTP, a solution MUST be described which 841 allows the establishment of SRTP associations with SRTP-capable 842 endpoints and / or RTP associations with non-SRTP-capable 843 endpoints. 845 R-OTHER-SIGNALING: 846 A solution SHOULD be able to negotiate keys for SRTP sessions 847 created via different call signaling protocols (e.g., between 848 Jabber, SIP, H.323, MGCP). 850 R-RECORDING: 851 A solution SHOULD be described which supports recording of 852 decrypted media. This requirement comes from Section 4.3. 854 R-TRANSCODER: 855 A solution SHOULD be described which supports intermediate 856 nodes (e.g., transcoders), terminating or processing media, 857 between the end points. 859 R-ALLOW-RTP: A solution SHOULD be described which allows RTP media 860 to be received by the calling party until SRTP has been 861 negotiated with the answerer, after which SRTP is preferred 862 over RTP. 864 6. Security Considerations 866 This document lists requirements for securing media traffic. As 867 such, it addresses security throughout the document. 869 7. IANA Considerations 871 This document does not require actions by IANA. 873 8. Acknowledgements 875 For contributions to the requirements portion of this document, the 876 authors would like to thank the active participants of the RTPSEC BoF 877 and on the RTPSEC mailing list, and a special thanks to Steffen Fries 878 and Dragan Ignjatic for their excellent MIKEY comparison [RFC5197] 879 document. 881 The authors would furthermore like to thank the following people for 882 their review, suggestions, and comments: Flemming Andreasen, Richard 883 Barnes, Mark Baugher, Wolfgang Buecker, Werner Dittmann, Lakshminath 884 Dondeti, John Elwell, Martin Euchner, Hans-Heinrich Grusdt, Christer 885 Holmberg, Guenther Horn, Peter Howard, Leo Huang, Dragan Ignjatic, 886 Cullen Jennings, Alan Johnston, Vesa Lehtovirta, Matt Lepinski, David 887 McGrew, David Oran, Colin Perkins, Eric Raymond, Eric Rescorla, Peter 888 Schneider, Srinath Thiruvengadam, Dave Ward, Dan York, and Phil 889 Zimmermann. 891 9. References 893 9.1. Normative References 895 [FIPS-140-2] 896 NIST, "Security Requirements for Cryptographic Modules", 897 June 2005, . 900 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 901 Requirement Levels", BCP 14, RFC 2119, March 1997. 903 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 904 A., Peterson, J., Sparks, R., Handley, M., and E. 905 Schooler, "SIP: Session Initiation Protocol", RFC 3261, 906 June 2002. 908 [RFC3262] Rosenberg, J. and H. Schulzrinne, "Reliability of 909 Provisional Responses in Session Initiation Protocol 910 (SIP)", RFC 3262, June 2002. 912 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 913 with Session Description Protocol (SDP)", RFC 3264, 914 June 2002. 916 [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 917 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 918 RFC 3711, March 2004. 920 [cryptval] 921 NIST, "Cryptographic Module Validation Program", 922 December 2006, 923 . 925 9.2. Informative References 927 [H.248.1] ITU, "Gateway control protocol", June 2000, 928 . 930 [I-D.baugher-mmusic-sdp-dh] 931 Baugher, M. and D. McGrew, "Diffie-Hellman Exchanges for 932 Multimedia Sessions", draft-baugher-mmusic-sdp-dh-00 (work 933 in progress), February 2006. 935 [I-D.dondeti-msec-rtpsec-mikeyv2] 936 Dondeti, L., "MIKEYv2: SRTP Key Management using MIKEY, 937 revisited", draft-dondeti-msec-rtpsec-mikeyv2-01 (work in 938 progress), March 2007. 940 [I-D.fischl-sipping-media-dtls] 941 Fischl, J., "Datagram Transport Layer Security (DTLS) 942 Protocol for Protection of Media Traffic Established with 943 the Session Initiation Protocol", 944 draft-fischl-sipping-media-dtls-03 (work in progress), 945 July 2007. 947 [I-D.ietf-avt-dtls-srtp] 948 McGrew, D. and E. Rescorla, "Datagram Transport Layer 949 Security (DTLS) Extension to Establish Keys for Secure 950 Real-time Transport Protocol (SRTP)", 951 draft-ietf-avt-dtls-srtp-06 (work in progress), 952 October 2008. 954 [I-D.ietf-mmusic-ice] 955 Rosenberg, J., "Interactive Connectivity Establishment 956 (ICE): A Protocol for Network Address Translator (NAT) 957 Traversal for Offer/Answer Protocols", 958 draft-ietf-mmusic-ice-19 (work in progress), October 2007. 960 [I-D.ietf-mmusic-media-path-middleboxes] 961 Stucker, B. and H. Tschofenig, "Analysis of Middlebox 962 Interactions for Signaling Protocol Communication along 963 the Media Path", 964 draft-ietf-mmusic-media-path-middleboxes-01 (work in 965 progress), July 2008. 967 [I-D.ietf-mmusic-sdp-capability-negotiation] 968 Andreasen, F., "SDP Capability Negotiation", 969 draft-ietf-mmusic-sdp-capability-negotiation-09 (work in 970 progress), July 2008. 972 [I-D.ietf-msec-mikey-ecc] 973 Milne, A., "ECC Algorithms for MIKEY", 974 draft-ietf-msec-mikey-ecc-03 (work in progress), 975 June 2007. 977 [I-D.ietf-sip-certs] 978 Jennings, C. and J. Fischl, "Certificate Management 979 Service for The Session Initiation Protocol (SIP)", 980 draft-ietf-sip-certs-06 (work in progress), April 2008. 982 [I-D.ietf-tls-rfc4346-bis] 983 Dierks, T. and E. Rescorla, "The Transport Layer Security 984 (TLS) Protocol Version 1.2", draft-ietf-tls-rfc4346-bis-10 985 (work in progress), March 2008. 987 [I-D.jennings-sipping-multipart] 988 Wing, D. and C. Jennings, "Session Initiation Protocol 989 (SIP) Offer/Answer with Multipart Alternative", 990 draft-jennings-sipping-multipart-02 (work in progress), 991 March 2006. 993 [I-D.mcgrew-srtp-ekt] 994 McGrew, D., "Encrypted Key Transport for Secure RTP", 995 draft-mcgrew-srtp-ekt-03 (work in progress), July 2007. 997 [I-D.stucker-sipping-early-media-coping] 998 Stucker, B., "Coping with Early Media in the Session 999 Initiation Protocol (SIP)", 1000 draft-stucker-sipping-early-media-coping-03 (work in 1001 progress), October 2006. 1003 [I-D.wing-sipping-srtp-key] 1004 Wing, D., Audet, F., Fries, S., Tschofenig, H., and A. 1005 Johnston, "Secure Media Recording and Transcoding with the 1006 Session Initiation Protocol", 1007 draft-wing-sipping-srtp-key-03 (work in progress), 1008 February 2008. 1010 [I-D.zimmermann-avt-zrtp] 1011 Zimmermann, P., Johnston, A., and J. Callas, "ZRTP: Media 1012 Path Key Agreement for Secure RTP", 1013 draft-zimmermann-avt-zrtp-10 (work in progress), 1014 October 2008. 1016 [RFC3326] Schulzrinne, H., Oran, D., and G. Camarillo, "The Reason 1017 Header Field for the Session Initiation Protocol (SIP)", 1018 RFC 3326, December 2002. 1020 [RFC3372] Vemuri, A. and J. Peterson, "Session Initiation Protocol 1021 for Telephones (SIP-T): Context and Architectures", 1022 BCP 63, RFC 3372, September 2002. 1024 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 1025 Jacobson, "RTP: A Transport Protocol for Real-Time 1026 Applications", STD 64, RFC 3550, July 2003. 1028 [RFC3830] Arkko, J., Carrara, E., Lindholm, F., Naslund, M., and K. 1029 Norrman, "MIKEY: Multimedia Internet KEYing", RFC 3830, 1030 August 2004. 1032 [RFC4474] Peterson, J. and C. Jennings, "Enhancements for 1033 Authenticated Identity Management in the Session 1034 Initiation Protocol (SIP)", RFC 4474, August 2006. 1036 [RFC4492] Blake-Wilson, S., Bolyard, N., Gupta, V., Hawk, C., and B. 1037 Moeller, "Elliptic Curve Cryptography (ECC) Cipher Suites 1038 for Transport Layer Security (TLS)", RFC 4492, May 2006. 1040 [RFC4568] Andreasen, F., Baugher, M., and D. Wing, "Session 1041 Description Protocol (SDP) Security Descriptions for Media 1042 Streams", RFC 4568, July 2006. 1044 [RFC4650] Euchner, M., "HMAC-Authenticated Diffie-Hellman for 1045 Multimedia Internet KEYing (MIKEY)", RFC 4650, 1046 September 2006. 1048 [RFC4738] Ignjatic, D., Dondeti, L., Audet, F., and P. Lin, "MIKEY- 1049 RSA-R: An Additional Mode of Key Distribution in 1050 Multimedia Internet KEYing (MIKEY)", RFC 4738, 1051 November 2006. 1053 [RFC4771] Lehtovirta, V., Naslund, M., and K. Norrman, "Integrity 1054 Transform Carrying Roll-Over Counter for the Secure Real- 1055 time Transport Protocol (SRTP)", RFC 4771, January 2007. 1057 [RFC4916] Elwell, J., "Connected Identity in the Session Initiation 1058 Protocol (SIP)", RFC 4916, June 2007. 1060 [RFC4949] Shirey, R., "Internet Security Glossary, Version 2", 1061 RFC 4949, August 2007. 1063 [RFC5027] Andreasen, F. and D. Wing, "Security Preconditions for 1064 Session Description Protocol (SDP) Media Streams", 1065 RFC 5027, October 2007. 1067 [RFC5197] Fries, S. and D. Ignjatic, "On the Applicability of 1068 Various Multimedia Internet KEYing (MIKEY) Modes and 1069 Extensions", RFC 5197, June 2008. 1071 Appendix A. Overview and Evaluation of Existing Keying Mechanisms 1073 Based on how the SRTP keys are exchanged, each SRTP key exchange 1074 mechanism belongs to one general category: 1076 signaling path: 1077 All the keying is carried in the call signaling (SIP or SDP) 1078 path. 1080 media path: 1081 All the keying is carried in the SRTP/SRTCP media path, and no 1082 signaling whatsoever is carried in the call signaling path. 1084 signaling and media path: 1085 Parts of the keying are carried in the SRTP/SRTCP media path, 1086 and parts are carried in the call signaling (SIP or SDP) path. 1088 One of the significant benefits of SRTP over other end-to-end 1089 encryption mechanisms, such as for example IPsec, is that SRTP is 1090 bandwidth efficient and SRTP retains the header of RTP packets. 1092 Bandwidth efficiency is vital for VoIP in many scenarios where access 1093 bandwidth is limited or expensive, and retaining the RTP header is 1094 important for troubleshooting packet loss, delay, and jitter. 1096 Related to SRTP's characteristics is a goal that any SRTP keying 1097 mechanism to also be efficient and not cause additional call setup 1098 delay. Contributors to additional call setup delay include network 1099 or database operations: retrieval of certificates and additional SIP 1100 or media path messages, and computational overhead of establishing 1101 keys or validating certificates. 1103 When examining the choice between keying in the signaling path, 1104 keying in the media path, or keying in both paths, it is important to 1105 realize the media path is generally 'faster' than the SIP signaling 1106 path. The SIP signaling path has computational elements involved 1107 which parse and route SIP messages. The media path, on the other 1108 hand, does not normally have computational elements involved, and 1109 even when computational elements such as firewalls are involved, they 1110 cause very little additional delay. Thus, the media path can be 1111 useful for exchanging several messages to establish SRTP keys. A 1112 disadvantage of keying over the media path is that interworking 1113 different key exchange requires the interworking function be in the 1114 media path, rather than just in the signaling path; in practice this 1115 involvement is probably unavoidable anyway. 1117 A.1. Signaling Path Keying Techniques 1119 A.1.1. MIKEY-NULL 1121 MIKEY-NULL [RFC3830] has the offerer indicate the SRTP keys for both 1122 directions. The key is sent unencrypted in SDP, which means the SDP 1123 must be encrypted hop-by-hop (e.g., by using TLS (SIPS)) or end-to- 1124 end (e.g., by using S/MIME). 1126 MIKEY-NULL requires one message from offerer to answerer (half a 1127 round trip), and does not add additional media path messages. 1129 A.1.2. MIKEY-PSK 1131 MIKEY-PSK (pre-shared key) [RFC3830] requires that all endpoints 1132 share one common key. MIKEY-PSK has the offerer encrypt the SRTP 1133 keys for both directions using this pre-shared key. 1135 MIKEY-PSK requires one message from offerer to answerer (half a round 1136 trip), and does not add additional media path messages. 1138 A.1.3. MIKEY-RSA 1140 MIKEY-RSA [RFC3830] has the offerer encrypt the keys for both 1141 directions using the intended answerer's public key, which is 1142 obtained from a mechanism outside of MIKEY. 1144 MIKEY-RSA requires one message from offerer to answerer (half a round 1145 trip), and does not add additional media path messages. MIKEY-RSA 1146 requires the offerer to obtain the intended answerer's certificate. 1148 A.1.4. MIKEY-RSA-R 1150 MIKEY-RSA-R [RFC4738] is essentially the same as MIKEY-RSA but 1151 reverses the role of the offerer and the answerer with regards to 1152 providing the keys. That is, the answerer encrypts the keys for both 1153 directions using the offerer's public key. Both the offerer and 1154 answerer validate each other's public keys using a standard X.509 1155 validation techniques. MIKEY-RSA-R also enables sending certificates 1156 in the MIKEY message. 1158 MIKEY-RSA-R requires one message from offerer to answer, and one 1159 message from answerer to offerer (full round trip), and does not add 1160 additional media path messages. MIKEY-RSA-R requires the offerer 1161 validate the answerer's certificate. 1163 A.1.5. MIKEY-DHSIGN 1165 In MIKEY-DHSIGN [RFC3830] the offerer and answerer derive the key 1166 from a Diffie-Hellman exchange. In order to prevent an active man- 1167 in-the-middle the DH exchange itself is signed using each endpoint's 1168 private key and the associated public keys are validated using 1169 standard X.509 validation techniques. 1171 MIKEY-DHSIGN requires one message from offerer to answerer, and one 1172 message from answerer to offerer (full round trip), and does not add 1173 additional media path messages. MIKEY-DHSIGN requires the offerer 1174 and answerer to validate each other's certificates. MIKEY-DHSIGN 1175 also enables sending the answerer's certificate in the MIKEY message. 1177 A.1.6. MIKEY-DHHMAC 1179 MIKEY-DHHMAC [RFC4650] uses a pre-shared secret to HMAC the Diffie- 1180 Hellman exchange, essentially combining aspects of MIKEY-PSK with 1181 MIKEY-DHSIGN, but without MIKEY-DHSIGN's need for certificate 1182 authentication. 1184 MIKEY-DHHMAC requires one message from offerer to answerer, and one 1185 message from answerer to offerer (full round trip), and does not add 1186 additional media path messages. 1188 A.1.7. MIKEY-ECIES and MIKEY-ECMQV (MIKEY-ECC) 1190 ECC Algorithms For MIKEY [I-D.ietf-msec-mikey-ecc] describes how ECC 1191 can be used with MIKEY-RSA (using ECDSA signature) and with MIKEY- 1192 DHSIGN (using a new DH-Group code), and also defines two new ECC- 1193 based algorithms, Elliptic Curve Integrated Encryption Scheme (ECIES) 1194 and Elliptic Curve Menezes-Qu-Vanstone (ECMQV) . 1196 With this proposal, the ECDSA signature, MIKEY-ECIES, and MIKEY-ECMQV 1197 function exactly like MIKEY-RSA, and the new DH-Group code function 1198 exactly like MIKEY-DHSIGN. Therefore these ECC mechanisms are not 1199 discussed separately in this document. 1201 A.1.8. Security Descriptions with SIPS 1203 Security Descriptions [RFC4568] has each side indicate the key it 1204 will use for transmitting SRTP media, and the keys are sent in the 1205 clear in SDP. Security Descriptions relies on hop-by-hop (TLS via 1206 "SIPS:") encryption to protect the keys exchanged in signaling. 1208 Security Descriptions requires one message from offerer to answerer, 1209 and one message from answerer to offerer (full round trip), and does 1210 not add additional media path messages. 1212 A.1.9. Security Descriptions with S/MIME 1214 This keying mechanism is identical to Appendix A.1.8, except that 1215 rather than protecting the signaling with TLS, the entire SDP is 1216 encrypted with S/MIME. 1218 A.1.10. SDP-DH (expired) 1220 SDP Diffie-Hellman [I-D.baugher-mmusic-sdp-dh] exchanges Diffie- 1221 Hellman messages in the signaling path to establish session keys. To 1222 protect against active man-in-the-middle attacks, the Diffie-Hellman 1223 exchange needs to be protected with S/MIME, SIPS, or SIP Identity 1224 [RFC4474] and SIP Conected Identity [RFC4916]. 1226 SDP-DH requires one message from offerer to answerer, and one message 1227 from answerer to offerer (full round trip), and does not add 1228 additional media path messages. 1230 A.1.11. MIKEYv2 in SDP (expired) 1232 MIKEYv2 [I-D.dondeti-msec-rtpsec-mikeyv2] adds mode negotiation to 1233 MIKEYv1 and removes the time synchronization requirement. It 1234 therefore now takes 2 round-trips to complete. In the first round 1235 trip, the communicating parties learn each other's identities, agree 1236 on a MIKEY mode, crypto algorithm, SRTP policy, and exchanges nonces 1237 for replay protection. In the second round trip, they negotiate 1238 unicast and/or group SRTP context for SRTP and/or SRTCP. 1240 Furthemore, MIKEYv2 also defines an in-band negotiation mode as an 1241 alternative to SDP (see Appendix A.3.3). 1243 A.2. Media Path Keying Technique 1245 A.2.1. ZRTP 1247 ZRTP [I-D.zimmermann-avt-zrtp] does not exchange information in the 1248 signaling path (although it's possible for endpoints to exchange a 1249 hash of the ZRTP Hello message with "a=zrtp-hash" in the initial 1250 Offer if sent over an integrity-protected signaling channel. This 1251 provides some useful correlation between the signaling and media 1252 layers). In ZRTP the keys are exchanged entirely in the media path 1253 using a Diffie-Hellman exchange. The advantage to this mechanism is 1254 that the signaling channel is used only for call setup and the media 1255 channel is used to establish an encrypted channel -- much like 1256 encryption devices on the PSTN. ZRTP uses voice authentication of 1257 its Diffie-Hellman exchange by having each person read digits or 1258 words to the other person. Subsequent sessions with the same ZRTP 1259 endpoint can be authenticated using the stored hash of the previously 1260 negotiated key rather than voice authentication. ZRTP uses 4 media 1261 path messages (Hello, Commit, DHPart1, and DHPart2) to establish the 1262 SRTP key, and 3 media path confirmation messages. These initial 1263 messages are all sent as non-RTP packets. 1265 Note that when ZRTP probing is used, unencrypted RTP can be 1266 exchanged until the SRTP keys are established. 1268 A.3. Signaling and Media Path Keying Techniques 1270 A.3.1. EKT 1272 EKT [I-D.mcgrew-srtp-ekt] relies on another SRTP key exchange 1273 protocol, such as Security Descriptions or MIKEY, for bootstrapping. 1274 In the initial phase, each member of a conference uses an SRTP key 1275 exchange protocol to establish a common key encryption key (KEK). 1276 Each member may use the KEK to securely transport its SRTP master key 1277 and current SRTP rollover counter (ROC), via RTCP, to the other 1278 participants in the session. 1280 EKT requires the offerer to send some parameters (EKT_Cipher, KEK, 1281 and security parameter index (SPI)) via the bootstrapping protocol 1282 such as Security Descriptions or MIKEY. Each answerer sends an SRTCP 1283 message which contains the answerer's SRTP Master Key, rollover 1284 counter, and the SRTP sequence number. Rekeying is done by sending a 1285 new SRTCP message. For reliable transport, multiple RTCP messages 1286 need to be sent. 1288 A.3.2. DTLS-SRTP 1290 DTLS-SRTP [I-D.ietf-avt-dtls-srtp] exchanges public key fingerprints 1291 in SDP [I-D.fischl-sipping-media-dtls] and then establishes a DTLS 1292 session over the media channel. The endpoints use the DTLS handshake 1293 to agree on crypto suites and establish SRTP session keys. SRTP 1294 packets are then exchanged between the endpoints. 1296 DTLS-SRTP requires one message from offerer to answerer (half round 1297 trip), and one message from the answerer to offerer (full round trip) 1298 so the offerer can correlate the SDP answer with the answering 1299 endpoint. DTLS-SRTP uses 4 media path messages to establish the SRTP 1300 key. 1302 This document assumes DTLS will use TLS_RSA_WITH_AES_128_CBC_SHA as 1303 its cipher suite, which is the mandatory-to-implement cipher suite in 1304 TLS [I-D.ietf-tls-rfc4346-bis]. 1306 A.3.3. MIKEYv2 Inband (expired) 1308 As defined in Appendix A.1.11, MIKEYv2 also defines an in-band 1309 negotiation mode as an alternative to SDP (see Appendix A.3.3). The 1310 details are not sorted out in the draft yet on what in-band actually 1311 means (i.e., UDP, RTP, RTCP, etc.). 1313 A.4. Evaluation Criteria - SIP 1315 This section considers how each keying mechanism interacts with SIP 1316 features. 1318 A.4.1. Secure Retargeting and Secure Forking 1320 Retargeting and forking of signaling requests is described within 1321 Section 4.2. The following builds upon this description. 1323 The following list compares the behavior of secure forking, answering 1324 association, two-time pads, and secure retargeting for each keying 1325 mechanism. 1327 MIKEY-NULL Secure Forking: No, all AORs see offerer's and 1328 answerer's keys. Answer is associated with media by the SSRC 1329 in MIKEY. Additionally, a two-time pad occurs if two branches 1330 choose the same 32-bit SSRC and transmit SRTP packets. 1332 Secure Retargeting: No, all targets see offerer's and 1333 answerer's keys. Suffers from retargeting identity problem. 1335 MIKEY-PSK 1336 Secure Forking: No, all AORs see offerer's and answerer's keys. 1337 Answer is associated with media by the SSRC in MIKEY. Note 1338 that all AORs must share the same pre-shared key in order for 1339 forking to work at all with MIKEY-PSK. Additionally, a two- 1340 time pad occurs if two branches choose the same 32-bit SSRC and 1341 transmit SRTP packets. 1343 Secure Retargeting: Not secure. For retargeting to work, the 1344 final target must possess the correct PSK. As this is likely 1345 in scenarios were the call is targeted to another device 1346 belonging to the same user (forking), it is very unlikely that 1347 other users will possess that PSK and be able to successfully 1348 answer that call. 1350 MIKEY-RSA 1351 Secure Forking: No, all AORs see offerer's and answerer's keys. 1352 Answer is associated with media by the SSRC in MIKEY. Note 1353 that all AORs must share the same private key in order for 1354 forking to work at all with MIKEY-RSA. Additionally, a two- 1355 time pad occurs if two branches choose the same 32-bit SSRC and 1356 transmit SRTP packets. 1358 Secure Retargeting: No. 1360 MIKEY-RSA-R 1361 Secure Forking: Yes. Answer is associated with media by the 1362 SSRC in MIKEY. 1364 Secure Retargeting: Yes. 1366 MIKEY-DHSIGN 1367 Secure Forking: Yes, each forked endpoint negotiates unique 1368 keys with the offerer for both directions. Answer is 1369 associated with media by the SSRC in MIKEY. 1371 Secure Retargeting: Yes, each target negotiates unique keys 1372 with the offerer for both directions. 1374 MIKEYv2 in SDP 1375 The behavior will depend on which mode is picked. 1377 MIKEY-DHHMAC 1378 Secure Forking: Yes, each forked endpoint negotiates unique 1379 keys with the offerer for both directions. Answer is 1380 associated with media by the SSRC in MIKEY. 1382 Secure Retargeting: Yes, each target negotiates unique keys 1383 with the offerer for both directions. Note that for the keys 1384 to be meaningful, it would require the PSK to be the same for 1385 all the potential intermediaries, which would only happen 1386 within a single domain. 1388 Security Descriptions with SIPS 1389 Secure Forking: No. Each forked endpoint sees the offerer's 1390 key. Answer is not associated with media. 1392 Secure Retargeting: No. Each target sees the offerer's key. 1394 Security Descriptions with S/MIME 1395 Secure Forking: No. Each forked endpoint sees the offerer's 1396 key. Answer is not associated with media. 1398 Secure Retargeting: No. Each target sees the offerer's key. 1399 Suffers from retargeting identity problem. 1401 SDP-DH 1402 Secure Forking: Yes. Each forked endpoint calculates a unique 1403 SRTP key. Answer is not associated with media. 1405 Secure Retargeting: Yes. The final target calculates a unique 1406 SRTP key. 1408 ZRTP 1409 Yes. Each forked endpoint calculates a unique SRTP key. With 1410 the "a=zrtp-hash" attribute, the media can be associated with 1411 an answer. 1413 Secure Retargeting: Yes. The final target calculates a unique 1414 SRTP key. 1416 EKT 1417 Secure Forking: Inherited from the bootstrapping mechanism (the 1418 specific MIKEY mode or Security Descriptions). Answer is 1419 associated with media by the SPI in the EKT protocol. Answer 1420 is associated with media by the SPI in the EKT protocol. 1422 Secure Retargeting: Inherited from the bootstrapping mechanism 1423 (the specific MIKEY mode or Security Descriptions). 1425 DTLS-SRTP 1426 Secure Forking: Yes. Each forked endpoint calculates a unique 1427 SRTP key. Answer is associated with media by the certificate 1428 fingerprint in signaling and certificate in the media path. 1430 Secure Retargeting: Yes. The final target calculates a unique 1431 SRTP key. 1433 MIKEYv2 Inband 1434 The behavior will depend on which mode is picked. 1436 A.4.2. Clipping Media Before SDP Answer 1438 Clipping media before receiving the signaling answer is described 1439 within Section 4.1. The following builds upon this description. 1441 Furthermore, the problem of clipping gets compounded when forking is 1442 used. For example, if using a Diffie-Hellman keying technique with 1443 security preconditions that forks to 20 endpoints, the call initiator 1444 would get 20 provisional responses containing 20 signed Diffie- 1445 Hellman half keys. Calculating 20 DH secrets and validating 1446 signatures can be a difficult task depending on the device 1447 capabilities. 1449 The following list compares the behavior of clipping before SDP 1450 answer for each keying mechanism. 1452 MIKEY-NULL 1453 Not clipped. The offerer provides the answerer's keys. 1455 MIKEY-PSK 1456 Not clipped. The offerer provides the answerer's keys. 1458 MIKEY-RSA 1459 Not clipped. The offerer provides the answerer's keys. 1461 MIKEY-RSA-R 1462 Clipped. The answer contains the answerer's encryption key. 1464 MIKEY-DHSIGN 1465 Clipped. The answer contains the answerer's Diffie-Hellman 1466 response. 1468 MIKEY-DHHMAC 1469 Clipped. The answer contains the answerer's Diffie-Hellman 1470 response. 1472 MIKEYv2 in SDP 1473 The behavior will depend on which mode is picked. 1475 Security Descriptions with SIPS 1476 Clipped. The answer contains the answerer's encryption key. 1478 Security Descriptions with S/MIME 1479 Clipped. The answer contains the answerer's encryption key. 1481 SDP-DH 1482 Clipped. The answer contains the answerer's Diffie-Hellman 1483 response. 1485 ZRTP 1486 Not clipped because the session intially uses RTP. While RTP 1487 is flowing, both ends negotiate SRTP keys in the media path and 1488 then switch to using SRTP. 1490 EKT 1491 Not clipped, as long as the first RTCP packet (containing the 1492 answerer's key) is not lost in transit. The answerer sends its 1493 encryption key in RTCP, which arrives at the same time (or 1494 before) the first SRTP packet encrypted with that key. 1496 Note: RTCP needs to work, in the answerer-to-offerer 1497 direction, before the offerer can decrypt SRTP media. 1499 DTLS-SRTP 1500 No clipping after the DTLS-SRTP handshake has completed. SRTP 1501 keys are exchanged in the media path. Need to wait for SDP 1502 answer to ensure DTLS-SRTP handshake was done with an 1503 authorized party. 1505 If a middlebox interferes with the media path, there can be 1506 clipping [I-D.ietf-mmusic-media-path-middleboxes]. 1508 MIKEYv2 Inband 1509 Not clipped. Keys are exchanged in the media path without 1510 relying on the signaling path. 1512 A.4.3. SSRC and ROC 1514 In SRTP, a cryptographic context is defined as the SSRC, destination 1515 network address, and destination transport port number. Whereas RTP, 1516 a flow is defined as the destination network address and destination 1517 transport port number. This results in a problem -- how to 1518 communicate the SSRC so that the SSRC can be used for the 1519 cryptographic context. 1521 Two approaches have emerged for this communication. One, used by all 1522 MIKEY modes, is to communicate the SSRCs to the peer in the MIKEY 1523 exchange. Another, used by Security Descriptions, is to apply "late 1524 bindng" -- that is, any new packet containing a previously-unseen 1525 SSRC (which arrives at the same destination network address and 1526 destination transport port number) will create a new cryptographic 1527 context. Another approach, common amongst techniques with media-path 1528 SRTP key establishment, is to require a handshake over that media 1529 path before SRTP packets are sent. MIKEY's approach changes RTP's 1530 SSRC collision detection behavior by requiring RTP to pre-establish 1531 the SSRC values for each session. 1533 Another related issue is that SRTP introduces a rollover counter 1534 (ROC), which records how many times the SRTP sequence number has 1535 rolled over. As the sequence number is used for SRTP's default 1536 ciphers, it is important that all endpoints know the value of the 1537 ROC. The ROC starts at 0 at the beginning of a session. 1539 Some keying mechanisms cause a two-time pad to occur if two endpoints 1540 of a forked call have an SSRC collision. 1542 Note: A proposal has been made to send the ROC value on every Nth 1543 SRTP packet[RFC4771]. This proposal has not yet been incorporated 1544 into this document. 1546 The following list examines handling of SSRC and ROC: 1548 MIKEY-NULL 1549 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1550 packets it transmits. 1552 MIKEY-PSK 1553 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1554 packets it transmits. 1556 MIKEY-RSA 1557 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1558 packets it transmits. 1560 MIKEY-RSA-R 1561 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1562 packets it transmits. 1564 MIKEY-DHSIGN 1565 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1566 packets it transmits. 1568 MIKEY-DHHMAC 1569 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1570 packets it transmits. 1572 MIKEYv2 in SDP 1573 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1574 packets it transmits. 1576 Security Descriptions with SIPS 1577 Neither SSRC nor ROC are signaled. SSRC 'late binding' is 1578 used. 1580 Security Descriptions with S/MIME 1581 Neither SSRC nor ROC are signaled. SSRC 'late binding' is 1582 used. 1584 SDP-DH 1585 Neither SSRC nor ROC are signaled. SSRC 'late binding' is 1586 used. 1588 ZRTP 1589 Neither SSRC nor ROC are signaled. SSRC 'late binding' is 1590 used. 1592 EKT 1593 The SSRC of the SRTCP packet containing an EKT update 1594 corresponds to the SRTP master key and other parameters within 1595 that packet. 1597 DTLS-SRTP 1598 Neither SSRC nor ROC are signaled. SSRC 'late binding' is 1599 used. 1601 MIKEYv2 Inband 1602 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1603 packets it transmits. 1605 A.5. Evaluation Criteria - Security 1607 This section evaluates each keying mechanism on the basis of their 1608 security properties. 1610 A.5.1. Distribution and Validation of Persistent Public Keys and 1611 Certificates 1613 Using persistent public keys for confidentiality and authentication 1614 can introduce requirements for two types of systems, often 1615 implemented using certificates: (1) a system to distribute those 1616 persistent public keys certificates, and (2) a system for validating 1617 those persistent public keys. We refer to the former as a key 1618 distribution system and the latter as an authentication 1619 infrastructure. In many cases, a monolithic public key 1620 infrastructure (PKI) is used for fulfill both of these roles. 1621 However, these functions can be provided by many other systems. For 1622 instance, key distribution may be accomplished by any public 1623 repository of keys. Any system in which the two endpoints have 1624 access to trust anchors and intermediate CA certificates that can be 1625 used to validate other endpoints' certificates (including a system of 1626 self-signed certificates) can be used to support certificate 1627 validation in the below schemes. 1629 With real-time communications it is desirable to avoid fetching or 1630 validating certificates that delay call setup. Rather, it is 1631 preferable to fetch or validate certificates in such a way that call 1632 setup is not delayed. For example, a certificate can be validated 1633 while the phone is ringing or can be validated while ring-back tones 1634 are being played or even while the called party is answering the 1635 phone and saying "hello". Even better is to avoid fetching or 1636 validating persistent public keys at all. 1638 SRTP key exchange mechanisms that require a particular authentication 1639 infrastructure to operate (whether for distribution or validation) 1640 are gated on the deployment of a such an infrastructure available to 1641 both endpoints. This means that no media security is achievable 1642 until such an infrastructure exists. For SIP, something like sip- 1643 certs [I-D.ietf-sip-certs] might be used to obtain the certificate of 1644 a peer. 1646 Note: Even if sip-certs [I-D.ietf-sip-certs] was deployed, the 1647 retargeting problem (Appendix A.4.1) would still prevent 1648 successful deployment of keying techniques which require the 1649 offerer to obtain the actual target's public key. 1651 The following list compares the requirements introduced by the use of 1652 public-key cryptography in each keying mechanism, both for public key 1653 distribution and for certificate validation. 1655 MIKEY-NULL 1656 Public-key cryptography is not used. 1658 MIKEY-PSK 1659 Public-key cryptography is not used. Rather, all endpoints 1660 must have some way to exchange per-endpoint or per-system pre- 1661 shared keys. 1663 MIKEY-RSA 1664 The offerer obtains the intended answerer's public key before 1665 initiating the call. This public key is used to encrypt the 1666 SRTP keys. There is no defined mechanism for the offerer to 1667 obtain the answerer's public key, although [I-D.ietf-sip-certs] 1668 might be viable in the future. 1670 The offer may also contain a certificate for the offeror, which 1671 would require an authentication infrastructure in order to be 1672 validated by the receiver. 1674 MIKEY-RSA-R 1675 The offer contains the offerer's certificate, and the answer 1676 contains the answerer's certificate. The answerer uses the 1677 public key in the certificate to encrypt the SRTP keys that 1678 will be used by the offerer and the answerer. An 1679 authentication infrastructure is necessary to validate the 1680 certificates. 1682 MIKEY-DHSIGN 1683 An authentication infrastructure is used to authenticate the 1684 public key that is included in the MIKEY message. 1686 MIKEY-DHHMAC 1687 Public-key cryptography is not used. Rather, all endpoints 1688 must have some way to exchange per-endpoint or per-system pre- 1689 shared keys. 1691 MIKEYv2 in SDP 1692 The behavior will depend on which mode is picked. 1694 Security Descriptions with SIPS 1695 Public-key cryptography is not used. 1697 Security Descriptions with S/MIME 1698 Use of S/MIME requires that the endpoints be able to fetch and 1699 validate certificates for each other. The offerer must obtain 1700 the intended target's certificate and encrypts the SDP offer 1701 with the public key contained in target's certificate. The 1702 answerer must obtain the offerer's certificate and encrypt the 1703 SDP answer with the public key contained in the offerer's 1704 certificate. 1706 SDP-DH 1707 Public-key cryptography is not used. 1709 ZRTP 1710 Public-key cryptography is used (Diffie-Hellman), but without 1711 dependence on persistent public keys. Thus, certificates are 1712 not fetched or validated. 1714 EKT 1715 Public-key cryptography is not used by itself, but might be 1716 used by the EKT bootstrapping keying mechanism (such as certain 1717 MIKEY modes). 1719 DTLS-SRTP 1720 Remote party's certificate is sent in media path, and a 1721 fingerprint of the same certificate is sent in the signaling 1722 path. 1724 MIKEYv2 Inband 1725 The behavior will depend on which mode is picked. 1727 A.5.2. Perfect Forward Secrecy 1729 In the context of SRTP, Perfect Forward Secrecy is the property that 1730 SRTP session keys that protected a previous session are not 1731 compromised if the static keys belonging to the endpoints are 1732 compromised. That is, if someone were to record your encrypted 1733 session content and later acquires either party's private key, that 1734 encrypted session content would be safe from decryption if your key 1735 exchange mechanism had perfect forward secrecy. 1737 The following list describes how each key exchange mechanism provides 1738 PFS. 1740 MIKEY-NULL 1741 Not applicable; MIKEY-NULL does not have a long-term secret. 1743 MIKEY-PSK 1744 No PFS. 1746 MIKEY-RSA 1747 No PFS. 1749 MIKEY-RSA-R 1750 No PFS. 1752 MIKEY-DHSIGN 1753 PFS is provided with the Diffie-Hellman exchange. 1755 MIKEY-DHHMAC 1756 PFS is provided with the Diffie-Hellman exchange. 1758 MIKEYv2 in SDP 1759 The behavior will depend on which mode is picked. 1761 Security Descriptions with SIPS 1762 Not applicable; Security Descriptions does not have a long-term 1763 secret. 1765 Security Descriptions with S/MIME 1766 Not applicable; Security Descriptions does not have a long-term 1767 secret. 1769 SDP-DH 1770 PFS is provided with the Diffie-Hellman exchange. 1772 ZRTP 1773 PFS is provided with the Diffie-Hellman exchange. 1775 EKT 1776 No PFS. 1778 DTLS-SRTP 1779 PFS is provided if the negotiated cipher suite uses ephemeral 1780 keys (e.g., Diffie-Hellman (DHE_RSA [I-D.ietf-tls-rfc4346-bis]) 1781 or Elliptic Curve Diffie-Hellman [RFC4492]). 1783 MIKEYv2 Inband 1784 The behavior will depend on which mode is picked. 1786 A.5.3. Best Effort Encryption 1788 With best effort encryption, SRTP is used with endpoints that support 1789 SRTP, otherwise RTP is used. 1791 SIP needs a backwards-compatible best effort encryption in order for 1792 SRTP to work successfully with SIP retargeting and forking when there 1793 is a mix of forked or retargeted devices that support SRTP and don't 1794 support SRTP. 1796 Consider the case of Bob, with a phone that only does RTP and a 1797 voice mail system that supports SRTP and RTP. If Alice calls Bob 1798 with an SRTP offer, Bob's RTP-only phone will reject the media 1799 stream (with an empty "m=" line) because Bob's phone doesn't 1800 understand SRTP (RTP/SAVP). Alice's phone will see this rejected 1801 media stream and may terminate the entire call (BYE) and re- 1802 initiate the call as RTP-only, or Alice's phone may decide to 1803 continue with call setup with the SRTP-capable leg (the voice mail 1804 system). If Alice's phone decided to re-initiate the call as RTP- 1805 only, and Bob doesn't answer his phone, Alice will then leave 1806 voice mail using only RTP, rather than SRTP as expected. 1808 Currently, several techniques are commonly considered as candidates 1809 to provide opportunistic encryption: 1811 multipart/alternative 1812 [I-D.jennings-sipping-multipart] describes how to form a 1813 multipart/alternative body part in SIP. The significant issues 1814 with this technique are (1) that multipart MIME is incompatible 1815 with existing SIP proxies, firewalls, Session Border Controllers, 1816 and endpoints and (2) when forking, the Heterogeneous Error 1817 Response Forking Problem (HERFP) [RFC3326] causes problems if such 1818 non-multipart-capable endpoints were involved in the forking. 1820 session attribute 1821 With this technique, the endpoints signal their desire to do SRTP 1822 by signaling RTP (RTP/AVP), and using an attribute ("a=") in the 1823 SDP. This technique is entirely backwards compatible with non- 1824 SRTP-aware endpoints, but doesn't use the RTP/SAVP protocol 1825 registered by SRTP [RFC3711]. 1827 SDP Capability Negotiation 1828 SDP Capability Negotiation 1829 [I-D.ietf-mmusic-sdp-capability-negotiation] provides a backwards- 1830 compatible mechanism to allow offering both SRTP and RTP in a 1831 single offer. This is the preferred technique. 1833 Probing 1834 With this technique, the endpoints first establish an RTP session 1835 using RTP (RTP/AVP). The endpoints send probe messages, over the 1836 media path, to determine if the remote endpoint supports their 1837 keying technique. A disadvantage of probing is an active attacker 1838 can interfere with probes, and until probing completes (and SRTP 1839 is established) the media is in the clear. 1841 The preferred technique, SDP Capability Negotiation 1842 [I-D.ietf-mmusic-sdp-capability-negotiation], can be used with all 1843 key exchange mechanisms. What remains unique is ZRTP, which can also 1844 accomplish its best effort encryption by probing (sending ZRTP 1845 messages over the media path) or by session attribute (see "a=zrtp- 1846 hash" in [I-D.zimmermann-avt-zrtp]). Current implementations of ZRTP 1847 use probing. 1849 A.5.4. Upgrading Algorithms 1851 It is necessary to allow upgrading SRTP encryption and hash 1852 algorithms, as well as upgrading the cryptographic functions used for 1853 the key exchange mechanism. With SIP's offer/answer model, this can 1854 be computionally expensive because the offer needs to contain all 1855 combinations of the key exchange mechanisms (all MIKEY modes, 1856 Security Descriptions) and all SRTP cryptographic suites (AES-128, 1857 AES-256) and all SRTP cryptographic hash functions (SHA-1, SHA-256) 1858 that the offerer supports. In order to do this, the offerer has to 1859 expend CPU resources to build an offer containing all of this 1860 information which becomes computationally prohibitive. 1862 Thus, it is important to keep the offerer's CPU impact fixed so that 1863 offering multiple new SRTP encryption and hash functions incurs no 1864 additional expense. 1866 The following list describes the CPU effort involved in using each 1867 key exchange technique. 1869 MIKEY-NULL 1870 No significant computaional expense. 1872 MIKEY-PSK 1873 No significant computational expense. 1875 MIKEY-RSA 1876 For each offered SRTP crypto suite, the offerer has to perform 1877 RSA operation to encrypt the TGK 1879 MIKEY-RSA-R 1880 For each offered SRTP crypto suite, the offerer has to perform 1881 public key operation to sign the MIKEY message. 1883 MIKEY-DHSIGN 1884 For each offered SRTP crypto suite, the offerer has to perform 1885 Diffie-Hellman operation, and a public key operation to sign 1886 the Diffie-Hellman output. 1888 MIKEY-DHHMAC 1889 For each offered SRTP crypto suite, the offerer has to perform 1890 Diffie-Hellman operation. 1892 MIKEYv2 in SDP 1893 The behavior will depend on which mode is picked. 1895 Security Descriptions with SIPS 1896 No significant computational expense. 1898 Security Descriptions with S/MIME 1899 S/MIME requires the offerer and the answerer to encrypt the SDP 1900 with the other's public key, and to decrypt the received SDP 1901 with their own private key. 1903 SDP-DH 1904 For each offered SRTP crypto suite, the offerer has to perform 1905 a Diffie-Hellman operation. 1907 ZRTP 1908 The offerer has no additional computational expense at all, as 1909 the offer contains no information about ZRTP or might contain 1910 "a=zrtp-hash". 1912 EKT 1913 The offerer's Computational expense depends entirely on the EKT 1914 bootstrapping mechanism selected (one or more MIKEY modes or 1915 Security Descriptions). 1917 DTLS-SRTP 1918 The offerer has no additional computational expense at all, as 1919 the offer contains only a fingerprint of the certificate that 1920 will be presented in the DTLS exchange. 1922 MIKEYv2 Inband 1923 The behavior will depend on which mode is picked. 1925 Appendix B. Out-of-Scope 1927 The compromise of an endpoint that has access to decrypted media 1928 (e.g., SIP user agent, transcoder, recorder) is out of scope of this 1929 document. Such a compromise might be via privilege escalation, 1930 installation of a virus or trojan horse, or similar attacks. 1932 B.1. Shared Key Conferencing 1934 The consensus on the RTPSEC mailing list was to concentrate on 1935 unicast, point-to-point sessions. Thus, there are no requirements 1936 related to shared key conferencing. This section is retained for 1937 informational purposes. 1939 For efficient scaling, large audio and video conference bridges 1940 operate most efficiently by encrypting the current speaker once and 1941 distributing that stream to the conference attendees. Typically, 1942 inactive participants receive the same streams -- they hear (or see) 1943 the active speaker(s), and the active speakers receive distinct 1944 streams that don't include themselves. In order to maintain 1945 confidentiality of such conferences where listeners share a common 1946 key, all listeners must rekeyed when a listener joins or leaves a 1947 conference. 1949 An important use case for mixers/translators is a conference bridge: 1951 +----+ 1952 A --- 1 --->| | 1953 <-- 2 ----| M | 1954 | I | 1955 B --- 3 --->| X | 1956 <-- 4 ----| E | 1957 | R | 1958 C --- 5 --->| | 1959 <-- 6 ----| | 1960 +----+ 1962 Figure 3: Centralized Keying 1964 In the figure above, 1, 3, and 5 are RTP media contributions from 1965 Alice, Bob, and Carol, and 2, 4, and 6 are the RTP flows to those 1966 devices carrying the 'mixed' media. 1968 Several scenarios are possible: 1970 a. Multiple inbound sessions: 1, 3, and 5 are distinct RTP sessions, 1972 b. Multiple outbound sessions: 2, 4, and 6 are distinct RTP 1973 sessions, 1975 c. Single inbound session: 1, 3, and 5 are just different sources 1976 within the same RTP session, 1978 d. Single outbound session: 2, 4, and 6 are different flows of the 1979 same (multi-unicast) RTP session 1981 If there are multiple inbound sessions and multiple outbound sessions 1982 (scenarios a and b), then every keying mechanism behaves as if the 1983 mixer were an end point and can set up a point-to-point secure 1984 session between the participant and the mixer. This is the simplest 1985 situation, but is computationally wasteful, since SRTP processing has 1986 to be done independently for each participant. The use of multiple 1987 inbound sessions (scenario a) doesn't waste computational resources, 1988 though it does consume additional cryptographic context on the mixer 1989 for each participant and has the advantage of data origin 1990 authentication. 1992 To support a single outbound session (scenario d), the mixer has to 1993 dictate its encryption key to the participants. Some keying 1994 mechanisms allow the transmitter to determine its own key, and others 1995 allow the offerer to determine the key for the offerer and answerer. 1996 Depending on how the call is established, the offerer might be a 1997 participant (such as a participant dialing into a conference bridge) 1998 or the offerer might be the mixer (such as a conference bridge 1999 calling a participant). The use of offerless INVITEs may help some 2000 keying mechanisms reverse the role of offerer/answerer. A 2001 difficulty, however, is knowing a priori if the role should be 2002 reversed for a particular call. The significant advantage of a 2003 single outbound session is the number of SRTP encryption operations 2004 remains constant even as the number of participants increases. 2005 However, a disadvantage is that data origin authentication is lost, 2006 allowing any participant to spoof the sender (because all 2007 participants know the sender's SRTP key). 2009 Appendix C. Requirement renumbering in -02 2011 [[RFC Editor: Please delete this section prior to publication.]] 2013 Previous versions of this document used requirement numbers, which 2014 were changed to mnemonics as follows: 2016 R1 R-FORK-RETARGET 2018 R2 R-BEST-SECURE 2020 R3 R-DISTINCT 2022 R4 R-REUSE; changed from 'MAY' to 'protocol MUST support, and 2023 SHOULD implement' 2025 R5 R-AVOID-CLIPPING 2027 R6 R-PASS-MEDIA 2029 R7 R-PASS-SIG 2031 R8 R-PFS 2033 R9 R-COMPUTE 2035 R10 R-RTP-VALID 2037 R11 (folded into R4; was reuse previous session) 2039 R12 R-CERTS 2041 R13 R-FIPS 2043 R14 R-ASSOC 2045 R15 R-ALLOW-RTP 2047 R16 R-DOS 2049 R17 R-SIG-MEDIA 2051 R18 R-EXISTING 2053 R19 R-AGILITY 2055 R20 R-DOWNGRADE 2057 R21 R-NEGOTIATE 2059 R23 R-OTHER-SIGNALING 2060 R23 R-RECORDING (R23 was duplicated in previous versions of the 2061 document) 2063 R24 (deleted; was lawful intercept) 2065 R25 R-TRANSCODER 2067 R26 R-PSTN 2069 R27 R-ID-BINDING 2071 R28 R-ACT-ACT 2073 Authors' Addresses 2075 Dan Wing (editor) 2076 Cisco Systems, Inc. 2077 170 West Tasman Drive 2078 San Jose, CA 95134 2079 USA 2081 Email: dwing@cisco.com 2083 Steffen Fries 2084 Siemens AG 2085 Otto-Hahn-Ring 6 2086 Munich, Bavaria 81739 2087 Germany 2089 Email: steffen.fries@siemens.com 2091 Hannes Tschofenig 2092 Nokia Siemens Networks 2093 Otto-Hahn-Ring 6 2094 Munich, Bavaria 81739 2095 Germany 2097 Email: Hannes.Tschofenig@nsn.com 2098 URI: http://www.tschofenig.priv.at 2099 Francois Audet 2100 Nortel 2101 4655 Great America Parkway 2102 Santa Clara, CA 95054 2103 USA 2105 Email: audet@nortel.com 2107 Full Copyright Statement 2109 Copyright (C) The IETF Trust (2008). 2111 This document is subject to the rights, licenses and restrictions 2112 contained in BCP 78, and except as set forth therein, the authors 2113 retain all their rights. 2115 This document and the information contained herein are provided on an 2116 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 2117 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 2118 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 2119 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 2120 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 2121 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 2123 Intellectual Property 2125 The IETF takes no position regarding the validity or scope of any 2126 Intellectual Property Rights or other rights that might be claimed to 2127 pertain to the implementation or use of the technology described in 2128 this document or the extent to which any license under such rights 2129 might or might not be available; nor does it represent that it has 2130 made any independent effort to identify any such rights. Information 2131 on the procedures with respect to rights in RFC documents can be 2132 found in BCP 78 and BCP 79. 2134 Copies of IPR disclosures made to the IETF Secretariat and any 2135 assurances of licenses to be made available, or the result of an 2136 attempt made to obtain a general license or permission for the use of 2137 such proprietary rights by implementers or users of this 2138 specification can be obtained from the IETF on-line IPR repository at 2139 http://www.ietf.org/ipr. 2141 The IETF invites any interested party to bring to its attention any 2142 copyrights, patents or patent applications, or other proprietary 2143 rights that may cover technology that may be required to implement 2144 this standard. Please address the information to the IETF at 2145 ietf-ipr@ietf.org. 2147 Acknowledgment 2149 This document was produced using xml2rfc v1.33 (of 2150 http://xml.resource.org/) from a source in RFC-2629 XML format.