idnits 2.17.1 draft-ietf-sip-media-security-requirements-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 20. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 2109. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 2120. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 2127. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 2133. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year == Line 949 has weird spacing: '...ication along...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (June 2, 2008) is 5806 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-07) exists of draft-ietf-avt-dtls-srtp-02 == Outdated reference: A later version (-07) exists of draft-ietf-mmusic-media-path-middleboxes-00 == Outdated reference: A later version (-13) exists of draft-ietf-mmusic-sdp-capability-negotiation-08 == Outdated reference: A later version (-15) exists of draft-ietf-sip-certs-06 == Outdated reference: A later version (-06) exists of draft-mcgrew-srtp-ekt-03 == Outdated reference: A later version (-04) exists of draft-wing-sipping-srtp-key-03 == Outdated reference: A later version (-22) exists of draft-zimmermann-avt-zrtp-06 -- Obsolete informational reference (is this intentional?): RFC 4474 (Obsoleted by RFC 8224) -- Obsolete informational reference (is this intentional?): RFC 4492 (Obsoleted by RFC 8422) Summary: 1 error (**), 0 flaws (~~), 9 warnings (==), 9 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 SIP Working Group D. Wing, Ed. 3 Internet-Draft Cisco 4 Intended status: Informational S. Fries 5 Expires: December 4, 2008 Siemens AG 6 H. Tschofenig 7 Nokia Siemens Networks 8 F. Audet 9 Nortel 10 June 2, 2008 12 Requirements and Analysis of Media Security Management Protocols 13 draft-ietf-sip-media-security-requirements-07 15 Status of this Memo 17 By submitting this Internet-Draft, each author represents that any 18 applicable patent or other IPR claims of which he or she is aware 19 have been or will be disclosed, and any of which he or she becomes 20 aware will be disclosed, in accordance with Section 6 of BCP 79. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF), its areas, and its working groups. Note that 24 other groups may also distribute working documents as Internet- 25 Drafts. 27 Internet-Drafts are draft documents valid for a maximum of six months 28 and may be updated, replaced, or obsoleted by other documents at any 29 time. It is inappropriate to use Internet-Drafts as reference 30 material or to cite them other than as "work in progress." 32 The list of current Internet-Drafts can be accessed at 33 http://www.ietf.org/ietf/1id-abstracts.txt. 35 The list of Internet-Draft Shadow Directories can be accessed at 36 http://www.ietf.org/shadow.html. 38 This Internet-Draft will expire on December 4, 2008. 40 Abstract 42 This document describes requirements for a protocol to negotiate a 43 security context for SIP-signaled SRTP media. In addition to the 44 natural security requirements, this negotiation protocol must 45 interoperate well with SIP in certain ways. A number of proposals 46 have been published and a summary of these proposals is in the 47 appendix of this document. 49 Table of Contents 51 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 52 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 53 3. Attack Scenarios . . . . . . . . . . . . . . . . . . . . . . . 5 54 4. Call Scenarios and Requirements Considerations . . . . . . . . 8 55 4.1. Clipping Media Before Signaling Answer . . . . . . . . . . 8 56 4.2. Retargeting and Forking . . . . . . . . . . . . . . . . . 9 57 4.3. Recording . . . . . . . . . . . . . . . . . . . . . . . . 11 58 4.4. PSTN gateway . . . . . . . . . . . . . . . . . . . . . . . 12 59 4.5. Call Setup Performance . . . . . . . . . . . . . . . . . . 12 60 4.6. Transcoding . . . . . . . . . . . . . . . . . . . . . . . 13 61 4.7. Upgrading to SRTP . . . . . . . . . . . . . . . . . . . . 13 62 4.8. Interworking with Other Signaling Protocols . . . . . . . 14 63 4.9. Certificates . . . . . . . . . . . . . . . . . . . . . . . 14 64 5. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 15 65 5.1. Key Management Protocol Requirements . . . . . . . . . . . 15 66 5.2. Security Requirements . . . . . . . . . . . . . . . . . . 17 67 5.3. Requirements Outside of the Key Management Protocol . . . 19 68 6. Security Considerations . . . . . . . . . . . . . . . . . . . 19 69 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 19 70 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 20 71 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 20 72 9.1. Normative References . . . . . . . . . . . . . . . . . . . 20 73 9.2. Informative References . . . . . . . . . . . . . . . . . . 21 74 Appendix A. Overview and Evaluation of Existing Keying 75 Mechanisms . . . . . . . . . . . . . . . . . . . . . 24 76 A.1. Signaling Path Keying Techniques . . . . . . . . . . . . . 25 77 A.1.1. MIKEY-NULL . . . . . . . . . . . . . . . . . . . . . . 25 78 A.1.2. MIKEY-PSK . . . . . . . . . . . . . . . . . . . . . . 25 79 A.1.3. MIKEY-RSA . . . . . . . . . . . . . . . . . . . . . . 25 80 A.1.4. MIKEY-RSA-R . . . . . . . . . . . . . . . . . . . . . 25 81 A.1.5. MIKEY-DHSIGN . . . . . . . . . . . . . . . . . . . . . 26 82 A.1.6. MIKEY-DHHMAC . . . . . . . . . . . . . . . . . . . . . 26 83 A.1.7. MIKEY-ECIES and MIKEY-ECMQV (MIKEY-ECC) . . . . . . . 26 84 A.1.8. Security Descriptions with SIPS . . . . . . . . . . . 26 85 A.1.9. Security Descriptions with S/MIME . . . . . . . . . . 27 86 A.1.10. SDP-DH (expired) . . . . . . . . . . . . . . . . . . . 27 87 A.1.11. MIKEYv2 in SDP (expired) . . . . . . . . . . . . . . . 27 88 A.2. Media Path Keying Technique . . . . . . . . . . . . . . . 27 89 A.2.1. ZRTP . . . . . . . . . . . . . . . . . . . . . . . . . 27 90 A.3. Signaling and Media Path Keying Techniques . . . . . . . . 28 91 A.3.1. EKT . . . . . . . . . . . . . . . . . . . . . . . . . 28 92 A.3.2. DTLS-SRTP . . . . . . . . . . . . . . . . . . . . . . 28 93 A.3.3. MIKEYv2 Inband (expired) . . . . . . . . . . . . . . . 29 94 A.4. Evaluation Criteria - SIP . . . . . . . . . . . . . . . . 29 95 A.4.1. Secure Retargeting and Secure Forking . . . . . . . . 29 96 A.4.2. Clipping Media Before SDP Answer . . . . . . . . . . . 31 97 A.4.3. SSRC and ROC . . . . . . . . . . . . . . . . . . . . . 33 98 A.5. Evaluation Criteria - Security . . . . . . . . . . . . . . 35 99 A.5.1. Distribution and Validation of Persistent Public 100 Keys and Certificates . . . . . . . . . . . . . . . . 35 101 A.5.2. Perfect Forward Secrecy . . . . . . . . . . . . . . . 38 102 A.5.3. Best Effort Encryption . . . . . . . . . . . . . . . . 39 103 A.5.4. Upgrading Algorithms . . . . . . . . . . . . . . . . . 40 104 Appendix B. Out-of-Scope . . . . . . . . . . . . . . . . . . . . 42 105 B.1. Shared Key Conferencing . . . . . . . . . . . . . . . . . 42 106 Appendix C. Requirement renumbering in -02 . . . . . . . . . . . 44 107 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 45 108 Intellectual Property and Copyright Statements . . . . . . . . . . 47 110 1. Introduction 112 The work on media security started when the Session Initiation 113 Protocol (SIP) was still in its infancy. With the increased SIP 114 deployment and the availability of new SIP extensions and related 115 protocols, the need for end-to-end security was re-evaluated. The 116 procedure of re-evaluating prior protocol work and design decisions 117 is not an uncommon strategy and, to some extent, considered necessary 118 to ensure that the developed protocols indeed meet the previously 119 envisioned needs for the users on the Internet. 121 This document summarizes media security requirements, i.e., 122 requirements for mechanisms that negotiate security context such as 123 cryptographic keys and parameters for SRTP. 125 The organization of this document is as follows: Section 2 introduces 126 terminology, Section 3 describes various attack scenarios against the 127 signaling path and media path, Section 4 provides an overview about 128 possible call scenarios, Section 5 lists requirements for media 129 security. The main part of the document concludes with the security 130 considerations Section 6, IANA considerations Section 7 and an 131 acknowledgement section in Section 8. Appendix A lists and compares 132 available solution proposals. The following Appendix A.4 compares 133 the different approaches regarding their suitability for the SIP 134 signaling scenarios described in Appendix A, while Appendix A.5 135 provides a comparison regarding security aspects. Appendix B lists 136 non-goals for this document. 138 2. Terminology 140 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 141 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 142 document are to be interpreted as described in [RFC2119], with the 143 important qualification that, unless otherwise stated, these terms 144 apply to the design of the media security key management protocol, 145 not its implementation or application. 147 Additionally, the following items are used in this document: 149 AOR (Address-of-Record): A SIP or SIPS URI that points to a domain 150 with a location service that can map the URI to another URI where 151 the user might be available. Typically, the location service is 152 populated through registrations. An AOR is frequently thought of 153 as the "public address" of the user. 155 SSRC: The 32-bit value that defines the synchronization source, used 156 in RTP. These are generally unique, but collisions can occur. 158 two-time pad: The use of the same key and the same keystream to 159 encrypt different data. For SRTP, a two-time pad occurs if two 160 senders are using the same key and the same RTP SSRC value. 162 Perfect Forward Secrecy (PFS): The property that disclosure of the 163 long-term secret keying material that is used to derive an agreed 164 ephemeral key does not compromise the secrecy of agreed keys from 165 earlier runs. 167 active adversary: An active adversary is able to alter data 168 communication to affect its operation (see also [RFC4949]). 170 passive adversary: A passive adversary is able to learn information 171 from data communication, but not alter that data communication 172 (see also[RFC4949]). 174 signaling path: The signaling path is the route taken by SIP 175 signaling messages transmitted between the calling and called user 176 agents. This can be either direct signaling between the calling 177 and called user agents or, more commonly involves the SIP proxy 178 servers that were involved in the call setup. 180 media path: The media path is the route taken by media packets 181 exchanged by the endpoints. In the simplest case, the endpoints 182 exchange media directly, and the "media path" is defined by a 183 quartet of IP addresses and TCP/UDP ports, along with an IP route. 184 In other cases, this path may include RTP relays, mixers, 185 transcoders, session border controllers, NATs, or media gateways. 187 3. Attack Scenarios 189 The discussion in this section relates to requirements R-PASS-MEDIA, 190 R-PASS-SIG, R-ASSOC, R-SIG-MEDIA, R-ACT-ACT, and R-ID-BINDING. 192 This document classifies adversaries according to their access and 193 their capabilities. An adversary might have access: 195 1. only to the media path, 197 2. only to the signaling path, 199 3. to the media path and to the signaling path. 201 An attacker that can solely be located along the signaling path, and 202 does not have access to media (item 2), is not considered in this 203 document. 205 There are two different types of adversaries, active and passive. An 206 active adversary may need to be active with regard to the key 207 exchange relevant information traveling along the media path or 208 traveling along the signaling path. 210 Based on their robustness against the adversary capabilities 211 described above, we can group security mechanisms using the following 212 labels. This list is generally ordered from easiest to compromise 213 (at the top) to more difficult to compromise: 215 +---------------+---------+--------------------------------------+ 216 | SIP signaling | media | abbreviation | 217 +---------------+---------+--------------------------------------+ 218 | none | passive | no-signaling-passive-media | 219 | none | active | no-signaling-active-media | 220 | passive | passive | passive-signaling-passive-media | 221 | passive | active | passive-signaling-active-media | 222 | active | passive | active-signaling-passive-media | 223 | active | active | active-signaling-active-media | 224 | active | active | active-signaling-active-media-detect | 225 +---------------+---------+--------------------------------------+ 227 no-signaling-passive-media: 228 Access to only the media path is sufficient to reveal the content 229 of the media traffic. 231 passive-signaling-passive-media: 232 Passive attack on the signaling and passive attack on the media 233 path is necessary to reveal the content of the media traffic. 235 passive-signaling-active-media: 236 Passive attack on the signaling and active attack on the media 237 path is necessary to reveal the content of the media traffic. 239 active-signaling-passive-media: 240 Active attack on the signaling path and passive attack on the 241 media path is necessary to reveal the content of the media 242 traffic. 244 no-signaling-active-media: 245 Active attack on the media path is sufficient to reveal the 246 content of the media traffic. 248 active-signaling-active-media: 249 Active attack on both the signaling path and the media path is 250 necessary to reveal the content of the media traffic. 252 active-signaling-active-media-detect: 253 Active attack on both signaling and media path is necessary to 254 reveal the content of the media traffic (as with active-signaling- 255 active-media), and the attack is detectable by protocol messages 256 exchanged between the end points. 258 For example, unencrypted RTP is vulnerable to no-signaling-passive- 259 media. 261 As another example, Security Descriptions [RFC4568], when protected 262 by TLS (as it is commonly implemented and deployed), belongs in the 263 passive-signaling-passive-media category since the adversary needs to 264 learn the Security Descriptions key by seeing the SIP signaling 265 message at a SIP proxy (assuming that the adversary is in control of 266 the SIP proxy). The media traffic can be decrypted using that 267 learned key. 269 As another example, DTLS-SRTP falls into active-signaling-active- 270 media category when DTLS-SRTP is used with a public key based 271 ciphersuite with self-signed certificates and without SIP-Identity 272 [RFC4474]. An adversary would have to modify the fingerprint that is 273 sent along the signaling path and subsequently to modify the 274 certificates carried in the DTLS handshake that travel along the 275 media path. If DTLS-SRTP is used with both SIP Identity [RFC4474] 276 and SIP Connected Identity [RFC4916], the RFC4474 signature protects 277 both the offer and the answer, and such a system would then belong to 278 the active-signaling-active-attack-detect category (provided, of 279 course, the signaling path to the RFC4474 authenticator and verifier 280 is secured as per RFC4474 and the RFC4474 authenticator and verifier 281 are behaving as per RFC4474). 283 The above discussion of DTLS-SRTP demonstrates how a single security 284 protocol can be in different classes depending on the mode in which 285 it is operated. Other protocols can achieve similar effect by adding 286 functions outside of the on-the-wire key management protocol itself. 287 Although it may be appropriate to deploy lower-classed mechanisms in 288 some cases, the ultimate security requirement for a media security 289 negotiation protocol is that it have a mode of operation available in 290 which is detect-attack, which provides protection against the passive 291 and active attacks and provides detection of such attacks. That is, 292 there must be a way to use the protocol so that an active attack is 293 required against both the signaling and media paths, and so that such 294 attacks are detectable by the endpoints. 296 4. Call Scenarios and Requirements Considerations 298 The following subsections describe call scenarios that pose the most 299 challenge to the key management system for media data in cooperation 300 with SIP signaling. 302 4.1. Clipping Media Before Signaling Answer 304 The discussion in this section relates to requirement R-AVOID- 305 CLIPPING and R-ALLOW-RTP. 307 Per the SDP Offer/Answer Model [RFC3264], 309 "Once the offerer has sent the offer, it MUST be prepared to 310 receive media for any recvonly streams described by that offer. 311 It MUST be prepared to send and receive media for any sendrecv 312 streams in the offer, and send media for any sendonly streams in 313 the offer (of course, it cannot actually send until the peer 314 provides an answer with the needed address and port information)." 316 To meet this requirement with SRTP, the offerer needs to know the 317 SRTP key for arriving media. If either endpoint receives encrypted 318 media before it has access to the associated SRTP key, it cannot play 319 the media -- causing clipping. 321 For key exchange mechanisms that send the answerer's key in SDP, a 322 SIP provisional response [RFC3261], such as 183 (session progress), 323 is useful. However, the 183 messages are not reliable unless both 324 the calling and called end point support PRACK [RFC3262], use TCP 325 across all SIP proxies, implement Security Preconditions [RFC5027], 326 or the both ends implement ICE [I-D.ietf-mmusic-ice] and the answerer 327 implements the reliable provisional response mechanism described in 328 ICE. Unfortunately, there is not wide deployment of any of these 329 techniques and there is industry reluctance to require these 330 techniques to avoid the problems described in this section. 332 Note that the receipt of an SDP answer is not always sufficient to 333 allow media to be played to the offerer. Sometimes, the offerer must 334 send media in order to open up firewall holes or NAT bindings before 335 media can be received (for details see 336 [I-D.ietf-mmusic-media-path-middleboxes]). In this case, even a 337 solution that makes the key available before the SDP answer arrives 338 will not help. 340 Preventing the arrival of early media (i.e., media that arrives at 341 the SDP offerer before the SDP answer arrives) might obsolete the 342 R-AVOID-CLIPPING requirement, but at the time of writing such early 343 media exists in many normal call scenarios. 345 4.2. Retargeting and Forking 347 The discussion in this section relates to requirements R-FORK- 348 RETARGET, R-DISTINCT, R-HERFP, and R-BEST-SECURE. 350 In SIP, a request sent to a specific AOR but delivered to a different 351 AOR is called a "retarget". A typical scenario is a "call 352 forwarding" feature. In Figure 1 Alice sends an INVITE in step 1 353 that is sent to Bob in step 2. Bob responds with a redirect (SIP 354 response code 3xx) pointing to Carol in step 3. This redirect 355 typically does not propagate back to Alice but only goes to a proxy 356 (i.e., the retargeting proxy) that sends the original INVITE to Carol 357 in step 4. 359 +-----+ 360 |Alice| 361 +--+--+ 362 | 363 | INVITE (1) 364 V 365 +----+----+ 366 | proxy | 367 ++-+-----++ 368 | ^ | 369 INVITE (2) | | | INVITE (4) 370 & redirect (3) | | | 371 V | V 372 ++-++ ++----+ 373 |Bob| |Carol| 374 +---+ +-----+ 376 Figure 1: Retargeting 378 Using retargeting might lead to situations where the UAC does not 379 know where its request will be going. This might not immediately 380 seem like a serious problem; after all, when one places a telephone 381 call on the PSTN, one never really knows if it will be forwarded to a 382 different number, who will pick up the line when it rings, and so on. 383 However, when considering SIP mechanisms for authenticating the 384 called party, this function can also make it difficult to 385 differentiate an intermediary that is behaving legitimately from an 386 attacker. From this perspective, the main problems with retargeting 387 ares: 389 Not detectable by the caller: The originating user agent has no 390 means of anticipating that the condition will arise, nor any means 391 of determining that it has occurred until the call has already 392 been set up. 394 Not preventable by the caller: There is no existing mechanism that 395 might be employed by the originating user agent in order to 396 guarantee that the call will not be re-targeted. 398 The mechanism used by SIP for identifying the calling party is SIP 399 Identity [RFC4474]. However, due to the nature of retargeting SIP 400 Identity can only identify the calling party (that is, the party that 401 initiated the SIP request). Some key exchange mechanisms predate SIP 402 Identity and include their own identity mechanism (e.g., MIKEY). 403 However, those built-in identity mechanism also suffer from the SIP 404 retargeting problem. While Connected Identity [RFC4916] allows 405 positive identification of the called party, the primary difficulty 406 still remains that the calling party does not know if a mismatched 407 called party is legitimate (i.e., due to authorized retargeting) or 408 illegitimate (i.e., due to unauthorized retargeting by an attacker 409 above to modify SIP signaling). 411 In SIP, 'forking' is the delivery of a request to multiple locations. 412 This happens when a single AOR is registered more than once. An 413 example of forking is when a user has a desk phone, PC client, and 414 mobile handset all registered with the same AOR. 416 +-----+ 417 |Alice| 418 +--+--+ 419 | 420 | INVITE 421 V 422 +-----+-----+ 423 | proxy | 424 ++---------++ 425 | | 426 INVITE | | INVITE 427 V V 428 +--+--+ +--+--+ 429 |Bob-1| |Bob-2| 430 +-----+ +-----+ 432 Figure 2: Forking 434 With forking, both Bob-1 and Bob-2 might send back SDP answers in SIP 435 responses. Alice will see those intermediate (18x) and final (200) 436 responses. It is useful for Alice to be able to associate the SIP 437 response with the incoming media stream. Although this association 438 can be done with ICE [I-D.ietf-mmusic-ice], and ICE is useful to make 439 this association with RTP, it is not desirable to require ICE to 440 accomplish this association. 442 Forking and retargeting are often used together. For example, a boss 443 and secretary might have both phones ring (forking) and rollover to 444 voice mail if neither phone is answered (retargeting). 446 To maintain security of the media traffic, only the end point that 447 answers the call should know the SRTP keys for the session. Forked 448 and re-targeted calls only reveal sensitive information to non- 449 responders when the signaling messages contain sensitive information 450 (e.g., SRTP keys) that is accessible by parties that receive the 451 offer, but may not respond (i.e., the original recipients in a 452 retargeted call, or non-answering endpoints in a forked call). For 453 key exchange mechanisms that do not provide secure forking or secure 454 retargeting, one workaround is to re-key immediately after forking or 455 retargeting. However, because the originator may not be aware that 456 the call forked this mechanism requires rekeying immediately after 457 every session is established. This doubles the number of messages 458 processed by the network. 460 Further compounding this problem is a unique feature of SIP that when 461 forking is used, there is always only one final error response 462 delivered to the sender of the request: the forking proxy is 463 responsible for choosing which final response to choose in the event 464 where forking results in multiple final error responses being 465 received by the forking proxy. This means that if a request is 466 rejected, say with information that the keying information was 467 rejected and providing the far end's credentials, it is very possible 468 that the rejection will never reach the sender. This problem, called 469 the Heterogeneous Error Response Forking Problem (HERFP) [RFC3326], 470 is difficult to solve in SIP. Because we expect the HERFP to 471 continue to be a problem in SIP for the foreseeable future, a media 472 security system should function even in the presence of HERFP 473 behavior. 475 4.3. Recording 477 The discussion in this section relates to requirement R-RECORDING. 479 Some business environments, such as stock brokers, banks, and catalog 480 call centers, require recording calls with customers. This is the 481 familiar "this call is being recorded for quality purposes" heard 482 during calls to these sorts of businesses. In these environments, 483 media recording is typically performed by an intermediate device 484 (with RTP, this is typically implemented in a 'sniffer'). 486 When performing such call recording with SRTP, the end-to-end 487 security is compromised. This is unavoidable, but necessary because 488 the operation of the business requires such recording. It is 489 desirable that the media security is not unduly compromised by the 490 media recording. The endpoint within the organization needs to be 491 informed that there is an intermediate device and needs to cooperate 492 with that intermediate device. 494 This scenario does not place a requirement directly on the key 495 management protocol. The requirement could be met directly by the 496 key management protocol (e.g., MIKEY-NULL or [RFC4568]) or through an 497 external out-of-band-mechanism (e.g., [I-D.wing-sipping-srtp-key]). 499 4.4. PSTN gateway 501 The discussion in this section relates to requirement R-PSTN. 503 It is desirable, even when one leg of a call is on the PSTN, that the 504 IP leg of the call be protected with SRTP. 506 A typical case of using media security where two entities are having 507 a VoIP conversation over IP capable networks. However, there are 508 cases where the other end of the communication is not connected to an 509 IP capable network. In this kind of setting, there needs to be some 510 kind of gateway at the edge of the IP network which converts the VoIP 511 conversation to format understood by the other network. An example 512 of such gateway is a PSTN gateway sitting at the edge of IP and PSTN 513 networks (such as the architecture described in [RFC3372]). 515 If media security (e.g., SRTP protection) is employed in this kind of 516 gateway-setting, then media security and the related key management 517 is terminated at the PSTN gateway. The other network (e.g., PSTN) 518 may have its own measures to protect the communication, but this 519 means that from media security point of view the media security is 520 not employed truely end-to-end between the communicating entities. 522 4.5. Call Setup Performance 524 The discussion in this section relates to requirement R-REUSE. 526 Some devices lack sufficient processing power to perform public key 527 operations or Diffie-Hellman operations for each call, or prefer to 528 avoid performing those operations on every call. The ability to re- 529 use previous public key or Diffie-Hellman operations can vastly 530 decrease the call setup delay and processing requirements for such 531 devices. 533 In certain devices, it can take a second or two to perform a Diffie- 534 Hellman operation. Examples of these devices include handsets, IP 535 Multimedia Services Identity Module (ISIMs), and PSTN gateways. PSTN 536 gateways typically utilize a Digital Signal Processor (DSP) which is 537 not yet involved with typical DSP operations at the beginning of a 538 call, thus the DSP could be used to perform the calculation, so as to 539 avoid having the central host processor perform the calculation. 540 However, not all PSTN gateways use DSPs (some have only central 541 processors or their DSPs are incapable of performing the necessary 542 public key or Diffie-Hellman operation), and handsets lack a 543 separate, unused processor to perform these operations. 545 Two scenarios where R-REUSE is useful are calls between an endpoint 546 and its voicemail server or its PSTN gateway. In those scenarios 547 calls are made relatively often and it can be useful for the 548 voicemail server or PSTN gateway to avoid public key operations for 549 subsequent calls. 551 Storing keys across sessions often interferes with perfect forward 552 secrecy (R-PFS). 554 4.6. Transcoding 556 The discussion in this section relates to requirement R-TRANSCODER. 558 In some environments is is necessary for network equipment to 559 transcode from one codec (e.g., a highly compressed codec which makes 560 efficient use of wireless bandwidth) to another codec (e.g., a 561 standardized codec to a SIP peering interface). With RTP, a 562 transcoding function can be performed with the combination of a SIP 563 B2BUA (to modify the SDP) and a processor to perform the transcoding 564 between the codecs. However, with end-to-end secured SRTP, a 565 transcoding function implemented the same way is a man in the middle 566 attack, and the key management system prevents its use. 568 However, such a network-based transcoder can still be realized with 569 the cooperation and approval of the endpoint, and can provide end-to- 570 transcoder and transcoder-to-end security. 572 4.7. Upgrading to SRTP 574 The discussion in this section relates to the requirement R-ALLOW- 575 RTP. 577 Legitimate RTP media can be sent to an endpoint for announcements, 578 colorful ringback tones (e.g., music), advertising, or normal call 579 progress tones. The RTP may be received before an associated SDP 580 answer. For details on various scenarios, see 582 [I-D.stucker-sipping-early-media-coping]. 584 While receiving such RTP exposes the calling party to a risk of 585 receiving malicious RTP from an attacker, SRTP endpoints will need to 586 receive and play out RTP media in order to be compatible with 587 deployed systems that send RTP to calling parties. 589 4.8. Interworking with Other Signaling Protocols 591 The discussion in this section relates to the requirement R-OTHER- 592 SIGNALING. 594 In many environments, some devices are signaled with protocols other 595 than SIP which do not share SIP's offer/answer model (e.g., [H.248.1] 596 or do not utilize SDP (e.g., H.323). In other environments, both 597 endpoints may be SIP, but may use different key management systems 598 (e.g., one uses MIKEY-RSA, the other MIKEY-RSA-R). 600 In these environments, it is desirable to have SRTP -- rather than 601 RTP -- between the two endpoints. It is always possible, although 602 undesirable, to interwork those disparate signaling systems or 603 disparate key management systems by decrypting and re-encrypting each 604 SRTP packet in a device in the middle of the network (often the same 605 device performing the signaling interworking). This is undesirable 606 due to the cost and increased attack area, as such an SRTP/SRTP 607 interworking device is a valuable attack target. 609 At the time of this writing, interworking is considered important. 610 Interworking without decryption/encryption of the SRTP, while useful, 611 is not yet deemed critical because the scale of such SRTP deployments 612 is, to date, relatively small. 614 4.9. Certificates 616 The discussion in this section relates to R-CERTS. 618 On the Internet and on some private networks, validating another 619 peer's certificate is often done through a trust anchor -- a list of 620 Certificate Authorities that are trusted. It can be difficult or 621 expensive for a peer to obtain these certificates. In all cases, 622 both parties to the call would need to trust the same trust anchor 623 (i.e., "certificate authority"). For these reasons, it is important 624 that the media plane key management protocol offer a mechanism that 625 allows end-users who have no prior association to authenticate to 626 each other without acquiring credentials from a third party trust 627 point. Note that this does not rule out mechanisms in which servers 628 have certificates and attest to the identities of end-users. 630 5. Requirements 632 This section is divided into several parts: requirements specific to 633 the key management protocol (Section 5.1), attack scenarios 634 (Section 5.2), and requirements which can be met inside the key 635 management protocol or outside of the key management protocol 636 (Section 5.3). 638 5.1. Key Management Protocol Requirements 640 SIP Forking and Retargeting, from Section 4.2: 642 R-FORK-RETARGET: 643 The media security key management protocol MUST securely 644 support forking and retargeting when all endpoints are willing 645 to use SRTP without causing the call setup to fail. This 646 requirement means the endpoints that did not answer the call 647 MUST NOT learn the SRTP keys (in either direction) used by the 648 answering endpoint. 650 R-DISTINCT: 651 The media security key management protocol MUST be capable of 652 creating distinct, independent cryptographic contexts for each 653 endpoint in a forked session. 655 R-HERFP: 656 The media security key management protocol MUST function 657 securely even in the presence of HERFP behavior. 659 Performance considerations: 661 R-REUSE: 662 The media security key management protocol MAY support the re- 663 use of a previously established security context. 665 Note: re-use of the security context does not imply re- 666 use of RTP parameters (e.g., payload type or SSRC). 668 Media considerations: 670 R-AVOID-CLIPPING: 671 The media security key management protocol SHOULD avoid 672 clipping media before SDP answer without requiring Security 673 Preconditions [RFC5027]. This requirement comes from 674 Section 4.1. 676 R-RTP-VALID: 677 If SRTP key negotiation is performed over the media path (i.e., 678 using the same UDP/TCP ports as media packets), the key 679 negotiation packets MUST NOT pass the RTP validity check 680 defined in Appendix A.1 of [RFC3550]. 682 R-ASSOC: 683 The media security key management protocol SHOULD include a 684 mechanism for associating key management messages with both the 685 signaling traffic that initiated the session and with protected 686 media traffic. It is useful to associate key management 687 messages with call signaling messages, as this allows the SDP 688 offerer to avoid performing CPU-consuming operations (e.g., 689 Diffie-Hellman or public key operations) with attackers that 690 have not seen the signaling messages. 692 For example, if using a Diffie-Hellman keying technique with 693 security preconditions that forks to 20 end points, the call 694 initiator would get 20 provisional responses containing 20 695 signed Diffie-Hellman key pairs. Calculating 20 Diffie-Hellman 696 secrets and validating signatures can be a difficult task for 697 some devices. Hence, in the case of forking, it is not 698 desirable to perform a Diffie-Hellman operation with every 699 party, but rather only with the party that answers the call 700 (and incur some media clipping). To do this, the signaling and 701 media need to be associated so the calling party knows which 702 key management exchange needs to be completed. This might be 703 done by using the transport address indicated in the SDP, 704 although NATs can complicate this association. 706 Note: due to RTP's design requirements, it is expected 707 that SRTP receivers will have to perform authentication 708 of any received SRTP packets. 710 R-NEGOTIATE: 711 The media security key management protocol MUST allow a SIP 712 User Agent to negotiate media security parameters for each 713 individual session. 715 R-PSTN: 716 The media security key management protocol MUST support 717 termination of media security in a PSTN gateway. This 718 requirement is from Section 4.4. 720 5.2. Security Requirements 722 This section describes overall security requirements and specific 723 requirements from the attack scenarios (Section 3). 725 Overall security requirements: 727 R-PFS: 728 The media security key management protocol MUST be able to 729 support perfect forward secrecy. 731 R-COMPUTE: 732 The media security key management protocol MUST support 733 offering additional SRTP cipher suites without incurring 734 significant computational expense. 736 R-CERTS: 737 The key management protocol MUST NOT require that end-users 738 obtain credentials (certificates or private keys) from a third- 739 party trust anchor. 741 R-FIPS: 742 The media security key management protocol SHOULD use 743 algorithms that allow FIPS 140-2 [FIPS-140-2] certification. 745 The United States Government can only purchase and use crypto 746 implementations that have been validated by the FIPS-140 747 [FIPS-140-2] process: 749 "The FIPS-140 standard is applicable to all Federal 750 agencies that use cryptographic-based security systems to 751 protect sensitive information in computer and 752 telecommunication systems, including voice systems. The 753 adoption and use of this standard is available to private 754 and commercial organizations." 756 Some commercial organizations, such as banks and defense 757 contractors, require or prefer equipment which has received the 758 same validation. 760 R-DOS: 761 The media security key management protocol SHOULD NOT introduce 762 new denial of service vulnerabilities (e.g., the protocol 763 should not request the endpoint to perform CPU-intensive 764 operations without the client being able to validate or 765 authorize the request). 767 R-EXISTING: 768 The media security key management protocol SHOULD allow 769 endpoints to authenticate using pre-existing cryptographic 770 credentials, e.g., certificates or pre-shared keys. 772 R-AGILITY: 773 The media security key management protocol MUST provide crypto- 774 agility, i.e., the ability to adapt to evolving cryptography 775 and security requirements (update of cryptographic algorithms 776 without substantial disruption to deployed implementations) 778 R-DOWNGRADE: 779 The media security key management protocol MUST protect cipher 780 suite negotiation against downgrading attacks. 782 R-PASS-MEDIA: 783 The media security key management protocol MUST have a mode 784 which prevents a passive adversary with access to the media 785 path from gaining access to keying material used to protect 786 SRTP media packets. 788 R-PASS-SIG: 789 The media security key management protocol MUST have a mode in 790 which it prevents a passive adversary with access to the 791 signaling path from gaining access to keying material used to 792 protect SRTP media packets. 794 R-SIG-MEDIA: 795 The media security key management protocol MUST have a mode in 796 which it defends itself from an attacker that is solely on the 797 media path and from an attacker that is solely on the signaling 798 path. A successful attack refers to the ability for the 799 adversary to obtain keying material to decrypt the SRTP 800 encrypted media traffic. 802 R-ID-BINDING: 803 The media security key management protocol MUST enable the 804 media security keys to be cryptographically bound to an 805 identity of the endpoint. 807 This allows domains to deploy SIP Identity [RFC4474]. 809 R-ACT-ACT: 810 The media security key management protocol MUST support a mode 811 of operation that provides active-signaling-active-media-detect 812 robustness, and MAY support modes of operation that provide 813 lower levels of robustness (as described in Section 3). 815 Failing to meet R-ACT-ACT indicates the protocol can not 816 provide secure end-to-end media. 818 5.3. Requirements Outside of the Key Management Protocol 820 The requirements in this section are for an overall VoIP security 821 system. These requirements can be met within the key management 822 protocol itself, or can be solved outside of the key management 823 protocol itself (e.g., solved in SIP or in SDP). 825 R-BEST-SECURE: 826 Even when some end points of a forked or retargeted call are 827 incapable of using SRTP, a solution MUST be described which 828 allows the establishment of SRTP associations with SRTP-capable 829 endpoints and / or RTP associations with non-SRTP-capable 830 endpoints. 832 R-OTHER-SIGNALING: 833 A solution SHOULD be able to negotiate keys for SRTP sessions 834 created via different call signaling protocols (e.g., between 835 Jabber, SIP, H.323, MGCP). 837 R-RECORDING: 838 A solution SHOULD be described which supports recording of 839 decrypted media. This requirement comes from Section 4.3. 841 R-TRANSCODER: 842 A solution SHOULD be described which supports intermediate 843 nodes (e.g., transcoders), terminating or processing media, 844 between the end points. 846 R-ALLOW-RTP: A solution SHOULD be described which allows RTP media 847 to be received by the calling party until SRTP has been 848 negotiated with the answerer, after which SRTP is preferred 849 over RTP. 851 6. Security Considerations 853 This document lists requirements for securing media traffic. As 854 such, it addresses security throughout the document. 856 7. IANA Considerations 858 This document does not require actions by IANA. 860 8. Acknowledgements 862 For contributions to the requirements portion of this document, the 863 authors would like to thank the active participants of the RTPSEC BoF 864 and on the RTPSEC mailing list, and a special thanks to Steffen Fries 865 and Dragan Ignjatic for their excellent MIKEY comparison document 866 [I-D.ietf-msec-mikey-applicability]. 868 The authors would furthermore like to thank the following people for 869 their review, suggestions, and comments: Flemming Andreasen, Richard 870 Barnes, Richard Barnes, Mark Baugher, Wolfgang Buecker, Werner 871 Dittmann, Lakshminath Dondeti, John Elwell, Martin Euchner, Steffen 872 Fries, Hans-Heinrich Grusdt, Christer Holmberg, Guenther Horn, Peter 873 Howard, Leo Huang, Dragan Ignjatic, Cullen Jennings, Alan Johnston, 874 Vesa Lehtovirta, Matt Lepinski, David McGrew, David Oran, Colin 875 Perkins, Eric Raymond, Eric Rescorla, Peter Schneider, Srinath 876 Thiruvengadam, Dave Ward, Dan York, and Phil Zimmermann. 878 9. References 880 9.1. Normative References 882 [FIPS-140-2] 883 NIST, "Security Requirements for Cryptographic Modules", 884 June 2005, . 887 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 888 Requirement Levels", BCP 14, RFC 2119, March 1997. 890 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 891 A., Peterson, J., Sparks, R., Handley, M., and E. 892 Schooler, "SIP: Session Initiation Protocol", RFC 3261, 893 June 2002. 895 [RFC3262] Rosenberg, J. and H. Schulzrinne, "Reliability of 896 Provisional Responses in Session Initiation Protocol 897 (SIP)", RFC 3262, June 2002. 899 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 900 with Session Description Protocol (SDP)", RFC 3264, 901 June 2002. 903 [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 904 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 905 RFC 3711, March 2004. 907 [cryptval] 908 NIST, "Cryptographic Module Validation Program", 909 December 2006, 910 . 912 9.2. Informative References 914 [H.248.1] ITU, "Gateway control protocol", June 2000, 915 . 917 [I-D.baugher-mmusic-sdp-dh] 918 Baugher, M. and D. McGrew, "Diffie-Hellman Exchanges for 919 Multimedia Sessions", draft-baugher-mmusic-sdp-dh-00 (work 920 in progress), February 2006. 922 [I-D.dondeti-msec-rtpsec-mikeyv2] 923 Dondeti, L., "MIKEYv2: SRTP Key Management using MIKEY, 924 revisited", draft-dondeti-msec-rtpsec-mikeyv2-01 (work in 925 progress), March 2007. 927 [I-D.fischl-sipping-media-dtls] 928 Fischl, J., "Datagram Transport Layer Security (DTLS) 929 Protocol for Protection of Media Traffic Established with 930 the Session Initiation Protocol", 931 draft-fischl-sipping-media-dtls-03 (work in progress), 932 July 2007. 934 [I-D.ietf-avt-dtls-srtp] 935 McGrew, D. and E. Rescorla, "Datagram Transport Layer 936 Security (DTLS) Extension to Establish Keys for Secure 937 Real-time Transport Protocol (SRTP)", 938 draft-ietf-avt-dtls-srtp-02 (work in progress), 939 February 2008. 941 [I-D.ietf-mmusic-ice] 942 Rosenberg, J., "Interactive Connectivity Establishment 943 (ICE): A Protocol for Network Address Translator (NAT) 944 Traversal for Offer/Answer Protocols", 945 draft-ietf-mmusic-ice-19 (work in progress), October 2007. 947 [I-D.ietf-mmusic-media-path-middleboxes] 948 Stucker, B. and H. Tschofenig, "Analysis of Middlebox 949 Interactions for Signaling Protocol Communication along 950 the Media Path", 951 draft-ietf-mmusic-media-path-middleboxes-00 (work in 952 progress), January 2008. 954 [I-D.ietf-mmusic-sdp-capability-negotiation] 955 Andreasen, F., "SDP Capability Negotiation", 956 draft-ietf-mmusic-sdp-capability-negotiation-08 (work in 957 progress), December 2007. 959 [I-D.ietf-msec-mikey-applicability] 960 Fries, S. and D. Ignjatic, "On the applicability of 961 various MIKEY modes and extensions", 962 draft-ietf-msec-mikey-applicability-09 (work in progress), 963 March 2008. 965 [I-D.ietf-msec-mikey-ecc] 966 Milne, A., "ECC Algorithms for MIKEY", 967 draft-ietf-msec-mikey-ecc-03 (work in progress), 968 June 2007. 970 [I-D.ietf-sip-certs] 971 Jennings, C. and J. Fischl, "Certificate Management 972 Service for The Session Initiation Protocol (SIP)", 973 draft-ietf-sip-certs-06 (work in progress), April 2008. 975 [I-D.ietf-tls-rfc4346-bis] 976 Dierks, T. and E. Rescorla, "The Transport Layer Security 977 (TLS) Protocol Version 1.2", draft-ietf-tls-rfc4346-bis-10 978 (work in progress), March 2008. 980 [I-D.jennings-sipping-multipart] 981 Wing, D. and C. Jennings, "Session Initiation Protocol 982 (SIP) Offer/Answer with Multipart Alternative", 983 draft-jennings-sipping-multipart-02 (work in progress), 984 March 2006. 986 [I-D.mcgrew-srtp-ekt] 987 McGrew, D., "Encrypted Key Transport for Secure RTP", 988 draft-mcgrew-srtp-ekt-03 (work in progress), July 2007. 990 [I-D.stucker-sipping-early-media-coping] 991 Stucker, B., "Coping with Early Media in the Session 992 Initiation Protocol (SIP)", 993 draft-stucker-sipping-early-media-coping-03 (work in 994 progress), October 2006. 996 [I-D.wing-sipping-srtp-key] 997 Wing, D., Audet, F., Fries, S., Tschofenig, H., and A. 998 Johnston, "Secure Media Recording and Transcoding with the 999 Session Initiation Protocol", 1000 draft-wing-sipping-srtp-key-03 (work in progress), 1001 February 2008. 1003 [I-D.zimmermann-avt-zrtp] 1004 Zimmermann, P., Johnston, A., and J. Callas, "ZRTP: Media 1005 Path Key Agreement for Secure RTP", 1006 draft-zimmermann-avt-zrtp-06 (work in progress), 1007 March 2008. 1009 [RFC3326] Schulzrinne, H., Oran, D., and G. Camarillo, "The Reason 1010 Header Field for the Session Initiation Protocol (SIP)", 1011 RFC 3326, December 2002. 1013 [RFC3372] Vemuri, A. and J. Peterson, "Session Initiation Protocol 1014 for Telephones (SIP-T): Context and Architectures", 1015 BCP 63, RFC 3372, September 2002. 1017 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 1018 Jacobson, "RTP: A Transport Protocol for Real-Time 1019 Applications", STD 64, RFC 3550, July 2003. 1021 [RFC3830] Arkko, J., Carrara, E., Lindholm, F., Naslund, M., and K. 1022 Norrman, "MIKEY: Multimedia Internet KEYing", RFC 3830, 1023 August 2004. 1025 [RFC4474] Peterson, J. and C. Jennings, "Enhancements for 1026 Authenticated Identity Management in the Session 1027 Initiation Protocol (SIP)", RFC 4474, August 2006. 1029 [RFC4492] Blake-Wilson, S., Bolyard, N., Gupta, V., Hawk, C., and B. 1030 Moeller, "Elliptic Curve Cryptography (ECC) Cipher Suites 1031 for Transport Layer Security (TLS)", RFC 4492, May 2006. 1033 [RFC4568] Andreasen, F., Baugher, M., and D. Wing, "Session 1034 Description Protocol (SDP) Security Descriptions for Media 1035 Streams", RFC 4568, July 2006. 1037 [RFC4650] Euchner, M., "HMAC-Authenticated Diffie-Hellman for 1038 Multimedia Internet KEYing (MIKEY)", RFC 4650, 1039 September 2006. 1041 [RFC4738] Ignjatic, D., Dondeti, L., Audet, F., and P. Lin, "MIKEY- 1042 RSA-R: An Additional Mode of Key Distribution in 1043 Multimedia Internet KEYing (MIKEY)", RFC 4738, 1044 November 2006. 1046 [RFC4771] Lehtovirta, V., Naslund, M., and K. Norrman, "Integrity 1047 Transform Carrying Roll-Over Counter for the Secure Real- 1048 time Transport Protocol (SRTP)", RFC 4771, January 2007. 1050 [RFC4916] Elwell, J., "Connected Identity in the Session Initiation 1051 Protocol (SIP)", RFC 4916, June 2007. 1053 [RFC4949] Shirey, R., "Internet Security Glossary, Version 2", 1054 RFC 4949, August 2007. 1056 [RFC5027] Andreasen, F. and D. Wing, "Security Preconditions for 1057 Session Description Protocol (SDP) Media Streams", 1058 RFC 5027, October 2007. 1060 Appendix A. Overview and Evaluation of Existing Keying Mechanisms 1062 Based on how the SRTP keys are exchanged, each SRTP key exchange 1063 mechanism belongs to one general category: 1065 signaling path: 1066 All the keying is carried in the call signaling (SIP or SDP) 1067 path. 1069 media path: 1070 All the keying is carried in the SRTP/SRTCP media path, and no 1071 signaling whatsoever is carried in the call signaling path. 1073 signaling and media path: 1074 Parts of the keying are carried in the SRTP/SRTCP media path, 1075 and parts are carried in the call signaling (SIP or SDP) path. 1077 One of the significant benefits of SRTP over other end-to-end 1078 encryption mechanisms, such as for example IPsec, is that SRTP is 1079 bandwidth efficient and SRTP retains the header of RTP packets. 1080 Bandwidth efficiency is vital for VoIP in many scenarios where access 1081 bandwidth is limited or expensive, and retaining the RTP header is 1082 important for troubleshooting packet loss, delay, and jitter. 1084 Related to SRTP's characteristics is a goal that any SRTP keying 1085 mechanism to also be efficient and not cause additional call setup 1086 delay. Contributors to additional call setup delay include network 1087 or database operations: retrieval of certificates and additional SIP 1088 or media path messages, and computational overhead of establishing 1089 keys or validating certificates. 1091 When examining the choice between keying in the signaling path, 1092 keying in the media path, or keying in both paths, it is important to 1093 realize the media path is generally 'faster' than the SIP signaling 1094 path. The SIP signaling path has computational elements involved 1095 which parse and route SIP messages. The media path, on the other 1096 hand, does not normally have computational elements involved, and 1097 even when computational elements such as firewalls are involved, they 1098 cause very little additional delay. Thus, the media path can be 1099 useful for exchanging several messages to establish SRTP keys. A 1100 disadvantage of keying over the media path is that interworking 1101 different key exchange requires the interworking function be in the 1102 media path, rather than just in the signaling path; in practice this 1103 involvement is probably unavoidable anyway. 1105 A.1. Signaling Path Keying Techniques 1107 A.1.1. MIKEY-NULL 1109 MIKEY-NULL [RFC3830] has the offerer indicate the SRTP keys for both 1110 directions. The key is sent unencrypted in SDP, which means the SDP 1111 must be encrypted hop-by-hop (e.g., by using TLS (SIPS)) or end-to- 1112 end (e.g., by using S/MIME). 1114 MIKEY-NULL requires one message from offerer to answerer (half a 1115 round trip), and does not add additional media path messages. 1117 A.1.2. MIKEY-PSK 1119 MIKEY-PSK (pre-shared key) [RFC3830] requires that all endpoints 1120 share one common key. MIKEY-PSK has the offerer encrypt the SRTP 1121 keys for both directions using this pre-shared key. 1123 MIKEY-PSK requires one message from offerer to answerer (half a round 1124 trip), and does not add additional media path messages. 1126 A.1.3. MIKEY-RSA 1128 MIKEY-RSA [RFC3830] has the offerer encrypt the keys for both 1129 directions using the intended answerer's public key, which is 1130 obtained from a mechanism outside of MIKEY. 1132 MIKEY-RSA requires one message from offerer to answerer (half a round 1133 trip), and does not add additional media path messages. MIKEY-RSA 1134 requires the offerer to obtain the intended answerer's certificate. 1136 A.1.4. MIKEY-RSA-R 1138 MIKEY-RSA-R [RFC4738] is essentially the same as MIKEY-RSA but 1139 reverses the role of the offerer and the answerer with regards to 1140 providing the keys. That is, the answerer encrypts the keys for both 1141 directions using the offerer's public key. Both the offerer and 1142 answerer validate each other's public keys using a standard X.509 1143 validation techniques. MIKEY-RSA-R also enables sending certificates 1144 in the MIKEY message. 1146 MIKEY-RSA-R requires one message from offerer to answer, and one 1147 message from answerer to offerer (full round trip), and does not add 1148 additional media path messages. MIKEY-RSA-R requires the offerer 1149 validate the answerer's certificate. 1151 A.1.5. MIKEY-DHSIGN 1153 In MIKEY-DHSIGN [RFC3830] the offerer and answerer derive the key 1154 from a Diffie-Hellman exchange. In order to prevent an active man- 1155 in-the-middle the DH exchange itself is signed using each endpoint's 1156 private key and the associated public keys are validated using 1157 standard X.509 validation techniques. 1159 MIKEY-DHSIGN requires one message from offerer to answerer, and one 1160 message from answerer to offerer (full round trip), and does not add 1161 additional media path messages. MIKEY-DHSIGN requires the offerer 1162 and answerer to validate each other's certificates. MIKEY-DHSIGN 1163 also enables sending the answerer's certificate in the MIKEY message. 1165 A.1.6. MIKEY-DHHMAC 1167 MIKEY-DHHMAC [RFC4650] uses a pre-shared secret to HMAC the Diffie- 1168 Hellman exchange, essentially combining aspects of MIKEY-PSK with 1169 MIKEY-DHSIGN, but without MIKEY-DHSIGN's need for certificate 1170 authentication. 1172 MIKEY-DHHMAC requires one message from offerer to answerer, and one 1173 message from answerer to offerer (full round trip), and does not add 1174 additional media path messages. 1176 A.1.7. MIKEY-ECIES and MIKEY-ECMQV (MIKEY-ECC) 1178 ECC Algorithms For MIKEY [I-D.ietf-msec-mikey-ecc] describes how ECC 1179 can be used with MIKEY-RSA (using ECDSA signature) and with MIKEY- 1180 DHSIGN (using a new DH-Group code), and also defines two new ECC- 1181 based algorithms, Elliptic Curve Integrated Encryption Scheme (ECIES) 1182 and Elliptic Curve Menezes-Qu-Vanstone (ECMQV) . 1184 With this proposal, the ECDSA signature, MIKEY-ECIES, and MIKEY-ECMQV 1185 function exactly like MIKEY-RSA, and the new DH-Group code function 1186 exactly like MIKEY-DHSIGN. Therefore these ECC mechanisms are not 1187 discussed separately in this document. 1189 A.1.8. Security Descriptions with SIPS 1191 Security Descriptions [RFC4568] has each side indicate the key it 1192 will use for transmitting SRTP media, and the keys are sent in the 1193 clear in SDP. Security Descriptions relies on hop-by-hop (TLS via 1194 "SIPS:") encryption to protect the keys exchanged in signaling. 1196 Security Descriptions requires one message from offerer to answerer, 1197 and one message from answerer to offerer (full round trip), and does 1198 not add additional media path messages. 1200 A.1.9. Security Descriptions with S/MIME 1202 This keying mechanism is identical to Appendix A.1.8, except that 1203 rather than protecting the signaling with TLS, the entire SDP is 1204 encrypted with S/MIME. 1206 A.1.10. SDP-DH (expired) 1208 SDP Diffie-Hellman [I-D.baugher-mmusic-sdp-dh] exchanges Diffie- 1209 Hellman messages in the signaling path to establish session keys. To 1210 protect against active man-in-the-middle attacks, the Diffie-Hellman 1211 exchange needs to be protected with S/MIME, SIPS, or SIP Identity 1212 [RFC4474] and SIP Conected Identity [RFC4916]. 1214 SDP-DH requires one message from offerer to answerer, and one message 1215 from answerer to offerer (full round trip), and does not add 1216 additional media path messages. 1218 A.1.11. MIKEYv2 in SDP (expired) 1220 MIKEYv2 [I-D.dondeti-msec-rtpsec-mikeyv2] adds mode negotiation to 1221 MIKEYv1 and removes the time synchronization requirement. It 1222 therefore now takes 2 round-trips to complete. In the first round 1223 trip, the communicating parties learn each other's identities, agree 1224 on a MIKEY mode, crypto algorithm, SRTP policy, and exchanges nonces 1225 for replay protection. In the second round trip, they negotiate 1226 unicast and/or group SRTP context for SRTP and/or SRTCP. 1228 Furthemore, MIKEYv2 also defines an in-band negotiation mode as an 1229 alternative to SDP (see Appendix A.3.3). 1231 A.2. Media Path Keying Technique 1233 A.2.1. ZRTP 1235 ZRTP [I-D.zimmermann-avt-zrtp] does not exchange information in the 1236 signaling path (although it's possible for endpoints to exchange a 1237 hash of the ZRTP Hello message with "a=zrtp-hash" in the initial 1238 Offer if sent over an integrity-protected signaling channel. This 1239 provides some useful correlation between the signaling and media 1240 layers). In ZRTP the keys are exchanged entirely in the media path 1241 using a Diffie-Hellman exchange. The advantage to this mechanism is 1242 that the signaling channel is used only for call setup and the media 1243 channel is used to establish an encrypted channel -- much like 1244 encryption devices on the PSTN. ZRTP uses voice authentication of 1245 its Diffie-Hellman exchange by having each person read digits or 1246 words to the other person. Subsequent sessions with the same ZRTP 1247 endpoint can be authenticated using the stored hash of the previously 1248 negotiated key rather than voice authentication. ZRTP uses 4 media 1249 path messages (Hello, Commit, DHPart1, and DHPart2) to establish the 1250 SRTP key, and 3 media path confirmation messages. These initial 1251 messages are all sent as non-RTP packets. 1253 Note that when ZRTP probing is used, unencrypted RTP can be 1254 exchanged until the SRTP keys are established. 1256 A.3. Signaling and Media Path Keying Techniques 1258 A.3.1. EKT 1260 EKT [I-D.mcgrew-srtp-ekt] relies on another SRTP key exchange 1261 protocol, such as Security Descriptions or MIKEY, for bootstrapping. 1262 In the initial phase, each member of a conference uses an SRTP key 1263 exchange protocol to establish a common key encryption key (KEK). 1264 Each member may use the KEK to securely transport its SRTP master key 1265 and current SRTP rollover counter (ROC), via RTCP, to the other 1266 participants in the session. 1268 EKT requires the offerer to send some parameters (EKT_Cipher, KEK, 1269 and security parameter index (SPI)) via the bootstrapping protocol 1270 such as Security Descriptions or MIKEY. Each answerer sends an SRTCP 1271 message which contains the answerer's SRTP Master Key, rollover 1272 counter, and the SRTP sequence number. Rekeying is done by sending a 1273 new SRTCP message. For reliable transport, multiple RTCP messages 1274 need to be sent. 1276 A.3.2. DTLS-SRTP 1278 DTLS-SRTP [I-D.ietf-avt-dtls-srtp] exchanges public key fingerprints 1279 in SDP [I-D.fischl-sipping-media-dtls] and then establishes a DTLS 1280 session over the media channel. The endpoints use the DTLS handshake 1281 to agree on crypto suites and establish SRTP session keys. SRTP 1282 packets are then exchanged between the endpoints. 1284 DTLS-SRTP requires one message from offerer to answerer (half round 1285 trip), and one message from the answerer to offerer (full round trip) 1286 so the offerer can correlate the SDP answer with the answering 1287 endpoint. DTLS-SRTP uses 4 media path messages to establish the SRTP 1288 key. 1290 This document assumes DTLS will use TLS_RSA_WITH_AES_128_CBC_SHA as 1291 its cipher suite, which is the mandatory-to-implement cipher suite in 1292 TLS [I-D.ietf-tls-rfc4346-bis]. 1294 A.3.3. MIKEYv2 Inband (expired) 1296 As defined in Appendix A.1.11, MIKEYv2 also defines an in-band 1297 negotiation mode as an alternative to SDP (see Appendix A.3.3). The 1298 details are not sorted out in the draft yet on what in-band actually 1299 means (i.e., UDP, RTP, RTCP, etc.). 1301 A.4. Evaluation Criteria - SIP 1303 This section considers how each keying mechanism interacts with SIP 1304 features. 1306 A.4.1. Secure Retargeting and Secure Forking 1308 Retargeting and forking of signaling requests is described within 1309 Section 4.2. The following builds upon this description. 1311 The following list compares the behavior of secure forking, answering 1312 association, two-time pads, and secure retargeting for each keying 1313 mechanism. 1315 MIKEY-NULL Secure Forking: No, all AORs see offerer's and 1316 answerer's keys. Answer is associated with media by the SSRC 1317 in MIKEY. Additionally, a two-time pad occurs if two branches 1318 choose the same 32-bit SSRC and transmit SRTP packets. 1320 Secure Retargeting: No, all targets see offerer's and 1321 answerer's keys. Suffers from retargeting identity problem. 1323 MIKEY-PSK 1324 Secure Forking: No, all AORs see offerer's and answerer's keys. 1325 Answer is associated with media by the SSRC in MIKEY. Note 1326 that all AORs must share the same pre-shared key in order for 1327 forking to work at all with MIKEY-PSK. Additionally, a two- 1328 time pad occurs if two branches choose the same 32-bit SSRC and 1329 transmit SRTP packets. 1331 Secure Retargeting: Not secure. For retargeting to work, the 1332 final target must possess the correct PSK. As this is likely 1333 in scenarios were the call is targeted to another device 1334 belonging to the same user (forking), it is very unlikely that 1335 other users will possess that PSK and be able to successfully 1336 answer that call. 1338 MIKEY-RSA 1339 Secure Forking: No, all AORs see offerer's and answerer's keys. 1340 Answer is associated with media by the SSRC in MIKEY. Note 1341 that all AORs must share the same private key in order for 1342 forking to work at all with MIKEY-RSA. Additionally, a two- 1343 time pad occurs if two branches choose the same 32-bit SSRC and 1344 transmit SRTP packets. 1346 Secure Retargeting: No. 1348 MIKEY-RSA-R 1349 Secure Forking: Yes. Answer is associated with media by the 1350 SSRC in MIKEY. 1352 Secure Retargeting: Yes. 1354 MIKEY-DHSIGN 1355 Secure Forking: Yes, each forked endpoint negotiates unique 1356 keys with the offerer for both directions. Answer is 1357 associated with media by the SSRC in MIKEY. 1359 Secure Retargeting: Yes, each target negotiates unique keys 1360 with the offerer for both directions. 1362 MIKEYv2 in SDP 1363 The behavior will depend on which mode is picked. 1365 MIKEY-DHHMAC 1366 Secure Forking: Yes, each forked endpoint negotiates unique 1367 keys with the offerer for both directions. Answer is 1368 associated with media by the SSRC in MIKEY. 1370 Secure Retargeting: Yes, each target negotiates unique keys 1371 with the offerer for both directions. Note that for the keys 1372 to be meaningful, it would require the PSK to be the same for 1373 all the potential intermediaries, which would only happen 1374 within a single domain. 1376 Security Descriptions with SIPS 1377 Secure Forking: No. Each forked endpoint sees the offerer's 1378 key. Answer is not associated with media. 1380 Secure Retargeting: No. Each target sees the offerer's key. 1382 Security Descriptions with S/MIME 1383 Secure Forking: No. Each forked endpoint sees the offerer's 1384 key. Answer is not associated with media. 1386 Secure Retargeting: No. Each target sees the offerer's key. 1387 Suffers from retargeting identity problem. 1389 SDP-DH 1390 Secure Forking: Yes. Each forked endpoint calculates a unique 1391 SRTP key. Answer is not associated with media. 1393 Secure Retargeting: Yes. The final target calculates a unique 1394 SRTP key. 1396 ZRTP 1397 Yes. Each forked endpoint calculates a unique SRTP key. With 1398 the "a=zrtp-hash" attribute, the media can be associated with 1399 an answer. 1401 Secure Retargeting: Yes. The final target calculates a unique 1402 SRTP key. 1404 EKT 1405 Secure Forking: Inherited from the bootstrapping mechanism (the 1406 specific MIKEY mode or Security Descriptions). Answer is 1407 associated with media by the SPI in the EKT protocol. Answer 1408 is associated with media by the SPI in the EKT protocol. 1410 Secure Retargeting: Inherited from the bootstrapping mechanism 1411 (the specific MIKEY mode or Security Descriptions). 1413 DTLS-SRTP 1414 Secure Forking: Yes. Each forked endpoint calculates a unique 1415 SRTP key. Answer is associated with media by the certificate 1416 fingerprint in signaling and certificate in the media path. 1418 Secure Retargeting: Yes. The final target calculates a unique 1419 SRTP key. 1421 MIKEYv2 Inband 1422 The behavior will depend on which mode is picked. 1424 A.4.2. Clipping Media Before SDP Answer 1426 Clipping media before receiving the signaling answer is described 1427 within Section 4.1. The following builds upon this description. 1429 Furthermore, the problem of clipping gets compounded when forking is 1430 used. For example, if using a Diffie-Hellman keying technique with 1431 security preconditions that forks to 20 endpoints, the call initiator 1432 would get 20 provisional responses containing 20 signed Diffie- 1433 Hellman half keys. Calculating 20 DH secrets and validating 1434 signatures can be a difficult task depending on the device 1435 capabilities. 1437 The following list compares the behavior of clipping before SDP 1438 answer for each keying mechanism. 1440 MIKEY-NULL 1441 Not clipped. The offerer provides the answerer's keys. 1443 MIKEY-PSK 1444 Not clipped. The offerer provides the answerer's keys. 1446 MIKEY-RSA 1447 Not clipped. The offerer provides the answerer's keys. 1449 MIKEY-RSA-R 1450 Clipped. The answer contains the answerer's encryption key. 1452 MIKEY-DHSIGN 1453 Clipped. The answer contains the answerer's Diffie-Hellman 1454 response. 1456 MIKEY-DHHMAC 1457 Clipped. The answer contains the answerer's Diffie-Hellman 1458 response. 1460 MIKEYv2 in SDP 1461 The behavior will depend on which mode is picked. 1463 Security Descriptions with SIPS 1464 Clipped. The answer contains the answerer's encryption key. 1466 Security Descriptions with S/MIME 1467 Clipped. The answer contains the answerer's encryption key. 1469 SDP-DH 1470 Clipped. The answer contains the answerer's Diffie-Hellman 1471 response. 1473 ZRTP 1474 Not clipped because the session intially uses RTP. While RTP 1475 is flowing, both ends negotiate SRTP keys in the media path and 1476 then switch to using SRTP. 1478 EKT 1479 Not clipped, as long as the first RTCP packet (containing the 1480 answerer's key) is not lost in transit. The answerer sends its 1481 encryption key in RTCP, which arrives at the same time (or 1482 before) the first SRTP packet encrypted with that key. 1484 Note: RTCP needs to work, in the answerer-to-offerer 1485 direction, before the offerer can decrypt SRTP media. 1487 DTLS-SRTP 1488 No clipping after the DTLS-SRTP handshake has completed. SRTP 1489 keys are exchanged in the media path. Need to wait for SDP 1490 answer to ensure DTLS-SRTP handshake was done with an 1491 authorized party. 1493 If a middlebox interferes with the media path, there can be 1494 clipping [I-D.ietf-mmusic-media-path-middleboxes]. 1496 MIKEYv2 Inband 1497 Not clipped. Keys are exchanged in the media path without 1498 relying on the signaling path. 1500 A.4.3. SSRC and ROC 1502 In SRTP, a cryptographic context is defined as the SSRC, destination 1503 network address, and destination transport port number. Whereas RTP, 1504 a flow is defined as the destination network address and destination 1505 transport port number. This results in a problem -- how to 1506 communicate the SSRC so that the SSRC can be used for the 1507 cryptographic context. 1509 Two approaches have emerged for this communication. One, used by all 1510 MIKEY modes, is to communicate the SSRCs to the peer in the MIKEY 1511 exchange. Another, used by Security Descriptions, is to use "late 1512 bindng" -- that is, any new packet containing a previously-unseen 1513 SSRC (which arrives at the same destination network address and 1514 destination transport port number) will create a new cryptographic 1515 context. Another approach, common amongst techniques with media-path 1516 SRTP key establishment, is to require a handshake over that media 1517 path before SRTP packets are sent. MIKEY's approach changes RTP's 1518 SSRC collision detection behavior by requiring RTP to pre-establish 1519 the SSRC values for each session. 1521 Another related issue is that SRTP introduces a rollover counter 1522 (ROC), which records how many times the SRTP sequence number has 1523 rolled over. As the sequence number is used for SRTP's default 1524 ciphers, it is important that all endpoints know the value of the 1525 ROC. The ROC starts at 0 at the beginning of a session. 1527 Some keying mechanisms cause a two-time pad to occur if two endpoints 1528 of a forked call have an SSRC collision. 1530 Note: A proposal has been made to send the ROC value on every Nth 1531 SRTP packet[RFC4771]. This proposal has not yet been incorporated 1532 into this document. 1534 The following list examines handling of SSRC and ROC: 1536 MIKEY-NULL 1537 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1538 packets it transmits. 1540 MIKEY-PSK 1541 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1542 packets it transmits. 1544 MIKEY-RSA 1545 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1546 packets it transmits. 1548 MIKEY-RSA-R 1549 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1550 packets it transmits. 1552 MIKEY-DHSIGN 1553 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1554 packets it transmits. 1556 MIKEY-DHHMAC 1557 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1558 packets it transmits. 1560 MIKEYv2 in SDP 1561 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1562 packets it transmits. 1564 Security Descriptions with SIPS 1565 Neither SSRC nor ROC are signaled. SSRC 'late binding' is 1566 used. 1568 Security Descriptions with S/MIME 1569 Neither SSRC nor ROC are signaled. SSRC 'late binding' is 1570 used. 1572 SDP-DH 1573 Neither SSRC nor ROC are signaled. SSRC 'late binding' is 1574 used. 1576 ZRTP 1577 Neither SSRC nor ROC are signaled. SSRC 'late binding' is 1578 used. 1580 EKT 1581 The SSRC of the SRTCP packet containing an EKT update 1582 corresponds to the SRTP master key and other parameters within 1583 that packet. 1585 DTLS-SRTP 1586 Neither SSRC nor ROC are signaled. SSRC 'late binding' is 1587 used. 1589 MIKEYv2 Inband 1590 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1591 packets it transmits. 1593 A.5. Evaluation Criteria - Security 1595 This section evaluates each keying mechanism on the basis of their 1596 security properties. 1598 A.5.1. Distribution and Validation of Persistent Public Keys and 1599 Certificates 1601 Using persistent public keys for confidentiality and authentication 1602 can introduce requirements for two types of systems, often 1603 implemented using certificates: (1) a system to distribute those 1604 persistent public keys certificates, and (2) a system for validating 1605 those persistent public keys. We refer to the former as a key 1606 distribution system and the latter as an authentication 1607 infrastructure. In many cases, a monolithic public key 1608 infrastructure (PKI) is used for fulfill both of these roles. 1609 However, these functions can be provided by many other systems. For 1610 instance, key distribution may be accomplished by any public 1611 repository of keys. Any system in which the two endpoints have 1612 access to trust anchors and intermediate CA certificates that can be 1613 used to validate other endpoints' certificates (including a system of 1614 self-signed certificates) can be used to support certificate 1615 validation in the below schemes. 1617 With real-time communications it is desirable to avoid fetching or 1618 validating certificates that delay call setup. Rather, it is 1619 preferable to fetch or validate certificates in such a way that call 1620 setup is not delayed. For example, a certificate can be validated 1621 while the phone is ringing or can be validated while ring-back tones 1622 are being played or even while the called party is answering the 1623 phone and saying "hello". Even better is to avoid fetching or 1624 validating persistent public keys at all. 1626 SRTP key exchange mechanisms that require a particular authentication 1627 infrastructure to operate (whether for distribution or validation) 1628 are gated on the deployment of a such an infrastructure available to 1629 both endpoints. This means that no media security is achievable 1630 until such an infrastructure exists. For SIP, something like sip- 1631 certs [I-D.ietf-sip-certs] might be used to obtain the certificate of 1632 a peer. 1634 Note: Even if sip-certs [I-D.ietf-sip-certs] was deployed, the 1635 retargeting problem (Appendix A.4.1) would still prevent 1636 successful deployment of keying techniques which require the 1637 offerer to obtain the actual target's public key. 1639 The following list compares the requirements introduced by the use of 1640 public-key cryptography in each keying mechanism, both for public key 1641 distribution and for certificate validation. 1643 MIKEY-NULL 1644 Public-key cryptography is not used. 1646 MIKEY-PSK 1647 Public-key cryptography is not used. Rather, all endpoints 1648 must have some way to exchange per-endpoint or per-system pre- 1649 shared keys. 1651 MIKEY-RSA 1652 The offerer obtains the intended answerer's public key before 1653 initiating the call. This public key is used to encrypt the 1654 SRTP keys. There is no defined mechanism for the offerer to 1655 obtain the answerer's public key, although [I-D.ietf-sip-certs] 1656 might be viable in the future. 1658 The offer may also contain a certificate for the offeror, which 1659 would require an authentication infrastructure in order to be 1660 validated by the receiver. 1662 MIKEY-RSA-R 1663 The offer contains the offerer's certificate, and the answer 1664 contains the answerer's certificate. The answerer uses the 1665 public key in the certificate to encrypt the SRTP keys that 1666 will be used by the offerer and the answerer. An 1667 authentication infrastructure is necessary to validate the 1668 certificates. 1670 MIKEY-DHSIGN 1671 An authentication infrastructure is used to authenticate the 1672 public key that is included in the MIKEY message. 1674 MIKEY-DHHMAC 1675 Public-key cryptography is not used. Rather, all endpoints 1676 must have some way to exchange per-endpoint or per-system pre- 1677 shared keys. 1679 MIKEYv2 in SDP 1680 The behavior will depend on which mode is picked. 1682 Security Descriptions with SIPS 1683 Public-key cryptography is not used. 1685 Security Descriptions with S/MIME 1686 Use of S/MIME requires that the endpoints be able to fetch and 1687 validate certificates for each other. The offerer must obtain 1688 the intended target's certificate and encrypts the SDP offer 1689 with the public key contained in target's certificate. The 1690 answerer must obtain the offerer's certificate and encrypt the 1691 SDP answer with the public key contained in the offerer's 1692 certificate. 1694 SDP-DH 1695 Public-key cryptography is not used. 1697 ZRTP 1698 Public-key cryptography is used (Diffie-Hellman), but without 1699 dependence on persistent public keys. Thus, certificates are 1700 not fetched or validated. 1702 EKT 1703 Public-key cryptography is not used by itself, but might be 1704 used by the EKT bootstrapping keying mechanism (such as certain 1705 MIKEY modes). 1707 DTLS-SRTP 1708 Remote party's certificate is sent in media path, and a 1709 fingerprint of the same certificate is sent in the signaling 1710 path. 1712 MIKEYv2 Inband 1713 The behavior will depend on which mode is picked. 1715 A.5.2. Perfect Forward Secrecy 1717 In the context of SRTP, Perfect Forward Secrecy is the property that 1718 SRTP session keys that protected a previous session are not 1719 compromised if the static keys belonging to the endpoints are 1720 compromised. That is, if someone were to record your encrypted 1721 session content and later acquires either party's private key, that 1722 encrypted session content would be safe from decryption if your key 1723 exchange mechanism had perfect forward secrecy. 1725 The following list describes how each key exchange mechanism provides 1726 PFS. 1728 MIKEY-NULL 1729 Not applicable; MIKEY-NULL does not have a long-term secret. 1731 MIKEY-PSK 1732 No PFS. 1734 MIKEY-RSA 1735 No PFS. 1737 MIKEY-RSA-R 1738 No PFS. 1740 MIKEY-DHSIGN 1741 PFS is provided with the Diffie-Hellman exchange. 1743 MIKEY-DHHMAC 1744 PFS is provided with the Diffie-Hellman exchange. 1746 MIKEYv2 in SDP 1747 The behavior will depend on which mode is picked. 1749 Security Descriptions with SIPS 1750 Not applicable; Security Descriptions does not have a long-term 1751 secret. 1753 Security Descriptions with S/MIME 1754 Not applicable; Security Descriptions does not have a long-term 1755 secret. 1757 SDP-DH 1758 PFS is provided with the Diffie-Hellman exchange. 1760 ZRTP 1761 PFS is provided with the Diffie-Hellman exchange. 1763 EKT 1764 No PFS. 1766 DTLS-SRTP 1767 PFS is provided if the negotiated cipher suite uses ephemeral 1768 keys (e.g., Diffie-Hellman (DHE_RSA [I-D.ietf-tls-rfc4346-bis]) 1769 or Elliptic Curve Diffie-Hellman [RFC4492]). 1771 MIKEYv2 Inband 1772 The behavior will depend on which mode is picked. 1774 A.5.3. Best Effort Encryption 1776 With best effort encryption, SRTP is used with endpoints that support 1777 SRTP, otherwise RTP is used. 1779 SIP needs a backwards-compatible best effort encryption in order for 1780 SRTP to work successfully with SIP retargeting and forking when there 1781 is a mix of forked or retargeted devices that support SRTP and don't 1782 support SRTP. 1784 Consider the case of Bob, with a phone that only does RTP and a 1785 voice mail system that supports SRTP and RTP. If Alice calls Bob 1786 with an SRTP offer, Bob's RTP-only phone will reject the media 1787 stream (with an empty "m=" line) because Bob's phone doesn't 1788 understand SRTP (RTP/SAVP). Alice's phone will see this rejected 1789 media stream and may terminate the entire call (BYE) and re- 1790 initiate the call as RTP-only, or Alice's phone may decide to 1791 continue with call setup with the SRTP-capable leg (the voice mail 1792 system). If Alice's phone decided to re-initiate the call as RTP- 1793 only, and Bob doesn't answer his phone, Alice will then leave 1794 voice mail using only RTP, rather than SRTP as expected. 1796 Currently, several techniques are commonly considered as candidates 1797 to provide opportunistic encryption: 1799 multipart/alternative 1800 [I-D.jennings-sipping-multipart] describes how to form a 1801 multipart/alternative body part in SIP. The significant issues 1802 with this technique are (1) that multipart MIME is incompatible 1803 with existing SIP proxies, firewalls, Session Border Controllers, 1804 and endpoints and (2) when forking, the Heterogeneous Error 1805 Response Forking Problem (HERFP) [RFC3326] causes problems if such 1806 non-multipart-capable endpoints were involved in the forking. 1808 session attribute 1809 With this technique, the endpoints signal their desire to do SRTP 1810 by signaling RTP (RTP/AVP), and using an attribute ("a=") in the 1811 SDP. This technique is entirely backwards compatible with non- 1812 SRTP-aware endpoints, but doesn't use the RTP/SAVP protocol 1813 registered by SRTP [RFC3711]. 1815 SDP Capability Negotiation 1816 SDP Capability Negotiation 1817 [I-D.ietf-mmusic-sdp-capability-negotiation] provides a backwards- 1818 compatible mechanism to allow offering both SRTP and RTP in a 1819 single offer. This is the preferred technique. 1821 Probing 1822 With this technique, the endpoints first establish an RTP session 1823 using RTP (RTP/AVP). The endpoints send probe messages, over the 1824 media path, to determine if the remote endpoint supports their 1825 keying technique. A disadvantage of probing is an active attacker 1826 can interfere with probes, and until probing completes (and SRTP 1827 is established) the media is in the clear. 1829 The preferred technique, SDP Capability Negotiation 1830 [I-D.ietf-mmusic-sdp-capability-negotiation], can be used with all 1831 key exchange mechanisms. What remains unique is ZRTP, which can also 1832 accomplish its best effort encryption by probing (sending ZRTP 1833 messages over the media path) or by session attribute (see "a=zrtp- 1834 hash" in [I-D.zimmermann-avt-zrtp]). Current implementations of ZRTP 1835 use probing. 1837 A.5.4. Upgrading Algorithms 1839 It is necessary to allow upgrading SRTP encryption and hash 1840 algorithms, as well as upgrading the cryptographic functions used for 1841 the key exchange mechanism. With SIP's offer/answer model, this can 1842 be computionally expensive because the offer needs to contain all 1843 combinations of the key exchange mechanisms (all MIKEY modes, 1844 Security Descriptions) and all SRTP cryptographic suites (AES-128, 1845 AES-256) and all SRTP cryptographic hash functions (SHA-1, SHA-256) 1846 that the offerer supports. In order to do this, the offerer has to 1847 expend CPU resources to build an offer containing all of this 1848 information which becomes computationally prohibitive. 1850 Thus, it is important to keep the offerer's CPU impact fixed so that 1851 offering multiple new SRTP encryption and hash functions incurs no 1852 additional expense. 1854 The following list describes the CPU effort involved in using each 1855 key exchange technique. 1857 MIKEY-NULL 1858 No significant computaional expense. 1860 MIKEY-PSK 1861 No significant computational expense. 1863 MIKEY-RSA 1864 For each offered SRTP crypto suite, the offerer has to perform 1865 RSA operation to encrypt the TGK 1867 MIKEY-RSA-R 1868 For each offered SRTP crypto suite, the offerer has to perform 1869 public key operation to sign the MIKEY message. 1871 MIKEY-DHSIGN 1872 For each offered SRTP crypto suite, the offerer has to perform 1873 Diffie-Hellman operation, and a public key operation to sign 1874 the Diffie-Hellman output. 1876 MIKEY-DHHMAC 1877 For each offered SRTP crypto suite, the offerer has to perform 1878 Diffie-Hellman operation. 1880 MIKEYv2 in SDP 1881 The behavior will depend on which mode is picked. 1883 Security Descriptions with SIPS 1884 No significant computational expense. 1886 Security Descriptions with S/MIME 1887 S/MIME requires the offerer and the answerer to encrypt the SDP 1888 with the other's public key, and to decrypt the received SDP 1889 with their own private key. 1891 SDP-DH 1892 For each offered SRTP crypto suite, the offerer has to perform 1893 a Diffie-Hellman operation. 1895 ZRTP 1896 The offerer has no additional computational expense at all, as 1897 the offer contains no information about ZRTP or might contain 1898 "a=zrtp-hash". 1900 EKT 1901 The offerer's Computational expense depends entirely on the EKT 1902 bootstrapping mechanism selected (one or more MIKEY modes or 1903 Security Descriptions). 1905 DTLS-SRTP 1906 The offerer has no additional computational expense at all, as 1907 the offer contains only a fingerprint of the certificate that 1908 will be presented in the DTLS exchange. 1910 MIKEYv2 Inband 1911 The behavior will depend on which mode is picked. 1913 Appendix B. Out-of-Scope 1915 The compromise of an endpoint that has access to decrypted media 1916 (e.g., SIP user agent, transcoder, recorder) is out of scope of this 1917 document. Such a compromise might be via privilege escalation, 1918 installation of a virus or trojan horse, or similar attacks. 1920 B.1. Shared Key Conferencing 1922 The consensus on the RTPSEC mailing list was to concentrate on 1923 unicast, point-to-point sessions. Thus, there are no requirements 1924 related to shared key conferencing. This section is retained for 1925 informational purposes. 1927 For efficient scaling, large audio and video conference bridges 1928 operate most efficiently by encrypting the current speaker once and 1929 distributing that stream to the conference attendees. Typically, 1930 inactive participants receive the same streams -- they hear (or see) 1931 the active speaker(s), and the active speakers receive distinct 1932 streams that don't include themselves. In order to maintain 1933 confidentiality of such conferences where listeners share a common 1934 key, all listeners must rekeyed when a listener joins or leaves a 1935 conference. 1937 An important use case for mixers/translators is a conference bridge: 1939 +----+ 1940 A --- 1 --->| | 1941 <-- 2 ----| M | 1942 | I | 1943 B --- 3 --->| X | 1944 <-- 4 ----| E | 1945 | R | 1946 C --- 5 --->| | 1947 <-- 6 ----| | 1948 +----+ 1950 Figure 3: Centralized Keying 1952 In the figure above, 1, 3, and 5 are RTP media contributions from 1953 Alice, Bob, and Carol, and 2, 4, and 6 are the RTP flows to those 1954 devices carrying the 'mixed' media. 1956 Several scenarios are possible: 1958 a. Multiple inbound sessions: 1, 3, and 5 are distinct RTP sessions, 1960 b. Multiple outbound sessions: 2, 4, and 6 are distinct RTP 1961 sessions, 1963 c. Single inbound session: 1, 3, and 5 are just different sources 1964 within the same RTP session, 1966 d. Single outbound session: 2, 4, and 6 are different flows of the 1967 same (multi-unicast) RTP session 1969 If there are multiple inbound sessions and multiple outbound sessions 1970 (scenarios a and b), then every keying mechanism behaves as if the 1971 mixer were an end point and can set up a point-to-point secure 1972 session between the participant and the mixer. This is the simplest 1973 situation, but is computationally wasteful, since SRTP processing has 1974 to be done independently for each participant. The use of multiple 1975 inbound sessions (scenario a) doesn't waste computational resources, 1976 though it does consume additional cryptographic context on the mixer 1977 for each participant and has the advantage of data origin 1978 authentication. 1980 To support a single outbound session (scenario d), the mixer has to 1981 dictate its encryption key to the participants. Some keying 1982 mechanisms allow the transmitter to determine its own key, and others 1983 allow the offerer to determine the key for the offerer and answerer. 1984 Depending on how the call is established, the offerer might be a 1985 participant (such as a participant dialing into a conference bridge) 1986 or the offerer might be the mixer (such as a conference bridge 1987 calling a participant). The use of offerless INVITEs may help some 1988 keying mechanisms reverse the role of offerer/answerer. A 1989 difficulty, however, is knowing a priori if the role should be 1990 reversed for a particular call. The significant advantage of a 1991 single outbound session is the number of SRTP encryption operations 1992 remains constant even as the number of participants increases. 1993 However, a disadvantage is that data origin authentication is lost, 1994 allowing any participant to spoof the sender (because all 1995 participants know the sender's SRTP key). 1997 Appendix C. Requirement renumbering in -02 1999 [[RFC Editor: Please delete this section prior to publication.]] 2001 Previous versions of this document used requirement numbers, which 2002 were changed to mnemonics as follows: 2004 R1 R-FORK-RETARGET 2006 R2 R-BEST-SECURE 2008 R3 R-DISTINCT 2010 R4 R-REUSE; changed from 'MAY' to 'protocol MUST support, and 2011 SHOULD implement' 2013 R5 R-AVOID-CLIPPING 2015 R6 R-PASS-MEDIA 2017 R7 R-PASS-SIG 2019 R8 R-PFS 2021 R9 R-COMPUTE 2023 R10 R-RTP-VALID 2025 R11 (folded into R4; was reuse previous session) 2027 R12 R-CERTS 2029 R13 R-FIPS 2030 R14 R-ASSOC 2032 R15 R-ALLOW-RTP 2034 R16 R-DOS 2036 R17 R-SIG-MEDIA 2038 R18 R-EXISTING 2040 R19 R-AGILITY 2042 R20 R-DOWNGRADE 2044 R21 R-NEGOTIATE 2046 R23 R-OTHER-SIGNALING 2048 R23 R-RECORDING (R23 was duplicated in previous versions of the 2049 document) 2051 R24 (deleted; was lawful intercept) 2053 R25 R-TRANSCODER 2055 R26 R-PSTN 2057 R27 R-ID-BINDING 2059 R28 R-ACT-ACT 2061 Authors' Addresses 2063 Dan Wing (editor) 2064 Cisco Systems, Inc. 2065 170 West Tasman Drive 2066 San Jose, CA 95134 2067 USA 2069 Email: dwing@cisco.com 2070 Steffen Fries 2071 Siemens AG 2072 Otto-Hahn-Ring 6 2073 Munich, Bavaria 81739 2074 Germany 2076 Email: steffen.fries@siemens.com 2078 Hannes Tschofenig 2079 Nokia Siemens Networks 2080 Otto-Hahn-Ring 6 2081 Munich, Bavaria 81739 2082 Germany 2084 Email: Hannes.Tschofenig@nsn.com 2085 URI: http://www.tschofenig.priv.at 2087 Francois Audet 2088 Nortel 2089 4655 Great America Parkway 2090 Santa Clara, CA 95054 2091 USA 2093 Email: audet@nortel.com 2095 Full Copyright Statement 2097 Copyright (C) The IETF Trust (2008). 2099 This document is subject to the rights, licenses and restrictions 2100 contained in BCP 78, and except as set forth therein, the authors 2101 retain all their rights. 2103 This document and the information contained herein are provided on an 2104 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 2105 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 2106 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 2107 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 2108 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 2109 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 2111 Intellectual Property 2113 The IETF takes no position regarding the validity or scope of any 2114 Intellectual Property Rights or other rights that might be claimed to 2115 pertain to the implementation or use of the technology described in 2116 this document or the extent to which any license under such rights 2117 might or might not be available; nor does it represent that it has 2118 made any independent effort to identify any such rights. Information 2119 on the procedures with respect to rights in RFC documents can be 2120 found in BCP 78 and BCP 79. 2122 Copies of IPR disclosures made to the IETF Secretariat and any 2123 assurances of licenses to be made available, or the result of an 2124 attempt made to obtain a general license or permission for the use of 2125 such proprietary rights by implementers or users of this 2126 specification can be obtained from the IETF on-line IPR repository at 2127 http://www.ietf.org/ipr. 2129 The IETF invites any interested party to bring to its attention any 2130 copyrights, patents or patent applications, or other proprietary 2131 rights that may cover technology that may be required to implement 2132 this standard. Please address the information to the IETF at 2133 ietf-ipr@ietf.org. 2135 Acknowledgment 2137 This document was produced using xml2rfc v1.33 (of 2138 http://xml.resource.org/) from a source in RFC-2629 XML format.