idnits 2.17.1 draft-ietf-sip-media-security-requirements-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** The document seems to lack a License Notice according IETF Trust Provisions of 28 Dec 2009, Section 6.b.i or Provisions of 12 Sep 2009 Section 6.b -- however, there's a paragraph with a matching beginning. Boilerplate error? -- It seems you're using the 'non-IETF stream' Licence Notice instead Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 976 has weird spacing: '...ication along...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (January 9, 2009) is 5585 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-07) exists of draft-ietf-avt-dtls-srtp-06 == Outdated reference: A later version (-07) exists of draft-ietf-mmusic-media-path-middleboxes-01 == Outdated reference: A later version (-13) exists of draft-ietf-mmusic-sdp-capability-negotiation-09 == Outdated reference: A later version (-15) exists of draft-ietf-sip-certs-07 == Outdated reference: A later version (-06) exists of draft-mcgrew-srtp-ekt-03 == Outdated reference: A later version (-22) exists of draft-zimmermann-avt-zrtp-11 -- Obsolete informational reference (is this intentional?): RFC 4474 (Obsoleted by RFC 8224) -- Obsolete informational reference (is this intentional?): RFC 4492 (Obsoleted by RFC 8422) Summary: 1 error (**), 0 flaws (~~), 8 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 SIP Working Group D. Wing, Ed. 3 Internet-Draft Cisco 4 Intended status: Informational S. Fries 5 Expires: July 13, 2009 Siemens AG 6 H. Tschofenig 7 Nokia Siemens Networks 8 F. Audet 9 Nortel 10 January 9, 2009 12 Requirements and Analysis of Media Security Management Protocols 13 draft-ietf-sip-media-security-requirements-09 15 Status of this Memo 17 This Internet-Draft is submitted to IETF in full conformance with the 18 provisions of BCP 78 and BCP 79. 20 Internet-Drafts are working documents of the Internet Engineering 21 Task Force (IETF), its areas, and its working groups. Note that 22 other groups may also distribute working documents as Internet- 23 Drafts. 25 Internet-Drafts are draft documents valid for a maximum of six months 26 and may be updated, replaced, or obsoleted by other documents at any 27 time. It is inappropriate to use Internet-Drafts as reference 28 material or to cite them other than as "work in progress." 30 The list of current Internet-Drafts can be accessed at 31 http://www.ietf.org/ietf/1id-abstracts.txt. 33 The list of Internet-Draft Shadow Directories can be accessed at 34 http://www.ietf.org/shadow.html. 36 This Internet-Draft will expire on July 13, 2009. 38 Copyright Notice 40 Copyright (c) 2009 IETF Trust and the persons identified as the 41 document authors. All rights reserved. 43 This document is subject to BCP 78 and the IETF Trust's Legal 44 Provisions Relating to IETF Documents 45 (http://trustee.ietf.org/license-info) in effect on the date of 46 publication of this document. Please review these documents 47 carefully, as they describe your rights and restrictions with respect 48 to this document. 50 Abstract 52 This document describes requirements for a protocol to negotiate a 53 security context for SIP-signaled SRTP media. In addition to the 54 natural security requirements, this negotiation protocol must 55 interoperate well with SIP in certain ways. A number of proposals 56 have been published and a summary of these proposals is in the 57 appendix of this document. 59 Table of Contents 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 62 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 63 3. Attack Scenarios . . . . . . . . . . . . . . . . . . . . . . . 5 64 4. Call Scenarios and Requirements Considerations . . . . . . . . 8 65 4.1. Clipping Media Before Signaling Answer . . . . . . . . . . 8 66 4.2. Retargeting and Forking . . . . . . . . . . . . . . . . . 9 67 4.3. Recording . . . . . . . . . . . . . . . . . . . . . . . . 12 68 4.4. PSTN gateway . . . . . . . . . . . . . . . . . . . . . . . 12 69 4.5. Call Setup Performance . . . . . . . . . . . . . . . . . . 13 70 4.6. Transcoding . . . . . . . . . . . . . . . . . . . . . . . 13 71 4.7. Upgrading to SRTP . . . . . . . . . . . . . . . . . . . . 14 72 4.8. Interworking with Other Signaling Protocols . . . . . . . 14 73 4.9. Certificates . . . . . . . . . . . . . . . . . . . . . . . 15 74 5. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 15 75 5.1. Key Management Protocol Requirements . . . . . . . . . . . 15 76 5.2. Security Requirements . . . . . . . . . . . . . . . . . . 17 77 5.3. Requirements Outside of the Key Management Protocol . . . 19 78 6. Security Considerations . . . . . . . . . . . . . . . . . . . 20 79 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 80 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 20 81 9. References . . . . . . . . . . . . . . . . . . . . . . . . . . 20 82 9.1. Normative References . . . . . . . . . . . . . . . . . . . 20 83 9.2. Informative References . . . . . . . . . . . . . . . . . . 21 84 Appendix A. Overview and Evaluation of Existing Keying 85 Mechanisms . . . . . . . . . . . . . . . . . . . . . 24 86 A.1. Signaling Path Keying Techniques . . . . . . . . . . . . . 25 87 A.1.1. MIKEY-NULL . . . . . . . . . . . . . . . . . . . . . . 25 88 A.1.2. MIKEY-PSK . . . . . . . . . . . . . . . . . . . . . . 25 89 A.1.3. MIKEY-RSA . . . . . . . . . . . . . . . . . . . . . . 26 90 A.1.4. MIKEY-RSA-R . . . . . . . . . . . . . . . . . . . . . 26 91 A.1.5. MIKEY-DHSIGN . . . . . . . . . . . . . . . . . . . . . 26 92 A.1.6. MIKEY-DHHMAC . . . . . . . . . . . . . . . . . . . . . 26 93 A.1.7. MIKEY-ECIES and MIKEY-ECMQV (MIKEY-ECC) . . . . . . . 27 94 A.1.8. Security Descriptions with SIPS . . . . . . . . . . . 27 95 A.1.9. Security Descriptions with S/MIME . . . . . . . . . . 27 96 A.1.10. SDP-DH (expired) . . . . . . . . . . . . . . . . . . . 27 97 A.1.11. MIKEYv2 in SDP (expired) . . . . . . . . . . . . . . . 27 98 A.2. Media Path Keying Technique . . . . . . . . . . . . . . . 28 99 A.2.1. ZRTP . . . . . . . . . . . . . . . . . . . . . . . . . 28 100 A.3. Signaling and Media Path Keying Techniques . . . . . . . . 28 101 A.3.1. EKT . . . . . . . . . . . . . . . . . . . . . . . . . 28 102 A.3.2. DTLS-SRTP . . . . . . . . . . . . . . . . . . . . . . 29 103 A.3.3. MIKEYv2 Inband (expired) . . . . . . . . . . . . . . . 29 104 A.4. Evaluation Criteria - SIP . . . . . . . . . . . . . . . . 29 105 A.4.1. Secure Retargeting and Secure Forking . . . . . . . . 29 106 A.4.2. Clipping Media Before SDP Answer . . . . . . . . . . . 32 107 A.4.3. SSRC and ROC . . . . . . . . . . . . . . . . . . . . . 34 108 A.5. Evaluation Criteria - Security . . . . . . . . . . . . . . 36 109 A.5.1. Distribution and Validation of Persistent Public 110 Keys and Certificates . . . . . . . . . . . . . . . . 36 111 A.5.2. Perfect Forward Secrecy . . . . . . . . . . . . . . . 38 112 A.5.3. Best Effort Encryption . . . . . . . . . . . . . . . . 40 113 A.5.4. Upgrading Algorithms . . . . . . . . . . . . . . . . . 41 114 Appendix B. Out-of-Scope . . . . . . . . . . . . . . . . . . . . 43 115 B.1. Shared Key Conferencing . . . . . . . . . . . . . . . . . 43 116 Appendix C. Requirement renumbering in -02 . . . . . . . . . . . 44 117 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 46 119 1. Introduction 121 The work on media security started when the Session Initiation 122 Protocol (SIP) was still in its infancy. With the increased SIP 123 deployment and the availability of new SIP extensions and related 124 protocols, the need for end-to-end security was re-evaluated. The 125 procedure of re-evaluating prior protocol work and design decisions 126 is not an uncommon strategy and, to some extent, considered necessary 127 to ensure that the developed protocols indeed meet the previously 128 envisioned needs for the users on the Internet. 130 This document summarizes media security requirements, i.e., 131 requirements for mechanisms that negotiate security context such as 132 cryptographic keys and parameters for SRTP. 134 The organization of this document is as follows: Section 2 introduces 135 terminology, Section 3 describes various attack scenarios against the 136 signaling path and media path, Section 4 provides an overview about 137 possible call scenarios, Section 5 lists requirements for media 138 security. The main part of the document concludes with the security 139 considerations Section 6, IANA considerations Section 7 and an 140 acknowledgement section in Section 8. Appendix A lists and compares 141 available solution proposals. The following Appendix A.4 compares 142 the different approaches regarding their suitability for the SIP 143 signaling scenarios described in Appendix A, while Appendix A.5 144 provides a comparison regarding security aspects. Appendix B lists 145 non-goals for this document. 147 2. Terminology 149 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 150 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 151 document are to be interpreted as described in [RFC2119], with the 152 important qualification that, unless otherwise stated, these terms 153 apply to the design of the media security key management protocol, 154 not its implementation or application. 156 Furthermore, the terminology described in SIP ([RFC3261]) regarding 157 functions and components are used throughout the document 159 Additionally, the following items are used in this document: 161 AOR (Address-of-Record): A SIP or SIPS URI that points to a domain 162 with a location service that can map the URI to another URI where 163 the user might be available. Typically, the location service is 164 populated through registrations. An AOR is frequently thought of 165 as the "public address" of the user. 167 SSRC: The 32-bit value that defines the synchronization source, used 168 in RTP. These are generally unique, but collisions can occur. 170 two-time pad: The use of the same key and the same keystream to 171 encrypt different data. For SRTP, a two-time pad occurs if two 172 senders are using the same key and the same RTP SSRC value. 174 Perfect Forward Secrecy (PFS): The property that disclosure of the 175 long-term secret keying material that is used to derive an agreed 176 ephemeral key does not compromise the secrecy of agreed keys from 177 earlier runs. 179 active adversary: An active adversary is able to alter data 180 communication to affect its operation (see also [RFC4949]). 182 passive adversary: A passive adversary is able to learn information 183 from data communication, but not alter that data communication 184 (see also[RFC4949]). 186 signaling path: The signaling path is the route taken by SIP 187 signaling messages transmitted between the calling and called user 188 agents. This can be either direct signaling between the calling 189 and called user agents or, more commonly involves the SIP proxy 190 servers that were involved in the call setup. 192 media path: The media path is the route taken by media packets 193 exchanged by the endpoints. In the simplest case, the endpoints 194 exchange media directly, and the "media path" is defined by a 195 quartet of IP addresses and TCP/UDP ports, along with an IP route. 196 In other cases, this path may include RTP relays, mixers, 197 transcoders, session border controllers, NATs, or media gateways. 199 Moreover, as this document discusses requirements for media security, 200 the nomenclature R-XXX is used to mark requrements, were XXX is the 201 requirement, which needs to be met. 203 3. Attack Scenarios 205 The discussion in this section relates to requirements R-PASS-MEDIA, 206 R-PASS-SIG, R-ASSOC, R-SIG-MEDIA, R-ACT-ACT, and R-ID-BINDING. 208 This document classifies adversaries according to their access and 209 their capabilities. An adversary might have access: 211 1. only to the media path, 212 2. only to the signaling path, 214 3. to the media path and to the signaling path. 216 An attacker that can solely be located along the signaling path, and 217 does not have access to media (item 2), is not considered in this 218 document. 220 There are two different types of adversaries, active and passive. An 221 active adversary may need to be active with regard to the key 222 exchange relevant information traveling along the media path or 223 traveling along the signaling path. 225 Based on their robustness against the adversary capabilities 226 described above, we can group security mechanisms using the following 227 labels. This list is generally ordered from easiest to compromise 228 (at the top) to more difficult to compromise: 230 +---------------+---------+--------------------------------------+ 231 | SIP signaling | media | abbreviation | 232 +---------------+---------+--------------------------------------+ 233 | none | passive | no-signaling-passive-media | 234 | none | active | no-signaling-active-media | 235 | passive | passive | passive-signaling-passive-media | 236 | passive | active | passive-signaling-active-media | 237 | active | passive | active-signaling-passive-media | 238 | active | active | active-signaling-active-media | 239 | active | active | active-signaling-active-media-detect | 240 +---------------+---------+--------------------------------------+ 242 no-signaling-passive-media: 243 Access to only the media path is sufficient to reveal the content 244 of the media traffic. 246 passive-signaling-passive-media: 247 Passive attack on the signaling and passive attack on the media 248 path is necessary to reveal the content of the media traffic. 250 passive-signaling-active-media: 251 Passive attack on the signaling and active attack on the media 252 path is necessary to reveal the content of the media traffic. 254 active-signaling-passive-media: 255 Active attack on the signaling path and passive attack on the 256 media path is necessary to reveal the content of the media 257 traffic. 259 no-signaling-active-media: 260 Active attack on the media path is sufficient to reveal the 261 content of the media traffic. 263 active-signaling-active-media: 264 Active attack on both the signaling path and the media path is 265 necessary to reveal the content of the media traffic. 267 active-signaling-active-media-detect: 268 Active attack on both signaling and media path is necessary to 269 reveal the content of the media traffic (as with active-signaling- 270 active-media), and the attack is detectable by protocol messages 271 exchanged between the end points. 273 For example, unencrypted RTP is vulnerable to no-signaling-passive- 274 media. 276 As another example, Security Descriptions [RFC4568], when protected 277 by TLS (as it is commonly implemented and deployed), belongs in the 278 passive-signaling-passive-media category since the adversary needs to 279 learn the Security Descriptions key by seeing the SIP signaling 280 message at a SIP proxy (assuming that the adversary is in control of 281 the SIP proxy). The media traffic can be decrypted using that 282 learned key. 284 As another example, DTLS-SRTP falls into active-signaling-active- 285 media category when DTLS-SRTP is used with a public key based 286 ciphersuite with self-signed certificates and without SIP-Identity 287 [RFC4474]. An adversary would have to modify the fingerprint that is 288 sent along the signaling path and subsequently to modify the 289 certificates carried in the DTLS handshake that travel along the 290 media path. If DTLS-SRTP is used with both SIP Identity [RFC4474] 291 and SIP Connected Identity [RFC4916], the RFC4474 signature protects 292 both the offer and the answer, and such a system would then belong to 293 the active-signaling-active-attack-detect category (provided, of 294 course, the signaling path to the RFC4474 authenticator and verifier 295 is secured as per RFC4474 and the RFC4474 authenticator and verifier 296 are behaving as per RFC4474). 298 The above discussion of DTLS-SRTP demonstrates how a single security 299 protocol can be in different classes depending on the mode in which 300 it is operated. Other protocols can achieve similar effect by adding 301 functions outside of the on-the-wire key management protocol itself. 302 Although it may be appropriate to deploy lower-classed mechanisms in 303 some cases, the ultimate security requirement for a media security 304 negotiation protocol is that it have a mode of operation available in 305 which is detect-attack, which provides protection against the passive 306 and active attacks and provides detection of such attacks. That is, 307 there must be a way to use the protocol so that an active attack is 308 required against both the signaling and media paths, and so that such 309 attacks are detectable by the endpoints. 311 4. Call Scenarios and Requirements Considerations 313 The following subsections describe call scenarios that pose the most 314 challenge to the key management system for media data in cooperation 315 with SIP signaling. 317 Throughout the subsections requirements are stated by using the 318 nomenclature R- to state an explicit requirement. All of the stated 319 requirements are explanied in detail in section Section 5. The 320 requirements in section Section 5 are listed according their 321 association to the key management protocol, to attack scenarios, and 322 requirements which can be met inside the key management protocol or 323 outside of the key management protocol. 325 4.1. Clipping Media Before Signaling Answer 327 The discussion in this section relates to requirement R-AVOID- 328 CLIPPING and R-ALLOW-RTP. 330 Per the SDP Offer/Answer Model [RFC3264], 332 "Once the offerer has sent the offer, it MUST be prepared to 333 receive media for any recvonly streams described by that offer. 334 It MUST be prepared to send and receive media for any sendrecv 335 streams in the offer, and send media for any sendonly streams in 336 the offer (of course, it cannot actually send until the peer 337 provides an answer with the needed address and port information)." 339 To meet this requirement with SRTP, the offerer needs to know the 340 SRTP key for arriving media. If either endpoint receives encrypted 341 media before it has access to the associated SRTP key, it cannot play 342 the media -- causing clipping. 344 For key exchange mechanisms that send the answerer's key in SDP, a 345 SIP provisional response [RFC3261], such as 183 (session progress), 346 is useful. However, the 183 messages are not reliable unless both 347 the calling and called end point support PRACK [RFC3262], use TCP 348 across all SIP proxies, implement Security Preconditions [RFC5027], 349 or the both ends implement ICE [I-D.ietf-mmusic-ice] and the answerer 350 implements the reliable provisional response mechanism described in 351 ICE. Unfortunately, there is not wide deployment of any of these 352 techniques and there is industry reluctance to require these 353 techniques to avoid the problems described in this section. 355 Note that the receipt of an SDP answer is not always sufficient to 356 allow media to be played to the offerer. Sometimes, the offerer must 357 send media in order to open up firewall holes or NAT bindings before 358 media can be received (for details see 359 [I-D.ietf-mmusic-media-path-middleboxes]). In this case, even a 360 solution that makes the key available before the SDP answer arrives 361 will not help. 363 Preventing the arrival of early media (i.e., media that arrives at 364 the SDP offerer before the SDP answer arrives) might obsolete the 365 R-AVOID-CLIPPING requirement, but at the time of writing such early 366 media exists in many normal call scenarios. 368 4.2. Retargeting and Forking 370 The discussion in this section relates to requirements R-FORK- 371 RETARGET, R-DISTINCT, R-HERFP, and R-BEST-SECURE. 373 In SIP, a request sent to a specific AOR but delivered to a different 374 AOR is called a "retarget". A typical scenario is a "call 375 forwarding" feature. In Figure 1 Alice sends an INVITE in step 1 376 that is sent to Bob in step 2. Bob responds with a redirect (SIP 377 response code 3xx) pointing to Carol in step 3. This redirect 378 typically does not propagate back to Alice but only goes to a proxy 379 (i.e., the retargeting proxy) that sends the original INVITE to Carol 380 in step 4. 382 +-----+ 383 |Alice| 384 +--+--+ 385 | 386 | INVITE (1) 387 V 388 +----+----+ 389 | proxy | 390 ++-+-----++ 391 | ^ | 392 INVITE (2) | | | INVITE (4) 393 & redirect (3) | | | 394 V | V 395 ++-++ ++----+ 396 |Bob| |Carol| 397 +---+ +-----+ 399 Figure 1: Retargeting 401 Using retargeting might lead to situations where the User Agent 402 Client (UAC) does not know where its request will be going. This 403 might not immediately seem like a serious problem; after all, when 404 one places a telephone call on the PSTN, one never really knows if it 405 will be forwarded to a different number, who will pick up the line 406 when it rings, and so on. However, when considering SIP mechanisms 407 for authenticating the called party, this function can also make it 408 difficult to differentiate an intermediary that is behaving 409 legitimately from an attacker. From this perspective, the main 410 problems with retargeting are: 412 Not detectable by the caller: The originating user agent has no 413 means of anticipating that the condition will arise, nor any means 414 of determining that it has occurred until the call has already 415 been set up. 417 Not preventable by the caller: There is no existing mechanism that 418 might be employed by the originating user agent in order to 419 guarantee that the call will not be re-targeted. 421 The mechanism used by SIP for identifying the calling party is SIP 422 Identity [RFC4474]. However, due to the nature of retargeting SIP 423 Identity can only identify the calling party (that is, the party that 424 initiated the SIP request). Some key exchange mechanisms predate SIP 425 Identity and include their own identity mechanism (e.g., MIKEY). 426 However, those built-in identity mechanism also suffer from the SIP 427 retargeting problem. While Connected Identity [RFC4916] allows 428 positive identification of the called party, the primary difficulty 429 still remains that the calling party does not know if a mismatched 430 called party is legitimate (i.e., due to authorized retargeting) or 431 illegitimate (i.e., due to unauthorized retargeting by an attacker 432 above to modify SIP signaling). 434 In SIP, 'forking' is the delivery of a request to multiple locations. 435 This happens when a single AOR is registered more than once. An 436 example of forking is when a user has a desk phone, PC client, and 437 mobile handset all registered with the same AOR. 439 +-----+ 440 |Alice| 441 +--+--+ 442 | 443 | INVITE 444 V 445 +-----+-----+ 446 | proxy | 447 ++---------++ 448 | | 449 INVITE | | INVITE 450 V V 451 +--+--+ +--+--+ 452 |Bob-1| |Bob-2| 453 +-----+ +-----+ 455 Figure 2: Forking 457 With forking, both Bob-1 and Bob-2 might send back SDP answers in SIP 458 responses. Alice will see those intermediate (18x) and final (200) 459 responses. It is useful for Alice to be able to associate the SIP 460 response with the incoming media stream. Although this association 461 can be done with ICE [I-D.ietf-mmusic-ice], and ICE is useful to make 462 this association with RTP, it is not desirable to require ICE to 463 accomplish this association. 465 Forking and retargeting are often used together. For example, a boss 466 and secretary might have both phones ring (forking) and rollover to 467 voice mail if neither phone is answered (retargeting). 469 To maintain security of the media traffic, only the end point that 470 answers the call should know the SRTP keys for the session. Forked 471 and re-targeted calls only reveal sensitive information to non- 472 responders when the signaling messages contain sensitive information 473 (e.g., SRTP keys) that is accessible by parties that receive the 474 offer, but may not respond (i.e., the original recipients in a 475 retargeted call, or non-answering endpoints in a forked call). For 476 key exchange mechanisms that do not provide secure forking or secure 477 retargeting, one workaround is to re-key immediately after forking or 478 retargeting. However, because the originator may not be aware that 479 the call forked this mechanism requires rekeying immediately after 480 every session is established. This doubles the number of messages 481 processed by the network. 483 Further compounding this problem is a unique feature of SIP that when 484 forking is used, there is always only one final error response 485 delivered to the sender of the request: the forking proxy is 486 responsible for choosing which final response to choose in the event 487 where forking results in multiple final error responses being 488 received by the forking proxy. This means that if a request is 489 rejected, say with information that the keying information was 490 rejected and providing the far end's credentials, it is very possible 491 that the rejection will never reach the sender. This problem, called 492 the Heterogeneous Error Response Forking Problem (HERFP) [RFC3326], 493 is difficult to solve in SIP. Because we expect the HERFP to 494 continue to be a problem in SIP for the foreseeable future, a media 495 security system should function even in the presence of HERFP 496 behavior. 498 4.3. Recording 500 The discussion in this section relates to requirement R-RECORDING. 502 Some business environments, such as stock brokers, banks, and catalog 503 call centers, require recording calls with customers. This is the 504 familiar "this call is being recorded for quality purposes" heard 505 during calls to these sorts of businesses. In these environments, 506 media recording is typically performed by an intermediate device 507 (with RTP, this is typically implemented in a 'sniffer'). 509 When performing such call recording with SRTP, the end-to-end 510 security is compromised. This is unavoidable, but necessary because 511 the operation of the business requires such recording. It is 512 desirable that the media security is not unduly compromised by the 513 media recording. The endpoint within the organization needs to be 514 informed that there is an intermediate device and needs to cooperate 515 with that intermediate device. 517 This scenario does not place a requirement directly on the key 518 management protocol. The requirement could be met directly by the 519 key management protocol (e.g., MIKEY-NULL or [RFC4568]) or through an 520 external out-of-band-mechanism (e.g., [I-D.wing-sipping-srtp-key]). 522 4.4. PSTN gateway 524 The discussion in this section relates to requirement R-PSTN. 526 It is desirable, even when one leg of a call is on the PSTN, that the 527 IP leg of the call be protected with SRTP. 529 A typical case of using media security where two entities are having 530 a VoIP conversation over IP capable networks. However, there are 531 cases where the other end of the communication is not connected to an 532 IP capable network. In this kind of setting, there needs to be some 533 kind of gateway at the edge of the IP network which converts the VoIP 534 conversation to format understood by the other network. An example 535 of such gateway is a PSTN gateway sitting at the edge of IP and PSTN 536 networks (such as the architecture described in [RFC3372]). 538 If media security (e.g., SRTP protection) is employed in this kind of 539 gateway-setting, then media security and the related key management 540 is terminated at the PSTN gateway. The other network (e.g., PSTN) 541 may have its own measures to protect the communication, but this 542 means that from media security point of view the media security is 543 not employed truely end-to-end between the communicating entities. 545 4.5. Call Setup Performance 547 The discussion in this section relates to requirement R-REUSE. 549 Some devices lack sufficient processing power to perform public key 550 operations or Diffie-Hellman operations for each call, or prefer to 551 avoid performing those operations on every call. The ability to re- 552 use previous public key or Diffie-Hellman operations can vastly 553 decrease the call setup delay and processing requirements for such 554 devices. 556 In certain devices, it can take a second or two to perform a Diffie- 557 Hellman operation. Examples of these devices include handsets, IP 558 Multimedia Services Identity Module (ISIMs), and PSTN gateways. PSTN 559 gateways typically utilize a Digital Signal Processor (DSP) which is 560 not yet involved with typical DSP operations at the beginning of a 561 call, thus the DSP could be used to perform the calculation, so as to 562 avoid having the central host processor perform the calculation. 563 However, not all PSTN gateways use DSPs (some have only central 564 processors or their DSPs are incapable of performing the necessary 565 public key or Diffie-Hellman operation), and handsets lack a 566 separate, unused processor to perform these operations. 568 Two scenarios where R-REUSE is useful are calls between an endpoint 569 and its voicemail server or its PSTN gateway. In those scenarios 570 calls are made relatively often and it can be useful for the 571 voicemail server or PSTN gateway to avoid public key operations for 572 subsequent calls. 574 Storing keys across sessions often interferes with perfect forward 575 secrecy (R-PFS). 577 4.6. Transcoding 579 The discussion in this section relates to requirement R-TRANSCODER. 581 In some environments is is necessary for network equipment to 582 transcode from one codec (e.g., a highly compressed codec which makes 583 efficient use of wireless bandwidth) to another codec (e.g., a 584 standardized codec to a SIP peering interface). With RTP, a 585 transcoding function can be performed with the combination of a SIP 586 B2BUA (to modify the SDP) and a processor to perform the transcoding 587 between the codecs. However, with end-to-end secured SRTP, a 588 transcoding function implemented the same way is a man in the middle 589 attack, and the key management system prevents its use. 591 However, such a network-based transcoder can still be realized with 592 the cooperation and approval of the endpoint, and can provide end-to- 593 transcoder and transcoder-to-end security. 595 4.7. Upgrading to SRTP 597 The discussion in this section relates to the requirement R-ALLOW- 598 RTP. 600 Legitimate RTP media can be sent to an endpoint for announcements, 601 colorful ringback tones (e.g., music), advertising, or normal call 602 progress tones. The RTP may be received before an associated SDP 603 answer. For details on various scenarios, see 604 [I-D.stucker-sipping-early-media-coping]. 606 While receiving such RTP exposes the calling party to a risk of 607 receiving malicious RTP from an attacker, SRTP endpoints will need to 608 receive and play out RTP media in order to be compatible with 609 deployed systems that send RTP to calling parties. 611 4.8. Interworking with Other Signaling Protocols 613 The discussion in this section relates to the requirement R-OTHER- 614 SIGNALING. 616 In many environments, some devices are signaled with protocols other 617 than SIP which do not share SIP's offer/answer model (e.g., [H.248.1] 618 or do not utilize SDP (e.g., H.323). In other environments, both 619 endpoints may be SIP, but may use different key management systems 620 (e.g., one uses MIKEY-RSA, the other MIKEY-RSA-R). 622 In these environments, it is desirable to have SRTP -- rather than 623 RTP -- between the two endpoints. It is always possible, although 624 undesirable, to interwork those disparate signaling systems or 625 disparate key management systems by decrypting and re-encrypting each 626 SRTP packet in a device in the middle of the network (often the same 627 device performing the signaling interworking). This is undesirable 628 due to the cost and increased attack area, as such an SRTP/SRTP 629 interworking device is a valuable attack target. 631 At the time of this writing, interworking is considered important. 632 Interworking without decryption/encryption of the SRTP, while useful, 633 is not yet deemed critical because the scale of such SRTP deployments 634 is, to date, relatively small. 636 4.9. Certificates 638 The discussion in this section relates to R-CERTS. 640 On the Internet and on some private networks, validating another 641 peer's certificate is often done through a trust anchor -- a list of 642 Certificate Authorities that are trusted. It can be difficult or 643 expensive for a peer to obtain these certificates. In all cases, 644 both parties to the call would need to trust the same trust anchor 645 (i.e., "certificate authority"). For these reasons, it is important 646 that the media plane key management protocol offer a mechanism that 647 allows end-users who have no prior association to authenticate to 648 each other without acquiring credentials from a third party trust 649 point. Note that this does not rule out mechanisms in which servers 650 have certificates and attest to the identities of end-users. 652 5. Requirements 654 This section is divided into several parts: requirements specific to 655 the key management protocol (Section 5.1), attack scenarios 656 (Section 5.2), and requirements which can be met inside the key 657 management protocol or outside of the key management protocol 658 (Section 5.3). 660 5.1. Key Management Protocol Requirements 662 SIP Forking and Retargeting, from Section 4.2: 664 R-FORK-RETARGET: 665 The media security key management protocol MUST securely 666 support forking and retargeting when all endpoints are willing 667 to use SRTP without causing the call setup to fail. This 668 requirement means the endpoints that did not answer the call 669 MUST NOT learn the SRTP keys (in either direction) used by the 670 answering endpoint. 672 R-DISTINCT: 673 The media security key management protocol MUST be capable of 674 creating distinct, independent cryptographic contexts for each 675 endpoint in a forked session. 677 R-HERFP: 678 The media security key management protocol MUST function 679 securely even in the presence of HERFP behavior, i.e., the 680 rejection of key information does not reach the sender. 682 Performance considerations: 684 R-REUSE: 685 The media security key management protocol MAY support the re- 686 use of a previously established security context. 688 Note: re-use of the security context does not imply re- 689 use of RTP parameters (e.g., payload type or SSRC). 691 Media considerations: 693 R-AVOID-CLIPPING: 694 The media security key management protocol SHOULD avoid 695 clipping media before SDP answer without requiring Security 696 Preconditions [RFC5027]. This requirement comes from 697 Section 4.1. 699 R-RTP-CHECK: 700 If SRTP key negotiation is performed over the media path (i.e., 701 using the same UDP/TCP ports as media packets), the key 702 negotiation packets MUST NOT pass the RTP validity check 703 defined in Appendix A.1 of [RFC3550], so that SRTP negotiation 704 packets can be differentiated from RTP packets. 706 R-ASSOC: 707 The media security key management protocol SHOULD include a 708 mechanism for associating key management messages with both the 709 signaling traffic that initiated the session and with protected 710 media traffic. It is useful to associate key management 711 messages with call signaling messages, as this allows the SDP 712 offerer to avoid performing CPU-consuming operations (e.g., 713 Diffie-Hellman or public key operations) with attackers that 714 have not seen the signaling messages. 716 For example, if using a Diffie-Hellman keying technique with 717 security preconditions that forks to 20 end points, the call 718 initiator would get 20 provisional responses containing 20 719 signed Diffie-Hellman key pairs. Calculating 20 Diffie-Hellman 720 secrets and validating signatures can be a difficult task for 721 some devices. Hence, in the case of forking, it is not 722 desirable to perform a Diffie-Hellman operation with every 723 party, but rather only with the party that answers the call 724 (and incur some media clipping). To do this, the signaling and 725 media need to be associated so the calling party knows which 726 key management exchange needs to be completed. This might be 727 done by using the transport address indicated in the SDP, 728 although NATs can complicate this association. 730 Note: due to RTP's design requirements, it is expected 731 that SRTP receivers will have to perform authentication 732 of any received SRTP packets. 734 R-NEGOTIATE: 735 The media security key management protocol MUST allow a SIP 736 User Agent to negotiate media security parameters for each 737 individual session. Such negotiation MUST NOT cause a two-time 738 pad (Section 9.1 of [RFC3711]). 740 R-PSTN: 741 The media security key management protocol MUST support 742 termination of media security in a PSTN gateway. This 743 requirement is from Section 4.4. 745 5.2. Security Requirements 747 This section describes overall security requirements and specific 748 requirements from the attack scenarios (Section 3). 750 Overall security requirements: 752 R-PFS: 753 The media security key management protocol MUST be able to 754 support perfect forward secrecy. 756 R-COMPUTE: 757 The media security key management protocol MUST support 758 offering additional SRTP cipher suites without incurring 759 significant computational expense. 761 R-CERTS: 762 The key management protocol MUST NOT require that end-users 763 obtain credentials (certificates or private keys) from a third- 764 party trust anchor. 766 R-FIPS: 767 The media security key management protocol SHOULD use 768 algorithms that allow FIPS 140-2 [FIPS-140-2] certification or 769 similar country-specific certification (e.g., [AISITSEC]). 771 The United States Government can only purchase and use crypto 772 implementations that have been validated by the FIPS-140 774 [FIPS-140-2] process: 776 "The FIPS-140 standard is applicable to all Federal 777 agencies that use cryptographic-based security systems to 778 protect sensitive information in computer and 779 telecommunication systems, including voice systems. The 780 adoption and use of this standard is available to private 781 and commercial organizations." 783 Some commercial organizations, such as banks and defense 784 contractors, require or prefer equipment which has received the 785 same validation. 787 R-DOS: 788 The media security key management protocol MUST NOT introduce 789 any new significant denial of service vulnerabilities (e.g., 790 the protocol should not request the endpoint to perform CPU- 791 intensive operations without the client being able to validate 792 or authorize the request). 794 R-EXISTING: 795 The media security key management protocol SHOULD allow 796 endpoints to authenticate using pre-existing cryptographic 797 credentials, e.g., certificates or pre-shared keys. 799 R-AGILITY: 800 The media security key management protocol MUST provide crypto- 801 agility, i.e., the ability to adapt to evolving cryptography 802 and security requirements (update of cryptographic algorithms 803 without substantial disruption to deployed implementations) 805 R-DOWNGRADE: 806 The media security key management protocol MUST protect cipher 807 suite negotiation against downgrading attacks. 809 R-PASS-MEDIA: 810 The media security key management protocol MUST have a mode 811 which prevents a passive adversary with access to the media 812 path from gaining access to keying material used to protect 813 SRTP media packets. 815 R-PASS-SIG: 816 The media security key management protocol MUST have a mode in 817 which it prevents a passive adversary with access to the 818 signaling path from gaining access to keying material used to 819 protect SRTP media packets. 821 R-SIG-MEDIA: 822 The media security key management protocol MUST have a mode in 823 which it defends itself from an attacker that is solely on the 824 media path and from an attacker that is solely on the signaling 825 path. A successful attack refers to the ability for the 826 adversary to obtain keying material to decrypt the SRTP 827 encrypted media traffic. 829 R-ID-BINDING: 830 The media security key management protocol MUST enable the 831 media security keys to be cryptographically bound to an 832 identity of the endpoint. 834 This allows domains to deploy SIP Identity [RFC4474]. 836 R-ACT-ACT: 837 The media security key management protocol MUST support a mode 838 of operation that provides active-signaling-active-media-detect 839 robustness, and MAY support modes of operation that provide 840 lower levels of robustness (as described in Section 3). 842 Failing to meet R-ACT-ACT indicates the protocol can not 843 provide secure end-to-end media. 845 5.3. Requirements Outside of the Key Management Protocol 847 The requirements in this section are for an overall VoIP security 848 system. These requirements can be met within the key management 849 protocol itself, or can be solved outside of the key management 850 protocol itself (e.g., solved in SIP or in SDP). 852 R-BEST-SECURE: 853 Even when some end points of a forked or retargeted call are 854 incapable of using SRTP, a solution MUST be described which 855 allows the establishment of SRTP associations with SRTP-capable 856 endpoints and / or RTP associations with non-SRTP-capable 857 endpoints. 859 R-OTHER-SIGNALING: 860 A solution SHOULD be able to negotiate keys for SRTP sessions 861 created via different call signaling protocols (e.g., between 862 Jabber, SIP, H.323, MGCP). 864 R-RECORDING: 865 A solution SHOULD be described which supports recording of 866 decrypted media. This requirement comes from Section 4.3. 868 R-TRANSCODER: 869 A solution SHOULD be described which supports intermediate 870 nodes (e.g., transcoders), terminating or processing media, 871 between the end points. 873 R-ALLOW-RTP: A solution SHOULD be described which allows RTP media 874 to be received by the calling party until SRTP has been 875 negotiated with the answerer, after which SRTP is preferred 876 over RTP. 878 6. Security Considerations 880 This document lists requirements for securing media traffic. As 881 such, it addresses security throughout the document. 883 7. IANA Considerations 885 This document does not require actions by IANA. 887 8. Acknowledgements 889 For contributions to the requirements portion of this document, the 890 authors would like to thank the active participants of the RTPSEC BoF 891 and on the RTPSEC mailing list, and a special thanks to Steffen Fries 892 and Dragan Ignjatic for their excellent MIKEY comparison [RFC5197] 893 document. 895 The authors would furthermore like to thank the following people for 896 their review, suggestions, and comments: Flemming Andreasen, Richard 897 Barnes, Mark Baugher, Wolfgang Buecker, Werner Dittmann, Lakshminath 898 Dondeti, John Elwell, Martin Euchner, Hans-Heinrich Grusdt, Christer 899 Holmberg, Guenther Horn, Peter Howard, Leo Huang, Dragan Ignjatic, 900 Cullen Jennings, Alan Johnston, Vesa Lehtovirta, Matt Lepinski, David 901 McGrew, David Oran, Colin Perkins, Eric Raymond, Eric Rescorla, Peter 902 Schneider, Srinath Thiruvengadam, Dave Ward, Dan York, and Phil 903 Zimmermann. 905 9. References 907 9.1. Normative References 909 [FIPS-140-2] 910 NIST, "Security Requirements for Cryptographic Modules", 911 June 2005, . 914 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 915 Requirement Levels", BCP 14, RFC 2119, March 1997. 917 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 918 A., Peterson, J., Sparks, R., Handley, M., and E. 919 Schooler, "SIP: Session Initiation Protocol", RFC 3261, 920 June 2002. 922 [RFC3262] Rosenberg, J. and H. Schulzrinne, "Reliability of 923 Provisional Responses in Session Initiation Protocol 924 (SIP)", RFC 3262, June 2002. 926 [RFC3264] Rosenberg, J. and H. Schulzrinne, "An Offer/Answer Model 927 with Session Description Protocol (SDP)", RFC 3264, 928 June 2002. 930 [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 931 Norrman, "The Secure Real-time Transport Protocol (SRTP)", 932 RFC 3711, March 2004. 934 9.2. Informative References 936 [AISITSEC] 937 "Anwendungshinweise und Interpretationen (AIS) zu ITSEC", 938 January 2002, 939 . 941 [H.248.1] ITU, "Gateway control protocol", June 2000, 942 . 944 [I-D.baugher-mmusic-sdp-dh] 945 Baugher, M. and D. McGrew, "Diffie-Hellman Exchanges for 946 Multimedia Sessions", draft-baugher-mmusic-sdp-dh-00 (work 947 in progress), February 2006. 949 [I-D.dondeti-msec-rtpsec-mikeyv2] 950 Dondeti, L., "MIKEYv2: SRTP Key Management using MIKEY, 951 revisited", draft-dondeti-msec-rtpsec-mikeyv2-01 (work in 952 progress), March 2007. 954 [I-D.fischl-sipping-media-dtls] 955 Fischl, J., "Datagram Transport Layer Security (DTLS) 956 Protocol for Protection of Media Traffic Established with 957 the Session Initiation Protocol", 958 draft-fischl-sipping-media-dtls-03 (work in progress), 959 July 2007. 961 [I-D.ietf-avt-dtls-srtp] 962 McGrew, D. and E. Rescorla, "Datagram Transport Layer 963 Security (DTLS) Extension to Establish Keys for Secure 964 Real-time Transport Protocol (SRTP)", 965 draft-ietf-avt-dtls-srtp-06 (work in progress), 966 October 2008. 968 [I-D.ietf-mmusic-ice] 969 Rosenberg, J., "Interactive Connectivity Establishment 970 (ICE): A Protocol for Network Address Translator (NAT) 971 Traversal for Offer/Answer Protocols", 972 draft-ietf-mmusic-ice-19 (work in progress), October 2007. 974 [I-D.ietf-mmusic-media-path-middleboxes] 975 Stucker, B. and H. Tschofenig, "Analysis of Middlebox 976 Interactions for Signaling Protocol Communication along 977 the Media Path", 978 draft-ietf-mmusic-media-path-middleboxes-01 (work in 979 progress), July 2008. 981 [I-D.ietf-mmusic-sdp-capability-negotiation] 982 Andreasen, F., "SDP Capability Negotiation", 983 draft-ietf-mmusic-sdp-capability-negotiation-09 (work in 984 progress), July 2008. 986 [I-D.ietf-msec-mikey-ecc] 987 Milne, A., "ECC Algorithms for MIKEY", 988 draft-ietf-msec-mikey-ecc-03 (work in progress), 989 June 2007. 991 [I-D.ietf-sip-certs] 992 Jennings, C. and J. Fischl, "Certificate Management 993 Service for The Session Initiation Protocol (SIP)", 994 draft-ietf-sip-certs-07 (work in progress), November 2008. 996 [I-D.ietf-tls-rfc4346-bis] 997 Dierks, T. and E. Rescorla, "The Transport Layer Security 998 (TLS) Protocol Version 1.2", draft-ietf-tls-rfc4346-bis-10 999 (work in progress), March 2008. 1001 [I-D.jennings-sipping-multipart] 1002 Wing, D. and C. Jennings, "Session Initiation Protocol 1003 (SIP) Offer/Answer with Multipart Alternative", 1004 draft-jennings-sipping-multipart-02 (work in progress), 1005 March 2006. 1007 [I-D.mcgrew-srtp-ekt] 1008 McGrew, D., "Encrypted Key Transport for Secure RTP", 1009 draft-mcgrew-srtp-ekt-03 (work in progress), July 2007. 1011 [I-D.stucker-sipping-early-media-coping] 1012 Stucker, B., "Coping with Early Media in the Session 1013 Initiation Protocol (SIP)", 1014 draft-stucker-sipping-early-media-coping-03 (work in 1015 progress), October 2006. 1017 [I-D.wing-sipping-srtp-key] 1018 Wing, D., Audet, F., Fries, S., Tschofenig, H., and A. 1019 Johnston, "Secure Media Recording and Transcoding with the 1020 Session Initiation Protocol", 1021 draft-wing-sipping-srtp-key-04 (work in progress), 1022 October 2008. 1024 [I-D.zimmermann-avt-zrtp] 1025 Zimmermann, P., Johnston, A., and J. Callas, "ZRTP: Media 1026 Path Key Agreement for Secure RTP", 1027 draft-zimmermann-avt-zrtp-11 (work in progress), 1028 November 2008. 1030 [RFC3326] Schulzrinne, H., Oran, D., and G. Camarillo, "The Reason 1031 Header Field for the Session Initiation Protocol (SIP)", 1032 RFC 3326, December 2002. 1034 [RFC3372] Vemuri, A. and J. Peterson, "Session Initiation Protocol 1035 for Telephones (SIP-T): Context and Architectures", 1036 BCP 63, RFC 3372, September 2002. 1038 [RFC3550] Schulzrinne, H., Casner, S., Frederick, R., and V. 1039 Jacobson, "RTP: A Transport Protocol for Real-Time 1040 Applications", STD 64, RFC 3550, July 2003. 1042 [RFC3830] Arkko, J., Carrara, E., Lindholm, F., Naslund, M., and K. 1043 Norrman, "MIKEY: Multimedia Internet KEYing", RFC 3830, 1044 August 2004. 1046 [RFC4474] Peterson, J. and C. Jennings, "Enhancements for 1047 Authenticated Identity Management in the Session 1048 Initiation Protocol (SIP)", RFC 4474, August 2006. 1050 [RFC4492] Blake-Wilson, S., Bolyard, N., Gupta, V., Hawk, C., and B. 1051 Moeller, "Elliptic Curve Cryptography (ECC) Cipher Suites 1052 for Transport Layer Security (TLS)", RFC 4492, May 2006. 1054 [RFC4568] Andreasen, F., Baugher, M., and D. Wing, "Session 1055 Description Protocol (SDP) Security Descriptions for Media 1056 Streams", RFC 4568, July 2006. 1058 [RFC4650] Euchner, M., "HMAC-Authenticated Diffie-Hellman for 1059 Multimedia Internet KEYing (MIKEY)", RFC 4650, 1060 September 2006. 1062 [RFC4738] Ignjatic, D., Dondeti, L., Audet, F., and P. Lin, "MIKEY- 1063 RSA-R: An Additional Mode of Key Distribution in 1064 Multimedia Internet KEYing (MIKEY)", RFC 4738, 1065 November 2006. 1067 [RFC4771] Lehtovirta, V., Naslund, M., and K. Norrman, "Integrity 1068 Transform Carrying Roll-Over Counter for the Secure Real- 1069 time Transport Protocol (SRTP)", RFC 4771, January 2007. 1071 [RFC4916] Elwell, J., "Connected Identity in the Session Initiation 1072 Protocol (SIP)", RFC 4916, June 2007. 1074 [RFC4949] Shirey, R., "Internet Security Glossary, Version 2", 1075 RFC 4949, August 2007. 1077 [RFC5027] Andreasen, F. and D. Wing, "Security Preconditions for 1078 Session Description Protocol (SDP) Media Streams", 1079 RFC 5027, October 2007. 1081 [RFC5197] Fries, S. and D. Ignjatic, "On the Applicability of 1082 Various Multimedia Internet KEYing (MIKEY) Modes and 1083 Extensions", RFC 5197, June 2008. 1085 Appendix A. Overview and Evaluation of Existing Keying Mechanisms 1087 Based on how the SRTP keys are exchanged, each SRTP key exchange 1088 mechanism belongs to one general category: 1090 signaling path: 1091 All the keying is carried in the call signaling (SIP or SDP) 1092 path. 1094 media path: 1095 All the keying is carried in the SRTP/SRTCP media path, and no 1096 signaling whatsoever is carried in the call signaling path. 1098 signaling and media path: 1099 Parts of the keying are carried in the SRTP/SRTCP media path, 1100 and parts are carried in the call signaling (SIP or SDP) path. 1102 One of the significant benefits of SRTP over other end-to-end 1103 encryption mechanisms, such as for example IPsec, is that SRTP is 1104 bandwidth efficient and SRTP retains the header of RTP packets. 1106 Bandwidth efficiency is vital for VoIP in many scenarios where access 1107 bandwidth is limited or expensive, and retaining the RTP header is 1108 important for troubleshooting packet loss, delay, and jitter. 1110 Related to SRTP's characteristics is a goal that any SRTP keying 1111 mechanism to also be efficient and not cause additional call setup 1112 delay. Contributors to additional call setup delay include network 1113 or database operations: retrieval of certificates and additional SIP 1114 or media path messages, and computational overhead of establishing 1115 keys or validating certificates. 1117 When examining the choice between keying in the signaling path, 1118 keying in the media path, or keying in both paths, it is important to 1119 realize the media path is generally 'faster' than the SIP signaling 1120 path. The SIP signaling path has computational elements involved 1121 which parse and route SIP messages. The media path, on the other 1122 hand, does not normally have computational elements involved, and 1123 even when computational elements such as firewalls are involved, they 1124 cause very little additional delay. Thus, the media path can be 1125 useful for exchanging several messages to establish SRTP keys. A 1126 disadvantage of keying over the media path is that interworking 1127 different key exchange requires the interworking function be in the 1128 media path, rather than just in the signaling path; in practice this 1129 involvement is probably unavoidable anyway. 1131 A.1. Signaling Path Keying Techniques 1133 A.1.1. MIKEY-NULL 1135 MIKEY-NULL [RFC3830] has the offerer indicate the SRTP keys for both 1136 directions. The key is sent unencrypted in SDP, which means the SDP 1137 must be encrypted hop-by-hop (e.g., by using TLS (SIPS)) or end-to- 1138 end (e.g., by using S/MIME). 1140 MIKEY-NULL requires one message from offerer to answerer (half a 1141 round trip), and does not add additional media path messages. 1143 A.1.2. MIKEY-PSK 1145 MIKEY-PSK (pre-shared key) [RFC3830] requires that all endpoints 1146 share one common key. MIKEY-PSK has the offerer encrypt the SRTP 1147 keys for both directions using this pre-shared key. 1149 MIKEY-PSK requires one message from offerer to answerer (half a round 1150 trip), and does not add additional media path messages. 1152 A.1.3. MIKEY-RSA 1154 MIKEY-RSA [RFC3830] has the offerer encrypt the keys for both 1155 directions using the intended answerer's public key, which is 1156 obtained from a mechanism outside of MIKEY. 1158 MIKEY-RSA requires one message from offerer to answerer (half a round 1159 trip), and does not add additional media path messages. MIKEY-RSA 1160 requires the offerer to obtain the intended answerer's certificate. 1162 A.1.4. MIKEY-RSA-R 1164 MIKEY-RSA-R [RFC4738] is essentially the same as MIKEY-RSA but 1165 reverses the role of the offerer and the answerer with regards to 1166 providing the keys. That is, the answerer encrypts the keys for both 1167 directions using the offerer's public key. Both the offerer and 1168 answerer validate each other's public keys using a standard X.509 1169 validation techniques. MIKEY-RSA-R also enables sending certificates 1170 in the MIKEY message. 1172 MIKEY-RSA-R requires one message from offerer to answer, and one 1173 message from answerer to offerer (full round trip), and does not add 1174 additional media path messages. MIKEY-RSA-R requires the offerer 1175 validate the answerer's certificate. 1177 A.1.5. MIKEY-DHSIGN 1179 In MIKEY-DHSIGN [RFC3830] the offerer and answerer derive the key 1180 from a Diffie-Hellman exchange. In order to prevent an active man- 1181 in-the-middle the DH exchange itself is signed using each endpoint's 1182 private key and the associated public keys are validated using 1183 standard X.509 validation techniques. 1185 MIKEY-DHSIGN requires one message from offerer to answerer, and one 1186 message from answerer to offerer (full round trip), and does not add 1187 additional media path messages. MIKEY-DHSIGN requires the offerer 1188 and answerer to validate each other's certificates. MIKEY-DHSIGN 1189 also enables sending the answerer's certificate in the MIKEY message. 1191 A.1.6. MIKEY-DHHMAC 1193 MIKEY-DHHMAC [RFC4650] uses a pre-shared secret to HMAC the Diffie- 1194 Hellman exchange, essentially combining aspects of MIKEY-PSK with 1195 MIKEY-DHSIGN, but without MIKEY-DHSIGN's need for certificate 1196 authentication. 1198 MIKEY-DHHMAC requires one message from offerer to answerer, and one 1199 message from answerer to offerer (full round trip), and does not add 1200 additional media path messages. 1202 A.1.7. MIKEY-ECIES and MIKEY-ECMQV (MIKEY-ECC) 1204 ECC Algorithms For MIKEY [I-D.ietf-msec-mikey-ecc] describes how ECC 1205 can be used with MIKEY-RSA (using ECDSA signature) and with MIKEY- 1206 DHSIGN (using a new DH-Group code), and also defines two new ECC- 1207 based algorithms, Elliptic Curve Integrated Encryption Scheme (ECIES) 1208 and Elliptic Curve Menezes-Qu-Vanstone (ECMQV) . 1210 With this proposal, the ECDSA signature, MIKEY-ECIES, and MIKEY-ECMQV 1211 function exactly like MIKEY-RSA, and the new DH-Group code function 1212 exactly like MIKEY-DHSIGN. Therefore these ECC mechanisms are not 1213 discussed separately in this document. 1215 A.1.8. Security Descriptions with SIPS 1217 Security Descriptions [RFC4568] has each side indicate the key it 1218 will use for transmitting SRTP media, and the keys are sent in the 1219 clear in SDP. Security Descriptions relies on hop-by-hop (TLS via 1220 "SIPS:") encryption to protect the keys exchanged in signaling. 1222 Security Descriptions requires one message from offerer to answerer, 1223 and one message from answerer to offerer (full round trip), and does 1224 not add additional media path messages. 1226 A.1.9. Security Descriptions with S/MIME 1228 This keying mechanism is identical to Appendix A.1.8, except that 1229 rather than protecting the signaling with TLS, the entire SDP is 1230 encrypted with S/MIME. 1232 A.1.10. SDP-DH (expired) 1234 SDP Diffie-Hellman [I-D.baugher-mmusic-sdp-dh] exchanges Diffie- 1235 Hellman messages in the signaling path to establish session keys. To 1236 protect against active man-in-the-middle attacks, the Diffie-Hellman 1237 exchange needs to be protected with S/MIME, SIPS, or SIP Identity 1238 [RFC4474] and SIP Conected Identity [RFC4916]. 1240 SDP-DH requires one message from offerer to answerer, and one message 1241 from answerer to offerer (full round trip), and does not add 1242 additional media path messages. 1244 A.1.11. MIKEYv2 in SDP (expired) 1246 MIKEYv2 [I-D.dondeti-msec-rtpsec-mikeyv2] adds mode negotiation to 1247 MIKEYv1 and removes the time synchronization requirement. It 1248 therefore now takes 2 round-trips to complete. In the first round 1249 trip, the communicating parties learn each other's identities, agree 1250 on a MIKEY mode, crypto algorithm, SRTP policy, and exchanges nonces 1251 for replay protection. In the second round trip, they negotiate 1252 unicast and/or group SRTP context for SRTP and/or SRTCP. 1254 Furthemore, MIKEYv2 also defines an in-band negotiation mode as an 1255 alternative to SDP (see Appendix A.3.3). 1257 A.2. Media Path Keying Technique 1259 A.2.1. ZRTP 1261 ZRTP [I-D.zimmermann-avt-zrtp] does not exchange information in the 1262 signaling path (although it's possible for endpoints to exchange a 1263 hash of the ZRTP Hello message with "a=zrtp-hash" in the initial 1264 Offer if sent over an integrity-protected signaling channel. This 1265 provides some useful correlation between the signaling and media 1266 layers). In ZRTP the keys are exchanged entirely in the media path 1267 using a Diffie-Hellman exchange. The advantage to this mechanism is 1268 that the signaling channel is used only for call setup and the media 1269 channel is used to establish an encrypted channel -- much like 1270 encryption devices on the PSTN. ZRTP uses voice authentication of 1271 its Diffie-Hellman exchange by having each person read digits or 1272 words to the other person. Subsequent sessions with the same ZRTP 1273 endpoint can be authenticated using the stored hash of the previously 1274 negotiated key rather than voice authentication. ZRTP uses 4 media 1275 path messages (Hello, Commit, DHPart1, and DHPart2) to establish the 1276 SRTP key, and 3 media path confirmation messages. These initial 1277 messages are all sent as non-RTP packets. 1279 Note that when ZRTP probing is used, unencrypted RTP can be 1280 exchanged until the SRTP keys are established. 1282 A.3. Signaling and Media Path Keying Techniques 1284 A.3.1. EKT 1286 EKT [I-D.mcgrew-srtp-ekt] relies on another SRTP key exchange 1287 protocol, such as Security Descriptions or MIKEY, for bootstrapping. 1288 In the initial phase, each member of a conference uses an SRTP key 1289 exchange protocol to establish a common key encryption key (KEK). 1290 Each member may use the KEK to securely transport its SRTP master key 1291 and current SRTP rollover counter (ROC), via RTCP, to the other 1292 participants in the session. 1294 EKT requires the offerer to send some parameters (EKT_Cipher, KEK, 1295 and security parameter index (SPI)) via the bootstrapping protocol 1296 such as Security Descriptions or MIKEY. Each answerer sends an SRTCP 1297 message which contains the answerer's SRTP Master Key, rollover 1298 counter, and the SRTP sequence number. Rekeying is done by sending a 1299 new SRTCP message. For reliable transport, multiple RTCP messages 1300 need to be sent. 1302 A.3.2. DTLS-SRTP 1304 DTLS-SRTP [I-D.ietf-avt-dtls-srtp] exchanges public key fingerprints 1305 in SDP [I-D.fischl-sipping-media-dtls] and then establishes a DTLS 1306 session over the media channel. The endpoints use the DTLS handshake 1307 to agree on crypto suites and establish SRTP session keys. SRTP 1308 packets are then exchanged between the endpoints. 1310 DTLS-SRTP requires one message from offerer to answerer (half round 1311 trip), and one message from the answerer to offerer (full round trip) 1312 so the offerer can correlate the SDP answer with the answering 1313 endpoint. DTLS-SRTP uses 4 media path messages to establish the SRTP 1314 key. 1316 This document assumes DTLS will use TLS_RSA_WITH_AES_128_CBC_SHA as 1317 its cipher suite, which is the mandatory-to-implement cipher suite in 1318 TLS [I-D.ietf-tls-rfc4346-bis]. 1320 A.3.3. MIKEYv2 Inband (expired) 1322 As defined in Appendix A.1.11, MIKEYv2 also defines an in-band 1323 negotiation mode as an alternative to SDP (see Appendix A.3.3). The 1324 details are not sorted out in the draft yet on what in-band actually 1325 means (i.e., UDP, RTP, RTCP, etc.). 1327 A.4. Evaluation Criteria - SIP 1329 This section considers how each keying mechanism interacts with SIP 1330 features. 1332 A.4.1. Secure Retargeting and Secure Forking 1334 Retargeting and forking of signaling requests is described within 1335 Section 4.2. The following builds upon this description. 1337 The following list compares the behavior of secure forking, answering 1338 association, two-time pads, and secure retargeting for each keying 1339 mechanism. 1341 MIKEY-NULL Secure Forking: No, all AORs see offerer's and 1342 answerer's keys. Answer is associated with media by the SSRC 1343 in MIKEY. Additionally, a two-time pad occurs if two branches 1344 choose the same 32-bit SSRC and transmit SRTP packets. 1346 Secure Retargeting: No, all targets see offerer's and 1347 answerer's keys. Suffers from retargeting identity problem. 1349 MIKEY-PSK 1350 Secure Forking: No, all AORs see offerer's and answerer's keys. 1351 Answer is associated with media by the SSRC in MIKEY. Note 1352 that all AORs must share the same pre-shared key in order for 1353 forking to work at all with MIKEY-PSK. Additionally, a two- 1354 time pad occurs if two branches choose the same 32-bit SSRC and 1355 transmit SRTP packets. 1357 Secure Retargeting: Not secure. For retargeting to work, the 1358 final target must possess the correct PSK. As this is likely 1359 in scenarios were the call is targeted to another device 1360 belonging to the same user (forking), it is very unlikely that 1361 other users will possess that PSK and be able to successfully 1362 answer that call. 1364 MIKEY-RSA 1365 Secure Forking: No, all AORs see offerer's and answerer's keys. 1366 Answer is associated with media by the SSRC in MIKEY. Note 1367 that all AORs must share the same private key in order for 1368 forking to work at all with MIKEY-RSA. Additionally, a two- 1369 time pad occurs if two branches choose the same 32-bit SSRC and 1370 transmit SRTP packets. 1372 Secure Retargeting: No. 1374 MIKEY-RSA-R 1375 Secure Forking: Yes. Answer is associated with media by the 1376 SSRC in MIKEY. 1378 Secure Retargeting: Yes. 1380 MIKEY-DHSIGN 1381 Secure Forking: Yes, each forked endpoint negotiates unique 1382 keys with the offerer for both directions. Answer is 1383 associated with media by the SSRC in MIKEY. 1385 Secure Retargeting: Yes, each target negotiates unique keys 1386 with the offerer for both directions. 1388 MIKEYv2 in SDP 1389 The behavior will depend on which mode is picked. 1391 MIKEY-DHHMAC 1392 Secure Forking: Yes, each forked endpoint negotiates unique 1393 keys with the offerer for both directions. Answer is 1394 associated with media by the SSRC in MIKEY. 1396 Secure Retargeting: Yes, each target negotiates unique keys 1397 with the offerer for both directions. Note that for the keys 1398 to be meaningful, it would require the PSK to be the same for 1399 all the potential intermediaries, which would only happen 1400 within a single domain. 1402 Security Descriptions with SIPS 1403 Secure Forking: No. Each forked endpoint sees the offerer's 1404 key. Answer is not associated with media. 1406 Secure Retargeting: No. Each target sees the offerer's key. 1408 Security Descriptions with S/MIME 1409 Secure Forking: No. Each forked endpoint sees the offerer's 1410 key. Answer is not associated with media. 1412 Secure Retargeting: No. Each target sees the offerer's key. 1413 Suffers from retargeting identity problem. 1415 SDP-DH 1416 Secure Forking: Yes. Each forked endpoint calculates a unique 1417 SRTP key. Answer is not associated with media. 1419 Secure Retargeting: Yes. The final target calculates a unique 1420 SRTP key. 1422 ZRTP 1423 Yes. Each forked endpoint calculates a unique SRTP key. With 1424 the "a=zrtp-hash" attribute, the media can be associated with 1425 an answer. 1427 Secure Retargeting: Yes. The final target calculates a unique 1428 SRTP key. 1430 EKT 1431 Secure Forking: Inherited from the bootstrapping mechanism (the 1432 specific MIKEY mode or Security Descriptions). Answer is 1433 associated with media by the SPI in the EKT protocol. Answer 1434 is associated with media by the SPI in the EKT protocol. 1436 Secure Retargeting: Inherited from the bootstrapping mechanism 1437 (the specific MIKEY mode or Security Descriptions). 1439 DTLS-SRTP 1440 Secure Forking: Yes. Each forked endpoint calculates a unique 1441 SRTP key. Answer is associated with media by the certificate 1442 fingerprint in signaling and certificate in the media path. 1444 Secure Retargeting: Yes. The final target calculates a unique 1445 SRTP key. 1447 MIKEYv2 Inband 1448 The behavior will depend on which mode is picked. 1450 A.4.2. Clipping Media Before SDP Answer 1452 Clipping media before receiving the signaling answer is described 1453 within Section 4.1. The following builds upon this description. 1455 Furthermore, the problem of clipping gets compounded when forking is 1456 used. For example, if using a Diffie-Hellman keying technique with 1457 security preconditions that forks to 20 endpoints, the call initiator 1458 would get 20 provisional responses containing 20 signed Diffie- 1459 Hellman half keys. Calculating 20 DH secrets and validating 1460 signatures can be a difficult task depending on the device 1461 capabilities. 1463 The following list compares the behavior of clipping before SDP 1464 answer for each keying mechanism. 1466 MIKEY-NULL 1467 Not clipped. The offerer provides the answerer's keys. 1469 MIKEY-PSK 1470 Not clipped. The offerer provides the answerer's keys. 1472 MIKEY-RSA 1473 Not clipped. The offerer provides the answerer's keys. 1475 MIKEY-RSA-R 1476 Clipped. The answer contains the answerer's encryption key. 1478 MIKEY-DHSIGN 1479 Clipped. The answer contains the answerer's Diffie-Hellman 1480 response. 1482 MIKEY-DHHMAC 1483 Clipped. The answer contains the answerer's Diffie-Hellman 1484 response. 1486 MIKEYv2 in SDP 1487 The behavior will depend on which mode is picked. 1489 Security Descriptions with SIPS 1490 Clipped. The answer contains the answerer's encryption key. 1492 Security Descriptions with S/MIME 1493 Clipped. The answer contains the answerer's encryption key. 1495 SDP-DH 1496 Clipped. The answer contains the answerer's Diffie-Hellman 1497 response. 1499 ZRTP 1500 Not clipped because the session intially uses RTP. While RTP 1501 is flowing, both ends negotiate SRTP keys in the media path and 1502 then switch to using SRTP. 1504 EKT 1505 Not clipped, as long as the first RTCP packet (containing the 1506 answerer's key) is not lost in transit. The answerer sends its 1507 encryption key in RTCP, which arrives at the same time (or 1508 before) the first SRTP packet encrypted with that key. 1510 Note: RTCP needs to work, in the answerer-to-offerer 1511 direction, before the offerer can decrypt SRTP media. 1513 DTLS-SRTP 1514 No clipping after the DTLS-SRTP handshake has completed. SRTP 1515 keys are exchanged in the media path. Need to wait for SDP 1516 answer to ensure DTLS-SRTP handshake was done with an 1517 authorized party. 1519 If a middlebox interferes with the media path, there can be 1520 clipping [I-D.ietf-mmusic-media-path-middleboxes]. 1522 MIKEYv2 Inband 1523 Not clipped. Keys are exchanged in the media path without 1524 relying on the signaling path. 1526 A.4.3. SSRC and ROC 1528 In SRTP, a cryptographic context is defined as the SSRC, destination 1529 network address, and destination transport port number. Whereas RTP, 1530 a flow is defined as the destination network address and destination 1531 transport port number. This results in a problem -- how to 1532 communicate the SSRC so that the SSRC can be used for the 1533 cryptographic context. 1535 Two approaches have emerged for this communication. One, used by all 1536 MIKEY modes, is to communicate the SSRCs to the peer in the MIKEY 1537 exchange. Another, used by Security Descriptions, is to apply "late 1538 binding" -- that is, any new packet containing a previously-unseen 1539 SSRC (which arrives at the same destination network address and 1540 destination transport port number) will create a new cryptographic 1541 context. Another approach, common amongst techniques with media-path 1542 SRTP key establishment, is to require a handshake over that media 1543 path before SRTP packets are sent. MIKEY's approach changes RTP's 1544 SSRC collision detection behavior by requiring RTP to pre-establish 1545 the SSRC values for each session. 1547 Another related issue is that SRTP introduces a rollover counter 1548 (ROC), which records how many times the SRTP sequence number has 1549 rolled over. As the sequence number is used for SRTP's default 1550 ciphers, it is important that all endpoints know the value of the 1551 ROC. The ROC starts at 0 at the beginning of a session. 1553 Some keying mechanisms cause a two-time pad to occur if two endpoints 1554 of a forked call have an SSRC collision. 1556 Note: A proposal has been made to send the ROC value on every Nth 1557 SRTP packet[RFC4771]. This proposal has not yet been incorporated 1558 into this document. 1560 The following list examines handling of SSRC and ROC: 1562 MIKEY-NULL 1563 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1564 packets it transmits. 1566 MIKEY-PSK 1567 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1568 packets it transmits. 1570 MIKEY-RSA 1571 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1572 packets it transmits. 1574 MIKEY-RSA-R 1575 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1576 packets it transmits. 1578 MIKEY-DHSIGN 1579 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1580 packets it transmits. 1582 MIKEY-DHHMAC 1583 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1584 packets it transmits. 1586 MIKEYv2 in SDP 1587 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1588 packets it transmits. 1590 Security Descriptions with SIPS 1591 Neither SSRC nor ROC are signaled. SSRC 'late binding' is 1592 used. 1594 Security Descriptions with S/MIME 1595 Neither SSRC nor ROC are signaled. SSRC 'late binding' is 1596 used. 1598 SDP-DH 1599 Neither SSRC nor ROC are signaled. SSRC 'late binding' is 1600 used. 1602 ZRTP 1603 Neither SSRC nor ROC are signaled. SSRC 'late binding' is 1604 used. 1606 EKT 1607 The SSRC of the SRTCP packet containing an EKT update 1608 corresponds to the SRTP master key and other parameters within 1609 that packet. 1611 DTLS-SRTP 1612 Neither SSRC nor ROC are signaled. SSRC 'late binding' is 1613 used. 1615 MIKEYv2 Inband 1616 Each endpoint indicates a set of SSRCs and the ROC for SRTP 1617 packets it transmits. 1619 A.5. Evaluation Criteria - Security 1621 This section evaluates each keying mechanism on the basis of their 1622 security properties. 1624 A.5.1. Distribution and Validation of Persistent Public Keys and 1625 Certificates 1627 Using persistent public keys for confidentiality and authentication 1628 can introduce requirements for two types of systems, often 1629 implemented using certificates: (1) a system to distribute those 1630 persistent public keys certificates, and (2) a system for validating 1631 those persistent public keys. We refer to the former as a key 1632 distribution system and the latter as an authentication 1633 infrastructure. In many cases, a monolithic public key 1634 infrastructure (PKI) is used for fulfill both of these roles. 1635 However, these functions can be provided by many other systems. For 1636 instance, key distribution may be accomplished by any public 1637 repository of keys. Any system in which the two endpoints have 1638 access to trust anchors and intermediate CA certificates that can be 1639 used to validate other endpoints' certificates (including a system of 1640 self-signed certificates) can be used to support certificate 1641 validation in the below schemes. 1643 With real-time communications it is desirable to avoid fetching or 1644 validating certificates that delay call setup. Rather, it is 1645 preferable to fetch or validate certificates in such a way that call 1646 setup is not delayed. For example, a certificate can be validated 1647 while the phone is ringing or can be validated while ring-back tones 1648 are being played or even while the called party is answering the 1649 phone and saying "hello". Even better is to avoid fetching or 1650 validating persistent public keys at all. 1652 SRTP key exchange mechanisms that require a particular authentication 1653 infrastructure to operate (whether for distribution or validation) 1654 are gated on the deployment of a such an infrastructure available to 1655 both endpoints. This means that no media security is achievable 1656 until such an infrastructure exists. For SIP, something like sip- 1657 certs [I-D.ietf-sip-certs] might be used to obtain the certificate of 1658 a peer. 1660 Note: Even if sip-certs [I-D.ietf-sip-certs] was deployed, the 1661 retargeting problem (Appendix A.4.1) would still prevent 1662 successful deployment of keying techniques which require the 1663 offerer to obtain the actual target's public key. 1665 The following list compares the requirements introduced by the use of 1666 public-key cryptography in each keying mechanism, both for public key 1667 distribution and for certificate validation. 1669 MIKEY-NULL 1670 Public-key cryptography is not used. 1672 MIKEY-PSK 1673 Public-key cryptography is not used. Rather, all endpoints 1674 must have some way to exchange per-endpoint or per-system pre- 1675 shared keys. 1677 MIKEY-RSA 1678 The offerer obtains the intended answerer's public key before 1679 initiating the call. This public key is used to encrypt the 1680 SRTP keys. There is no defined mechanism for the offerer to 1681 obtain the answerer's public key, although [I-D.ietf-sip-certs] 1682 might be viable in the future. 1684 The offer may also contain a certificate for the offeror, which 1685 would require an authentication infrastructure in order to be 1686 validated by the receiver. 1688 MIKEY-RSA-R 1689 The offer contains the offerer's certificate, and the answer 1690 contains the answerer's certificate. The answerer uses the 1691 public key in the certificate to encrypt the SRTP keys that 1692 will be used by the offerer and the answerer. An 1693 authentication infrastructure is necessary to validate the 1694 certificates. 1696 MIKEY-DHSIGN 1697 An authentication infrastructure is used to authenticate the 1698 public key that is included in the MIKEY message. 1700 MIKEY-DHHMAC 1701 Public-key cryptography is not used. Rather, all endpoints 1702 must have some way to exchange per-endpoint or per-system pre- 1703 shared keys. 1705 MIKEYv2 in SDP 1706 The behavior will depend on which mode is picked. 1708 Security Descriptions with SIPS 1709 Public-key cryptography is not used. 1711 Security Descriptions with S/MIME 1712 Use of S/MIME requires that the endpoints be able to fetch and 1713 validate certificates for each other. The offerer must obtain 1714 the intended target's certificate and encrypts the SDP offer 1715 with the public key contained in target's certificate. The 1716 answerer must obtain the offerer's certificate and encrypt the 1717 SDP answer with the public key contained in the offerer's 1718 certificate. 1720 SDP-DH 1721 Public-key cryptography is not used. 1723 ZRTP 1724 Public-key cryptography is used (Diffie-Hellman), but without 1725 dependence on persistent public keys. Thus, certificates are 1726 not fetched or validated. 1728 EKT 1729 Public-key cryptography is not used by itself, but might be 1730 used by the EKT bootstrapping keying mechanism (such as certain 1731 MIKEY modes). 1733 DTLS-SRTP 1734 Remote party's certificate is sent in media path, and a 1735 fingerprint of the same certificate is sent in the signaling 1736 path. 1738 MIKEYv2 Inband 1739 The behavior will depend on which mode is picked. 1741 A.5.2. Perfect Forward Secrecy 1743 In the context of SRTP, Perfect Forward Secrecy is the property that 1744 SRTP session keys that protected a previous session are not 1745 compromised if the static keys belonging to the endpoints are 1746 compromised. That is, if someone were to record your encrypted 1747 session content and later acquires either party's private key, that 1748 encrypted session content would be safe from decryption if your key 1749 exchange mechanism had perfect forward secrecy. 1751 The following list describes how each key exchange mechanism provides 1752 PFS. 1754 MIKEY-NULL 1755 Not applicable; MIKEY-NULL does not have a long-term secret. 1757 MIKEY-PSK 1758 No PFS. 1760 MIKEY-RSA 1761 No PFS. 1763 MIKEY-RSA-R 1764 No PFS. 1766 MIKEY-DHSIGN 1767 PFS is provided with the Diffie-Hellman exchange. 1769 MIKEY-DHHMAC 1770 PFS is provided with the Diffie-Hellman exchange. 1772 MIKEYv2 in SDP 1773 The behavior will depend on which mode is picked. 1775 Security Descriptions with SIPS 1776 Not applicable; Security Descriptions does not have a long-term 1777 secret. 1779 Security Descriptions with S/MIME 1780 Not applicable; Security Descriptions does not have a long-term 1781 secret. 1783 SDP-DH 1784 PFS is provided with the Diffie-Hellman exchange. 1786 ZRTP 1787 PFS is provided with the Diffie-Hellman exchange. 1789 EKT 1790 No PFS. 1792 DTLS-SRTP 1793 PFS is provided if the negotiated cipher suite uses ephemeral 1794 keys (e.g., Diffie-Hellman (DHE_RSA [I-D.ietf-tls-rfc4346-bis]) 1795 or Elliptic Curve Diffie-Hellman [RFC4492]). 1797 MIKEYv2 Inband 1798 The behavior will depend on which mode is picked. 1800 A.5.3. Best Effort Encryption 1802 With best effort encryption, SRTP is used with endpoints that support 1803 SRTP, otherwise RTP is used. 1805 SIP needs a backwards-compatible best effort encryption in order for 1806 SRTP to work successfully with SIP retargeting and forking when there 1807 is a mix of forked or retargeted devices that support SRTP and don't 1808 support SRTP. 1810 Consider the case of Bob, with a phone that only does RTP and a 1811 voice mail system that supports SRTP and RTP. If Alice calls Bob 1812 with an SRTP offer, Bob's RTP-only phone will reject the media 1813 stream (with an empty "m=" line) because Bob's phone doesn't 1814 understand SRTP (RTP/SAVP). Alice's phone will see this rejected 1815 media stream and may terminate the entire call (BYE) and re- 1816 initiate the call as RTP-only, or Alice's phone may decide to 1817 continue with call setup with the SRTP-capable leg (the voice mail 1818 system). If Alice's phone decided to re-initiate the call as RTP- 1819 only, and Bob doesn't answer his phone, Alice will then leave 1820 voice mail using only RTP, rather than SRTP as expected. 1822 Currently, several techniques are commonly considered as candidates 1823 to provide opportunistic encryption: 1825 multipart/alternative 1826 [I-D.jennings-sipping-multipart] describes how to form a 1827 multipart/alternative body part in SIP. The significant issues 1828 with this technique are (1) that multipart MIME is incompatible 1829 with existing SIP proxies, firewalls, Session Border Controllers, 1830 and endpoints and (2) when forking, the Heterogeneous Error 1831 Response Forking Problem (HERFP) [RFC3326] causes problems if such 1832 non-multipart-capable endpoints were involved in the forking. 1834 session attribute 1835 With this technique, the endpoints signal their desire to do SRTP 1836 by signaling RTP (RTP/AVP), and using an attribute ("a=") in the 1837 SDP. This technique is entirely backwards compatible with non- 1838 SRTP-aware endpoints, but doesn't use the RTP/SAVP protocol 1839 registered by SRTP [RFC3711]. 1841 SDP Capability Negotiation 1842 SDP Capability Negotiation 1843 [I-D.ietf-mmusic-sdp-capability-negotiation] provides a backwards- 1844 compatible mechanism to allow offering both SRTP and RTP in a 1845 single offer. This is the preferred technique. 1847 Probing 1848 With this technique, the endpoints first establish an RTP session 1849 using RTP (RTP/AVP). The endpoints send probe messages, over the 1850 media path, to determine if the remote endpoint supports their 1851 keying technique. A disadvantage of probing is an active attacker 1852 can interfere with probes, and until probing completes (and SRTP 1853 is established) the media is in the clear. 1855 The preferred technique, SDP Capability Negotiation 1856 [I-D.ietf-mmusic-sdp-capability-negotiation], can be used with all 1857 key exchange mechanisms. What remains unique is ZRTP, which can also 1858 accomplish its best effort encryption by probing (sending ZRTP 1859 messages over the media path) or by session attribute (see "a=zrtp- 1860 hash" in [I-D.zimmermann-avt-zrtp]). Current implementations of ZRTP 1861 use probing. 1863 A.5.4. Upgrading Algorithms 1865 It is necessary to allow upgrading SRTP encryption and hash 1866 algorithms, as well as upgrading the cryptographic functions used for 1867 the key exchange mechanism. With SIP's offer/answer model, this can 1868 be computionally expensive because the offer needs to contain all 1869 combinations of the key exchange mechanisms (all MIKEY modes, 1870 Security Descriptions) and all SRTP cryptographic suites (AES-128, 1871 AES-256) and all SRTP cryptographic hash functions (SHA-1, SHA-256) 1872 that the offerer supports. In order to do this, the offerer has to 1873 expend CPU resources to build an offer containing all of this 1874 information which becomes computationally prohibitive. 1876 Thus, it is important to keep the offerer's CPU impact fixed so that 1877 offering multiple new SRTP encryption and hash functions incurs no 1878 additional expense. 1880 The following list describes the CPU effort involved in using each 1881 key exchange technique. 1883 MIKEY-NULL 1884 No significant computational expense. 1886 MIKEY-PSK 1887 No significant computational expense. 1889 MIKEY-RSA 1890 For each offered SRTP crypto suite, the offerer has to perform 1891 RSA operation to encrypt the TGK 1893 MIKEY-RSA-R 1894 For each offered SRTP crypto suite, the offerer has to perform 1895 public key operation to sign the MIKEY message. 1897 MIKEY-DHSIGN 1898 For each offered SRTP crypto suite, the offerer has to perform 1899 Diffie-Hellman operation, and a public key operation to sign 1900 the Diffie-Hellman output. 1902 MIKEY-DHHMAC 1903 For each offered SRTP crypto suite, the offerer has to perform 1904 Diffie-Hellman operation. 1906 MIKEYv2 in SDP 1907 The behavior will depend on which mode is picked. 1909 Security Descriptions with SIPS 1910 No significant computational expense. 1912 Security Descriptions with S/MIME 1913 S/MIME requires the offerer and the answerer to encrypt the SDP 1914 with the other's public key, and to decrypt the received SDP 1915 with their own private key. 1917 SDP-DH 1918 For each offered SRTP crypto suite, the offerer has to perform 1919 a Diffie-Hellman operation. 1921 ZRTP 1922 The offerer has no additional computational expense at all, as 1923 the offer contains no information about ZRTP or might contain 1924 "a=zrtp-hash". 1926 EKT 1927 The offerer's Computational expense depends entirely on the EKT 1928 bootstrapping mechanism selected (one or more MIKEY modes or 1929 Security Descriptions). 1931 DTLS-SRTP 1932 The offerer has no additional computational expense at all, as 1933 the offer contains only a fingerprint of the certificate that 1934 will be presented in the DTLS exchange. 1936 MIKEYv2 Inband 1937 The behavior will depend on which mode is picked. 1939 Appendix B. Out-of-Scope 1941 The compromise of an endpoint that has access to decrypted media 1942 (e.g., SIP user agent, transcoder, recorder) is out of scope of this 1943 document. Such a compromise might be via privilege escalation, 1944 installation of a virus or trojan horse, or similar attacks. 1946 B.1. Shared Key Conferencing 1948 The consensus on the RTPSEC mailing list was to concentrate on 1949 unicast, point-to-point sessions. Thus, there are no requirements 1950 related to shared key conferencing. This section is retained for 1951 informational purposes. 1953 For efficient scaling, large audio and video conference bridges 1954 operate most efficiently by encrypting the current speaker once and 1955 distributing that stream to the conference attendees. Typically, 1956 inactive participants receive the same streams -- they hear (or see) 1957 the active speaker(s), and the active speakers receive distinct 1958 streams that don't include themselves. In order to maintain 1959 confidentiality of such conferences where listeners share a common 1960 key, all listeners must rekeyed when a listener joins or leaves a 1961 conference. 1963 An important use case for mixers/translators is a conference bridge: 1965 +----+ 1966 A --- 1 --->| | 1967 <-- 2 ----| M | 1968 | I | 1969 B --- 3 --->| X | 1970 <-- 4 ----| E | 1971 | R | 1972 C --- 5 --->| | 1973 <-- 6 ----| | 1974 +----+ 1976 Figure 3: Centralized Keying 1978 In the figure above, 1, 3, and 5 are RTP media contributions from 1979 Alice, Bob, and Carol, and 2, 4, and 6 are the RTP flows to those 1980 devices carrying the 'mixed' media. 1982 Several scenarios are possible: 1984 a. Multiple inbound sessions: 1, 3, and 5 are distinct RTP sessions, 1986 b. Multiple outbound sessions: 2, 4, and 6 are distinct RTP 1987 sessions, 1989 c. Single inbound session: 1, 3, and 5 are just different sources 1990 within the same RTP session, 1992 d. Single outbound session: 2, 4, and 6 are different flows of the 1993 same (multi-unicast) RTP session 1995 If there are multiple inbound sessions and multiple outbound sessions 1996 (scenarios a and b), then every keying mechanism behaves as if the 1997 mixer were an end point and can set up a point-to-point secure 1998 session between the participant and the mixer. This is the simplest 1999 situation, but is computationally wasteful, since SRTP processing has 2000 to be done independently for each participant. The use of multiple 2001 inbound sessions (scenario a) doesn't waste computational resources, 2002 though it does consume additional cryptographic context on the mixer 2003 for each participant and has the advantage of data origin 2004 authentication. 2006 To support a single outbound session (scenario d), the mixer has to 2007 dictate its encryption key to the participants. Some keying 2008 mechanisms allow the transmitter to determine its own key, and others 2009 allow the offerer to determine the key for the offerer and answerer. 2010 Depending on how the call is established, the offerer might be a 2011 participant (such as a participant dialing into a conference bridge) 2012 or the offerer might be the mixer (such as a conference bridge 2013 calling a participant). The use of offerless INVITEs may help some 2014 keying mechanisms reverse the role of offerer/answerer. A 2015 difficulty, however, is knowing a priori if the role should be 2016 reversed for a particular call. The significant advantage of a 2017 single outbound session is the number of SRTP encryption operations 2018 remains constant even as the number of participants increases. 2019 However, a disadvantage is that data origin authentication is lost, 2020 allowing any participant to spoof the sender (because all 2021 participants know the sender's SRTP key). 2023 Appendix C. Requirement renumbering in -02 2025 [[RFC Editor: Please delete this section prior to publication.]] 2027 Previous versions of this document used requirement numbers, which 2028 were changed to mnemonics as follows: 2030 R1 R-FORK-RETARGET 2032 R2 R-BEST-SECURE 2034 R3 R-DISTINCT 2036 R4 R-REUSE; changed from 'MAY' to 'protocol MUST support, and 2037 SHOULD implement' 2039 R5 R-AVOID-CLIPPING 2041 R6 R-PASS-MEDIA 2043 R7 R-PASS-SIG 2045 R8 R-PFS 2047 R9 R-COMPUTE 2049 R10 R-RTP-CHECK 2051 R11 (folded into R4; was reuse previous session) 2053 R12 R-CERTS 2055 R13 R-FIPS 2057 R14 R-ASSOC 2059 R15 R-ALLOW-RTP 2061 R16 R-DOS 2063 R17 R-SIG-MEDIA 2065 R18 R-EXISTING 2067 R19 R-AGILITY 2069 R20 R-DOWNGRADE 2071 R21 R-NEGOTIATE 2073 R23 R-OTHER-SIGNALING 2074 R23 R-RECORDING (R23 was duplicated in previous versions of the 2075 document) 2077 R24 (deleted; was lawful intercept) 2079 R25 R-TRANSCODER 2081 R26 R-PSTN 2083 R27 R-ID-BINDING 2085 R28 R-ACT-ACT 2087 Authors' Addresses 2089 Dan Wing (editor) 2090 Cisco Systems, Inc. 2091 170 West Tasman Drive 2092 San Jose, CA 95134 2093 USA 2095 Email: dwing@cisco.com 2097 Steffen Fries 2098 Siemens AG 2099 Otto-Hahn-Ring 6 2100 Munich, Bavaria 81739 2101 Germany 2103 Email: steffen.fries@siemens.com 2105 Hannes Tschofenig 2106 Nokia Siemens Networks 2107 Otto-Hahn-Ring 6 2108 Munich, Bavaria 81739 2109 Germany 2111 Email: Hannes.Tschofenig@nsn.com 2112 URI: http://www.tschofenig.priv.at 2113 Francois Audet 2114 Nortel 2115 4655 Great America Parkway 2116 Santa Clara, CA 95054 2117 USA 2119 Email: audet@nortel.com