idnits 2.17.1 draft-jones-perc-private-media-reqts-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: The final element is the switching MDD, which is responsible for forwarding encrypted media packets and conference control information to endpoints in the conference. It is also responsible for conveying secured signaling between the endpoints and the key management function, acquiring per-hop authentication keys from the KMF, and performing per-hop authentication operations for media packets. This function might also aggregate conference control information and initiate various conference control requests. Forwarding of media packets requires that the switching MDD have access to RTP headers or header extensions and potentially modify those message elements, but the actual media content MUST not be decipherable by the switching MDD. -- The document date (July 6, 2015) is 3210 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'TBD' is mentioned on line 746, but not defined == Unused Reference: 'RFC3261' is defined on line 782, but no explicit reference was found in the text -- Obsolete informational reference (is this intentional?): RFC 4474 (Obsoleted by RFC 8224) Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group P. Jones (Ed.) 3 Internet Draft N. Ismail 4 Intended status: Informational D. Benham 5 Expires: January 6, 2016 N. Buckles 6 Cisco Systems 7 J. Mattsson 8 Ericsson 9 R. Barnes 10 Mozilla 11 July 6, 2015 13 Private Media Requirements in Privacy Enhanced RTP Conferencing 14 draft-jones-perc-private-media-reqts-00 16 Abstract 18 This document specifies the requirements for ensuring the privacy and 19 integrity of real-time transport protocol (RTP) media flows between 20 two or more endpoints communicating through one or more centrally 21 located media distribution devices (MDDs). 23 Status of this Memo 25 This Internet-Draft is submitted to IETF in full conformance with the 26 provisions of BCP 78 and BCP 79. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF). Note that other groups may also distribute 30 working documents as Internet-Drafts. The list of current Internet- 31 Drafts is at http://datatracker.ietf.org/drafts/current/. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 This Internet-Draft will expire on January 6, 2015. 40 Copyright Notice 42 Copyright (c) 2015 IETF Trust and the persons identified as the 43 document authors. All rights reserved. 45 This document is subject to BCP 78 and the IETF Trust's Legal 46 Provisions Relating to IETF Documents 47 (http://trustee.ietf.org/license-info) in effect on the date of 48 publication of this document. Please review these documents 49 carefully, as they describe your rights and restrictions with respect 50 to this document. Code Components extracted from this document must 51 include Simplified BSD License text as described in Section 4.e of 52 the Trust Legal Provisions and are provided without warranty as 53 described in the Simplified BSD License. 55 Table of Contents 57 1. Introduction...................................................2 58 2. Requirements Language..........................................3 59 3. Terminology....................................................3 60 4. Background.....................................................4 61 5. Motivation for Private Media using switching MDDs..............5 62 5.1. Switching Media in Cloud Services.........................5 63 5.2. Private Media Security through Switching..................7 64 6. Private Media Trust Model......................................8 65 6.1. Trusted Elements..........................................9 66 6.2. Untrusted Elements.......................................10 67 7. Goals and Non-Goals...........................................11 68 7.1. Goals....................................................11 69 7.1.1. Ensure End-To-End Confidentiality...................11 70 7.1.2. Ensure End-To-End Source Authentication of Media....11 71 7.1.3. Provide a More Efficient Service than "Full-Mesh"...12 72 7.1.4. Support Cloud-Based Conferencing....................12 73 7.1.5. Limiting an Endpoint's Access to Content............12 74 7.1.6. Compatibility with the WebRTC Security Architecture.12 75 7.2. Non-Goals................................................13 76 7.2.1. Securing the Endpoints..............................13 77 7.2.2. Concealing that Communication Occurs................13 78 7.2.3. Individual Media Source Authentication..............13 79 7.2.4. Multicast -based Conferencing.......................14 80 8. Requirements..................................................14 81 9. IANA Considerations...........................................15 82 10. Security Considerations......................................15 83 11. References...................................................15 84 11.1. Normative References....................................15 85 11.2. Informative References..................................16 86 12. Acknowledgments..............................................16 87 13. Contributors.................................................17 88 Authors' Addresses...............................................18 90 1. Introduction 92 Users of multimedia communication products and services have privacy 93 expectations that are largely satisfied with the use of SRTP 94 [RFC3711] and related technologies when communicating point-to-point 95 over the Internet. When two or more endpoints communicate through a 96 traditional media server, it is necessary for those endpoints to 97 share the SRTP master key and salt information with the traditional 98 media server so that it can authenticate and decrypt received RTP and 99 RTCP packets. The key material is needed so that a traditional media 100 server can perform various operations on the media, such as mixing, 101 transcoding, and transrating. The traditional media server also 102 needs the master key and salt in order to transmit media packets to 103 other endpoints in the conference. The need for a traditional media 104 server to have the master key represents a security risk. 106 Within a corporate or other isolated environment where all 107 conferencing resources, including both call control and media 108 processing functions, are tightly controlled, this security risk can 109 be effectively managed. However, managing this risk is becoming 110 increasing difficult as conferencing resources are deployed in 111 networks that are not so strictly managed or controlled, including 112 resources on virtualized servers deployed in third-party cloud 113 environments. 115 There are also existing public voice and video conferencing service 116 providers in which users must place full trust by sharing media 117 encryption keys in order to use those services. This exposes 118 corporations, for example, to a higher risk of being subjected to 119 corporate espionage. While it is not the intent of this draft to 120 suggest that any existing service provider would permit or condone 121 any illicit use of its service, the fact is that security threats can 122 come from either internal or external sources and remain undiscovered 123 for long periods of time. 125 It is possible to ensure real-time transport protocol (RTP) media 126 privacy in deployments using one or more centrally located media 127 distribution devices (MDDs) with limited changes in the security 128 mechanisms used today. This document discusses this possibility in 129 more detail and presents a set of requirements that are neutral with 130 respect to session signaling protocols. 132 This document is focused on ensuring the privacy of RTP media in 133 centralized MDD models only. Other types of media are out of scope. 134 Other, non-centralized media distribution models are also out of 135 scope. 137 2. Requirements Language 139 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 140 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 141 document are to be interpreted as described in RFC 2119 [RFC2119] 142 when they appear in ALL CAPS. These words may also appear in this 143 document in lower case as plain English words, absent their normative 144 meanings. 146 3. Terminology 148 Adversary - An unauthorized entity that may attempt to compromise the 149 performance of a media distribution device through various means, 150 including, but not limited to, the transmission of bogus media 151 packets or attempt to gain access to the plaintext of the media. 153 Media content - The portion of the RTP (i.e., the encrypted RTP 154 payload) or other packet containing the actual audio, video, or other 155 multimedia information that is considered confidential and is subject 156 to end-to-end encryption. This does not include, for example, RTP 157 headers, RTP header extensions, or RTCP packets. 159 Switching media distribution device - A media distribution device 160 that does not decrypt RTP media flows or perform processing on the 161 media payload, but instead simply forwards the received media from a 162 sender to the other endpoints in a multimedia conference. A 163 switching media distribution device may modify some portion of the 164 RTP header and may often consume and create RTCP messages for 165 efficient media handling. 167 4. Background 169 Traditional media servers used for multimedia conferencing would mix, 170 transcode, transrate, and/or recompose media flows from one or more 171 conference participants' endpoints, sending out a different audio and 172 video flow to each endpoint. For audio, this might entail mixing 173 some number of input flows that appear to contain audio intended to 174 be heard by the other participants, with each endpoint receiving a 175 flow that does not contain that participant's own audio. For video, 176 the traditional media server may elect to send only video showing the 177 current active speaker, a tiled composition of all participants or 178 the most recent active speakers, a video flow with the active speaker 179 presented prominently with other participants presented as thumbnail 180 images, or some other composite arrangement. It is also common for 181 audio or video to be transcoded. A typical traditional media server 182 is depicted in Figure 1. 184 +-------------------+ 185 +---+ --{A}--> | | <--{C}-- +---+ 186 | A | | Media Composition | | C | 187 +---+ <-{BCD}- | | -{ABD}-> +---+ 188 | Transcoders | 189 +---+ --{B}--> | Transraters | <--{D}-- +---+ 190 | B | | | | D | 191 +---+ <-{ACD}- | Decrypt/Encrypt | -{ABC}-> +---+ 192 +-------------------+ 194 Figure 1 - Traditional Media Server 196 Traditional media servers require a significant amount of processing 197 power, which in turn translates into a high cost for conferencing 198 hardware manufacturers. Significantly, too, it is very difficult to 199 deploy these servers in a cloud environment due to the high 200 processing demands, as the specialized hardware found in the 201 traditional media server does not exist in a cloud environment. 203 To enable the traditional media server to perform its job, the server 204 establishes one or more SRTP sessions with each of the conference 205 endpoints wherein it is given access to the keys required to decrypt 206 and encrypt media flows from and to each endpoint. This means that 207 the traditional media server is necessarily a fully trusted entity in 208 the communication path. Any time these servers are deployed in a 209 network that is not secured, it increases the risk that an adversary 210 might gain access to cryptographic key material, allowing the 211 adversary to be able to see and listen to ongoing conferences. In 212 some instances, depending on how the hardware is designed and how 213 keys and certificates are managed, it might be possible for an 214 adversary to see and listen to previously recorded conferences or 215 future conferences. 217 The Secure Real-time Transport Protocol (SRTP) [RFC3711] is a profile 218 of RTP, which can provide confidentiality, message authentication, 219 and replay protection to the RTP traffic and to the RTP Control 220 Protocol (RTCP). Encryption of header extension in SRTP [RFC6904] 221 provides a mechanism extending the mechanisms of [RFC3711], to 222 selectively encrypt RTP header extensions in SRTP. [RFC3711] and 223 [RFC6904] solves end-to-end use cases between two endpoints, and does 224 not consider use cases where a sender delivers media to a receiver 225 via a cloud-based conferencing service. 227 5. Motivation for Private Media using switching MDDs 229 5.1. Switching Media in Cloud Services 231 There is a trend in the industry for enterprises to use cloud 232 services to host multi-party conferences and meet-me services, either 233 exclusively or to meet peak loads on-demand. At the same time, there 234 is shift toward using lightweight, cost-effective switching MDDs in 235 cloud services that do not necessarily need to mix audio or 236 composite/transcode video. Also fueling the use of such lightweight 237 MDDs is the desire to fully exploit virtualized computing resources 238 and dynamic scalability potential available in cloud computing 239 environments. 241 The increased use of cloud services has exposed a problem. There are 242 two different trust domains from a media perspective: endpoints and 243 other devices in a trusted domain, and MDDs controlled by the cloud 244 service in an untrusted domain. Other examples of conference devices 245 spread across trusted and untrusted domains are likely, but the cloud 246 service trend is triggering the urgency to address the need to allow 247 for lightweight media conference while enabling media privacy at the 248 same time. 250 With a switching MDD, each endpoint transmits media as it would with 251 a traditional media server. However, the switching MDD merely 252 forwards all or a subset of the media to the other endpoints in the 253 conference (where at least one other endpoint may be associated with 254 a cascaded media distribution device), leaving composition to the 255 receiving endpoint. It is also worth noting that, for a switching 256 MDD model to work successfully, each endpoint in the conference must 257 support the media formats transmitted by all other endpoints in the 258 conference. More modern endpoints support multiple codecs and 259 formats, making this commercially practical. 261 Figure 2 depicts an example of a switching MDD wherein each endpoint 262 is receiving the media flows transmitted by each of the other 263 endpoints in the conference. 265 +--------------------+ 266 +---+ --{A}--> | | <-{C}--- +---+ 267 | A | <-{B}--- | Switching MDD | --{A}--> | C | 268 | | <-{C}--- | | --{B}--> | | 269 +---+ <-{D}--- | | --{D}--> +---+ 270 | Packet | 271 +---+ --{B}--> | Authentication | <-{D}--- +---+ 272 | B | <-{A}--- | | --{A}--> | D | 273 | | <-{C}--- | | --{B}--> | | 274 +---+ <-{D}--- | Media Privacy | --{C}--> +---+ 275 +--------------------+ 277 Figure 2 - Switching Media Distribution Device 279 Note - The use of multiple arrows directed toward each endpoint is 280 not intended to suggest the use of separate RTP sessions. 282 By using methods such as those described in [RFC6464], it is possible 283 for the switching MDD to transmit the appropriate audio and video 284 flows to endpoints without having knowledge of the content of the 285 encrypted media. The following "Active Speaker Switching" examples 286 help illustrate this point. 288 In Figure 3, endpoints A, B and D receive the video streams from 289 endpoint C, the currently active speaker, which is receiving video 290 from endpoint A, the previous active speaker. Later when endpoint B 291 becomes the active speaker (Figure 4), endpoints A, C and D will 292 start to receive video from B, while endpoint B continues to receive 293 video from endpoint C. Finally in Figure 5, endpoint A becomes the 294 active speaker. 296 +--------------------+ 297 +---+ --{A}--> | | <--{C}-- +---+ 298 | A | | Switching MDD | | C |* 299 +---+ <-{C}--- | | ---{A}-> +---+ 300 | | 301 +---+ --{B}--> | | <--{D}-- +---+ 302 | B | | | | D | 303 +---+ <-{C}--- | | ---{C}-> +---+ 304 +--------------------+ 306 Figure 3 - Endpoint "C" is the Active Speaker 308 +--------------------+ 309 +---+ --{A}--> | | <--{C}-- +---+ 310 | A | | Switching MDD | | C | 311 +---+ <-{B}--- | | ---{B}-> +---+ 312 | | 313 +---+ --{B}--> | | <--{D}-- +---+ 314 *| B | | | | D | 315 +---+ <-{C}--- | | ---{B}-> +---+ 316 +--------------------+ 318 Figure 4 - Endpoint "B" is the Active Speaker 320 +--------------------+ 321 +---+ --{A}--> | | <--{C}-- +---+ 322 *| A | | Switching MDD | | C | 323 +---+ <-{B}--- | | ---{A}-> +---+ 324 | | 325 +---+ --{B}--> | | <--{D}-- +---+ 326 | B | | | | D | 327 +---+ <-{A}--- | | ---{A}-> +---+ 328 +--------------------+ 330 Figure 5 - Endpoint "A" is the Active Speaker 332 Switched media can also enable conferences to scale to include many 333 more endpoints simultaneously than would be possible with a 334 traditional media server. Like traditional media servers, switching 335 MDDs can also be cascaded or interconnected in a meshed topology to 336 increase the size of the conference without putting undue burden on 337 any particular server. 339 5.2. Private Media Security through Switching 341 A traditional media server, or MCU, establishes an SRTP session with 342 each endpoint separately, and needs to decrypt packets containing 343 media for presentation to other endpoints. By using a switching MDD, 344 it is possible to keep the media encryption keys private to the 345 endpoints such that the MDD does not have access to the keys used for 346 media encryption. The switching MDD just forwards media received to 347 each of the other endpoints in the conference. 349 This provides for a significantly improved security model, as one 350 can, for example, utilize conferencing resources in the cloud that do 351 not have to be trusted. That said, there may be situations where the 352 switching MDD needs to modify the RTP packet received from an 353 endpoint, such as by adding or removing an RTP header extension, 354 modifying the payload type value, etc. It would be the 355 responsibility of the switching MDD to ensure that media of the 356 expected type and containing the correct information is received by a 357 recipient. 359 Thus, there is a need to utilize an end-to-end encryption and 360 authentication key (or pair of keys) and a hop-by-hop encryption and 361 authentication key (or pair of keys). The end-to-end encryption and 362 authentication key(s) is to ensure that media remains private to the 363 trusted endpoints. The hop-by-hop authentication key allows the 364 switching MDD to authenticate RTP and RTCP packets and to optionally 365 modify certain elements of those packet. The hop-by-hop encryption 366 key is to optionally encrypt RTP header extensions and optionally 367 encrypt RTCP packets. The current SRTP and related specifications do 368 not define use of a dual-key (hop-by-hop and end-to-end) approach. 369 However, such an approach is possible and would result in ensuring 370 the privacy of media while also enabling the more scalable switched 371 conferencing model. 373 This dual-key model does necessitate a change in the way that keys 374 are managed. However, the topic of key management is outside the 375 scope of this requirements document. High-level assumptions, such as 376 if the end-to-end context uses a group key as SRTP master key or if 377 individual SRTP master keys (that may be derived/negotiated from 378 another group key), are likely to influence the solution derived from 379 this document. 381 6. Private Media Trust Model 383 The architectural model suggested in this document enables switching 384 MDDs to be hosted in domains in which the network elements may have 385 low trust, or where the trustworthiness is uncertain. This does not 386 mean that the service provider is completely untrusted; it simply 387 means that high enough trust with media decryption is not required. 388 This has the benefit of protecting the endpoint's media in the case 389 of external attacks against the MDD. 391 In this model, certain elements are considered trusted and others are 392 considered untrusted. Trust in the context of this document means 393 that the element can be in possession of the media encryption key(s) 394 for a past, current, or potentially future conference (or portion 395 thereof) used to protect media content. 397 In the general case, only the endpoint and an associated key 398 management function, which may be integrated with the endpoint or in 399 a separate stand-alone entity, needs to be trusted. However, it is 400 recognized that in certain deployments, some elements that are 401 classified as untrusted in this document might be placed into the 402 trusted domain and thus be considered trusted. One example might be 403 a gateway, traditional media server or other MDD in a trusted 404 environment connecting endpoints to the same private media 405 conference. This document does not preclude such deployment 406 combinations, but does not rely on them in order to keep the examples 407 and model definitions focused on the simple, most general case. 409 Each of the elements discussed below has a direct or indirect 410 relationship with each other. The following diagram depicts the 411 trust relationships described in the following sub-sections and the 412 media or signaling interfaces that exist between them, showing the 413 trusted elements on the left and untrusted elements on the right. 414 Note that this is a functional diagram and elements may be co-located 415 or further divided into multiple separate physical entities. 416 Further, it is not necessary that every interface exist between all 417 elements, such as both an interface from the endpoint and call 418 processing function to a key management function, though both are 419 possible options. 421 | 422 | 423 +----------+ | +-----------------+ 424 | Endpoint | | | Call Processing | 425 +----------+ | +-----------------+ 426 | 427 | 428 +----------------+ | +-----------------+ 429 | Key Management | | | Switching Media | 430 | Function | | | Server | 431 +----------------+ | +-----------------+ 432 | 433 Trusted | Untrusted 434 Elements | Elements 435 | 436 | 438 Figure 6 - Relationship of Trusted and Untrusted Elements 440 6.1. Trusted Elements 442 The endpoint is considered a trusted element, as it will be sourcing 443 media flows transmitted to other endpoints and will be receiving 444 media for rendering. While it is possible for an endpoint to be 445 compromised and perform in unexpected ways, such as transmitting a 446 decrypted copy of media content to an adversary, such security issues 447 and defenses are outside the scope of this document. 449 The other trusted element is a key management function (KMF), which 450 may be integrated with the endpoints or exist standalone. This 451 function is responsible for providing cryptographic keys to the 452 endpoints for encrypting and authenticating media content. The KMF 453 is also responsible for providing cryptographic keys to the 454 conferencing resources, such as the MDD, to enable authentication of 455 media packets received by an endpoint. Interaction between the KMF 456 and untrusted call processing functions may be necessary to ensure 457 endpoints are delivered the appropriate keys. The KMF needs to be 458 tightly controlled and managed to prevent exploitation by an 459 adversary, as any kind of security compromise of the KMF puts the 460 security of the conference at risk. 462 6.2. Untrusted Elements 464 The call processing function is responsible for such things as 465 authenticating the user or endpoint for the purpose of joining a 466 conference, signing messages, and processing call signaling messages. 467 This element is responsible for ensuring the integrity, and 468 optionally the confidentiality, of call signaling messages between 469 itself, the endpoint, and other network elements. However, it is 470 considered an untrusted element for the purposes of this document, as 471 it cannot be trusted to have access to or be able to gain access to 472 cryptographic key material that provides privacy and integrity of 473 media packets. 475 There might be several independent call processing functions within 476 an enterprise, service provider network, or the Internet that are 477 classified as untrusted. Any signaling information that passes 478 through these untrusted entities is subject to inspection by that 479 element and might be altered by an adversary. 481 Likewise, there may be certain deployment models where the call 482 processing function is considered trusted. In such cases, trusted 483 call processing functions MUST take responsibility for ensuring the 484 integrity of received messages before delivering those to the 485 endpoint. How signaling message integrity is ensured is outside the 486 scope of this document, but might use such methods as defined in 487 [RFC4474]. 489 The final element is the switching MDD, which is responsible for 490 forwarding encrypted media packets and conference control information 491 to endpoints in the conference. It is also responsible for conveying 492 secured signaling between the endpoints and the key management 493 function, acquiring per-hop authentication keys from the KMF, and 494 performing per-hop authentication operations for media packets. This 495 function might also aggregate conference control information and 496 initiate various conference control requests. Forwarding of media 497 packets requires that the switching MDD have access to RTP headers or 498 header extensions and potentially modify those message elements, but 499 the actual media content MUST not be decipherable by the switching 500 MDD. 502 Further, the switching MDD does not have the ability to determine 503 whether an endpoint is authorized to have access to media encryption 504 keys. Merely joining a conference MUST NOT be interpreted as having 505 authority. Media encryption keys are conveyed to the endpoint by the 506 KMF in such a way as to prevent the switching MDD from having access 507 to those keys. 509 It is assumed that an adversary might have access to the switching 510 MDD and have the ability to read any of the contents that pass 511 through. For this reason, it is untrusted to have access to the 512 media encryption keys. 514 As with the call processing functions, it is appreciated that there 515 may be some deployments wherein the switching MDD is trusted. 516 However, for the purposes of this document, the switching MDD is 517 considered untrusted so that we can be ensure to develop a solution 518 that will work even in the most hostile environments. 520 It is expected that a switching MDD performs its role in properly 521 forwarding media packets, taking measures to safeguard against replay 522 attacks, etc. If a MDD is exploited, an adversary may do such things 523 as discard packets, replay packets, or introduce unacceptable delay 524 in packet delivery. 526 7. Goals and Non-Goals 528 7.1. Goals 530 7.1.1. Ensure End-To-End Confidentiality 532 The content of the communication and all media needs to be 533 confidential within the group of entities explicitly invited into the 534 conference. An external monitoring adversary should not be able to 535 deduce the human-to-human communication that actually occurred from 536 capturing the media packets. 538 At the same time, it is necessary to allow switching MDDs to 539 manipulate certain RTP header fields like the payload type value. 541 7.1.2. Ensure End-To-End Source Authentication of Media 543 In a conference system with multiple endpoints it is vital that the 544 media content presented to any of the human participants is from the 545 stated endpoint, and not an adversary that attempts to inject 546 misleading content. Nor should an adversary be able to fool the 547 system into becoming a trusted party in the conference. Only 548 explicitly invited parties shall be able to contribute content. 550 7.1.3. Provide a More Efficient Service than "Full-Mesh" 552 A multi-party conference that has the goals of confidentiality and 553 source authentication can be established as a "full mesh" (i.e., each 554 participating endpoint directly addresses each of the other 555 endpoints). However, this has a significant issue with the amount of 556 consumed resources in both the uplink and the downlink from each 557 endpoint. 559 A switched conferencing model would yield the efficiencies desired. 561 7.1.4. Support Cloud-Based Conferencing 563 To achieve cost-effective and scalable conferencing, it must be 564 possible to run the MDD instances in a cloud-based virtualized 565 environment. 567 From a security standpoint, this is a significant issue since the 568 virtualized server instance and the underlying hardware and software 569 upon which it runs might not be secure from an adversary. 571 7.1.5. Limiting an Endpoint's Access to Content 573 Since an invited endpoint will be provided with the content 574 protection keys, the endpoint can decrypt content from time periods 575 before and after the endpoint joined the conference. However, this 576 is not always desirable. It should be possible to re-key the content 577 protection keys every time a participant joins or leaves the 578 conference so each particular set of endpoints uses a unique key. 580 This also changes the trust level required on the conference roster 581 handling at any point and how to keep that accurate and secured. 583 It should be noted that timely completion of the re-keying operations 584 become an obstacle in system design and operation. Thus, it is a 585 goal to allow for this possibility when it is deemed essential, but 586 it should not be a requirement on a system to re-key each time the 587 participant list changes. 589 7.1.6. Compatibility with the WebRTC Security Architecture 591 It is a goal of this work to ensure compatibility with the WebRTC 592 security architecture as described in [I.D-rtcweb-security-arch]. As 593 an example, local resources that are considered a part of the trusted 594 computing base (TCB), such as keying material derived using DTLS- 595 SRTP, will remain within the TCB and not exposed to untrusted 596 entities. 598 The browser is reliant on an external calling service to convey 599 signaling information that may open the door for a man-in-the-middle 600 attack, such as the conveyance of certificate fingerprints over the 601 interface between the browser and the calling service. However, as 602 described in [I.D-rtcweb-security-arch], the browser may utilize 603 additional services, such as a trusted identify provider, to mitigate 604 such risks. 606 Having said the foregoing, this document does not aim to define 607 requirements for end-to-end security for the WebRTC data channel. 609 7.2. Non-Goals 611 7.2.1. Securing the Endpoints 613 The security of a communication session requires that the endpoints 614 are not compromised and that the users are trustworthy. If not, 615 credentials and decrypted content may be shared with third parties. 616 However, this is hard to prevent through system design. Thus, it 617 should be assumed that the endpoint is secure and the user is 618 trustworthy; how to achieve this is out of scope this document. 620 7.2.2. Concealing that Communication Occurs 622 A non-goal is to attempt to prevent a pervasive monitoring adversary 623 from knowing that the communication session has occurred. The reason 624 for excluding this as a goal is that it is extremely difficult to 625 achieve, as a pervasive monitoring adversary can be expected to be 626 able to have knowledge of all IP flows that enter or exit local ISPs, 627 across links that straddle national borders or internet exchange 628 points. To hide the fact communication occurred, the flows required 629 to achieve the communication session need to be highly difficult to 630 correlate between different legs of the communication. 632 At this stage this is deemed too difficult to attempt and will need 633 to be a subject for further study. Existing attempts include The 634 Onion Router (TOR), against which it has been claimed to be possible 635 to monitor, at least partially, by an adversary with sufficient 636 reach. 638 Also of consideration is that trying to conceal the fact that 639 communication occurred actually makes it more difficult for network 640 administrators to effectively manage and troubleshoot issues with 641 conference calls. 643 7.2.3. Individual Media Source Authentication 645 Although the endpoints in the conference are authenticated, it is not 646 a goal to provide source authentication of the media at the 647 individual user level, instead being satisfied with being able to 648 authenticate media as coming from an invited endpoint or not. 650 There exist solutions that can provide individual media source 651 authentication (e.g., TESLA). However, they impact the performance 652 or security properties they provide. Thus, further study is required 653 to determine impact and resulting security properties if desired to 654 have individual source authentication. 656 7.2.4. Multicast -based Conferencing 658 Using multicast to construct a non-centralized media distribution 659 model is out of scope. This document is focused only on models where 660 endpoints, or other devices, participating in a conference unicast 661 media to a centrally located media distribution device. 663 8. Requirements 665 The following are the security solution requirements for switched 666 conferencing that enable end-to-end media privacy between all 667 endpoints. 669 Note that while some switching MDDs might be fully trusted entities, 670 the intent of this solution and purpose for these requirements is to 671 address those servers that are not trusted. 673 PM-01: Switching media distribution device MUST be able to switch 674 the media between endpoints in a conference without having 675 access to unencrypted media content. 677 PM-02: Solution MUST maintain all current SRTP security goals, 678 namely the ability to provide for end-to-end confidentiality, 679 provide for hop-by-hop replay protection, and ensure hop-by- 680 hop and end-to-end message integrity. 682 PM-03: Solution MUST extend replay protection to cover each hop in 683 the media path, both ensuring that any received packet is 684 destined for the recipient and not a duplicate. 686 PM-04: Keys used for end-to-end encryption and authentication of RTP 687 payloads and other information deemed unsuitable for access 688 by the switching media distribution device MUST NOT be 689 generated by or accessible to any component that is not 690 trusted. 692 PM-05: The switching media distribution device MUST be allowed to 693 make changes to the RTP header and the RTP header extensions. 695 PM-06: A cryptographic context suitable for enabling end-to-end 696 authenticated encryption MUST be defined. 698 PM-07: The switching media distribution device, or any entity that 699 is not fully trusted, MUST NOT be involved in the user or 700 endpoint authentication for the purpose of media key 701 distribution. 703 PM-08: The switching media distribution device MUST be able to 704 switch an already active RTP stream to a new receiver, while 705 guaranteeing the timely synchronization between the RTP 706 security context of the transmitter and its current and new 707 receivers. 709 PM-09: It MUST be possible for the switching media distribution 710 device to determine if a received media packet was 711 transmitted by an endpoint in possession of a valid hop-by- 712 hop key for that conference. 714 PM-10: It MUST be possible for a conference to be optionally re- 715 keyed as desired, such as each time a participant joins or 716 leaves the conference. 718 PM-11: Any solution satisfying this requirements document MUST 719 provide for a means through which WebRTC-compliant endpoints 720 can participate in a switched conference using private media 721 as outlined herein. 723 PM-12: All RTP senders, including the switching media distribution 724 device, MUST adhere to all congestion control requirements 725 that are required by the RTP profile and topology in use, 726 including RTP circuit breakers [I.D-ietf-avtcore-rtp-circuit- 727 breakers]. Since the switching media distribution device is 728 unable to perform transcoding or transrating that requires 729 access to the unencrypted media, its reaction to congestion 730 signals is often limited to dropping packets that would 731 otherwise be forwarded in the absence of congestion, and 732 signaling congestion to the RTP source. This is similar to 733 the congestion control behavior of the Media Switching Mixer 734 and Selective Forwarding Middlebox/Unit in [I.D-ietf-avtcore- 735 rtp-topologies-update]. 737 PM-13: It MUST be possible for a media distribution device or an 738 endpoint to authenticate a received RTCP packet. 740 9. IANA Considerations 742 There are no IANA considerations for this document. 744 10. Security Considerations 746 [TBD] 748 11. References 750 11.1. Normative References 752 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 753 Requirement Levels", BCP 14, RFC 2119, March 1997. 755 [RFC3711] Baugher, M., McGrew, D., Naslund, M., Carrara, E., and K. 756 Norrman, "The Secure Real-time Transport Protocol 757 (SRTP)", RFC 3711, March 2004. 759 [RFC6464] Lennox, J., Ivov, E., and E. Marocco, "A Real-time 760 Transport Protocol (RTP) Header Extension for Client-to- 761 Mixer Audio Level Indication", RFC 6464, December 2011. 763 [I.D-rtcweb-security-arch] 764 E. Rescorla, "WebRTC Security Architecture", Work in 765 Progress, March 2015. 767 [RFC6904] J. Lennox, "Encryption of Header Extensions in the Secure 768 Real-time Transport Protocol (SRTP)", RFC 6904, December 769 2013. 771 [I.D-ietf-avtcore-rtp-topologies-update] 772 Westerlund, M., and S. Wenger, "RTP Topologies", Work in 773 Progress, March 2015. 775 [I.D-ietf-avtcore-rtp-circuit-breakers] 776 Perkins, C. S., and V. Singh, "Multimedia Congestion 777 Control: Circuit Breakers for Unicast RTP Sessions", Work 778 in Progress, March 2015. 780 11.2. Informative References 782 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 783 A., Peterson, J., Sparks, R., Handley, M., and E. 784 Schooler, "SIP: Session Initiation Protocol", RFC 3261, 785 June 2002. 787 [RFC4474] Peterson, J. and C. Jennings, "Enhancements for 788 Authenticated Identity Management in the Session 789 Initiation Protocol (SIP)", RFC 4474, August 2006. 791 12. Acknowledgments 793 The authors would like to thank Marcello Caramma, Matthew Miller, 794 Christian Oien, Magnus Westerlund, Cullen Jennings, Christer 795 Holmberg, Bo Burman, Jonathan Lennox, Suhas Nandakumar, Dan Wing, 796 Roni Even, and Mo Zanaty for their invaluable input. 798 13. Contributors 800 Yi Cheng 801 Ericsson 802 SE-164 80 Stockholm 803 Sweden 805 Phone: +46 10 71 17 589 806 Email: yi.cheng@ericsson.com 808 Authors' Addresses 810 Paul E. Jones 811 Cisco Systems, Inc. 812 7025 Kit Creek Rd. 813 Research Triangle Park, NC 27709 814 USA 816 Phone: +1 919 476 2048 817 Email: paulej@packetizer.com 819 Nermeen Ismail 820 Cisco Systems, Inc. 821 170 W Tasman Dr. 822 San Jose 823 USA 825 Email: nermeen@cisco.com 827 David Benham 828 Cisco Systems, Inc. 829 170 W Tasman Dr. 830 San Jose 831 USA 833 Email: dbenham@cisco.com 835 Nathan Buckles 836 Cisco Systems, Inc. 837 170 W Tasman Dr. 838 San Jose 839 USA 841 Email: nbuckles@cisco.com 843 John Mattsson 844 Ericsson AB 845 SE-164 80 Stockholm 846 Sweden 848 Phone: +46 10 71 43 501 849 Email: john.mattsson@ericsson.com 851 Richard Barnes 852 Mozilla 853 331 E Evelyn Ave. 855 Mountain View 856 USA 858 Email: rlb@ipv.sx