idnits 2.17.1 draft-rescorla-rtcweb-security-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (June 5, 2011) is 4708 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-17) exists of draft-ietf-hybi-thewebsocketprotocol-07 -- Obsolete informational reference (is this intentional?): RFC 2818 (Obsoleted by RFC 9110) -- Obsolete informational reference (is this intentional?): RFC 4347 (Obsoleted by RFC 6347) -- Obsolete informational reference (is this intentional?): RFC 5245 (Obsoleted by RFC 8445, RFC 8839) Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 RTC-Web E. Rescorla 3 Internet-Draft RTFM, Inc. 4 Intended status: Standards Track June 5, 2011 5 Expires: December 7, 2011 7 Security Considerations for RTC-Web 8 draft-rescorla-rtcweb-security-00 10 Abstract 12 The Real-Time Communications on the Web (RTC-Web) working group is 13 tasked with standardizing protocols for real-time communications 14 between Web browsers. The two major use cases for RTC-Web technology 15 are real-time audio and/or video calls and direct data transfer. 16 Unlike most conventional real-time systems (e.g., SIP-based soft 17 phones) RTC-Web communications are directly controlled by some Web 18 server, which poses new security challenges. For instance, a Web 19 browser might expose a JavaScript API which allows a server to place 20 a video call. Unrestricted access to such an API would allow any 21 site which a user visited to "bug" a user's computer, capturing any 22 activity which passed in front of their camera. This document 23 defines the RTC-Web threat model and defines an architecture which 24 provides security within that threat model. 26 Legal 28 THIS DOCUMENT AND THE INFORMATION CONTAINED THEREIN ARE PROVIDED ON 29 AN "AS IS" BASIS AND THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 30 REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE 31 IETF TRUST, AND THE INTERNET ENGINEERING TASK FORCE, DISCLAIM ALL 32 WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY 33 WARRANTY THAT THE USE OF THE INFORMATION THEREIN WILL NOT INFRINGE 34 ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS 35 FOR A PARTICULAR PURPOSE. 37 Status of this Memo 39 This Internet-Draft is submitted in full conformance with the 40 provisions of BCP 78 and BCP 79. 42 Internet-Drafts are working documents of the Internet Engineering 43 Task Force (IETF). Note that other groups may also distribute 44 working documents as Internet-Drafts. The list of current Internet- 45 Drafts is at http://datatracker.ietf.org/drafts/current/. 47 Internet-Drafts are draft documents valid for a maximum of six months 48 and may be updated, replaced, or obsoleted by other documents at any 49 time. It is inappropriate to use Internet-Drafts as reference 50 material or to cite them other than as "work in progress." 52 This Internet-Draft will expire on December 7, 2011. 54 Copyright Notice 56 Copyright (c) 2011 IETF Trust and the persons identified as the 57 document authors. All rights reserved. 59 This document is subject to BCP 78 and the IETF Trust's Legal 60 Provisions Relating to IETF Documents 61 (http://trustee.ietf.org/license-info) in effect on the date of 62 publication of this document. Please review these documents 63 carefully, as they describe your rights and restrictions with respect 64 to this document. Code Components extracted from this document must 65 include Simplified BSD License text as described in Section 4.e of 66 the Trust Legal Provisions and are provided without warranty as 67 described in the Simplified BSD License. 69 This document may contain material from IETF Documents or IETF 70 Contributions published or made publicly available before November 71 10, 2008. The person(s) controlling the copyright in some of this 72 material may not have granted the IETF Trust the right to allow 73 modifications of such material outside the IETF Standards Process. 74 Without obtaining an adequate license from the person(s) controlling 75 the copyright in such materials, this document may not be modified 76 outside the IETF Standards Process, and derivative works of it may 77 not be created outside the IETF Standards Process, except to format 78 it for publication as an RFC or to translate it into languages other 79 than English. 81 Table of Contents 83 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 84 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 85 3. The Browser Threat Model . . . . . . . . . . . . . . . . . . . 5 86 3.1. Access to Local Resources . . . . . . . . . . . . . . . . 6 87 3.2. Same Origin Policy . . . . . . . . . . . . . . . . . . . . 6 88 3.3. Bypassing SOP: CORS, WebSockets, and consent to 89 communicate . . . . . . . . . . . . . . . . . . . . . . . 7 90 4. Security for RTC-Web Applications . . . . . . . . . . . . . . 7 91 4.1. Access to Local Devices . . . . . . . . . . . . . . . . . 7 92 4.2. Communications Consent Verification . . . . . . . . . . . 9 93 4.2.1. ICE . . . . . . . . . . . . . . . . . . . . . . . . . 9 94 4.2.2. Masking . . . . . . . . . . . . . . . . . . . . . . . 10 95 4.2.3. Backward Compatibility . . . . . . . . . . . . . . . . 10 96 4.3. Communications Security . . . . . . . . . . . . . . . . . 10 97 4.3.1. Protecting Against Retrospective Compromise . . . . . 11 98 4.3.2. Protecting Against During-Call Attack . . . . . . . . 12 99 4.3.2.1. Key Continuity . . . . . . . . . . . . . . . . . . 12 100 4.3.2.2. Short Authentication Strings . . . . . . . . . . . 13 101 5. Security Considerations . . . . . . . . . . . . . . . . . . . 14 102 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 14 103 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 14 104 7.1. Normative References . . . . . . . . . . . . . . . . . . . 14 105 7.2. Informative References . . . . . . . . . . . . . . . . . . 14 106 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 16 108 1. Introduction 110 The Real-Time Communications on the Web (RTC-Web) working group is 111 tasked with standardizing protocols for real-time communications 112 between Web browsers. The two major use cases for RTC-Web technology 113 are real-time audio and/or video calls and direct data transfer. 114 Unlike most conventional real-time systems, (e.g., SIP-based[RFC3261] 115 soft phones) RTC-Web communications are directly controlled by some 116 Web server. A simple case is shown below. 118 +----------------+ 119 | | 120 | Web Server | 121 | | 122 +----------------+ 123 ^ ^ 124 / \ 125 HTTP / \ HTTP 126 / \ 127 / \ 128 v v 129 JS API JS API 130 +-----------+ +-----------+ 131 | | Media | | 132 | Browser |<---------->| Browser | 133 | | | | 134 +-----------+ +-----------+ 136 Figure 1: A simple RTC-Web system 138 In the system shown in Figure 1, Alice and Bob both have RTC-Web 139 enabled browsers and they visit some Web server which operates a 140 calling service. Each of their browsers exposes standardized 141 JavaScript calling APIs which are used by the Web server to set up a 142 call between Alice and Bob. While this system is topologically 143 similar to a conventional SIP-based system (with the Web server 144 acting as the signaling service and browsers acting as softphones), 145 control has moved to the central Web server; the browser simply 146 provides API points that are used by the calling service. As with 147 any Web application, the Web server can move logic between the server 148 and JavaScript in the browser, but regardless of where the code is 149 executing, it is ultimately under control of the server. 151 It should be immediately apparent that this type of system poses new 152 security challenges beyond those of a conventional VoIP system. In 153 particular, it needs to contend with malicious calling services. For 154 example, if the calling service can cause the browser to make a call 155 at any time to any callee of its choice, then this facility can be 156 used to bug a user's computer without their knowledge, simply by 157 placing a call to some recording service. More subtly, if the 158 exposed APIs allow the server to instruct the browser to send 159 arbitrary content, then they can be used to bypass firewalls or mount 160 denial of service attacks. Any successful system will need to be 161 resistant to this and other attacks. 163 2. Terminology 165 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 166 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 167 document are to be interpreted as described in RFC 2119 [RFC2119]. 169 3. The Browser Threat Model 171 The security requirements for RTC-Web follow directly from the 172 requirement that the browser's job is to protect the user. Huang et 173 al. [huang-w2sp] summarize the core browser security guarantee as: 175 Users can safely visit arbitrary web sites and execute scripts 176 provided by those sites. 178 It is important to realize that this includes sites hosting arbitrary 179 malicious scripts. The motivation for this requirement is simple: 180 it is trivial for attackers to divert users to sites of their choice. 181 For instance, an attacker can purchase display advertisements which 182 direct the user (either automatically or via user clicking) to their 183 site, at which point the browser will execute the attacker's scripts. 184 Thus, it is important that it be safe to view arbitrarily malicious 185 pages. Of course, browsers inevitably have bugs which cause them to 186 fall short of this goal, but any new RTC-Web functionality must be 187 designed with the intent to meet this standard. The remainder of 188 this section provides more background on the existing Web security 189 model. 191 In this model, then, the browser acts as a TRUSTED COMPUTING BASE 192 (TCB) both from the user's perspective and to some extent from the 193 server's. While HTML and JS provided by the server can cause the 194 browser to execute a variety of actions, those scripts operate in a 195 sandbox that isolates them both from the user's computer and from 196 each other, as detailed below. 198 Conventionally, we refer to either WEB ATTACKERS, who are able to 199 induce you to visit their sites but do not control the network, and 200 NETWORK ATTACKERS, who are able to control your network. Network 201 attackers correspond to the [RFC3552] "Internet Threat Model". In 202 general, it is desirable to build a system which is secure against 203 both kinds of attackers, but realistically many sites do not run 204 HTTPS [RFC2818] and so our ability to defend against network 205 attackers is necessarily somewhat limited. Most of the rest of this 206 section is devoted to web attackers, with the assumption that 207 protection against network attackers is provided by running HTTPS. 209 3.1. Access to Local Resources 211 While the browser has access to local resources such as keying 212 material, files, the camera and the microphone, it strictly limits or 213 forbids web servers from accessing those same resources. For 214 instance, while it is possible to produce an HTML form which will 215 allow file upload, a script cannot do so without user consent and in 216 fact cannot even suggest a specific file (e.g., /etc/passwd); the 217 user must explicitly select the file and consent to its upload. 218 [Note: in many cases browsers are explicitly designed to avoid 219 dialogs with the semantics of "click here to screw yourself", as 220 extensive research shows that users are prone to consent under such 221 circumstances.] 223 Similarly, while Flash SWFs can access the camera and microphone, 224 they explicitly require that the user consent to that access. In 225 addition, some resources simply cannot be accessed from the browser 226 at all. For instance, there is no real way to run specific 227 executables directly from a script (though the user can of course be 228 induced to download executable files and run them). 230 3.2. Same Origin Policy 232 Many other resources are accessible but isolated. For instance, 233 while scripts are allowed to make HTTP requests via the 234 XMLHttpRequest() API those requests are not allowed to be made to any 235 server, but rather solely to the same ORIGIN from whence the script 236 came.[I-D.abarth-origin] (although CORS [CORS] and WebSockets 237 [I-D.ietf-hybi-thewebsocketprotocol] provides a escape hatch from 238 this restriction, as described below. This SAME ORIGIN POLICY (SOP) 239 prevents server A from mounting attacks on server B via the user's 240 browser, which protects both the user (e.g., from misuse of his 241 credentials) and the server (e.g., from DoS attack). 243 More generally, SOP forces scripts from each site to run in their 244 own, isolated, sandboxes. While there are techniques to allow them 245 to interact, those interactions generally must be mutually consensual 246 (by each site) and are limited to certain channels. For instance, 247 multiple pages/browser panes from the same origin can read each 248 other's JS variables, but pages from the different origins--or even 249 iframes from different origins on the same page--cannot. 251 3.3. Bypassing SOP: CORS, WebSockets, and consent to communicate 253 While SOP serves an important security function, it also makes it 254 inconvenient to write certain classes of applications. In 255 particular, mash-ups, in which a script from origin A uses resources 256 from origin B, can only be achieved via a certain amount of hackery. 257 The W3C Cross-Origin Resource Sharing (CORS) spec [CORS] is a 258 response to this demand. In CORS, when a script from origin A 259 executes what would otherwise be a forbidden cross-origin request, 260 the browser instead contacts the target server to determine whether 261 it is willing to allow cross-origin requests from A. If it is so 262 willing, the browser then allows the request. This consent 263 verification process is designed to safely allow cross-origin 264 requests. 266 While CORS is designed to allow cross-origin HTTP requests, 267 WebSockets [I-D.ietf-hybi-thewebsocketprotocol] allows cross-origin 268 establishment of transparent channels. Once a WebSockets connection 269 has been established from a script to a site, the script can exchange 270 any traffic it likes without being required to frame it as a series 271 of HTTP request/response transactions. As with CORS, a WebSockets 272 transaction starts with a consent verification stage to avoid 273 allowing scripts to simply send arbitrary data to another origin. 275 While consent verification is conceptually simple--just do a 276 handshake before you start exchanging the real data--experience has 277 shown that designing a correct consent verification system is 278 difficult. In particular, Huang et al. [huang-w2sp] have shown 279 vulnerabilities in the existing Java and Flash consent verification 280 techniques and in a simplified version of the WebSockets handshake. 281 In particular, it is important to be wary of CROSS-PROTOCOL attacks 282 in which the attacking script generates traffic which is acceptable 283 to some non-Web protocol state machine. In order to resist this form 284 of attack, WebSockets incorporates a masking technique intended to 285 randomize the bits on the wire, thus making it more difficult to 286 generate traffic which resembles a given protocol. 288 4. Security for RTC-Web Applications 290 4.1. Access to Local Devices 292 As discussed in Section 1, allowing arbitrary sites to initiate calls 293 violates the core Web security guarantee; without some access 294 restrictions on local devices, any malicious site could simply bug a 295 user. At minimum, then, it MUST NOT be possible for arbitrary sites 296 to initiate calls to arbitrary location without user consent. This 297 immediately raises the question, however, of what should be the scope 298 of user consent. 300 As discussed in Section 3.2, the basic unit of Web sandboxing is the 301 origin, and so it is natural to scope consent to origin. 302 Specifically, a script from origin A MUST only be allowed to initiate 303 communications (and hence to access camera and microphone) if the 304 user has specifically authorized access for that origin. It is of 305 course technically possible to have coarser-scoped permissions, but 306 because the Web model is scoped to origin, this creates a difficult 307 mismatch. 309 Arguably, origin is not fine-grained enough. Consider the situation 310 where Alice visits a site and authorizes it to make a single call. 311 If consent is expressed solely in terms of origin, then at any future 312 visit to that site (including one induced via mash-up or ad network), 313 the site can bug Alice's computer. While in principle Alice could 314 grant and then revoke the privilege, in practice privileges 315 accumulate; if we are concerned about this attack, something else is 316 needed. There are a number of potential countermeasures to this sort 317 of issue. 319 Individual Consent 320 Ask the user for permission for each call. 322 Callee-oriented Consent 323 Only allow calls to a given user. 325 Cryptographic Consent 326 Only allow calls to a given set of peer keying material. 328 Unfortunately, none of these approaches is really satisfactory. 329 Individual consent puts the user's approval in the UI flow for every 330 call. Not only does this quickly become annoying but it rapidly 331 trains the user to simply click "OK", at which point the consent 332 becomes useless. 334 The other two options are designed to restrict calls to a given 335 target. Unfortunately, Callee-oriented consent does not work because 336 the malicious site can claim that the user is calling any user of his 337 choice. The fix for this is to tie calls to a specific set of 338 cryptographic keying material, but that breaks any portability for 339 the callee's client, and is thus problematic. (Section 4.3.2.1) 341 While this is primarily a question not for IETF, it should be clear 342 that there is no really good answer. In general, if you cannot trust 343 the site which you have authorized for calling not to bug you then 344 your security situation is not really ideal. It is RECOMMENDED that 345 browsers have explicit (and obvious) indicators that they are in a 346 call in order to mitigate this risk. 348 The above recommendations provide security against web attackers. 349 However, if a legitimate site is fetched over HTTP rather than HTTPS, 350 a network attacker can inject code to initiate calls as if it were 351 that origin, thus bypassing origin restrictions. Note that this form 352 of attack is also possible if a site embeds active content (e.g., 353 JavaScript) that is fetched over HTTP or from an untrusted site, 354 because that JavaScript is executed in the security context of the 355 page [finer-grained]. Therefore, it is RECOMMENDED that sites which 356 embed RTC-Web functionality serve that functionality only over HTTPS 357 and that browsers disallow execution of calling functionality in 358 origins which contain mixed content. Note: this issue is not 359 restricted to PAGES which contain mixed content. If a page from a 360 given origin ever loads mixed content then it is possible for a 361 network attacker to infect the browser's notion of that origin semi- 362 permanently. 364 4.2. Communications Consent Verification 366 As discussed in Section 3.3, allowing web applications unrestricted 367 access to the via the browser network introduces the risk of using 368 the browser as an attack platform against machines which would not 369 otherwise be accessible to the malicious site, for instance because 370 they are topologically restricted (e.g., behind a firewall or NAT). 371 In order to prevent this form of attack as well as cross-protocol 372 attacks it is important to require that the target of traffic 373 explicitly consent to receiving the traffic in question. Until that 374 consent has been verified for a given endpoint, traffic other than 375 the consent handshake MUST NOT be sent to that endpoint. 377 4.2.1. ICE 379 Verifying receiver consent requires some sort of explicit handshake, 380 but conveniently we already need one in order to do NAT hole- 381 punching. ICE [RFC5245] includes a handshake designed to verify that 382 the receiving element wishes to receive traffic from the sender. It 383 is important to remember here that the site initiating ICE is 384 presumed malicious; in order for the handshake to be secure the 385 receiving element MUST demonstrate receipt/knowledge of some value 386 not available to the site (thus preventing it from forging 387 responses). In order to achieve this objective with ICE, the STUN 388 transaction IDs must be generated by the browser and MUST NOT be made 389 available to the initiating script, even via a diagnostic interface. 391 4.2.2. Masking 393 Once consent is verified, there still is some concern about 394 misinterpretation attacks as described by Huang et al.[huang-w2sp]. 395 As long as communication is limited to UDP, then this risk is 396 probably limited, thus masking is not required for UDP. However, 397 with TCP the risk of transparent proxies becomes much more severe. 398 If TCP is to be used, then WebSockets style masking MUST be employed. 400 4.2.3. Backward Compatibility 402 A requirement to use ICE limits compatibility with legacy non-ICE 403 clients. It seems unsafe to completely remove the requirement for 404 some check, but it might be possible to merely require a one-sided 405 check where the legacy client was a STUN responder. It's unclear 406 whether that is in fact simpler than doing ICE-Lite. 408 4.3. Communications Security 410 Finally, we consider a problem familiar from the SIP world: 411 communications security. For obvious reasons, it MUST be possible 412 for the communicating parties to establish a channel which is secure 413 against both message recovery and message modification. (See 414 [RFC5479] for more details.) This service must be provided for both 415 data and voice/video. Ideally the same security mechanisms would be 416 used for both types of content. Technology for providing this 417 service (for instance, DTLS [RFC4347] and DTLS-SRTP [RFC5763]) is 418 well understood. However, we must examine this technology to the 419 RTC-Web context, where the threat model is somewhat different. 421 In general, it is important to understand that unlike a conventional 422 SIP proxy, the calling service (i.e., the Web server) controls not 423 only the channel between the communicating endpoints but also the 424 application running on the user's browser. While in principle it is 425 possible for the browser to cut the calling service out of the loop 426 and directly present trusted information (and perhaps get consent), 427 practice in modern browsers is to avoid this whenever possible. "In- 428 flow" modal dialogs which require the user to consent to specific 429 actions are particularly disfavored as human factors research 430 indicates that unless they are made extremely invasive, users simply 431 agree to them without actually consciously giving consent. 432 [abarth-rtcweb]. Thus, nearly all the UI will necessarily be 433 rendered by the browser but under control of the calling service. 434 This likely includes the peer's identity information, which, after 435 all, is only meaningful in the context of some calling service. 437 This limitation does not mean that preventing attack by the calling 438 service is completely hopeless. However, we need to distinguish 439 between two classes of attack: 441 Retrospective compromise of calling service. 442 The calling service is is non-malicious during a call but 443 subsequently is compromised and wishes to attack an older call. 445 During-call attack by calling service. 446 The calling service is compromised during the call it wishes to 447 attack. 449 Providing security against the former type of attack is practical 450 using the techniques discussed in Section 4.3.1. However, it is 451 extremely difficult to prevent a trusted but malicious calling 452 service from actively attacking a user's calls, either by mounting a 453 MITM attack or by diverting them entirely. (Note that this attack 454 applies equally to a network attacker if communications to the 455 calling service are not secured.) We discuss some potential 456 approaches and why they are likely to be impractical in 457 Section 4.3.2. 459 4.3.1. Protecting Against Retrospective Compromise 461 In a retrospective attack, the calling service was uncompromised 462 during the call, but that an attacker subsequently wants to recover 463 the content of the call. We assume that the attacker has access to 464 the protected media stream as well as having full control of the 465 calling service. 467 If the calling service has access to the traffic keying material (as 468 in SDES [RFC4568]), then retrospective attack is trivial. This form 469 of attack is particularly serious in the Web context because it is 470 standard practice in Web services to run extensive logging and 471 monitoring. Thus, it is highly likely that if the traffic key is 472 part of any HTTP request it will be logged somewhere and thus subject 473 to subsequent compromise. It is this consideration that makes an 474 automatic, public key-based key exchange mechanism imperative for 475 RTC-Web (this is a good idea for any communications security system) 476 and this mechanism SHOULD provide perfect forward secrecy (PFS). The 477 signaling channel/calling service can be used to authenticate this 478 mechanism. 480 In addition, the system MUST NOT provide any APIs to extract either 481 long-term keying material or to directly access any stored traffic 482 keys. Otherwise, an attacker who subsequently compromised the 483 calling service might be able to use those APIs to recover the 484 traffic keys and thus compromise the traffic. 486 4.3.2. Protecting Against During-Call Attack 488 Protecting against attacks during a call is a more difficult 489 proposition. Even if the calling service cannot directly access 490 keying material (as recommended in the previous section), it can 491 simply mount a man-in-the-middle attack on the connection, telling 492 Alice that she is calling Bob and Bob that he is calling Alice, while 493 in fact the calling service is acting as a calling bridge and 494 capturing all the traffic. While in theory it is possible to 495 construct techniques which protect against this form of attack, in 496 practice these techniques all require far too much user intervention 497 to be practical, given the user interface constraints described in 498 [abarth-rtcweb]. 500 4.3.2.1. Key Continuity 502 One natural approach is to use "key continuity". While a malicious 503 calling service can present any identity it chooses to the user, it 504 cannot produce a private key that maps to a given public key. Thus, 505 it is possible for the browser to note a given user's public key and 506 generate an alarm whenever that user's key changes. SSH [RFC4251] 507 uses a similar technique. (Note that the need to avoid explicit user 508 consent on every call precludes the browser requiring an immediate 509 manual check of the peer's key). 511 Unfortunately, this sort of key continuity mechanism is far less 512 useful in the RTC-Web context. First, much of the virtue of RTC-Web 513 (and any Web application) is that it is not bound to particular piece 514 of client software. Thus, it will be not only possible but routine 515 for a user to use multiple browsers on different computers which will 516 of course have different keying material (SACRED [RFC3760] 517 notwithstanding.) Thus, users will frequently be alerted to key 518 mismatches which are in fact completely legitimate, with the result 519 that they are trained to simply click through them. As it is known 520 that users routinely will click through far more dire warnings 521 [cranor-wolf], it seems extremely unlikely that any key continuity 522 mechanism will be effective rather than simply annoying. 524 Moreover, it is trivial to bypass even this kind of mechanism. 525 Recall that unlike the case of SSH, the browser never directly gets 526 the peer's identity from the user. Rather, it is provided by the 527 calling service. Even enabling a mechanism of this type would 528 require an API to allow the calling service to tell the browser "this 529 is a call to user X". All the calling service needs to do to avoid 530 triggering a key continuity warning is to tell the browser that "this 531 is a call to user Y" where Y is close to X. Even if the user actually 532 checks the other side's name (which all available evidence indicates 533 is unlikely), this would require (a) the browser to trusted UI to 534 provide the name and (b) the user to not be fooled by similar 535 appearing names. 537 4.3.2.2. Short Authentication Strings 539 ZRTP [RFC6189] uses a "short authentication string" (SAS) which is 540 derived from the key agreement protocol. This SAS is designed to be 541 read over the voice channel and if confirmed by both sides precludes 542 MITM attack. The intention is that the SAS is used once and then key 543 continuity (though a different mechanism from that discussed above) 544 is used thereafter. 546 Unfortunately, the SAS does not offer a practical solution to the 547 problem of a compromised calling service. "Voice conversion" 548 systems, which modify voice from one speaker to make it sound like 549 another, are an active area of research. These systems are already 550 good enough to fool both automatic recognition systems 551 [farus-conversion] and humans [kain-conversion] in many cases, and 552 are of course likely to improve in future, especially in an 553 environment where the user just wants to get on with the phone call. 554 Thus, even if SAS is effective today, it is likely not to be so for 555 much longer. Moreover, it is possible for an attacker who controls 556 the browser to allow the SAS to succeed and then simulate call 557 failure and reconnect, trusting that the user will not notice that 558 the "no SAS" indicator has been set (which seems likely). 560 Even were SAS secure if used, it seems exceedingly unlikely that 561 users will actually use it. As discussed above, the browser UI 562 constraints preclude requiring the SAS exchange prior to completing 563 the call and so it must be voluntary; at most the browser will 564 provide some UI indicator that the SAS has not yet been checked. 565 However, it it is well-known that when faced with optional mechanisms 566 such as fingerprints, users simply do not check them [whitten-johnny] 567 Thus, it is highly unlikely that users will ever perform the SAS 568 exchange. 570 Once uses have checked the SAS once, key continuity is required to 571 avoid them needing to check it on every call. However, this is 572 problematic for reasons indicated in Section 4.3.2.1. In principle 573 it is of course possible to render a different UI element to indicate 574 that calls are using an unauthenticated set of keying material 575 (recall that the attacker can just present a slightly different name 576 so that the attack shows the same UI as a call to a new device or to 577 someone you haven't called before) but as a practical matter, users 578 simply ignore such indicators even in the rather more dire case of 579 mixed content warnings. 581 5. Security Considerations 583 This entire document is about security. 585 6. Acknowledgements 587 7. References 589 7.1. Normative References 591 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 592 Requirement Levels", BCP 14, RFC 2119, March 1997. 594 7.2. Informative References 596 [CORS] van Kesteren, A., "Cross-Origin Resource Sharing". 598 [I-D.abarth-origin] 599 Barth, A., "The Web Origin Concept", 600 draft-abarth-origin-09 (work in progress), November 2010. 602 [I-D.ietf-hybi-thewebsocketprotocol] 603 Fette, I., "The WebSocket protocol", 604 draft-ietf-hybi-thewebsocketprotocol-07 (work in 605 progress), April 2011. 607 [RFC2818] Rescorla, E., "HTTP Over TLS", RFC 2818, May 2000. 609 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 610 A., Peterson, J., Sparks, R., Handley, M., and E. 611 Schooler, "SIP: Session Initiation Protocol", RFC 3261, 612 June 2002. 614 [RFC3552] Rescorla, E. and B. Korver, "Guidelines for Writing RFC 615 Text on Security Considerations", BCP 72, RFC 3552, 616 July 2003. 618 [RFC3760] Gustafson, D., Just, M., and M. Nystrom, "Securely 619 Available Credentials (SACRED) - Credential Server 620 Framework", RFC 3760, April 2004. 622 [RFC4251] Ylonen, T. and C. Lonvick, "The Secure Shell (SSH) 623 Protocol Architecture", RFC 4251, January 2006. 625 [RFC4347] Rescorla, E. and N. Modadugu, "Datagram Transport Layer 626 Security", RFC 4347, April 2006. 628 [RFC4568] Andreasen, F., Baugher, M., and D. Wing, "Session 629 Description Protocol (SDP) Security Descriptions for Media 630 Streams", RFC 4568, July 2006. 632 [RFC5245] Rosenberg, J., "Interactive Connectivity Establishment 633 (ICE): A Protocol for Network Address Translator (NAT) 634 Traversal for Offer/Answer Protocols", RFC 5245, 635 April 2010. 637 [RFC5479] Wing, D., Fries, S., Tschofenig, H., and F. Audet, 638 "Requirements and Analysis of Media Security Management 639 Protocols", RFC 5479, April 2009. 641 [RFC5763] Fischl, J., Tschofenig, H., and E. Rescorla, "Framework 642 for Establishing a Secure Real-time Transport Protocol 643 (SRTP) Security Context Using Datagram Transport Layer 644 Security (DTLS)", RFC 5763, May 2010. 646 [RFC6189] Zimmermann, P., Johnston, A., and J. Callas, "ZRTP: Media 647 Path Key Agreement for Unicast Secure RTP", RFC 6189, 648 April 2011. 650 [abarth-rtcweb] 651 Barth, A., "Prompting the user is security failure", RTC- 652 Web Workshop. 654 [cranor-wolf] 655 Sunshine, J., Egelman, S., Almuhimedi, H., Atri, N., and 656 L. cranor, "Crying Wolf: An Empirical Study of SSL Warning 657 Effectiveness", Proceedings of the 18th USENIX Security 658 Symposium, 2009. 660 [farus-conversion] 661 Farrus, M., Erro, D., and J. Hernando, "Speaker 662 Recognition Robustness to Voice Conversion". 664 [finer-grained] 665 Barth, A. and C. Jackson, "Beware of Finer-Grained 666 Origins", W2SP, 2008. 668 [huang-w2sp] 669 Huang, L-S., Chen, E., Barth, A., Rescorla, E., and C. 670 Jackson, "Talking to Yourself for Fun and Profit", W2SP, 671 2011. 673 [kain-conversion] 674 Kain, A. and M. Macon, "Design and Evaluation of a Voice 675 Conversion Algorithm based on Spectral Envelope Mapping 676 and Residual Prediction", Proceedings of ICASSP, May 677 2001. 679 [whitten-johnny] 680 Whitten, A. and J. Tygar, "Why Johnny Can't Encrypt: A 681 Usability Evaluation of PGP 5.0", Proceedings of the 8th 682 USENIX Security Symposium, 1999. 684 Author's Address 686 Eric Rescorla 687 RTFM, Inc. 688 2064 Edgewood Drive 689 Palo Alto, CA 94303 690 USA 692 Phone: +1 650 678 2350 693 Email: ekr@rtfm.com