idnits 2.17.1 

draft-rescorla-rtcweb-security-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == The document seems to contain a disclaimer for pre-RFC5378 work, but was
     first submitted on or after 10 November 2008.  The disclaimer is usually
     necessary only for documents that revise or obsolete older RFCs, and that
     take significant amounts of text from those RFCs.  If you can contact all
     authors of the source material and they are willing to grant the BCP78
     rights to the IETF Trust, you can and should remove the disclaimer. 
     Otherwise, the disclaimer is needed and you can ignore this comment. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (June 5, 2011) is 4708 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Outdated reference: A later version (-17) exists of
     draft-ietf-hybi-thewebsocketprotocol-07

  -- Obsolete informational reference (is this intentional?): RFC 2818
     (Obsoleted by RFC 9110)

  -- Obsolete informational reference (is this intentional?): RFC 4347
     (Obsoleted by RFC 6347)

  -- Obsolete informational reference (is this intentional?): RFC 5245
     (Obsoleted by RFC 8445, RFC 8839)


     Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	RTC-Web                                                      E. Rescorla
3	Internet-Draft                                                RTFM, Inc.
4	Intended status:  Standards Track                           June 5, 2011
5	Expires:  December 7, 2011

7	                  Security Considerations for RTC-Web
8	                   draft-rescorla-rtcweb-security-00

10	Abstract

12	   The Real-Time Communications on the Web (RTC-Web) working group is
13	   tasked with standardizing protocols for real-time communications
14	   between Web browsers.  The two major use cases for RTC-Web technology
15	   are real-time audio and/or video calls and direct data transfer.
16	   Unlike most conventional real-time systems (e.g., SIP-based soft
17	   phones) RTC-Web communications are directly controlled by some Web
18	   server, which poses new security challenges.  For instance, a Web
19	   browser might expose a JavaScript API which allows a server to place
20	   a video call.  Unrestricted access to such an API would allow any
21	   site which a user visited to "bug" a user's computer, capturing any
22	   activity which passed in front of their camera.  This document
23	   defines the RTC-Web threat model and defines an architecture which
24	   provides security within that threat model.

26	Legal

28	   THIS DOCUMENT AND THE INFORMATION CONTAINED THEREIN ARE PROVIDED ON
29	   AN "AS IS" BASIS AND THE CONTRIBUTOR, THE ORGANIZATION HE/SHE
30	   REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE
31	   IETF TRUST, AND THE INTERNET ENGINEERING TASK FORCE, DISCLAIM ALL
32	   WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY
33	   WARRANTY THAT THE USE OF THE INFORMATION THEREIN WILL NOT INFRINGE
34	   ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS
35	   FOR A PARTICULAR PURPOSE.

37	Status of this Memo

39	   This Internet-Draft is submitted in full conformance with the
40	   provisions of BCP 78 and BCP 79.

42	   Internet-Drafts are working documents of the Internet Engineering
43	   Task Force (IETF).  Note that other groups may also distribute
44	   working documents as Internet-Drafts.  The list of current Internet-
45	   Drafts is at http://datatracker.ietf.org/drafts/current/.

47	   Internet-Drafts are draft documents valid for a maximum of six months
48	   and may be updated, replaced, or obsoleted by other documents at any
49	   time.  It is inappropriate to use Internet-Drafts as reference
50	   material or to cite them other than as "work in progress."

52	   This Internet-Draft will expire on December 7, 2011.

54	Copyright Notice

56	   Copyright (c) 2011 IETF Trust and the persons identified as the
57	   document authors.  All rights reserved.

59	   This document is subject to BCP 78 and the IETF Trust's Legal
60	   Provisions Relating to IETF Documents
61	   (http://trustee.ietf.org/license-info) in effect on the date of
62	   publication of this document.  Please review these documents
63	   carefully, as they describe your rights and restrictions with respect
64	   to this document.  Code Components extracted from this document must
65	   include Simplified BSD License text as described in Section 4.e of
66	   the Trust Legal Provisions and are provided without warranty as
67	   described in the Simplified BSD License.

69	   This document may contain material from IETF Documents or IETF
70	   Contributions published or made publicly available before November
71	   10, 2008.  The person(s) controlling the copyright in some of this
72	   material may not have granted the IETF Trust the right to allow
73	   modifications of such material outside the IETF Standards Process.
74	   Without obtaining an adequate license from the person(s) controlling
75	   the copyright in such materials, this document may not be modified
76	   outside the IETF Standards Process, and derivative works of it may
77	   not be created outside the IETF Standards Process, except to format
78	   it for publication as an RFC or to translate it into languages other
79	   than English.

81	Table of Contents

83	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  4
84	   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . . . .  5
85	   3.  The Browser Threat Model . . . . . . . . . . . . . . . . . . .  5
86	     3.1.  Access to Local Resources  . . . . . . . . . . . . . . . .  6
87	     3.2.  Same Origin Policy . . . . . . . . . . . . . . . . . . . .  6
88	     3.3.  Bypassing SOP: CORS, WebSockets, and consent to
89	           communicate  . . . . . . . . . . . . . . . . . . . . . . .  7
90	   4.  Security for RTC-Web Applications  . . . . . . . . . . . . . .  7
91	     4.1.  Access to Local Devices  . . . . . . . . . . . . . . . . .  7
92	     4.2.  Communications Consent Verification  . . . . . . . . . . .  9
93	       4.2.1.  ICE  . . . . . . . . . . . . . . . . . . . . . . . . .  9
94	       4.2.2.  Masking  . . . . . . . . . . . . . . . . . . . . . . . 10
95	       4.2.3.  Backward Compatibility . . . . . . . . . . . . . . . . 10
96	     4.3.  Communications Security  . . . . . . . . . . . . . . . . . 10
97	       4.3.1.  Protecting Against Retrospective Compromise  . . . . . 11
98	       4.3.2.  Protecting Against During-Call Attack  . . . . . . . . 12
99	         4.3.2.1.  Key Continuity . . . . . . . . . . . . . . . . . . 12
100	         4.3.2.2.  Short Authentication Strings . . . . . . . . . . . 13
101	   5.  Security Considerations  . . . . . . . . . . . . . . . . . . . 14
102	   6.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 14
103	   7.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 14
104	     7.1.  Normative References . . . . . . . . . . . . . . . . . . . 14
105	     7.2.  Informative References . . . . . . . . . . . . . . . . . . 14
106	   Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 16

108	1.  Introduction

110	   The Real-Time Communications on the Web (RTC-Web) working group is
111	   tasked with standardizing protocols for real-time communications
112	   between Web browsers.  The two major use cases for RTC-Web technology
113	   are real-time audio and/or video calls and direct data transfer.
114	   Unlike most conventional real-time systems, (e.g., SIP-based[RFC3261]
115	   soft phones) RTC-Web communications are directly controlled by some
116	   Web server.  A simple case is shown below.

118	                               +----------------+
119	                               |                |
120	                               |   Web Server   |
121	                               |                |
122	                               +----------------+
123	                                   ^        ^
124	                                  /          \
125	                          HTTP   /            \   HTTP
126	                                /              \
127	                               /                \
128	                              v                  v
129	                           JS API              JS API
130	                     +-----------+            +-----------+
131	                     |           |    Media   |           |
132	                     |  Browser  |<---------->|  Browser  |
133	                     |           |            |           |
134	                     +-----------+            +-----------+

136	                     Figure 1: A simple RTC-Web system

138	   In the system shown in Figure 1, Alice and Bob both have RTC-Web
139	   enabled browsers and they visit some Web server which operates a
140	   calling service.  Each of their browsers exposes standardized
141	   JavaScript calling APIs which are used by the Web server to set up a
142	   call between Alice and Bob. While this system is topologically
143	   similar to a conventional SIP-based system (with the Web server
144	   acting as the signaling service and browsers acting as softphones),
145	   control has moved to the central Web server; the browser simply
146	   provides API points that are used by the calling service.  As with
147	   any Web application, the Web server can move logic between the server
148	   and JavaScript in the browser, but regardless of where the code is
149	   executing, it is ultimately under control of the server.

151	   It should be immediately apparent that this type of system poses new
152	   security challenges beyond those of a conventional VoIP system.  In
153	   particular, it needs to contend with malicious calling services.  For
154	   example, if the calling service can cause the browser to make a call
155	   at any time to any callee of its choice, then this facility can be
156	   used to bug a user's computer without their knowledge, simply by
157	   placing a call to some recording service.  More subtly, if the
158	   exposed APIs allow the server to instruct the browser to send
159	   arbitrary content, then they can be used to bypass firewalls or mount
160	   denial of service attacks.  Any successful system will need to be
161	   resistant to this and other attacks.

163	2.  Terminology

165	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
166	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
167	   document are to be interpreted as described in RFC 2119 [RFC2119].

169	3.  The Browser Threat Model

171	   The security requirements for RTC-Web follow directly from the
172	   requirement that the browser's job is to protect the user.  Huang et
173	   al. [huang-w2sp] summarize the core browser security guarantee as:

175	      Users can safely visit arbitrary web sites and execute scripts
176	      provided by those sites.

178	   It is important to realize that this includes sites hosting arbitrary
179	   malicious scripts.  The motivation for this requirement is simple:
180	   it is trivial for attackers to divert users to sites of their choice.
181	   For instance, an attacker can purchase display advertisements which
182	   direct the user (either automatically or via user clicking) to their
183	   site, at which point the browser will execute the attacker's scripts.
184	   Thus, it is important that it be safe to view arbitrarily malicious
185	   pages.  Of course, browsers inevitably have bugs which cause them to
186	   fall short of this goal, but any new RTC-Web functionality must be
187	   designed with the intent to meet this standard.  The remainder of
188	   this section provides more background on the existing Web security
189	   model.

191	   In this model, then, the browser acts as a TRUSTED COMPUTING BASE
192	   (TCB) both from the user's perspective and to some extent from the
193	   server's.  While HTML and JS provided by the server can cause the
194	   browser to execute a variety of actions, those scripts operate in a
195	   sandbox that isolates them both from the user's computer and from
196	   each other, as detailed below.

198	   Conventionally, we refer to either WEB ATTACKERS, who are able to
199	   induce you to visit their sites but do not control the network, and
200	   NETWORK ATTACKERS, who are able to control your network.  Network
201	   attackers correspond to the [RFC3552] "Internet Threat Model".  In
202	   general, it is desirable to build a system which is secure against
203	   both kinds of attackers, but realistically many sites do not run
204	   HTTPS [RFC2818] and so our ability to defend against network
205	   attackers is necessarily somewhat limited.  Most of the rest of this
206	   section is devoted to web attackers, with the assumption that
207	   protection against network attackers is provided by running HTTPS.

209	3.1.  Access to Local Resources

211	   While the browser has access to local resources such as keying
212	   material, files, the camera and the microphone, it strictly limits or
213	   forbids web servers from accessing those same resources.  For
214	   instance, while it is possible to produce an HTML form which will
215	   allow file upload, a script cannot do so without user consent and in
216	   fact cannot even suggest a specific file (e.g., /etc/passwd); the
217	   user must explicitly select the file and consent to its upload.
218	   [Note:  in many cases browsers are explicitly designed to avoid
219	   dialogs with the semantics of "click here to screw yourself", as
220	   extensive research shows that users are prone to consent under such
221	   circumstances.]

223	   Similarly, while Flash SWFs can access the camera and microphone,
224	   they explicitly require that the user consent to that access.  In
225	   addition, some resources simply cannot be accessed from the browser
226	   at all.  For instance, there is no real way to run specific
227	   executables directly from a script (though the user can of course be
228	   induced to download executable files and run them).

230	3.2.  Same Origin Policy

232	   Many other resources are accessible but isolated.  For instance,
233	   while scripts are allowed to make HTTP requests via the
234	   XMLHttpRequest() API those requests are not allowed to be made to any
235	   server, but rather solely to the same ORIGIN from whence the script
236	   came.[I-D.abarth-origin] (although CORS [CORS] and WebSockets
237	   [I-D.ietf-hybi-thewebsocketprotocol] provides a escape hatch from
238	   this restriction, as described below.  This SAME ORIGIN POLICY (SOP)
239	   prevents server A from mounting attacks on server B via the user's
240	   browser, which protects both the user (e.g., from misuse of his
241	   credentials) and the server (e.g., from DoS attack).

243	   More generally, SOP forces scripts from each site to run in their
244	   own, isolated, sandboxes.  While there are techniques to allow them
245	   to interact, those interactions generally must be mutually consensual
246	   (by each site) and are limited to certain channels.  For instance,
247	   multiple pages/browser panes from the same origin can read each
248	   other's JS variables, but pages from the different origins--or even
249	   iframes from different origins on the same page--cannot.

251	3.3.  Bypassing SOP: CORS, WebSockets, and consent to communicate

253	   While SOP serves an important security function, it also makes it
254	   inconvenient to write certain classes of applications.  In
255	   particular, mash-ups, in which a script from origin A uses resources
256	   from origin B, can only be achieved via a certain amount of hackery.
257	   The W3C Cross-Origin Resource Sharing (CORS) spec [CORS] is a
258	   response to this demand.  In CORS, when a script from origin A
259	   executes what would otherwise be a forbidden cross-origin request,
260	   the browser instead contacts the target server to determine whether
261	   it is willing to allow cross-origin requests from A. If it is so
262	   willing, the browser then allows the request.  This consent
263	   verification process is designed to safely allow cross-origin
264	   requests.

266	   While CORS is designed to allow cross-origin HTTP requests,
267	   WebSockets [I-D.ietf-hybi-thewebsocketprotocol] allows cross-origin
268	   establishment of transparent channels.  Once a WebSockets connection
269	   has been established from a script to a site, the script can exchange
270	   any traffic it likes without being required to frame it as a series
271	   of HTTP request/response transactions.  As with CORS, a WebSockets
272	   transaction starts with a consent verification stage to avoid
273	   allowing scripts to simply send arbitrary data to another origin.

275	   While consent verification is conceptually simple--just do a
276	   handshake before you start exchanging the real data--experience has
277	   shown that designing a correct consent verification system is
278	   difficult.  In particular, Huang et al. [huang-w2sp] have shown
279	   vulnerabilities in the existing Java and Flash consent verification
280	   techniques and in a simplified version of the WebSockets handshake.
281	   In particular, it is important to be wary of CROSS-PROTOCOL attacks
282	   in which the attacking script generates traffic which is acceptable
283	   to some non-Web protocol state machine.  In order to resist this form
284	   of attack, WebSockets incorporates a masking technique intended to
285	   randomize the bits on the wire, thus making it more difficult to
286	   generate traffic which resembles a given protocol.

288	4.  Security for RTC-Web Applications

290	4.1.  Access to Local Devices

292	   As discussed in Section 1, allowing arbitrary sites to initiate calls
293	   violates the core Web security guarantee; without some access
294	   restrictions on local devices, any malicious site could simply bug a
295	   user.  At minimum, then, it MUST NOT be possible for arbitrary sites
296	   to initiate calls to arbitrary location without user consent.  This
297	   immediately raises the question, however, of what should be the scope
298	   of user consent.

300	   As discussed in Section 3.2, the basic unit of Web sandboxing is the
301	   origin, and so it is natural to scope consent to origin.
302	   Specifically, a script from origin A MUST only be allowed to initiate
303	   communications (and hence to access camera and microphone) if the
304	   user has specifically authorized access for that origin.  It is of
305	   course technically possible to have coarser-scoped permissions, but
306	   because the Web model is scoped to origin, this creates a difficult
307	   mismatch.

309	   Arguably, origin is not fine-grained enough.  Consider the situation
310	   where Alice visits a site and authorizes it to make a single call.
311	   If consent is expressed solely in terms of origin, then at any future
312	   visit to that site (including one induced via mash-up or ad network),
313	   the site can bug Alice's computer.  While in principle Alice could
314	   grant and then revoke the privilege, in practice privileges
315	   accumulate; if we are concerned about this attack, something else is
316	   needed.  There are a number of potential countermeasures to this sort
317	   of issue.

319	   Individual Consent
320	      Ask the user for permission for each call.

322	   Callee-oriented Consent
323	      Only allow calls to a given user.

325	   Cryptographic Consent
326	      Only allow calls to a given set of peer keying material.

328	   Unfortunately, none of these approaches is really satisfactory.
329	   Individual consent puts the user's approval in the UI flow for every
330	   call.  Not only does this quickly become annoying but it rapidly
331	   trains the user to simply click "OK", at which point the consent
332	   becomes useless.

334	   The other two options are designed to restrict calls to a given
335	   target.  Unfortunately, Callee-oriented consent does not work because
336	   the malicious site can claim that the user is calling any user of his
337	   choice.  The fix for this is to tie calls to a specific set of
338	   cryptographic keying material, but that breaks any portability for
339	   the callee's client, and is thus problematic.  (Section 4.3.2.1)

341	   While this is primarily a question not for IETF, it should be clear
342	   that there is no really good answer.  In general, if you cannot trust
343	   the site which you have authorized for calling not to bug you then
344	   your security situation is not really ideal.  It is RECOMMENDED that
345	   browsers have explicit (and obvious) indicators that they are in a
346	   call in order to mitigate this risk.

348	   The above recommendations provide security against web attackers.
349	   However, if a legitimate site is fetched over HTTP rather than HTTPS,
350	   a network attacker can inject code to initiate calls as if it were
351	   that origin, thus bypassing origin restrictions.  Note that this form
352	   of attack is also possible if a site embeds active content (e.g.,
353	   JavaScript) that is fetched over HTTP or from an untrusted site,
354	   because that JavaScript is executed in the security context of the
355	   page [finer-grained].  Therefore, it is RECOMMENDED that sites which
356	   embed RTC-Web functionality serve that functionality only over HTTPS
357	   and that browsers disallow execution of calling functionality in
358	   origins which contain mixed content.  Note:  this issue is not
359	   restricted to PAGES which contain mixed content.  If a page from a
360	   given origin ever loads mixed content then it is possible for a
361	   network attacker to infect the browser's notion of that origin semi-
362	   permanently.

364	4.2.  Communications Consent Verification

366	   As discussed in Section 3.3, allowing web applications unrestricted
367	   access to the via the browser network introduces the risk of using
368	   the browser as an attack platform against machines which would not
369	   otherwise be accessible to the malicious site, for instance because
370	   they are topologically restricted (e.g., behind a firewall or NAT).
371	   In order to prevent this form of attack as well as cross-protocol
372	   attacks it is important to require that the target of traffic
373	   explicitly consent to receiving the traffic in question.  Until that
374	   consent has been verified for a given endpoint, traffic other than
375	   the consent handshake MUST NOT be sent to that endpoint.

377	4.2.1.  ICE

379	   Verifying receiver consent requires some sort of explicit handshake,
380	   but conveniently we already need one in order to do NAT hole-
381	   punching.  ICE [RFC5245] includes a handshake designed to verify that
382	   the receiving element wishes to receive traffic from the sender.  It
383	   is important to remember here that the site initiating ICE is
384	   presumed malicious; in order for the handshake to be secure the
385	   receiving element MUST demonstrate receipt/knowledge of some value
386	   not available to the site (thus preventing it from forging
387	   responses).  In order to achieve this objective with ICE, the STUN
388	   transaction IDs must be generated by the browser and MUST NOT be made
389	   available to the initiating script, even via a diagnostic interface.

391	4.2.2.  Masking

393	   Once consent is verified, there still is some concern about
394	   misinterpretation attacks as described by Huang et al.[huang-w2sp].
395	   As long as communication is limited to UDP, then this risk is
396	   probably limited, thus masking is not required for UDP.  However,
397	   with TCP the risk of transparent proxies becomes much more severe.
398	   If TCP is to be used, then WebSockets style masking MUST be employed.

400	4.2.3.  Backward Compatibility

402	   A requirement to use ICE limits compatibility with legacy non-ICE
403	   clients.  It seems unsafe to completely remove the requirement for
404	   some check, but it might be possible to merely require a one-sided
405	   check where the legacy client was a STUN responder.  It's unclear
406	   whether that is in fact simpler than doing ICE-Lite.

408	4.3.  Communications Security

410	   Finally, we consider a problem familiar from the SIP world:
411	   communications security.  For obvious reasons, it MUST be possible
412	   for the communicating parties to establish a channel which is secure
413	   against both message recovery and message modification.  (See
414	   [RFC5479] for more details.)  This service must be provided for both
415	   data and voice/video.  Ideally the same security mechanisms would be
416	   used for both types of content.  Technology for providing this
417	   service (for instance, DTLS [RFC4347] and DTLS-SRTP [RFC5763]) is
418	   well understood.  However, we must examine this technology to the
419	   RTC-Web context, where the threat model is somewhat different.

421	   In general, it is important to understand that unlike a conventional
422	   SIP proxy, the calling service (i.e., the Web server) controls not
423	   only the channel between the communicating endpoints but also the
424	   application running on the user's browser.  While in principle it is
425	   possible for the browser to cut the calling service out of the loop
426	   and directly present trusted information (and perhaps get consent),
427	   practice in modern browsers is to avoid this whenever possible.  "In-
428	   flow" modal dialogs which require the user to consent to specific
429	   actions are particularly disfavored as human factors research
430	   indicates that unless they are made extremely invasive, users simply
431	   agree to them without actually consciously giving consent.
432	   [abarth-rtcweb].  Thus, nearly all the UI will necessarily be
433	   rendered by the browser but under control of the calling service.
434	   This likely includes the peer's identity information, which, after
435	   all, is only meaningful in the context of some calling service.

437	   This limitation does not mean that preventing attack by the calling
438	   service is completely hopeless.  However, we need to distinguish
439	   between two classes of attack:

441	   Retrospective compromise of calling service.
442	      The calling service is is non-malicious during a call but
443	      subsequently is compromised and wishes to attack an older call.

445	   During-call attack by calling service.
446	      The calling service is compromised during the call it wishes to
447	      attack.

449	   Providing security against the former type of attack is practical
450	   using the techniques discussed in Section 4.3.1.  However, it is
451	   extremely difficult to prevent a trusted but malicious calling
452	   service from actively attacking a user's calls, either by mounting a
453	   MITM attack or by diverting them entirely.  (Note that this attack
454	   applies equally to a network attacker if communications to the
455	   calling service are not secured.)  We discuss some potential
456	   approaches and why they are likely to be impractical in
457	   Section 4.3.2.

459	4.3.1.  Protecting Against Retrospective Compromise

461	   In a retrospective attack, the calling service was uncompromised
462	   during the call, but that an attacker subsequently wants to recover
463	   the content of the call.  We assume that the attacker has access to
464	   the protected media stream as well as having full control of the
465	   calling service.

467	   If the calling service has access to the traffic keying material (as
468	   in SDES [RFC4568]), then retrospective attack is trivial.  This form
469	   of attack is particularly serious in the Web context because it is
470	   standard practice in Web services to run extensive logging and
471	   monitoring.  Thus, it is highly likely that if the traffic key is
472	   part of any HTTP request it will be logged somewhere and thus subject
473	   to subsequent compromise.  It is this consideration that makes an
474	   automatic, public key-based key exchange mechanism imperative for
475	   RTC-Web (this is a good idea for any communications security system)
476	   and this mechanism SHOULD provide perfect forward secrecy (PFS).  The
477	   signaling channel/calling service can be used to authenticate this
478	   mechanism.

480	   In addition, the system MUST NOT provide any APIs to extract either
481	   long-term keying material or to directly access any stored traffic
482	   keys.  Otherwise, an attacker who subsequently compromised the
483	   calling service might be able to use those APIs to recover the
484	   traffic keys and thus compromise the traffic.

486	4.3.2.  Protecting Against During-Call Attack

488	   Protecting against attacks during a call is a more difficult
489	   proposition.  Even if the calling service cannot directly access
490	   keying material (as recommended in the previous section), it can
491	   simply mount a man-in-the-middle attack on the connection, telling
492	   Alice that she is calling Bob and Bob that he is calling Alice, while
493	   in fact the calling service is acting as a calling bridge and
494	   capturing all the traffic.  While in theory it is possible to
495	   construct techniques which protect against this form of attack, in
496	   practice these techniques all require far too much user intervention
497	   to be practical, given the user interface constraints described in
498	   [abarth-rtcweb].

500	4.3.2.1.  Key Continuity

502	   One natural approach is to use "key continuity".  While a malicious
503	   calling service can present any identity it chooses to the user, it
504	   cannot produce a private key that maps to a given public key.  Thus,
505	   it is possible for the browser to note a given user's public key and
506	   generate an alarm whenever that user's key changes.  SSH [RFC4251]
507	   uses a similar technique.  (Note that the need to avoid explicit user
508	   consent on every call precludes the browser requiring an immediate
509	   manual check of the peer's key).

511	   Unfortunately, this sort of key continuity mechanism is far less
512	   useful in the RTC-Web context.  First, much of the virtue of RTC-Web
513	   (and any Web application) is that it is not bound to particular piece
514	   of client software.  Thus, it will be not only possible but routine
515	   for a user to use multiple browsers on different computers which will
516	   of course have different keying material (SACRED [RFC3760]
517	   notwithstanding.)  Thus, users will frequently be alerted to key
518	   mismatches which are in fact completely legitimate, with the result
519	   that they are trained to simply click through them.  As it is known
520	   that users routinely will click through far more dire warnings
521	   [cranor-wolf], it seems extremely unlikely that any key continuity
522	   mechanism will be effective rather than simply annoying.

524	   Moreover, it is trivial to bypass even this kind of mechanism.
525	   Recall that unlike the case of SSH, the browser never directly gets
526	   the peer's identity from the user.  Rather, it is provided by the
527	   calling service.  Even enabling a mechanism of this type would
528	   require an API to allow the calling service to tell the browser "this
529	   is a call to user X".  All the calling service needs to do to avoid
530	   triggering a key continuity warning is to tell the browser that "this
531	   is a call to user Y" where Y is close to X. Even if the user actually
532	   checks the other side's name (which all available evidence indicates
533	   is unlikely), this would require (a) the browser to trusted UI to
534	   provide the name and (b) the user to not be fooled by similar
535	   appearing names.

537	4.3.2.2.  Short Authentication Strings

539	   ZRTP [RFC6189] uses a "short authentication string" (SAS) which is
540	   derived from the key agreement protocol.  This SAS is designed to be
541	   read over the voice channel and if confirmed by both sides precludes
542	   MITM attack.  The intention is that the SAS is used once and then key
543	   continuity (though a different mechanism from that discussed above)
544	   is used thereafter.

546	   Unfortunately, the SAS does not offer a practical solution to the
547	   problem of a compromised calling service.  "Voice conversion"
548	   systems, which modify voice from one speaker to make it sound like
549	   another, are an active area of research.  These systems are already
550	   good enough to fool both automatic recognition systems
551	   [farus-conversion] and humans [kain-conversion] in many cases, and
552	   are of course likely to improve in future, especially in an
553	   environment where the user just wants to get on with the phone call.
554	   Thus, even if SAS is effective today, it is likely not to be so for
555	   much longer.  Moreover, it is possible for an attacker who controls
556	   the browser to allow the SAS to succeed and then simulate call
557	   failure and reconnect, trusting that the user will not notice that
558	   the "no SAS" indicator has been set (which seems likely).

560	   Even were SAS secure if used, it seems exceedingly unlikely that
561	   users will actually use it.  As discussed above, the browser UI
562	   constraints preclude requiring the SAS exchange prior to completing
563	   the call and so it must be voluntary; at most the browser will
564	   provide some UI indicator that the SAS has not yet been checked.
565	   However, it it is well-known that when faced with optional mechanisms
566	   such as fingerprints, users simply do not check them [whitten-johnny]
567	   Thus, it is highly unlikely that users will ever perform the SAS
568	   exchange.

570	   Once uses have checked the SAS once, key continuity is required to
571	   avoid them needing to check it on every call.  However, this is
572	   problematic for reasons indicated in Section 4.3.2.1.  In principle
573	   it is of course possible to render a different UI element to indicate
574	   that calls are using an unauthenticated set of keying material
575	   (recall that the attacker can just present a slightly different name
576	   so that the attack shows the same UI as a call to a new device or to
577	   someone you haven't called before) but as a practical matter, users
578	   simply ignore such indicators even in the rather more dire case of
579	   mixed content warnings.

581	5.  Security Considerations

583	   This entire document is about security.

585	6.  Acknowledgements

587	7.  References

589	7.1.  Normative References

591	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
592	              Requirement Levels", BCP 14, RFC 2119, March 1997.

594	7.2.  Informative References

596	   [CORS]     van Kesteren, A., "Cross-Origin Resource Sharing".

598	   [I-D.abarth-origin]
599	              Barth, A., "The Web Origin Concept",
600	              draft-abarth-origin-09 (work in progress), November 2010.

602	   [I-D.ietf-hybi-thewebsocketprotocol]
603	              Fette, I., "The WebSocket protocol",
604	              draft-ietf-hybi-thewebsocketprotocol-07 (work in
605	              progress), April 2011.

607	   [RFC2818]  Rescorla, E., "HTTP Over TLS", RFC 2818, May 2000.

609	   [RFC3261]  Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston,
610	              A., Peterson, J., Sparks, R., Handley, M., and E.
611	              Schooler, "SIP: Session Initiation Protocol", RFC 3261,
612	              June 2002.

614	   [RFC3552]  Rescorla, E. and B. Korver, "Guidelines for Writing RFC
615	              Text on Security Considerations", BCP 72, RFC 3552,
616	              July 2003.

618	   [RFC3760]  Gustafson, D., Just, M., and M. Nystrom, "Securely
619	              Available Credentials (SACRED) - Credential Server
620	              Framework", RFC 3760, April 2004.

622	   [RFC4251]  Ylonen, T. and C. Lonvick, "The Secure Shell (SSH)
623	              Protocol Architecture", RFC 4251, January 2006.

625	   [RFC4347]  Rescorla, E. and N. Modadugu, "Datagram Transport Layer
626	              Security", RFC 4347, April 2006.

628	   [RFC4568]  Andreasen, F., Baugher, M., and D. Wing, "Session
629	              Description Protocol (SDP) Security Descriptions for Media
630	              Streams", RFC 4568, July 2006.

632	   [RFC5245]  Rosenberg, J., "Interactive Connectivity Establishment
633	              (ICE): A Protocol for Network Address Translator (NAT)
634	              Traversal for Offer/Answer Protocols", RFC 5245,
635	              April 2010.

637	   [RFC5479]  Wing, D., Fries, S., Tschofenig, H., and F. Audet,
638	              "Requirements and Analysis of Media Security Management
639	              Protocols", RFC 5479, April 2009.

641	   [RFC5763]  Fischl, J., Tschofenig, H., and E. Rescorla, "Framework
642	              for Establishing a Secure Real-time Transport Protocol
643	              (SRTP) Security Context Using Datagram Transport Layer
644	              Security (DTLS)", RFC 5763, May 2010.

646	   [RFC6189]  Zimmermann, P., Johnston, A., and J. Callas, "ZRTP: Media
647	              Path Key Agreement for Unicast Secure RTP", RFC 6189,
648	              April 2011.

650	   [abarth-rtcweb]
651	              Barth, A., "Prompting the user is security failure",  RTC-
652	              Web Workshop.

654	   [cranor-wolf]
655	              Sunshine, J., Egelman, S., Almuhimedi, H., Atri, N., and
656	              L. cranor, "Crying Wolf: An Empirical Study of SSL Warning
657	              Effectiveness",  Proceedings of the 18th USENIX Security
658	              Symposium, 2009.

660	   [farus-conversion]
661	              Farrus, M., Erro, D., and J. Hernando, "Speaker
662	              Recognition Robustness to Voice Conversion".

664	   [finer-grained]
665	              Barth, A. and C. Jackson, "Beware of Finer-Grained
666	              Origins",  W2SP, 2008.

668	   [huang-w2sp]
669	              Huang, L-S., Chen, E., Barth, A., Rescorla, E., and C.
670	              Jackson, "Talking to Yourself for Fun and Profit",  W2SP,
671	              2011.

673	   [kain-conversion]
674	              Kain, A. and M. Macon, "Design and Evaluation of a Voice
675	              Conversion Algorithm based on Spectral Envelope Mapping
676	              and Residual Prediction",  Proceedings of ICASSP, May
677	              2001.

679	   [whitten-johnny]
680	              Whitten, A. and J. Tygar, "Why Johnny Can't Encrypt: A
681	              Usability Evaluation of PGP 5.0",  Proceedings of the 8th
682	              USENIX Security Symposium, 1999.

684	Author's Address

686	   Eric Rescorla
687	   RTFM, Inc.
688	   2064 Edgewood Drive
689	   Palo Alto, CA  94303
690	   USA

692	   Phone:  +1 650 678 2350
693	   Email:  ekr@rtfm.com