idnits 2.17.1 

draft-thomson-mmusic-ice-webrtc-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == The document seems to lack the recommended RFC 2119 boilerplate, even if
     it appears to use RFC 2119 keywords. 

     (The document does seem to have the reference to RFC 2119 which the
     ID-Checklist requires).
  -- The document date (October 19, 2013) is 3842 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  ** Obsolete normative reference: RFC 5245 (Obsoleted by RFC 8445, RFC 8839)

  == Outdated reference: A later version (-19) exists of
     draft-ietf-rtcweb-overview-08

  == Outdated reference: A later version (-20) exists of
     draft-ietf-rtcweb-security-arch-07

  == Outdated reference: A later version (-12) exists of
     draft-ietf-rtcweb-security-05

  -- Obsolete informational reference (is this intentional?): RFC 5389
     (Obsoleted by RFC 8489)

  -- Obsolete informational reference (is this intentional?): RFC 5766
     (Obsoleted by RFC 8656)


     Summary: 2 errors (**), 0 flaws (~~), 5 warnings (==), 3 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	MMUSIC                                                        M. Thomson
3	Internet-Draft                                                 Microsoft
4	Intended status: Standards Track                        October 19, 2013
5	Expires: April 22, 2014

7	  Using Interactive Connectivity Establishment (ICE) in Web Real-Time
8	                        Communications (WebRTC)
9	                   draft-thomson-mmusic-ice-webrtc-01

11	Abstract

13	   Interactive Connectivity Establishment (ICE) has been selected as the
14	   basis for establishing peer-to-peer UDP flows between Web Real-Time
15	   Communication (WebRTC) clients.  Using an unmodified ICE
16	   implementation in this context enables the use of the web platform as
17	   a denial of service platform.  The risks and complications arising
18	   from this choice are discussed.  A modified algorithm for sending ICE
19	   connectivity checks from the web platform is described.

21	Status of This Memo

23	   This Internet-Draft is submitted in full conformance with the
24	   provisions of BCP 78 and BCP 79.

26	   Internet-Drafts are working documents of the Internet Engineering
27	   Task Force (IETF).  Note that other groups may also distribute
28	   working documents as Internet-Drafts.  The list of current Internet-
29	   Drafts is at http://datatracker.ietf.org/drafts/current/.

31	   Internet-Drafts are draft documents valid for a maximum of six months
32	   and may be updated, replaced, or obsoleted by other documents at any
33	   time.  It is inappropriate to use Internet-Drafts as reference
34	   material or to cite them other than as "work in progress."

36	   This Internet-Draft will expire on April 22, 2014.

38	Copyright Notice

40	   Copyright (c) 2013 IETF Trust and the persons identified as the
41	   document authors.  All rights reserved.

43	   This document is subject to BCP 78 and the IETF Trust's Legal
44	   Provisions Relating to IETF Documents
45	   (http://trustee.ietf.org/license-info) in effect on the date of
46	   publication of this document.  Please review these documents
47	   carefully, as they describe your rights and restrictions with respect
48	   to this document.  Code Components extracted from this document must
49	   include Simplified BSD License text as described in Section 4.e of
50	   the Trust Legal Provisions and are provided without warranty as
51	   described in the Simplified BSD License.

53	Table of Contents

55	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
56	     1.1.  Conventions and Terminology . . . . . . . . . . . . . . .   4
57	   2.  ICE in a Web Browser  . . . . . . . . . . . . . . . . . . . .   4
58	     2.1.  Factors Influencing DoS Capacity  . . . . . . . . . . . .   4
59	       2.1.1.  Pacing of Connectivity Checks . . . . . . . . . . . .   5
60	       2.1.2.  Retransmission of Connectivity Checks . . . . . . . .   5
61	       2.1.3.  Connectivity Check Size . . . . . . . . . . . . . . .   6
62	     2.2.  Denial of Service Magnitude . . . . . . . . . . . . . . .   6
63	   3.  Modified ICE Algorithm  . . . . . . . . . . . . . . . . . . .   7
64	     3.1.  Trickled and Peer Reflexive Candidates  . . . . . . . . .   9
65	     3.2.  Multiple ICE Agents . . . . . . . . . . . . . . . . . . .  10
66	       3.2.1.  Introducing Artificial Contention . . . . . . . . . .  11
67	       3.2.2.  Origin-First Round-Robin  . . . . . . . . . . . . . .  11
68	       3.2.3.  Inter-Agent Candidate Pair Freezing . . . . . . . . .  11
69	       3.2.4.  Delayed ICE Agent Start . . . . . . . . . . . . . . .  12
70	   4.  Further Reducing the Impact of Attacks  . . . . . . . . . . .  12
71	     4.1.  Bandwidth Rate Limiting . . . . . . . . . . . . . . . . .  12
72	     4.2.  Malicious Application Penalties . . . . . . . . . . . . .  13
73	     4.3.  Limited Concurrent Access to ICE  . . . . . . . . . . . .  13
74	   5.  Negotiating Algorithm Use . . . . . . . . . . . . . . . . . .  13
75	   6.  Security Considerations . . . . . . . . . . . . . . . . . . .  14
76	   7.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .  14
77	   8.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  14
78	     8.1.  Normative References  . . . . . . . . . . . . . . . . . .  14
79	     8.2.  Informative References  . . . . . . . . . . . . . . . . .  14
80	   Appendix A.  Defining Legitimate Uses of ICE  . . . . . . . . . .  15
81	     A.1.  Candidate Pair Count  . . . . . . . . . . . . . . . . . .  15
82	     A.2.  Connectivity Check Size . . . . . . . . . . . . . . . . .  16
83	     A.3.  Rate Calculations . . . . . . . . . . . . . . . . . . . .  16
84	     A.4.  Comparison: G.711 Audio . . . . . . . . . . . . . . . . .  16
85	     A.5.  Recommended Rate Limits . . . . . . . . . . . . . . . . .  17
86	   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  17

88	1.  Introduction

90	   ICE [RFC5245] describes a process whereby peers establish a bi-
91	   directional UDP flow.  This process has been adopted for use in Web
92	   Real-Time Communications (WebRTC) for establishing flows to and from
93	   web browsers ([I-D.ietf-rtcweb-overview]).

95	   Properties of ICE are also critical to the security of WebRTC (see
96	   Section 4.2.1 of [I-D.ietf-rtcweb-security]).

98	   The design of RFC 5245 does not fully consider the threat models
99	   enabled by the web environment.  In particular, the following
100	   assumptions are not valid in a web context:

102	   o  A one-time consent to communicate is sufficient, and revocation of
103	      consent is not necessary.

105	   o  Signaling and control originates from actors that always operate
106	      in good faith.

108	   o  Only one ICE processing context operates at the one time.

110	   Implementations of ICE that are technically compliant with the
111	   algorithm described in RFC 5245 potentially expose controls to web
112	   applications that can be exploited.

114	   In the web context, an attacker is able to provide code (usually
115	   JavaScript) that is executed by those hosts in a sandbox.  The
116	   protections of the sandbox are critical, both for protecting the host
117	   running the sandbox, and for protecting the Internet as a whole from
118	   bad actors.

120	   The exposure of ICE features in the web browser could allow attackers
121	   to generate denial of service (DoS) traffic far in excess of the
122	   bandwidth needed to deploy the JavaScript.  A small (1KB) file can
123	   potentially generate many megabytes of connectivity checks in a short
124	   period, representing an amplication factor far greater than other
125	   similar amplification attacks (for instance, DNS reflection attacks).

127	   Mounting this sort of DoS attack does not rely on anything other than
128	   inducing a host to download and execute JavaScript.  This is
129	   generally very easy to accomplish, making it very easy to conscript
130	   large number of traffic sources.

132	   The issue regarding the one-time consent to communicate has already
133	   been identified as a serious problem for WebRTC.
134	   [I-D.muthu-behave-consent-freshness] describes a limit on the time
135	   that consent remains valid, requiring that communications consent be
136	   continuously refreshed.

138	   This document first describes the characteristics of ICE as they
139	   relate to the web and the way that these characteristics can be
140	   exploited.  In order to address the issues arising from allowing web
141	   application to initiate and control ICE processing, a modified
142	   algorithm is described, plus additional measures that can be employed
143	   to reduce the amount of traffic an attacker can produce.

145	1.1.  Conventions and Terminology

147	   In cases where normative language needs to be emphasized, this
148	   document falls back on established shorthands for expressing
149	   interoperability requirements on implementations: the capitalized
150	   words "MUST", "MUST NOT", "SHOULD" and "MAY".  The meaning of these
151	   is described in [RFC2119].

153	2.  ICE in a Web Browser

155	   A web browser provides an API that applications can use to
156	   instantiate and control an ICE agent.  The web application is
157	   responsible for providing the ICE agent with signaling that it might
158	   need to operate successfully, as well as configuration information
159	   regarding TURN [RFC5766] or STUN [RFC5389] servers.

161	   In the web context, a browser treats the web application as being
162	   potentially hostile, providing access to features in a controlled
163	   fashion.  Therefore, some of the information that an ICE agent might
164	   depend on in other contexts has to be regarded as potentially suspect
165	   when provided by a web application.

167	2.1.  Factors Influencing DoS Capacity

169	   There are several parameters that affect the characteristics of DoS
170	   attacks that can be mounted using ICE.  These include:

172	   o  The number of candidate pairs that are created.  An attacker can
173	      add extra remote candidates to inflate this number to tbe maximum
174	      supported.  RFC 5245 recommends a default maximum of 100 candidate
175	      pairs.  Reducing this limit directly reduces DoS potential, though
176	      it could affect success in some legitimate scenarios (see the
177	      calculations in Appendix A).

179	   o  The time between consecutive connectivity checks.  Pacing of
180	      checks is discussed at length in Section 2.1.1.

182	   o  The total number and timing of retransmissions for each candidate
183	      pair.  Section 2.1.2 discusses the implications of
184	      retransmissions.

186	   o  The size of connectivity check packets.  Size considerations are
187	      described in Section 2.1.3.

189	   o  The number of ICE agents that can be operated concurrently.  RFC
190	      5245 does not consider scenarios like WebRTC where it is not only
191	      possible for there to be multiple agents.  The web security model
192	      allows for cases where multiple agents can be created
193	      concurrently, often with a further restriction that a browser not
194	      leak information between agents.

196	2.1.1.  Pacing of Connectivity Checks

198	   ICE [RFC5245] describes a scheme for pacing connectivity checks.
199	   There are two primary reasons that are cited:

201	   o  Pacing the initial connectivity checks for a given candidate pair
202	      allows middleboxes sufficient time to establish bindings.
203	      Empirical evidence suggests that failing to allow at least 20
204	      milliseconds between initial connectivity checks risks the
205	      bindings being dropped at some middleboxes.

207	   o  Pacing limits the potential for connectivity checks to generate
208	      network congestion.  Section 16.1 of [RFC5245] describes a formula
209	      for calculating the time between connectivity checks (Ta) that is
210	      based on the expected bandwidth of the real-time session that is
211	      being established.

213	   In the web context, information about the expected bandwidth used by
214	   the session comes from the web application.  Since the web
215	   application has to be regarded as potentially malicious, information
216	   about expected media bandwidth cannot be used to determine the pacing
217	   of connectivity checks.  A fixed minimum interval between
218	   connectivity checks becomes the primary mechanism for limiting the
219	   ability of web applications to generate packets that are potentially
220	   congestion inducing.

222	   Increasing the pacing interval directly reduces the amount of
223	   congestion that connectivity checks can generate, though this only
224	   reduces the peak bitrate that can be induced - the same amount of
225	   traffic is generated over a longer period.  The cost of this is
226	   extended session setup times, where recent efforts have been focused
227	   on reducing this time.

229	2.1.2.  Retransmission of Connectivity Checks

231	   The initial retransmission timer (RTO) can also be increased with
232	   similar effect to increasing the pacing timer.  Furthermore, there is
233	   a strong desire to reduce the recommended value of the RTO in ICE
234	   from 500 milliseconds to values more reflective of common round trip
235	   times in well-connected locations, which might be as low as 50
236	   milliseconds.

238	   More relevant is the total number of connectivity check
239	   retransmissions that an implementation attempts for each candidate
240	   pair.  Each additional retransmission directly increases the duration
241	   and magnitude of a DoS attack.  Following the exponential backoff
242	   recommended by RFC 5245 does extend the time between retransmissions,
243	   which could reduce the rate of connectivity checks after several
244	   retransmissions, but this depends on the initial retransmission time
245	   out (RTO).

247	   Reducing the number of retransmissions has the effect of reducing the
248	   probability of the check succeeding.  The selection of a total
249	   retransmission count is a trade-off of success rates against the
250	   potential for abuse.

252	2.1.3.  Connectivity Check Size

254	   As currently specified, an attacker is only able to influence the
255	   size of the USERNAME attribute.  [RFC5389] restricts USERNAME to a
256	   maximum size of 512 octets; the Session Description Protocol (SDP)
257	   signaling described in [RFC5245] limits the size of the username
258	   fragment an attacker can set to 256 bytes.

260	   A browser could reduce its username fragment to as little as 4 bytes,
261	   limiting the overall size of the attribute to 261 bytes.  A small
262	   username fragment does limit the collision resilience of the field,
263	   which is a property that is important for detecting other forms of
264	   attack (see Section 5.7.3 of [I-D.ietf-rtcweb-security-arch]).

266	   There is also the potential for new modifications to ICE that
267	   increase the packet size.  For instance [I-D.martinsen-mmusic-malice]
268	   provides an attacker with direct control over the bytes that are
269	   included in connectivity checks.

271	2.2.  Denial of Service Magnitude

273	   A malicious application is able to influence connectivity checking by
274	   altering the set of remote candidates and by changing the remote
275	   username fragment.  The default maximum sizes for remote username
276	   fragment (256 bytes) and number of candidate pairs (100) described in
277	   RFC 5245 can be exploited by an attacker to increase the number and
278	   size of packets.  Assuming an inter-check timer of the minimum of 20
279	   milliseconds, plus a minimal 28 bytes of IPv4 and UDP overhead, this
280	   results in an attacker being able to induce approximately 144kbps for
281	   every ICE agent it is able to instantiate.

283	   This rate is significantly higher than the minimal rate of 20kbps
284	   that a typical compressed voice stream generates.  By comparison, a
285	   G.711 audio stream, which cannot be rate limited in response to
286	   network congestion, but is generally regarded as safe to send to a
287	   willing target, generates about 74kbps.

289	   ICE does not allow for any congestion feedback (other than ECN
290	   [RFC3168]), so this rate could conceivably be sustained for some
291	   time, though after several seconds the time between retries
292	   increases, reducing the check rate unless the application is able to
293	   instantiate another ICE agent.

295	   Some existing ICE implementations could generate about 3 or more
296	   times the basic rate of connectivity checks over a short period.
297	   These implementations do not pace retransmission of connectivity
298	   checks, resulting in significantly higher connectivity check rates
299	   during early rounds of retransmission.

301	      These implementations are ignoring the advice on calculating a
302	      minimum RTO from Section 16.1 of [RFC5245].  However, the shorter
303	      RTO allows ICE to complete much faster, which is a significant
304	      advantage.

306	   Implementations that do not limit the number of ICE agents that can
307	   be instantiated, and subsequently fail to enforce rate limits
308	   globally create a further multiplicative factor on the basic rate.

310	3.  Modified ICE Algorithm

312	   This section describes an algorithm that ensures proper global pacing
313	   of connectivity checks.  This limits the ability of any single
314	   attacker to generate a high rate of connectivity checks.  This only
315	   limits the peak data rate that results from connectivity checks,
316	   reducing the intensity of DoS attacks.

318	   Measures that reduce the overall duration of attacks are described in
319	   Section 4.

321	   The modified algorithm for ICE does not alter the way that candidate
322	   pairs are selected, prioritized, frozen or signaled.  It only affects
323	   the generation of connectivity checks.  This algorithm affects
324	   candidate pairs in either of the "Waiting" or "In-Progress" states
325	   only (see Section 5.7.4 of [RFC5245]).

327	   The ICE agent maintains two queues for candidate pairs.

329	   waiting queue:  The first is a prioritized list of candidate pairs in
330	      the "Waiting" state.  The waiting queue is simply a prioritized
331	      list of all the candidate pairs in the check list (see Section 5.7
332	      of [RFC5245]) that are in the "Waiting" state.  As candidate pairs
333	      enter the "Waiting" state, they are added to the waiting queue.
334	      As each candidate pair is added, it is prioritized relative to all
335	      the other candidate pairs in the waiting queue.

337	   check queue:  The second is for outstanding connectivity checks.
338	      Each entry in this list represents a connectivity check for a
339	      given candidate pair.  Each entry also includes a counter
340	      representing the number of connectivity checks that have been sent
341	      on this candidate pair.

343	   The ICE agent maintains two types of timer: a pacing timer and a
344	   retransmission timer.  There is only one pacing timer, though there
345	   can be multiple retransmission timers running concurrently.

347	   The first candidate pair that arrives in the waiting queue starts the
348	   pacing timer.  The pacing timer runs as long as there are items in
349	   any queue, ending if the timer expires when there are no entries in
350	   either queue.  The pacing timer resumes if an entry is added to
351	   either queue and the timer is not already running.

353	   Each time the pacing timer expires, the ICE agent performs the
354	   following steps:

356	   1.  If there are items on the waiting queue, but no items on the
357	       check queue, the first candidate pair is taken from the waiting
358	       queue.

360	       a.  The candidate pair transitions from "Waiting" to "In-
361	           Progress".

363	       b.  A check counter is associted with the candidate pair,
364	           initialized with a zero value.

366	       c.  The candidate pair is added to the check queue.  This could
367	           result in a connectivity check being sent immediately if the
368	           check queue is currently empty.

370	   2.  If there are items in the check queue, the ICE agent removes the
371	       first item and performs a connectivity check on the identified
372	       candidate pair.

374	       a.  The check counter associated with the candidate pair is
375	           incremented by one.

377	       b.  Based on the value of the check counter, a retransmission
378	           timer is scheduled for the candidate pair.  The
379	           retransmission timer is not scheduled if the check counter
380	           exceeds the maximum number of checks configured for the ICE
381	           agent.

383	       c.  If the retransmission timer expires without the connectivity
384	           check succeeding, the candidate pair is returned to the end
385	           of the check queue along with the higher check counter.

387	       d.  The retransmission timer is cancelled if the connectivity
388	           check succeeds.  The process for handling successful checks
389	           in Section 7.1.3.2 of [RFC5245] is followed.

391	   3.  If no connectivity checks were sent, the pacing timer is stopped.

393	   An important characteristic of this algorithm is that it - as much as
394	   possible - prefers retransmission of connectivity checks over the
395	   initiation of new connectivity checks.  This ensures that once an
396	   initial connectivity check has established any necessary middlebox
397	   bindings, subsequent retries are not delayed excessively, which could
398	   cause the binding to time out.  However, the global pacing can cause
399	   the time between retransmission of connectivity checks to be extended
400	   as the check queue occasionally fills.

402	   Favoring retransmission over initial checks directly contradicts the
403	   guidance on RTO selection in Section 16.1 of [RFC5245].  This is
404	   necessary due to the delays induced by potential interactions between
405	   multiple ICE agents, which might otherwise cause retries to be
406	   significantly delayed.  Improvements to candidate prioritization are
407	   expected to reduce the impact of this change.

409	3.1.  Trickled and Peer Reflexive Candidates

411	   Trickled ICE candidates [I-D.ivov-mmusic-trickle-ice] generate
412	   candidate pairs after connectivity checking has commenced.  In order
413	   to avoid trickled candidates negatively affecting the chances of a
414	   connectivity check succeeding, connectivity checks on newly appearing
415	   candidate pairs must be prioritized below any existing connectivity
416	   check.

418	   Trickled candidates are in many respects identical to peer reflexive
419	   candidates.  Both arrive after the algorithm has commenced.

421	   In either case, as new candidates arrive (or are discovered), they
422	   are paired as normal (Section 5.7.1 of [RFC5245]), and - if
423	   appropriate - entered into the "Waiting" state.  This causes the
424	   candidate pair to enter the waiting queue.  Candidate pairs in the
425	   waiting queue are not ordered based on arrival time, they are ordered
426	   based on priority alone.

428	   Trickling regular candidates does introduce the potential for a
429	   mismatch in the ordering of candidate pairs between peers, since
430	   trickled candidates will appear in the sending side well before the
431	   receiving side can act upon them, resulting in the sending peer
432	   potentially commencing checks much earlier than the receiving peer.
433	   This is particularly important given the possibility that
434	   retransmissions of connectivity checks can block the progress of a
435	   candidate pair from the "Waiting" state into the "In-Progress" state,
436	   resulting in potentially large differences in the commencement time
437	   for any given candidate pair.

439	   A trickle ICE implementation MAY choose not to immediately enqueue
440	   local candidates as they are discovered to allow some time for
441	   trickle signaling to propagate in order to increase the probability
442	   that checks remain synchronized.

444	3.2.  Multiple ICE Agents

446	   In a system that has potentially more than one ICE agent, it's
447	   important that connectivity checks from any given ICE agent cannot be
448	   blocked or starved by other ICE agents.  It is also important that an
449	   attacker is unable to circumvent any limits by instantiating multiple
450	   ICE agents.

452	   To that end, a single pacing timer is maintained globally whenever
453	   multiple ICE agents are operated.  Each time the pacing timer fires,
454	   the global context selects ICE agents in a round-robin fashion.  In
455	   addition to ensuring a global rate limit, this selection method
456	   ensures that no single ICE agent is completely starved.

458	   In a shared context, ICE agents do not stop or start the pacing timer
459	   unless they are the first or last ICE agent to be active.  The first
460	   ICE agent to commence checking starts the global timer, the last ICE
461	   agent to cancel the timer causes the global timer to be cancelled.
462	   At all other instances, "starting" the pacing timer for an ICE agent
463	   simply adds the ICE agent to the set of agents that can be selected;
464	   "stopping" the pacing timer removes the ICE agent from the set of ICE
465	   agents that are in consideration.

467	   A global pacing timer causes each individual ICE agent to execute
468	   checks more slowly than a lone ICE agent would.  Where there are many
469	   candidate pairs to test, this could have a negative impact on the
470	   synchronization of checks between peers.  Poor check synchronization
471	   can have a negative impact on success rates.  Peers with asymmetric
472	   contention can have lower priority candidate pairs started on the
473	   less contended peer long before the contended peer is able to
474	   commence checking, which can result in those checks failing.

476	   Several measures are suggested for mitigating the impact of
477	   contention: artificial contention, origin-first distribution, inter-
478	   agent candidate pair freezing, and delayed start.  However, it is
479	   important to note that similar artificial constraints have
480	   classically been quickly circumvented on the web if they have overly
481	   negative performance consequences.

483	3.2.1.  Introducing Artificial Contention

485	   In cases where there is zero contention, artificial contention can be
486	   introduced to ensure a certain minimum effective pacing timer.  In
487	   effect, this would increase the basic pacing timer from 20ms by a
488	   minimum multiple for any single ICE agent.  Artificially contention
489	   would result in no checks being sent at all at different phases,
490	   spacing genuine connectivity checks.

492	   For instance, contention could be increased to a minimum of 3 ICE
493	   agents.  Assuming a 20ms basic interval, the first ICE agent would be
494	   able to send connectivity checks every 60ms, as though it were
495	   contending with two other ICE agents.  Adding another ICE agent would
496	   have no effect on this rate.  It would only be if a fourth ICE agent
497	   were added that all ICE agents would be reduced to sending checks at
498	   80ms intervals.

500	   This has the advantage of ensuring that a lightly contended client
501	   has the same rate of checking as a client with only a small number of
502	   ICE agents so that checks are more likely to be synchronized.

504	3.2.2.  Origin-First Round-Robin

506	   In a system such as a browser, there are potentially competing
507	   interests sharing the same limited resources.  In this type of
508	   context, each competing user - in the browser, this is an origin
509	   [RFC6454] - can first be selected using a round-robin or similar
510	   allocation scheme.

512	   Thus, as a first step, selection is performed from the set origins
513	   that have an active ICE agent.  Once an origin is selected, agents
514	   are selected from within that origin.  This ensures that no single
515	   origin can receive more than a proportional share of the access to
516	   connectivity checking.

518	   This is particularly important if multiple users (or origins) are
519	   each able to create multiple ICE agents.  Selecting based on users
520	   first prevents a single origin from monopolizing access to
521	   connectivity checks.

523	3.2.3.  Inter-Agent Candidate Pair Freezing
524	   In some cases, it might be necessary to instantiate multiple ICE
525	   agents from the same application, between the same two peers.  An ICE
526	   agent MAY place candidate pairs in the "Frozen" state based on
527	   candidate pairs with the same foundation being "Waiting" or "In-
528	   Progress" on another ICE agent.  This reduces the overall demand for
529	   connectivity checks without any significant negative effect on the
530	   chances that ICE succeeds.

532	   In the browser context, information about the success of connectivity
533	   checks cannot leak between different domains.  This could allow
534	   information about activities on another tab to be leaked, violating
535	   the origin security model of the browser.  Thus, any inter-agent
536	   freezing logic MUST be constrained to ICE agents that operate in the
537	   same origin.

539	3.2.4.  Delayed ICE Agent Start

541	   In cases where there is high contention for access to connectivity
542	   checking, it might be preferable to delay the start of connectivity
543	   checks for an ICE agent rather than have the effective pacing timer
544	   increased.

546	4.  Further Reducing the Impact of Attacks

548	   A global pacing timer allows a web application to determine whether
549	   another domain is currently establishing an ICE transport, simply by
550	   observing the pacing of connectivity checks that it requests.
551	   Section 3.2.1 describes a method that allows a limited number of ICE
552	   agents to operate without being detectable.

554	   The algorithm and the measures it describes are based on an
555	   assumption that ICE agents are created legitimately.  Even with these
556	   measures, it's possible to generate a steady amount of bandwidth
557	   toward arbitrary hosts.  The remainder of this section is dedicated
558	   to additional measures that might be employed to reduce the ability
559	   of malicious users to generate unwanted connectivity checks over
560	   time.

562	4.1.  Bandwidth Rate Limiting

564	   A measure of the bandwidth generated by connectivity checks can be
565	   maintained, on both global and a per-origin basis.  As this number
566	   increases, the browser can reduce the rate of connectivity checks.
567	   This reduction might either be gained by increasing the duration of
568	   the pacing timer or skipping occasional connectivity checks.

570	   Appendix A includes some simple calculations and recommendations on
571	   what might be appropriate limits to set on the bandwidth used by
572	   connectivity checks.

574	4.2.  Malicious Application Penalties

576	   An attacker that only wishes to generate traffic is unlikely to
577	   provide valid candidates for two reasons:

579	   o  a successful connectivity check is likely to cause the ICE agent
580	      to terminate further checking

582	   o  serving connectivity checks requires the dedication of greater
583	      resources by the attacker

585	   A long sequence of unsuccessful connectivity checks is therefore a
586	   likely indicator for an attack.  An ICE agent could choose to reduce
587	   the rate at which connectivity checks are generated for an
588	   application that has a large number of failed checks.

590	   Any measure that penalizes for unsuccessful checks will have to allow
591	   for some failures.  Even legitimate uses of ICE can result in
592	   significant numbers of failed connectivity checks.  For instance, an
593	   implementation that exclusively prioritizes IPv6 over IPv4 on a
594	   network with broken IPv6 will legitimately see a large number of
595	   failures.  Similarly, if a remote peer is behind a NAT, prior to the
596	   commencement of checking by that peer all connectivity checks are
597	   likely to be discarded by the NAT.

599	4.3.  Limited Concurrent Access to ICE

601	   Setting an absolute maximum on the number of ICE agents that can be
602	   instantiated could overly constrain legitimate applications that
603	   depend on having multiple active sessions.  However, limiting
604	   concurrent access to active ICE agents by delaying the start of
605	   connectivity checking, as described in Section 3.2.4 might allow an
606	   implementation to reduce the ability of a single origin to generate
607	   unwanted connectivity checks.

609	5.  Negotiating Algorithm Use

611	   The algorithm defined in Section 3 could cause some ICE agents to
612	   perform checks in a very different order to the order of an
613	   unmodified ICE agent.  Failing to coordinate when checks occur
614	   reduces the probability that ICE is successful.

616	   TODO: Determine whether an ice-options token that enables negotiation
617	   of this algorithm is appropriate, or whether something more
618	   definitive is required, since an answerer could negotiate an ice-
619	   options token away.  Note that WebRTC implementations probably won't
620	   be able to accept a session that does not use this algorithm.

622	6.  Security Considerations

624	   This entire document is about security.

626	7.  Acknowledgements

628	   The bulk of the algorithm described in this document came out of a
629	   discussion with Emil Ivov and Pal-Erik Martinsen.  Eric Rescorla and
630	   Bernard Aboba provided some feedback regarding the DoS considerations
631	   and possible mitigations.

633	8.  References

635	8.1.  Normative References

637	   [I-D.ivov-mmusic-trickle-ice]
638	              Ivov, E., Rescorla, E., and J. Uberti, "Trickle ICE:
639	              Incremental Provisioning of Candidates for the Interactive
640	              Connectivity Establishment (ICE) Protocol", draft-ivov-
641	              mmusic-trickle-ice-01 (work in progress), March 2013.

643	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
644	              Requirement Levels", BCP 14, RFC 2119, March 1997.

646	   [RFC5245]  Rosenberg, J., "Interactive Connectivity Establishment
647	              (ICE): A Protocol for Network Address Translator (NAT)
648	              Traversal for Offer/Answer Protocols", RFC 5245, April
649	              2010.

651	8.2.  Informative References

653	   [I-D.ietf-rtcweb-overview]
654	              Alvestrand, H., "Overview: Real Time Protocols for Brower-
655	              based Applications", draft-ietf-rtcweb-overview-08 (work
656	              in progress), September 2013.

658	   [I-D.ietf-rtcweb-security-arch]
659	              Rescorla, E., "WebRTC Security Architecture", draft-ietf-
660	              rtcweb-security-arch-07 (work in progress), July 2013.

662	   [I-D.ietf-rtcweb-security]
663	              Rescorla, E., "Security Considerations for WebRTC", draft-
664	              ietf-rtcweb-security-05 (work in progress), July 2013.

666	   [I-D.martinsen-mmusic-malice]
667	              Penno, R., Martinsen, P., Wing, D., and A. Zamfir, "Meta-
668	              data Attribute signaLling with ICE", draft-martinsen-
669	              mmusic-malice-00 (work in progress), July 2013.

671	   [I-D.muthu-behave-consent-freshness]
672	              Perumal, M., Wing, D., R, R., and T. Reddy, "STUN Usage
673	              for Consent Freshness", draft-muthu-behave-consent-
674	              freshness-04 (work in progress), July 2013.

676	   [RFC3168]  Ramakrishnan, K., Floyd, S., and D. Black, "The Addition
677	              of Explicit Congestion Notification (ECN) to IP", RFC
678	              3168, September 2001.

680	   [RFC5389]  Rosenberg, J., Mahy, R., Matthews, P., and D. Wing,
681	              "Session Traversal Utilities for NAT (STUN)", RFC 5389,
682	              October 2008.

684	   [RFC5766]  Mahy, R., Matthews, P., and J. Rosenberg, "Traversal Using
685	              Relays around NAT (TURN): Relay Extensions to Session
686	              Traversal Utilities for NAT (STUN)", RFC 5766, April 2010.

688	   [RFC6454]  Barth, A., "The Web Origin Concept", RFC 6454, December
689	              2011.

691	Appendix A.  Defining Legitimate Uses of ICE

693	   Limiting the bandwidth generated by connectivity checks depends on
694	   knowing how much ICE could use under normal circumstances.  This
695	   ensures any absolute limit doesn't adversely affect a legitimate use
696	   of ICE.

698	   Any calculation should allow for slightly abnormal configurations
699	   that might generate higher than average data rates.  Otherwise, an
700	   average might adversely affect legitimate users.  The intent is to
701	   avoid having legitimate uses concerned with the limit.

703	A.1.  Candidate Pair Count

705	   Our sample legitimate user has 2 local network interfaces.  This can
706	   result in as many as 14 candidates, 8 of them IPv4 plus 6 IPv6.  Each
707	   interface has 1 IPv4 address, an IPv6 address, plus a link-local IPv6
708	   address.  Assuming a different public IPv4 NAT address for each
709	   interface and IP version (using either NAT4-4 or NAT6-4 as
710	   appropriate) other than the link local addresses, this adds another 4
711	   addresses.  In addition to this, two TURN servers might be contacted
712	   by either IPv4 or IPv6, providing 4 more addresses.

714	   Two peers with this configuration will generate 100 candidate pairs,
715	   since only IPv4 candidates are paired with IPv4 candidates.

717	   Assuming that all candidates are checked once before ICE completes on
718	   a second round of checks, there are in excess of 100 connectivity
719	   checks sent.  Even at the fastest permitted pacing, this means that
720	   ICE completes in at least 2 seconds, plus the round trip time.

722	A.2.  Connectivity Check Size

724	   The STUN message used for a connectivity check can vary, but making
725	   some reasonable assumptions, it is likely to be 149 or 169 bytes on
726	   the wire (plus network layer encapsulation).  This makes the
727	   following assumptions:

729	   IP Header:  20 bytes (IPv4) or 40 bytes (IPv6) with no extensions

731	   UDP Header:  8 bytes

733	   STUN Header:  20 bytes

735	   USE-CANDIDATE Attribute:  4 bytes

737	   CONTROLLED or CONTROLLING Attribute:  4 bytes

739	   PRIORITY Attribute:  4 bytes

741	   MESSAGE-INTEGRITY Attribute:  24 bytes

743	   FINGERPRINT Attribute:  8 bytes

745	   USER Attribute:  49 bytes carries two 20 character username fragments

747	A.3.  Rate Calculations

749	   Assuming a 150 byte connectivity check and a global pacing timer of
750	   20ms, this produces 60kbps at peak (68kpbs for IPv6).

752	   For 100 candidate pairs, with at most 5 connectivity checks on each
753	   pair, this peak could be sustained for 10 seconds by a single ICE
754	   agent.

756	   The question is: is this a tolerable rate?

758	A.4.  Comparison: G.711 Audio

760	   G.711 audio is commonly used without any congestion feedback
761	   mechanisms in place - primarily because it is unflexible and unable
762	   to scale its network usage in response to congestion signals.  The
763	   theory is that it might be acceptable to generate a similar amount of
764	   traffic without congestion controls.

766	   It should be immediately obvious that this theory has a major flaw.
767	   Even though the impact on the network might be similar, G.711 is not
768	   sent to an unwilling recipient, whereas no such guarantee can be made
769	   for connectivity checks.

771	   Assuming 80bit integrity on SRTP, no header extensions and no CSRCs,
772	   G.711 produces 84kbps.  That would suggest that a single ICE agent
773	   with 20ms pacing might be tolerable, at least over short intervals.

775	A.5.  Recommended Rate Limits

777	   Enforcing a limit of 96kbps would allow for a substantial increase in
778	   the size of STUN connectivity check messages without affecting
779	   legitimate uses.

781	   Over a longer interval, this high rate is likely to be unnecessary.
782	   Even with 100 candidate pairs, ICE should complete in between 2 and 5
783	   seconds, especially if candidate pairs are frozen across multiple ICE
784	   agents.  Providing a lower limit over a 10 to 20 second interval
785	   should further limit the damage.  Enforcing a longer term limit of 48
786	   kilobytes (every 20 seconds or so) would allow for 6 seconds of
787	   continuous checking with the size described above, or 4 seconds of
788	   checking at the short term rate limit.

790	Author's Address

792	   Martin Thomson
793	   Microsoft
794	   3210 Porter Drive
795	   Palo Alto, CA  94304
796	   US

798	   Email: martin.thomson@skype.net