idnits 2.17.1 

draft-isomaki-rtcweb-mobile-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (July 9, 2012) is 4309 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

     No issues found here.

     Summary: 2 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	RTCWeb                                                        M. Isomaki
3	Internet-Draft                                                     Nokia
4	Intended status: Standards Track                            July 9, 2012
5	Expires: January 10, 2013

7	                RTCweb Considerations for Mobile Devices
8	                     draft-isomaki-rtcweb-mobile-00

10	Abstract

12	   Web Real-time Communications (WebRTC) aims to provide web-based
13	   applications real-time and peer-to-peer communication capabilities.
14	   In many cases those applications are run in mobile devices connected
15	   to different types of mobile networks.  This document gives an
16	   overview of the issues and challenges in implementing and deploying
17	   WebRTC in mobile environments.  It also gives guidance on how to
18	   overcome those challenges.

20	Status of this Memo

22	   This Internet-Draft is submitted in full conformance with the
23	   provisions of BCP 78 and BCP 79.

25	   Internet-Drafts are working documents of the Internet Engineering
26	   Task Force (IETF).  Note that other groups may also distribute
27	   working documents as Internet-Drafts.  The list of current Internet-
28	   Drafts is at http://datatracker.ietf.org/drafts/current/.

30	   Internet-Drafts are draft documents valid for a maximum of six months
31	   and may be updated, replaced, or obsoleted by other documents at any
32	   time.  It is inappropriate to use Internet-Drafts as reference
33	   material or to cite them other than as "work in progress."

35	   This Internet-Draft will expire on January 10, 2013.

37	Copyright Notice

39	   Copyright (c) 2012 IETF Trust and the persons identified as the
40	   document authors.  All rights reserved.

42	   This document is subject to BCP 78 and the IETF Trust's Legal
43	   Provisions Relating to IETF Documents
44	   (http://trustee.ietf.org/license-info) in effect on the date of
45	   publication of this document.  Please review these documents
46	   carefully, as they describe your rights and restrictions with respect
47	   to this document.  Code Components extracted from this document must
48	   include Simplified BSD License text as described in Section 4.e of
49	   the Trust Legal Provisions and are provided without warranty as
50	   described in the Simplified BSD License.

52	Table of Contents

54	   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
55	   2.  Common mobile networks and their properties  . . . . . . . . .  3
56	   3.  Specific issues and how to deal with them  . . . . . . . . . .  5
57	     3.1.  Persistent connectivity to the Calling Site  . . . . . . .  5
58	     3.2.  Media and Data channels  . . . . . . . . . . . . . . . . .  6
59	     3.3.  Recovery from interface switching  . . . . . . . . . . . .  7
60	     3.4.  Congestion avoidance . . . . . . . . . . . . . . . . . . .  9
61	   4.  Security Considerations  . . . . . . . . . . . . . . . . . . .  9
62	   5.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . .  9
63	   6.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 10
64	   Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 10

66	1.  Introduction

68	   Web Real-time Communications (WebRTC) provides web-based applications
69	   real-time and peer-to-peer communication capabilities.  The
70	   applications can setup communication sessions that can carry audio,
71	   video, or any application specific data.  To be reachable for
72	   incoming sessions setups or other messages, the applications must
73	   keep persistent connectivity with their "calling site".

75	   In the last few years, mobile devices, such as smartphones or
76	   tablets, have become relatively powerful in terms of processing and
77	   memory.  Their browsers are becoming close to their desktop
78	   counterparts.  So, from that perspective, it is feasible to run
79	   WebRTC applications in them.  However, power consumption and highly
80	   diverse nature of the connectivity still remain as specific
81	   challenges.  A lot of work is done to address these challenges in
82	   e.g. radio technologies and hardware components, but still by far the
83	   most important factor is how the applications and protocols and
84	   application programming interfaces are designed.

86	   Section 2 of this document gives an overview of the characteristics
87	   of different mobile networks as background for further discussion.
88	   Section 3 introduces the specific issues that WebRTC protocols and
89	   applications should take into consideration to be mobile-friendly.

91	   The current version of the document misses all references and lot of
92	   details.  It may have some errors.  Its purpose is to get attention
93	   to the topics it raises and start discussion about them.

95	2.  Common mobile networks and their properties

97	   The most relevant mobile networks for WebRTC at the moment are Wi-Fi
98	   and the different variants of cellular technologies.

100	   Many characteristics of the cellular networks are covered in Section
101	   3 in the context of the particular issue under discussion.  The
102	   following is a very brief description of the power consumption
103	   related properties of WCDMA/HSPA networks.  The details vary, but
104	   similar principles apply to other cellular networks, at least GPRS/
105	   EDGE and LTE.

107	   In simplified terms, the WCDMA/HSPA radio can be in three different
108	   types of states: The power-save state (IDLE, Cell_PCH, URA_PCH), a
109	   shared channel state (Cell_FACH) or a dedicated channel state
110	   (Cell_DCH).  The power-save states consumes about two decades less
111	   power than the dedicated channel state, while the shared-channel
112	   state is somewhere in the middle.  The state machine works so that if
113	   a device has only small packets (upto ~200-500 bytes) to send or
114	   receive, it will allocate a shared channel, that operates on low data
115	   rate.  If there is more traffic (even a single full size IP packet),
116	   a dedicated channel is allocated.  Starting from the power-save
117	   state, the channel allocation typically takes somewhere between 0.5
118	   and 2 seconds, depending on the network and the exact power-save
119	   state.  Only after that, the first packet is really sent.  If two
120	   cellular devices were to exchange packets with each other starting
121	   from the power-save state, the initial IP-level RTT could be easily
122	   3-4 seconds.

124	   The channel is kept for some time after the last packet has been sent
125	   or received.  The dedicated channel drops to power-save via the
126	   shared channel.  The timers from dedicated to shared and shared to
127	   power-save are network dependent, but typically somewhere between 5
128	   and 30 seconds.  So, in some networks sending a single ping every 30
129	   secods is enough to keep the power consumption constantly at the
130	   maximum level, while in others the power-save state is entered much
131	   faster.  The total radio power consumption does not actually depend
132	   so much on overall volume of traffic, but on how long a dedicated or
133	   shared channel is active.  So, for instance a 1 kB keep-alive sent
134	   every 30 seconds for an hour (total ~100 kB of traffic) consumes much
135	   more (even an order or magnitude more!) than a single 10 MB download,
136	   assuming that will finish in a minute or two.

138	   The applications have no control over the radio states, but the
139	   Operating System and the Radio Modem software can do something about
140	   them.  In the newer specifications (and devices and networks) it is
141	   possible for the device to explicily ask the radio channel to be
142	   abandoned even immediately after the last packet.  For instance, if
143	   the device were somehow to know that no new packets are to be sent
144	   for some time, it could do such signaling and save power.

146	   The bottom line is that applications and protocols should keep as
147	   long intervals between traffic as possible, giving the radio as much
148	   low-power time as possible.  The intervals that are more than a few
149	   seconds may help, but at least intervals that are longer than 30
150	   seconds will definitely help.  On the other hand, the initial RTT
151	   after an interval will be long.  This issue is covered in Sections
152	   3.1 and 3.2.

154	   The other key characteristic of cellular networks is that they have
155	   long buffers and run link-layer in "acknowledged" mode, meaning all
156	   lost packets are retransmitted.  This means TCP will easily create
157	   long delays and ruins real-time traffic.  This is covered in Section
158	   3.4.

160	   The third characteristic is that mobile devices often change networks
161	   on the fly, typically between cellular and Wi-Fi.  Most devices only
162	   run a single interface at a time.  From networking perspective this
163	   means that the device's IP address changes, and e.g. all its TCP
164	   connections are lost.  This is covered in Section 3.3.

166	3.  Specific issues and how to deal with them

168	3.1.  Persistent connectivity to the Calling Site

170	   Many WebRTC apps want to be reachable for incoming sessions (JSEP
171	   Offers) or other types of asynchronous messages.  For this purpose
172	   they need some kind of a persistent communication channel with their
173	   "Calling Site".  Two standard approaches for this are WebSockets and
174	   HTTP long-polling.  In both of these cases a TCP connection is used
175	   as the underlying transport.

177	   Most cellular networks have a firewall preventing incoming TCP
178	   connections, even when they allocate public IPv4 or IPv6 addresses.
179	   Also NATs are becoming more popular with the exhaustion of IPv4
180	   address space.  The firewall and NAT timers for TCP can range between
181	   1 and 60 minutes, depending on the network.  To keep the TCP
182	   connection alive, the application needs to send some kind of a keep-
183	   alive packets with high enough frequency to avoid the timeout.

185	   If the WebRTC app intends to run for a long periods of time (even
186	   when the user is not actively interacting with it), it is of utmost
187	   importance to keep this keep-alive traffic as infrequent as possible.
188	   Every wake-up of the radio consumes a significant amount of power,
189	   even if it is needed just for sending and receiving a couple of IP
190	   packets.  It makes a huge difference, if there are for instance 6 vs.
191	   60 of these wake-ups every hour.  A naiive application may want to
192	   make it sure it sends frequently enough for all possible networks.
193	   That leads to unacceptable power consumption.  A smarter application
194	   will try to figure out a suitable timeout for a given network it is
195	   using, and can save a lot of power in networks with longer timers.

197	   There are further strategies to manage the keep-alives so that they
198	   consume least amount of power.  It is best to send as small keep-
199	   alive messages as possible.  HSPA/WCDMA networks have a special
200	   shared radio channel (FACH) that can carry small amounts of traffic.
201	   Its power consumption is typically less than half of the dedicated
202	   channel.  Depending on the network, a packet of a couple of hundred
203	   bytes will usually only require FACH, while a thousand byte packet
204	   will require the dedicated channel to be activated.  So, a WebSocket
205	   PING-PONG is better than an HTTP POST or GET with all the Cookies and
206	   other headers attached.  If there are multiple applications or
207	   connections to be kept alive, the Browser or the underlying platform
208	   should offer some kind of a synchronization for them, so that the
209	   radio is woken only once per cycle.

211	   The most efficient approach would be to multiplex the initial
212	   incoming messages for all applications over the same TCP connection.
213	   This would require the use of some kind of a gateway service in the
214	   network.  Such "notification" services are available on many
215	   platforms, but at the moment they are not typically available for
216	   browsers or web applications.  It would be useful to standardize or
217	   develop Javascript APIs for this purpose.  There is W3C work on
218	   Server-sent events.  Also, the Open Mobile Alliance (OMA) has started
219	   work on standardized "notification" services.  Be the services
220	   standards based or proprietary, the most relevant part to get done
221	   would be to give WebRTC and other Web applications access to them.
222	   Such services are always subject to privacy concerns, so at minimum
223	   the messages passed over them should be end-to-end encrypted.
224	   (Traffic analysis threats would still remain.)

226	3.2.  Media and Data channels

228	   Real-time media (audio, video) is typically sent and/or received
229	   constantly, while the media channel is established.  This means radio
230	   needs to be on constantly, and there is little for the application to
231	   do to preserve power.  (Choosing a hardware accelerated video codec
232	   over a non-HW-supported one is one thing the application may be able
233	   to influence.)  At least in LTE there are techniques called
234	   Discontinuous Transmission/Reception (DTX, DRX), that operate even in
235	   the timeframe of tens of milliseconds and can affect power
236	   consumption e.g. for VoIP.  It is an open issue if WebRTC stacks can
237	   be somehow optimized for them.

239	   The Data Channel may however be often low-volume or even idle for
240	   long periods of time.  For instance an IM connection may be idle for
241	   minutes or even hours.  There can be many apps that want to keep such
242	   a connection available just in case there is some traffic to be sent
243	   or received infrequently.  The WebRTC Data Channel is based on SCTP
244	   over DTLS over UDP.  This means it needs keepalives in the order of
245	   30 seconds in cellular networks, meaning the radio will be active
246	   most of the time even if no user traffic is sent.  It is not possible
247	   to keep such a channel on for a long time due to power consumption.

249	   Applications can choose different strategies to deal with this
250	   problem.  One approach is to avoid Data Channels completely for low-
251	   volume or infrequent traffic and send it via the Web servers over
252	   HTTP or WebSockets.  This is probably the best approach.  The other
253	   approach is to tear down the Data Channel after some timeout and re-
254	   establish it only when new traffic needs to be sent.  This may create
255	   some lag in sending the first message after the interval.  The third
256	   option is to transport the Data Channel over TCP, e.g. using a yet
257	   undefined "HTTP tunneling fallback" mechanism.  This would be almost
258	   identical to the first approach, except that logically the
259	   application would still be using a WebRTC Data Channel.  It is not
260	   yet clear if this will be feasible due to ICE concent refreshes that
261	   may need to occur frequetly as well (every 30 seconds?).  They are
262	   sent end-to-end so one side of the Data Channel can not by itself
263	   even affect their rate.

265	3.3.  Recovery from interface switching

267	   Most mobile platforms only support Internet connectivity over only
268	   one interface at a time.  In practice this is either a cellular or a
269	   Wi-Fi interface.  From radio hardware perspective there would be no
270	   need for such a limitation, but it is driven by simplicity and power
271	   preservation.  The devices typically have a hard-coded or
272	   configurable priority order for different networks.  The most common
273	   policy is that any known Wi-Fi network is always preferred over any
274	   cellular network, but even more complex policies are possible.

276	   When the device detects a higher priority network than the one
277	   currently in use, it will by default attach to that network
278	   automatically.  After a successful attachment to the new network, the
279	   device turns the old network (and interface) off.  In most platforms
280	   applications have no control over this.  In a typical situation the
281	   switch-over leads to a change of IP address, and for instance all TCP
282	   connections becoming disconnected, and any state tied to them needs
283	   to be recreated.

285	   It is important that WebRTC applications are made robust enough to
286	   survive this behavior.  Many native applications deal with it by
287	   listening to "disconnect" and "reconnect" events through the APIs
288	   they are using.  For WebRTC apps the first priority is to re-
289	   establish its "signaling" connectivity to the "Calling Site".  If
290	   that connectivity is based on a WebSocket, the application needs to
291	   react to the "onerror" event through the WebSocket API and establish
292	   a new connection and setup all state related to it.  (Say, if the
293	   application was using SIP over WebSockets, it might have to re-
294	   REGISTER on the SIP level.)  If the disconnect was caused by
295	   interface switching and the switch-over succeeded cleanly, it would
296	   be possible to setup the new connection immediately.  In some cases
297	   the disconnect could last longer, and the application would have to
298	   retry the connection until connectivity is regained.

300	   It would be advisable to make the reconnect step as lightweight as
301	   possible in terms of RTTs required.  For the browser and the web
302	   application platform, it is important that the "disconnect" event
303	   gets propagated to the applications as fast as possible.

305	   For HTTP long-polling, it would similarly be important to notice that
306	   the underlying TCP connection has become stale, and a new poll needs
307	   to be sent as quickly as possible.

309	   The application may also attempt to update any peer-to-peer sessions
310	   it is having at the time of the switch-over.  At this point of RTCWeb
311	   standardization it is not yet clear how much control over this the
312	   protocols and APIs will exhibit.  There are many layers on which the
313	   recovery can be done.  It is possible to try to deal with it using
314	   ICE.  This would require knowing when the currently used ICE
315	   candidate becomes unusable, as it is bound to a removed interface.
316	   The failure of ICE connectivity checks provide that information, but
317	   possibly after some delay.  (Frequent connectivity checks are not an
318	   issue as long as media is actively sent or received, but would be
319	   costly over an idle or low-volume media channel, such as a Data
320	   Channel.  If media traffic is infrequent, the speed of detection may
321	   not be that critical for user experience anyway.)  If an interface
322	   really became unusable, it would be better to have an explicit event
323	   to signal that all ICE candidates bound to it are likely unusable as
324	   well, so the application could act immediately.  If a new interface
325	   became available, the application could restart ICE and start using
326	   the new candidates gathered.

328	   The PeerConnection API offers a few events for these purposes, at
329	   least "icechange" and "renegotiationneeded".  With these the
330	   application can learn about problems with the currently used
331	   candidates.  There is also a method "updateIce" by which the
332	   application can restart the ICE candidate gathering process.  It is
333	   however not yet entirely clear how these event handlers and methods
334	   should be best used to deal with an interface change, and whether
335	   they even are a feasible tool for dealing with it.  It is also
336	   important to note that no new offers or answers could be sent or
337	   received until the "signaling channel" (e.g. the Websocket
338	   connection) was first re-established.

340	   If the lower-level instruments fail, the application could create a
341	   new PeerConnection, and recreate the media channels.  This would be a
342	   heavier operation, but in some cases it might still be better than
343	   leaving the recovery entirely to the user, i.e. explicitly making a
344	   new call from the UI.

346	   There are certain things that the underlyind platform (Operating
347	   System, Connection Manager etc.) can also implement to make interface
348	   switching smoother for the applications.  One possibility would be to
349	   keep the old interface available for a short duration even after a
350	   new higher priority interface becomes available.  This would allow
351	   applications to deal with the change in a more proactive fashion.
352	   There are also protocols such as Multipath TCP that could be used to
353	   switch e.g.  WebSocket connections to a new interface without always
354	   resorting to the application support.

356	3.4.  Congestion avoidance

358	   Cellular mobile networks have notoriously large buffers.  Their link
359	   layers also typically operate in an "acknowledged" mode, meaning that
360	   the lost frames (or packets) are retransmitted.  Retransmission
361	   creates head of line blocking on the queue.  This means packets are
362	   seldom lost, but delays grow large.  The individual users or
363	   endpoints are often isolated from each other so that the network
364	   capacity is divided among them more or less evenly.  However, all
365	   traffic to and from the same endpoint ends up in the same queue.  In
366	   WebRTC context this means that plain TCP traffic will easily ruin
367	   real-time traffic due to the buffering.

369	   WebRTC protocols should be desinged to avoid this.  If Data Channels
370	   transfer a lot of data in parallel to the real-time streams, they
371	   should not use the loss-driven (TCP) congestion control algorithms
372	   but something that reacts to queue growth much faster.  IETF LEDBAT
373	   WG may have something to offer for this case.  If the browser wants
374	   to protect its real-time strams in general against all TCP (HTTP,
375	   WebSocket) traffic, it might be best for it to also restrict the
376	   number of simultanous TCP connections in use, for instace to retrive
377	   a website.  The HTTP 2.0 work done in IETF HTTPBIS WG should prove
378	   helpful in this case.

380	   Cellular networks also do have their in-built Quality of Service
381	   mechanisms that can be used to differentiate service for different
382	   packet flows.  These are not widely used in HSPA/WCDMA, but LTE may
383	   change the situation to some extent.  The QoS policy is enforced by
384	   the network, and requires a contract with the operator.  It is thus
385	   likely only available for services with some relation to the access
386	   operator.  How the WebRTC application or the browser deal with that
387	   is TBD.  Technically DiffServ marking is probably the only dynamic
388	   approach to indicate the priority of a particular flow.

390	4.  Security Considerations

392	   Not explicitly covered in this version.

394	5.  Acknowledgements

396	   Bernard Aboba and Goeran Eriksson provided useful comments to the
397	   document.  Dan Druta has worked on Web notifications in the context
398	   of WebRTC.

400	6.  References

402	Author's Address

404	   Markus Isomaki
405	   Nokia
406	   Keilalahdentie 2-4
407	   FI-02150 Espoo
408	   Finland

410	   Email: markus.isomaki@nokia.com