RTCWeb                                                        M. Isomaki
Internet-Draft                                                     Nokia
Intended status: Standards Track                            July 9, 2012
Expires: January 10, 2013


                RTCweb Considerations for Mobile Devices
                     draft-isomaki-rtcweb-mobile-00

Abstract

   Web Real-time Communications (WebRTC) aims to provide web-based
   applications real-time and peer-to-peer communication capabilities.
   In many cases those applications are run in mobile devices connected
   to different types of mobile networks.  This document gives an
   overview of the issues and challenges in implementing and deploying
   WebRTC in mobile environments.  It also gives guidance on how to
   overcome those challenges.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at http://datatracker.ietf.org/drafts/current/.

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on January 10, 2013.

Copyright Notice

   Copyright (c) 2012 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   (http://trustee.ietf.org/license-info) in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of


Isomaki                 Expires January 10, 2013                [Page 1]

Internet-Draft              RTCWeb for Mobile                  July 2012


   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . . . .  3
   2.  Common mobile networks and their properties  . . . . . . . . .  3
   3.  Specific issues and how to deal with them  . . . . . . . . . .  5
     3.1.  Persistent connectivity to the Calling Site  . . . . . . .  5
     3.2.  Media and Data channels  . . . . . . . . . . . . . . . . .  6
     3.3.  Recovery from interface switching  . . . . . . . . . . . .  7
     3.4.  Congestion avoidance . . . . . . . . . . . . . . . . . . .  9
   4.  Security Considerations  . . . . . . . . . . . . . . . . . . .  9
   5.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . .  9
   6.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 10
   Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 10


Isomaki                 Expires January 10, 2013                [Page 2]

Internet-Draft              RTCWeb for Mobile                  July 2012


1.  Introduction

   Web Real-time Communications (WebRTC) provides web-based applications
   real-time and peer-to-peer communication capabilities.  The
   applications can setup communication sessions that can carry audio,
   video, or any application specific data.  To be reachable for
   incoming sessions setups or other messages, the applications must
   keep persistent connectivity with their "calling site".

   In the last few years, mobile devices, such as smartphones or
   tablets, have become relatively powerful in terms of processing and
   memory.  Their browsers are becoming close to their desktop
   counterparts.  So, from that perspective, it is feasible to run
   WebRTC applications in them.  However, power consumption and highly
   diverse nature of the connectivity still remain as specific
   challenges.  A lot of work is done to address these challenges in
   e.g. radio technologies and hardware components, but still by far the
   most important factor is how the applications and protocols and
   application programming interfaces are designed.

   Section 2 of this document gives an overview of the characteristics
   of different mobile networks as background for further discussion.
   Section 3 introduces the specific issues that WebRTC protocols and
   applications should take into consideration to be mobile-friendly.

   The current version of the document misses all references and lot of
   details.  It may have some errors.  Its purpose is to get attention
   to the topics it raises and start discussion about them.


2.  Common mobile networks and their properties

   The most relevant mobile networks for WebRTC at the moment are Wi-Fi
   and the different variants of cellular technologies.

   Many characteristics of the cellular networks are covered in Section
   3 in the context of the particular issue under discussion.  The
   following is a very brief description of the power consumption
   related properties of WCDMA/HSPA networks.  The details vary, but
   similar principles apply to other cellular networks, at least GPRS/
   EDGE and LTE.

   In simplified terms, the WCDMA/HSPA radio can be in three different
   types of states: The power-save state (IDLE, Cell_PCH, URA_PCH), a
   shared channel state (Cell_FACH) or a dedicated channel state
   (Cell_DCH).  The power-save states consumes about two decades less
   power than the dedicated channel state, while the shared-channel
   state is somewhere in the middle.  The state machine works so that if


Isomaki                 Expires January 10, 2013                [Page 3]

Internet-Draft              RTCWeb for Mobile                  July 2012


   a device has only small packets (upto ~200-500 bytes) to send or
   receive, it will allocate a shared channel, that operates on low data
   rate.  If there is more traffic (even a single full size IP packet),
   a dedicated channel is allocated.  Starting from the power-save
   state, the channel allocation typically takes somewhere between 0.5
   and 2 seconds, depending on the network and the exact power-save
   state.  Only after that, the first packet is really sent.  If two
   cellular devices were to exchange packets with each other starting
   from the power-save state, the initial IP-level RTT could be easily
   3-4 seconds.

   The channel is kept for some time after the last packet has been sent
   or received.  The dedicated channel drops to power-save via the
   shared channel.  The timers from dedicated to shared and shared to
   power-save are network dependent, but typically somewhere between 5
   and 30 seconds.  So, in some networks sending a single ping every 30
   secods is enough to keep the power consumption constantly at the
   maximum level, while in others the power-save state is entered much
   faster.  The total radio power consumption does not actually depend
   so much on overall volume of traffic, but on how long a dedicated or
   shared channel is active.  So, for instance a 1 kB keep-alive sent
   every 30 seconds for an hour (total ~100 kB of traffic) consumes much
   more (even an order or magnitude more!) than a single 10 MB download,
   assuming that will finish in a minute or two.

   The applications have no control over the radio states, but the
   Operating System and the Radio Modem software can do something about
   them.  In the newer specifications (and devices and networks) it is
   possible for the device to explicily ask the radio channel to be
   abandoned even immediately after the last packet.  For instance, if
   the device were somehow to know that no new packets are to be sent
   for some time, it could do such signaling and save power.

   The bottom line is that applications and protocols should keep as
   long intervals between traffic as possible, giving the radio as much
   low-power time as possible.  The intervals that are more than a few
   seconds may help, but at least intervals that are longer than 30
   seconds will definitely help.  On the other hand, the initial RTT
   after an interval will be long.  This issue is covered in Sections
   3.1 and 3.2.

   The other key characteristic of cellular networks is that they have
   long buffers and run link-layer in "acknowledged" mode, meaning all
   lost packets are retransmitted.  This means TCP will easily create
   long delays and ruins real-time traffic.  This is covered in Section
   3.4.

   The third characteristic is that mobile devices often change networks


Isomaki                 Expires January 10, 2013                [Page 4]

Internet-Draft              RTCWeb for Mobile                  July 2012


   on the fly, typically between cellular and Wi-Fi.  Most devices only
   run a single interface at a time.  From networking perspective this
   means that the device's IP address changes, and e.g. all its TCP
   connections are lost.  This is covered in Section 3.3.


3.  Specific issues and how to deal with them

3.1.  Persistent connectivity to the Calling Site

   Many WebRTC apps want to be reachable for incoming sessions (JSEP
   Offers) or other types of asynchronous messages.  For this purpose
   they need some kind of a persistent communication channel with their
   "Calling Site".  Two standard approaches for this are WebSockets and
   HTTP long-polling.  In both of these cases a TCP connection is used
   as the underlying transport.

   Most cellular networks have a firewall preventing incoming TCP
   connections, even when they allocate public IPv4 or IPv6 addresses.
   Also NATs are becoming more popular with the exhaustion of IPv4
   address space.  The firewall and NAT timers for TCP can range between
   1 and 60 minutes, depending on the network.  To keep the TCP
   connection alive, the application needs to send some kind of a keep-
   alive packets with high enough frequency to avoid the timeout.

   If the WebRTC app intends to run for a long periods of time (even
   when the user is not actively interacting with it), it is of utmost
   importance to keep this keep-alive traffic as infrequent as possible.
   Every wake-up of the radio consumes a significant amount of power,
   even if it is needed just for sending and receiving a couple of IP
   packets.  It makes a huge difference, if there are for instance 6 vs.
   60 of these wake-ups every hour.  A naiive application may want to
   make it sure it sends frequently enough for all possible networks.
   That leads to unacceptable power consumption.  A smarter application
   will try to figure out a suitable timeout for a given network it is
   using, and can save a lot of power in networks with longer timers.

   There are further strategies to manage the keep-alives so that they
   consume least amount of power.  It is best to send as small keep-
   alive messages as possible.  HSPA/WCDMA networks have a special
   shared radio channel (FACH) that can carry small amounts of traffic.
   Its power consumption is typically less than half of the dedicated
   channel.  Depending on the network, a packet of a couple of hundred
   bytes will usually only require FACH, while a thousand byte packet
   will require the dedicated channel to be activated.  So, a WebSocket
   PING-PONG is better than an HTTP POST or GET with all the Cookies and
   other headers attached.  If there are multiple applications or
   connections to be kept alive, the Browser or the underlying platform


Isomaki                 Expires January 10, 2013                [Page 5]

Internet-Draft              RTCWeb for Mobile                  July 2012


   should offer some kind of a synchronization for them, so that the
   radio is woken only once per cycle.

   The most efficient approach would be to multiplex the initial
   incoming messages for all applications over the same TCP connection.
   This would require the use of some kind of a gateway service in the
   network.  Such "notification" services are available on many
   platforms, but at the moment they are not typically available for
   browsers or web applications.  It would be useful to standardize or
   develop Javascript APIs for this purpose.  There is W3C work on
   Server-sent events.  Also, the Open Mobile Alliance (OMA) has started
   work on standardized "notification" services.  Be the services
   standards based or proprietary, the most relevant part to get done
   would be to give WebRTC and other Web applications access to them.
   Such services are always subject to privacy concerns, so at minimum
   the messages passed over them should be end-to-end encrypted.
   (Traffic analysis threats would still remain.)

3.2.  Media and Data channels

   Real-time media (audio, video) is typically sent and/or received
   constantly, while the media channel is established.  This means radio
   needs to be on constantly, and there is little for the application to
   do to preserve power.  (Choosing a hardware accelerated video codec
   over a non-HW-supported one is one thing the application may be able
   to influence.)  At least in LTE there are techniques called
   Discontinuous Transmission/Reception (DTX, DRX), that operate even in
   the timeframe of tens of milliseconds and can affect power
   consumption e.g. for VoIP.  It is an open issue if WebRTC stacks can
   be somehow optimized for them.

   The Data Channel may however be often low-volume or even idle for
   long periods of time.  For instance an IM connection may be idle for
   minutes or even hours.  There can be many apps that want to keep such
   a connection available just in case there is some traffic to be sent
   or received infrequently.  The WebRTC Data Channel is based on SCTP
   over DTLS over UDP.  This means it needs keepalives in the order of
   30 seconds in cellular networks, meaning the radio will be active
   most of the time even if no user traffic is sent.  It is not possible
   to keep such a channel on for a long time due to power consumption.

   Applications can choose different strategies to deal with this
   problem.  One approach is to avoid Data Channels completely for low-
   volume or infrequent traffic and send it via the Web servers over
   HTTP or WebSockets.  This is probably the best approach.  The other
   approach is to tear down the Data Channel after some timeout and re-
   establish it only when new traffic needs to be sent.  This may create
   some lag in sending the first message after the interval.  The third


Isomaki                 Expires January 10, 2013                [Page 6]

Internet-Draft              RTCWeb for Mobile                  July 2012


   option is to transport the Data Channel over TCP, e.g. using a yet
   undefined "HTTP tunneling fallback" mechanism.  This would be almost
   identical to the first approach, except that logically the
   application would still be using a WebRTC Data Channel.  It is not
   yet clear if this will be feasible due to ICE concent refreshes that
   may need to occur frequetly as well (every 30 seconds?).  They are
   sent end-to-end so one side of the Data Channel can not by itself
   even affect their rate.

3.3.  Recovery from interface switching

   Most mobile platforms only support Internet connectivity over only
   one interface at a time.  In practice this is either a cellular or a
   Wi-Fi interface.  From radio hardware perspective there would be no
   need for such a limitation, but it is driven by simplicity and power
   preservation.  The devices typically have a hard-coded or
   configurable priority order for different networks.  The most common
   policy is that any known Wi-Fi network is always preferred over any
   cellular network, but even more complex policies are possible.

   When the device detects a higher priority network than the one
   currently in use, it will by default attach to that network
   automatically.  After a successful attachment to the new network, the
   device turns the old network (and interface) off.  In most platforms
   applications have no control over this.  In a typical situation the
   switch-over leads to a change of IP address, and for instance all TCP
   connections becoming disconnected, and any state tied to them needs
   to be recreated.

   It is important that WebRTC applications are made robust enough to
   survive this behavior.  Many native applications deal with it by
   listening to "disconnect" and "reconnect" events through the APIs
   they are using.  For WebRTC apps the first priority is to re-
   establish its "signaling" connectivity to the "Calling Site".  If
   that connectivity is based on a WebSocket, the application needs to
   react to the "onerror" event through the WebSocket API and establish
   a new connection and setup all state related to it.  (Say, if the
   application was using SIP over WebSockets, it might have to re-
   REGISTER on the SIP level.)  If the disconnect was caused by
   interface switching and the switch-over succeeded cleanly, it would
   be possible to setup the new connection immediately.  In some cases
   the disconnect could last longer, and the application would have to
   retry the connection until connectivity is regained.

   It would be advisable to make the reconnect step as lightweight as
   possible in terms of RTTs required.  For the browser and the web
   application platform, it is important that the "disconnect" event
   gets propagated to the applications as fast as possible.


Isomaki                 Expires January 10, 2013                [Page 7]

Internet-Draft              RTCWeb for Mobile                  July 2012


   For HTTP long-polling, it would similarly be important to notice that
   the underlying TCP connection has become stale, and a new poll needs
   to be sent as quickly as possible.

   The application may also attempt to update any peer-to-peer sessions
   it is having at the time of the switch-over.  At this point of RTCWeb
   standardization it is not yet clear how much control over this the
   protocols and APIs will exhibit.  There are many layers on which the
   recovery can be done.  It is possible to try to deal with it using
   ICE.  This would require knowing when the currently used ICE
   candidate becomes unusable, as it is bound to a removed interface.
   The failure of ICE connectivity checks provide that information, but
   possibly after some delay.  (Frequent connectivity checks are not an
   issue as long as media is actively sent or received, but would be
   costly over an idle or low-volume media channel, such as a Data
   Channel.  If media traffic is infrequent, the speed of detection may
   not be that critical for user experience anyway.)  If an interface
   really became unusable, it would be better to have an explicit event
   to signal that all ICE candidates bound to it are likely unusable as
   well, so the application could act immediately.  If a new interface
   became available, the application could restart ICE and start using
   the new candidates gathered.

   The PeerConnection API offers a few events for these purposes, at
   least "icechange" and "renegotiationneeded".  With these the
   application can learn about problems with the currently used
   candidates.  There is also a method "updateIce" by which the
   application can restart the ICE candidate gathering process.  It is
   however not yet entirely clear how these event handlers and methods
   should be best used to deal with an interface change, and whether
   they even are a feasible tool for dealing with it.  It is also
   important to note that no new offers or answers could be sent or
   received until the "signaling channel" (e.g. the Websocket
   connection) was first re-established.

   If the lower-level instruments fail, the application could create a
   new PeerConnection, and recreate the media channels.  This would be a
   heavier operation, but in some cases it might still be better than
   leaving the recovery entirely to the user, i.e. explicitly making a
   new call from the UI.

   There are certain things that the underlyind platform (Operating
   System, Connection Manager etc.) can also implement to make interface
   switching smoother for the applications.  One possibility would be to
   keep the old interface available for a short duration even after a
   new higher priority interface becomes available.  This would allow
   applications to deal with the change in a more proactive fashion.
   There are also protocols such as Multipath TCP that could be used to


Isomaki                 Expires January 10, 2013                [Page 8]

Internet-Draft              RTCWeb for Mobile                  July 2012


   switch e.g.  WebSocket connections to a new interface without always
   resorting to the application support.

3.4.  Congestion avoidance

   Cellular mobile networks have notoriously large buffers.  Their link
   layers also typically operate in an "acknowledged" mode, meaning that
   the lost frames (or packets) are retransmitted.  Retransmission
   creates head of line blocking on the queue.  This means packets are
   seldom lost, but delays grow large.  The individual users or
   endpoints are often isolated from each other so that the network
   capacity is divided among them more or less evenly.  However, all
   traffic to and from the same endpoint ends up in the same queue.  In
   WebRTC context this means that plain TCP traffic will easily ruin
   real-time traffic due to the buffering.

   WebRTC protocols should be desinged to avoid this.  If Data Channels
   transfer a lot of data in parallel to the real-time streams, they
   should not use the loss-driven (TCP) congestion control algorithms
   but something that reacts to queue growth much faster.  IETF LEDBAT
   WG may have something to offer for this case.  If the browser wants
   to protect its real-time strams in general against all TCP (HTTP,
   WebSocket) traffic, it might be best for it to also restrict the
   number of simultanous TCP connections in use, for instace to retrive
   a website.  The HTTP 2.0 work done in IETF HTTPBIS WG should prove
   helpful in this case.

   Cellular networks also do have their in-built Quality of Service
   mechanisms that can be used to differentiate service for different
   packet flows.  These are not widely used in HSPA/WCDMA, but LTE may
   change the situation to some extent.  The QoS policy is enforced by
   the network, and requires a contract with the operator.  It is thus
   likely only available for services with some relation to the access
   operator.  How the WebRTC application or the browser deal with that
   is TBD.  Technically DiffServ marking is probably the only dynamic
   approach to indicate the priority of a particular flow.


4.  Security Considerations

   Not explicitly covered in this version.


5.  Acknowledgements

   Bernard Aboba and Goeran Eriksson provided useful comments to the
   document.  Dan Druta has worked on Web notifications in the context
   of WebRTC.


Isomaki                 Expires January 10, 2013                [Page 9]

Internet-Draft              RTCWeb for Mobile                  July 2012


6.  References


Author's Address

   Markus Isomaki
   Nokia
   Keilalahdentie 2-4
   FI-02150 Espoo
   Finland

   Email: markus.isomaki@nokia.com


Isomaki                 Expires January 10, 2013               [Page 10]