idnits 2.17.1 draft-isomaki-rtcweb-mobile-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 9, 2012) is 4309 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) No issues found here. Summary: 2 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 RTCWeb M. Isomaki 3 Internet-Draft Nokia 4 Intended status: Standards Track July 9, 2012 5 Expires: January 10, 2013 7 RTCweb Considerations for Mobile Devices 8 draft-isomaki-rtcweb-mobile-00 10 Abstract 12 Web Real-time Communications (WebRTC) aims to provide web-based 13 applications real-time and peer-to-peer communication capabilities. 14 In many cases those applications are run in mobile devices connected 15 to different types of mobile networks. This document gives an 16 overview of the issues and challenges in implementing and deploying 17 WebRTC in mobile environments. It also gives guidance on how to 18 overcome those challenges. 20 Status of this Memo 22 This Internet-Draft is submitted in full conformance with the 23 provisions of BCP 78 and BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF). Note that other groups may also distribute 27 working documents as Internet-Drafts. The list of current Internet- 28 Drafts is at http://datatracker.ietf.org/drafts/current/. 30 Internet-Drafts are draft documents valid for a maximum of six months 31 and may be updated, replaced, or obsoleted by other documents at any 32 time. It is inappropriate to use Internet-Drafts as reference 33 material or to cite them other than as "work in progress." 35 This Internet-Draft will expire on January 10, 2013. 37 Copyright Notice 39 Copyright (c) 2012 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents 44 (http://trustee.ietf.org/license-info) in effect on the date of 45 publication of this document. Please review these documents 46 carefully, as they describe your rights and restrictions with respect 47 to this document. Code Components extracted from this document must 48 include Simplified BSD License text as described in Section 4.e of 49 the Trust Legal Provisions and are provided without warranty as 50 described in the Simplified BSD License. 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 55 2. Common mobile networks and their properties . . . . . . . . . 3 56 3. Specific issues and how to deal with them . . . . . . . . . . 5 57 3.1. Persistent connectivity to the Calling Site . . . . . . . 5 58 3.2. Media and Data channels . . . . . . . . . . . . . . . . . 6 59 3.3. Recovery from interface switching . . . . . . . . . . . . 7 60 3.4. Congestion avoidance . . . . . . . . . . . . . . . . . . . 9 61 4. Security Considerations . . . . . . . . . . . . . . . . . . . 9 62 5. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 9 63 6. References . . . . . . . . . . . . . . . . . . . . . . . . . . 10 64 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 10 66 1. Introduction 68 Web Real-time Communications (WebRTC) provides web-based applications 69 real-time and peer-to-peer communication capabilities. The 70 applications can setup communication sessions that can carry audio, 71 video, or any application specific data. To be reachable for 72 incoming sessions setups or other messages, the applications must 73 keep persistent connectivity with their "calling site". 75 In the last few years, mobile devices, such as smartphones or 76 tablets, have become relatively powerful in terms of processing and 77 memory. Their browsers are becoming close to their desktop 78 counterparts. So, from that perspective, it is feasible to run 79 WebRTC applications in them. However, power consumption and highly 80 diverse nature of the connectivity still remain as specific 81 challenges. A lot of work is done to address these challenges in 82 e.g. radio technologies and hardware components, but still by far the 83 most important factor is how the applications and protocols and 84 application programming interfaces are designed. 86 Section 2 of this document gives an overview of the characteristics 87 of different mobile networks as background for further discussion. 88 Section 3 introduces the specific issues that WebRTC protocols and 89 applications should take into consideration to be mobile-friendly. 91 The current version of the document misses all references and lot of 92 details. It may have some errors. Its purpose is to get attention 93 to the topics it raises and start discussion about them. 95 2. Common mobile networks and their properties 97 The most relevant mobile networks for WebRTC at the moment are Wi-Fi 98 and the different variants of cellular technologies. 100 Many characteristics of the cellular networks are covered in Section 101 3 in the context of the particular issue under discussion. The 102 following is a very brief description of the power consumption 103 related properties of WCDMA/HSPA networks. The details vary, but 104 similar principles apply to other cellular networks, at least GPRS/ 105 EDGE and LTE. 107 In simplified terms, the WCDMA/HSPA radio can be in three different 108 types of states: The power-save state (IDLE, Cell_PCH, URA_PCH), a 109 shared channel state (Cell_FACH) or a dedicated channel state 110 (Cell_DCH). The power-save states consumes about two decades less 111 power than the dedicated channel state, while the shared-channel 112 state is somewhere in the middle. The state machine works so that if 113 a device has only small packets (upto ~200-500 bytes) to send or 114 receive, it will allocate a shared channel, that operates on low data 115 rate. If there is more traffic (even a single full size IP packet), 116 a dedicated channel is allocated. Starting from the power-save 117 state, the channel allocation typically takes somewhere between 0.5 118 and 2 seconds, depending on the network and the exact power-save 119 state. Only after that, the first packet is really sent. If two 120 cellular devices were to exchange packets with each other starting 121 from the power-save state, the initial IP-level RTT could be easily 122 3-4 seconds. 124 The channel is kept for some time after the last packet has been sent 125 or received. The dedicated channel drops to power-save via the 126 shared channel. The timers from dedicated to shared and shared to 127 power-save are network dependent, but typically somewhere between 5 128 and 30 seconds. So, in some networks sending a single ping every 30 129 secods is enough to keep the power consumption constantly at the 130 maximum level, while in others the power-save state is entered much 131 faster. The total radio power consumption does not actually depend 132 so much on overall volume of traffic, but on how long a dedicated or 133 shared channel is active. So, for instance a 1 kB keep-alive sent 134 every 30 seconds for an hour (total ~100 kB of traffic) consumes much 135 more (even an order or magnitude more!) than a single 10 MB download, 136 assuming that will finish in a minute or two. 138 The applications have no control over the radio states, but the 139 Operating System and the Radio Modem software can do something about 140 them. In the newer specifications (and devices and networks) it is 141 possible for the device to explicily ask the radio channel to be 142 abandoned even immediately after the last packet. For instance, if 143 the device were somehow to know that no new packets are to be sent 144 for some time, it could do such signaling and save power. 146 The bottom line is that applications and protocols should keep as 147 long intervals between traffic as possible, giving the radio as much 148 low-power time as possible. The intervals that are more than a few 149 seconds may help, but at least intervals that are longer than 30 150 seconds will definitely help. On the other hand, the initial RTT 151 after an interval will be long. This issue is covered in Sections 152 3.1 and 3.2. 154 The other key characteristic of cellular networks is that they have 155 long buffers and run link-layer in "acknowledged" mode, meaning all 156 lost packets are retransmitted. This means TCP will easily create 157 long delays and ruins real-time traffic. This is covered in Section 158 3.4. 160 The third characteristic is that mobile devices often change networks 161 on the fly, typically between cellular and Wi-Fi. Most devices only 162 run a single interface at a time. From networking perspective this 163 means that the device's IP address changes, and e.g. all its TCP 164 connections are lost. This is covered in Section 3.3. 166 3. Specific issues and how to deal with them 168 3.1. Persistent connectivity to the Calling Site 170 Many WebRTC apps want to be reachable for incoming sessions (JSEP 171 Offers) or other types of asynchronous messages. For this purpose 172 they need some kind of a persistent communication channel with their 173 "Calling Site". Two standard approaches for this are WebSockets and 174 HTTP long-polling. In both of these cases a TCP connection is used 175 as the underlying transport. 177 Most cellular networks have a firewall preventing incoming TCP 178 connections, even when they allocate public IPv4 or IPv6 addresses. 179 Also NATs are becoming more popular with the exhaustion of IPv4 180 address space. The firewall and NAT timers for TCP can range between 181 1 and 60 minutes, depending on the network. To keep the TCP 182 connection alive, the application needs to send some kind of a keep- 183 alive packets with high enough frequency to avoid the timeout. 185 If the WebRTC app intends to run for a long periods of time (even 186 when the user is not actively interacting with it), it is of utmost 187 importance to keep this keep-alive traffic as infrequent as possible. 188 Every wake-up of the radio consumes a significant amount of power, 189 even if it is needed just for sending and receiving a couple of IP 190 packets. It makes a huge difference, if there are for instance 6 vs. 191 60 of these wake-ups every hour. A naiive application may want to 192 make it sure it sends frequently enough for all possible networks. 193 That leads to unacceptable power consumption. A smarter application 194 will try to figure out a suitable timeout for a given network it is 195 using, and can save a lot of power in networks with longer timers. 197 There are further strategies to manage the keep-alives so that they 198 consume least amount of power. It is best to send as small keep- 199 alive messages as possible. HSPA/WCDMA networks have a special 200 shared radio channel (FACH) that can carry small amounts of traffic. 201 Its power consumption is typically less than half of the dedicated 202 channel. Depending on the network, a packet of a couple of hundred 203 bytes will usually only require FACH, while a thousand byte packet 204 will require the dedicated channel to be activated. So, a WebSocket 205 PING-PONG is better than an HTTP POST or GET with all the Cookies and 206 other headers attached. If there are multiple applications or 207 connections to be kept alive, the Browser or the underlying platform 208 should offer some kind of a synchronization for them, so that the 209 radio is woken only once per cycle. 211 The most efficient approach would be to multiplex the initial 212 incoming messages for all applications over the same TCP connection. 213 This would require the use of some kind of a gateway service in the 214 network. Such "notification" services are available on many 215 platforms, but at the moment they are not typically available for 216 browsers or web applications. It would be useful to standardize or 217 develop Javascript APIs for this purpose. There is W3C work on 218 Server-sent events. Also, the Open Mobile Alliance (OMA) has started 219 work on standardized "notification" services. Be the services 220 standards based or proprietary, the most relevant part to get done 221 would be to give WebRTC and other Web applications access to them. 222 Such services are always subject to privacy concerns, so at minimum 223 the messages passed over them should be end-to-end encrypted. 224 (Traffic analysis threats would still remain.) 226 3.2. Media and Data channels 228 Real-time media (audio, video) is typically sent and/or received 229 constantly, while the media channel is established. This means radio 230 needs to be on constantly, and there is little for the application to 231 do to preserve power. (Choosing a hardware accelerated video codec 232 over a non-HW-supported one is one thing the application may be able 233 to influence.) At least in LTE there are techniques called 234 Discontinuous Transmission/Reception (DTX, DRX), that operate even in 235 the timeframe of tens of milliseconds and can affect power 236 consumption e.g. for VoIP. It is an open issue if WebRTC stacks can 237 be somehow optimized for them. 239 The Data Channel may however be often low-volume or even idle for 240 long periods of time. For instance an IM connection may be idle for 241 minutes or even hours. There can be many apps that want to keep such 242 a connection available just in case there is some traffic to be sent 243 or received infrequently. The WebRTC Data Channel is based on SCTP 244 over DTLS over UDP. This means it needs keepalives in the order of 245 30 seconds in cellular networks, meaning the radio will be active 246 most of the time even if no user traffic is sent. It is not possible 247 to keep such a channel on for a long time due to power consumption. 249 Applications can choose different strategies to deal with this 250 problem. One approach is to avoid Data Channels completely for low- 251 volume or infrequent traffic and send it via the Web servers over 252 HTTP or WebSockets. This is probably the best approach. The other 253 approach is to tear down the Data Channel after some timeout and re- 254 establish it only when new traffic needs to be sent. This may create 255 some lag in sending the first message after the interval. The third 256 option is to transport the Data Channel over TCP, e.g. using a yet 257 undefined "HTTP tunneling fallback" mechanism. This would be almost 258 identical to the first approach, except that logically the 259 application would still be using a WebRTC Data Channel. It is not 260 yet clear if this will be feasible due to ICE concent refreshes that 261 may need to occur frequetly as well (every 30 seconds?). They are 262 sent end-to-end so one side of the Data Channel can not by itself 263 even affect their rate. 265 3.3. Recovery from interface switching 267 Most mobile platforms only support Internet connectivity over only 268 one interface at a time. In practice this is either a cellular or a 269 Wi-Fi interface. From radio hardware perspective there would be no 270 need for such a limitation, but it is driven by simplicity and power 271 preservation. The devices typically have a hard-coded or 272 configurable priority order for different networks. The most common 273 policy is that any known Wi-Fi network is always preferred over any 274 cellular network, but even more complex policies are possible. 276 When the device detects a higher priority network than the one 277 currently in use, it will by default attach to that network 278 automatically. After a successful attachment to the new network, the 279 device turns the old network (and interface) off. In most platforms 280 applications have no control over this. In a typical situation the 281 switch-over leads to a change of IP address, and for instance all TCP 282 connections becoming disconnected, and any state tied to them needs 283 to be recreated. 285 It is important that WebRTC applications are made robust enough to 286 survive this behavior. Many native applications deal with it by 287 listening to "disconnect" and "reconnect" events through the APIs 288 they are using. For WebRTC apps the first priority is to re- 289 establish its "signaling" connectivity to the "Calling Site". If 290 that connectivity is based on a WebSocket, the application needs to 291 react to the "onerror" event through the WebSocket API and establish 292 a new connection and setup all state related to it. (Say, if the 293 application was using SIP over WebSockets, it might have to re- 294 REGISTER on the SIP level.) If the disconnect was caused by 295 interface switching and the switch-over succeeded cleanly, it would 296 be possible to setup the new connection immediately. In some cases 297 the disconnect could last longer, and the application would have to 298 retry the connection until connectivity is regained. 300 It would be advisable to make the reconnect step as lightweight as 301 possible in terms of RTTs required. For the browser and the web 302 application platform, it is important that the "disconnect" event 303 gets propagated to the applications as fast as possible. 305 For HTTP long-polling, it would similarly be important to notice that 306 the underlying TCP connection has become stale, and a new poll needs 307 to be sent as quickly as possible. 309 The application may also attempt to update any peer-to-peer sessions 310 it is having at the time of the switch-over. At this point of RTCWeb 311 standardization it is not yet clear how much control over this the 312 protocols and APIs will exhibit. There are many layers on which the 313 recovery can be done. It is possible to try to deal with it using 314 ICE. This would require knowing when the currently used ICE 315 candidate becomes unusable, as it is bound to a removed interface. 316 The failure of ICE connectivity checks provide that information, but 317 possibly after some delay. (Frequent connectivity checks are not an 318 issue as long as media is actively sent or received, but would be 319 costly over an idle or low-volume media channel, such as a Data 320 Channel. If media traffic is infrequent, the speed of detection may 321 not be that critical for user experience anyway.) If an interface 322 really became unusable, it would be better to have an explicit event 323 to signal that all ICE candidates bound to it are likely unusable as 324 well, so the application could act immediately. If a new interface 325 became available, the application could restart ICE and start using 326 the new candidates gathered. 328 The PeerConnection API offers a few events for these purposes, at 329 least "icechange" and "renegotiationneeded". With these the 330 application can learn about problems with the currently used 331 candidates. There is also a method "updateIce" by which the 332 application can restart the ICE candidate gathering process. It is 333 however not yet entirely clear how these event handlers and methods 334 should be best used to deal with an interface change, and whether 335 they even are a feasible tool for dealing with it. It is also 336 important to note that no new offers or answers could be sent or 337 received until the "signaling channel" (e.g. the Websocket 338 connection) was first re-established. 340 If the lower-level instruments fail, the application could create a 341 new PeerConnection, and recreate the media channels. This would be a 342 heavier operation, but in some cases it might still be better than 343 leaving the recovery entirely to the user, i.e. explicitly making a 344 new call from the UI. 346 There are certain things that the underlyind platform (Operating 347 System, Connection Manager etc.) can also implement to make interface 348 switching smoother for the applications. One possibility would be to 349 keep the old interface available for a short duration even after a 350 new higher priority interface becomes available. This would allow 351 applications to deal with the change in a more proactive fashion. 352 There are also protocols such as Multipath TCP that could be used to 353 switch e.g. WebSocket connections to a new interface without always 354 resorting to the application support. 356 3.4. Congestion avoidance 358 Cellular mobile networks have notoriously large buffers. Their link 359 layers also typically operate in an "acknowledged" mode, meaning that 360 the lost frames (or packets) are retransmitted. Retransmission 361 creates head of line blocking on the queue. This means packets are 362 seldom lost, but delays grow large. The individual users or 363 endpoints are often isolated from each other so that the network 364 capacity is divided among them more or less evenly. However, all 365 traffic to and from the same endpoint ends up in the same queue. In 366 WebRTC context this means that plain TCP traffic will easily ruin 367 real-time traffic due to the buffering. 369 WebRTC protocols should be desinged to avoid this. If Data Channels 370 transfer a lot of data in parallel to the real-time streams, they 371 should not use the loss-driven (TCP) congestion control algorithms 372 but something that reacts to queue growth much faster. IETF LEDBAT 373 WG may have something to offer for this case. If the browser wants 374 to protect its real-time strams in general against all TCP (HTTP, 375 WebSocket) traffic, it might be best for it to also restrict the 376 number of simultanous TCP connections in use, for instace to retrive 377 a website. The HTTP 2.0 work done in IETF HTTPBIS WG should prove 378 helpful in this case. 380 Cellular networks also do have their in-built Quality of Service 381 mechanisms that can be used to differentiate service for different 382 packet flows. These are not widely used in HSPA/WCDMA, but LTE may 383 change the situation to some extent. The QoS policy is enforced by 384 the network, and requires a contract with the operator. It is thus 385 likely only available for services with some relation to the access 386 operator. How the WebRTC application or the browser deal with that 387 is TBD. Technically DiffServ marking is probably the only dynamic 388 approach to indicate the priority of a particular flow. 390 4. Security Considerations 392 Not explicitly covered in this version. 394 5. Acknowledgements 396 Bernard Aboba and Goeran Eriksson provided useful comments to the 397 document. Dan Druta has worked on Web notifications in the context 398 of WebRTC. 400 6. References 402 Author's Address 404 Markus Isomaki 405 Nokia 406 Keilalahdentie 2-4 407 FI-02150 Espoo 408 Finland 410 Email: markus.isomaki@nokia.com