idnits 2.17.1 draft-huang-alto-mowie-for-network-aware-app-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (5 January 2021) is 1207 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) No issues found here. Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 ALTO C. Xiong 3 Internet-Draft Y. Zhang 4 Intended status: Standards Track Tencent 5 Expires: 9 July 2021 R. Yang 6 Yale University 7 G. Li 8 CMRI 9 Y. Lei 10 Y. Han 11 Tencent 12 5 January 2021 14 MoWIE for Network Aware Application 15 draft-huang-alto-mowie-for-network-aware-app-02 17 Abstract 19 With the quick deployment of 5G networks in the world, cloud based 20 interactive services such as clouding gaming have gained substantial 21 attention and are regarded as potential killer applications. To 22 ensure users' quality of experience (QoE), a cloud interactive 23 service may require not only high bandwidth (e.g., high-resolution 24 media transmission) but also low delay (e.g., low latency and low 25 lagging). However, the bandwidth and delay experienced by a mobile 26 and wireless user can be dynamic, as a function of many factors, and 27 unhandled changes can substantially compromise users' QoE. In this 28 document, we investigate network-aware applications (NAA), which 29 realize cloud based interactive services with improved QoE, by 30 efficient utilization of Mobile and Wireless Information Exposure 31 (MoWIE) . In particular, this document demonstrates, through 32 realistic evaluations, that mobile network information such as MCS 33 (Modulation and Coding Scheme) can effectively expose the dynamicity 34 of the underlying network and can be made available to applications 35 through MoWIE; using such information, the applications can then 36 adapt key control knobs such as media codec scheme, encapsulation and 37 application logical function to minimize QoE deduction. Based on the 38 evaluations, we discuss how MoWIE can be a systematic extension of 39 the ALTO protocol, to expose more lower-layer and finer grain network 40 dynamics. 42 Status of This Memo 44 This Internet-Draft is submitted in full conformance with the 45 provisions of BCP 78 and BCP 79. 47 Internet-Drafts are working documents of the Internet Engineering 48 Task Force (IETF). Note that other groups may also distribute 49 working documents as Internet-Drafts. The list of current Internet- 50 Drafts is at https://datatracker.ietf.org/drafts/current/. 52 Internet-Drafts are draft documents valid for a maximum of six months 53 and may be updated, replaced, or obsoleted by other documents at any 54 time. It is inappropriate to use Internet-Drafts as reference 55 material or to cite them other than as "work in progress." 57 This Internet-Draft will expire on 9 July 2021. 59 Copyright Notice 61 Copyright (c) 2021 IETF Trust and the persons identified as the 62 document authors. All rights reserved. 64 This document is subject to BCP 78 and the IETF Trust's Legal 65 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 66 license-info) in effect on the date of publication of this document. 67 Please review these documents carefully, as they describe your rights 68 and restrictions with respect to this document. Code Components 69 extracted from this document must include Simplified BSD License text 70 as described in Section 4.e of the Trust Legal Provisions and are 71 provided without warranty as described in the Simplified BSD License. 73 Table of Contents 75 1. Introduction of Network-aware Applications . . . . . . . . . 3 76 2. Use Cases of Network-Aware Application (NAA) . . . . . . . . 5 77 2.1. Cloud Gaming . . . . . . . . . . . . . . . . . . . . . . 5 78 2.2. Low Delay Live Show . . . . . . . . . . . . . . . . . . . 5 79 2.3. Cloud VR . . . . . . . . . . . . . . . . . . . . . . . . 6 80 2.4. Performance Requirements of these Use Cases . . . . . . . 6 81 3. Current (Indirect) Technologies on NAA . . . . . . . . . . . 6 82 3.1. Video Compression Based on ROI (Region of Interest) . . . 7 83 3.2. AI-based Adaptive Bitrate . . . . . . . . . . . . . . . . 7 84 4. Preliminary QoE Improvement Based on MoWIE . . . . . . . . . 8 85 4.1. MoWIE Architecture and Network Information exposure . . . 8 86 4.2. RAN assisted TCP optimization based on MoWIE . . . . . . 9 87 4.3. NAA QoE Test based on MoWIE . . . . . . . . . . . . . . . 10 88 4.4. ROI Detection with Network Information . . . . . . . . . 10 89 4.5. Adaptive Bitrate with Network Capability Exposure . . . . 13 90 4.6. Analysis of the Experiments . . . . . . . . . . . . . . . 15 91 5. Standardization Considerations of MoWIE as an Extension to 92 ALTO . . . . . . . . . . . . . . . . . . . . . . . . . . 17 93 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 94 7. Security Considerations . . . . . . . . . . . . . . . . . . . 19 95 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 19 96 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 19 97 9.1. Normative References . . . . . . . . . . . . . . . . . . 19 98 9.2. Informative References . . . . . . . . . . . . . . . . . 19 99 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 20 101 1. Introduction of Network-aware Applications 103 With the quick and widely deployment of 5G network in the world, more 104 and more applications are now moving to the remote cloud-based 105 application, e.g., cloud office, cloud education and cloud gaming. 107 Some new and amazing applications are created and hosted in the 108 remote cloud, e.g., cloud AR/VR/MR. What's more a lot of traditional 109 niche interactive applications are becoming widely used in daily 110 business with the help of mobile network and cloud, e.g., cloud video 111 conference. Especially, during the coronavirus pandemic in 2020, 112 many peoples have to stay at home and work/study remotely, the usage 113 of cloud applications, including cloud-based online courses, cloud- 114 based conferencing, and cloud gaming, has surged significant. 116 To provide acceptable QoE to the end users via the mobile network, 117 the cloud application needs to know the mobile network status, e.g., 118 delay, bandwidth, jitter to dynamically balance the generated media 119 traffic and the rendering/mixing in the cloud. Currently, the 120 application assumes the network as a black box and continuously uses 121 client or server measurement to detect the network characteristics, 122 and then adaptively change the parameters as well as logical function 123 of the application. However, when only application information is 124 utilized, the application can't guarantee a good QoE in some cases. 125 First, information from application side may have delay. When a user 126 enters some place with bad network such as elevator or underground 127 garage, the application will not receive such information 128 immediately. As a result, the buffer of video application may have a 129 high chance to run out. Then the screen will freeze and users QoE 130 will be harmed. Besides, the application does not have information 131 about other users in the cell. Thus, it can't know how many 132 resources it can get and when it will change. If other users enter 133 the cell and compete the resource, the application layer may misjudge 134 the resource and request a high bitrate. Then the delay will 135 increase and QoE will drop. Some information from network layer like 136 physical resource block (PRB) information and utilization rate can 137 help to describe how many resources the user will get and how many 138 users are competing with him. Such information is helpful to predict 139 the network and streaming videos. However, the application can't get 140 those kinds of information yet. 142 Mobile network is always pursuing standard solutions to get network 143 dynamic indicators that can be used by applications. In 3GPP, a lot 144 of IP-based QoE mechanism are reused. The ECN[RFC3168] has been 145 supported by the 4G radio station (eNB) to provide CE(Congestion 146 Encountered) information to the IMS application to perform the 147 Adaptive Bitrate (ABR) [TS26.114].The application can downgrade the 148 bit rate after receiving the CE indication, but does not know exact 149 bit rate to be selected. The DSCP[RFC2474] is used to difference the 150 QoS class and paging strategy[TS23.501],normally the application 151 cannot dynamically change the DSCP to improve bit rate based on the 152 network status. DASH [MPEGDASH] is a MPEG standard widely used for 153 the application to detect the throughput of the network based on the 154 current throughput and buffering states and adaptively select the 155 next segment of video streaming with a suitable bitrate in order to 156 avoid the re-buffering. SAND-DASH[TS26.247] defines the mechanism 157 that the network/server can provide available throughput to the 158 application, in such case, the better bitrate can be selected by DASH 159 application. 161 In 5G cellular networks, network capability exposure has been 162 specified which allows the 5G system to expose the QoS Flow 163 establishment with AF provided QoS requirements, user device 164 location, network status towards the 3rd party application servers 165 modeled as AF (Application Function) [TS23.501].In such case, the AF 166 can request the 5G to establish a dedicated QoS Flow to transport an 167 IP flow with the AF provided QoS requirements. The 5G also can 168 provide QNC (QoS Notification Control) to the AF if the 169 GBR(Guaranteed Bitrate) of the established GBR QoS Flow cannot be 170 fulfilled, and the AF can change the bitrate after receiving the QNC 171 notification. But the AF still does not know which bitrate to be 172 selected. So the 5G enhances the QNC with providing a list of 173 AQPs(alternative QoS profile). with this AQP, the 5G network provides 174 a subset of supported AQPs with the QNC, then the AF selects a bit 175 rate from 5G network supported AQPs, in such case, the GBR can 176 fulfilled again if the radio state of user is changed. QoS 177 predication is realized by network function inside 5GC to collect and 178 analyze the status and parameters from the 5G network entities, and 179 deliver the analytics results towards the entity such as application 180 server. However, both network capability exposure and QoS 181 predication solutions are designed for 5G access and core network, 182 which cannot cover the whole end-to-end network. How to enable the 183 application to be aware of the lower layer networks in Internet 184 scenario is an important area for both industrial and academic 185 researchers. 187 2. Use Cases of Network-Aware Application (NAA) 189 There are three typical NAAs, cloud gaming, low delay live show, and 190 cloud VR, whose QoE can be largely enhanced with the help of MoWIE. 192 2.1. Cloud Gaming 194 As mentioned above, cloud gaming becomes more and more popular 195 recently. This kind of games requires low latency and highly 196 reliable transmission of motion tracking data from user to gaming 197 server in the cloud, as well as low latency and high data rate 198 transmission of processed visual content from gaming server cloud to 199 the user devices. Cloud gaming is regarded as one major killer 200 application as well as traffic contributor to wireless and cellular 201 networks including 5G. The major advantages of cloud gaming are easy 202 & quick starting (no/less need to download and install big volume of 203 software in the user device), less cost and process load in user 204 device and it is also regarded as anti-cheating measure. Thus, the 205 kind of gaming becomes a competitive replacement for console gaming 206 using cheaper PC or laptop. In order to support high quality cloud 207 gaming services, the application need to get the information from the 208 network layer, e.g., the data rate value or range which lower layer 209 can provide in order to perform rendering and encoding, during which 210 the application in the cloud can adopt different parameters to adjust 211 the size of produced visual content within a time period. 213 2.2. Low Delay Live Show 215 In 2019, over 500 million active users were using online personal 216 live show services in China and there are 4 million simultaneous 217 online audience watching a celebrity's show. Low delay live show 218 requires the close interaction between application and network. 220 Compared with conventional broadcast services. This service is 221 interactive which means the audience can be involved and they are 222 able to provide feedback to the anchor. For example, a gaming show 223 broadcasts the gaming playing to all audience, and it also requires 224 playing game interaction between the anchor and the audience. A 225 delay lower than 100ms is desired. If the delay is too large, there 226 will be undesirable degradation on user experiences especially in a 227 large- scale show. To lower the latency and provide size-adjustable 228 show content, the application also requires the real-time lower layer 229 information. 231 2.3. Cloud VR 233 Cloud VR data volume is large which is related to different parameter 234 settings like DoF (Degree of Freedom), resolution and adopted 235 rendering and compression algorithm. The rendering can be performed 236 at the cloud/network side or a mix of the cloud and the user device 237 side. Because the latency in cloud VR is even as low as 20ms, the 238 application may need to interact with network to get the information 239 about the segmentation or transport block information, and these 240 lower layers information may be dependent on different layer 2 and 241 layer 3 wireless protocol designs. 243 2.4. Performance Requirements of these Use Cases 245 There are different bandwidth, latency and lagging requirements for 246 the above services which are characterized as parameter range. The 247 reason of using a range is because such requirements are related to a 248 group of parameter settings including resolution, frame rate and the 249 compression mechanism. We consider 1080p~4K as the resolution range, 250 60-120 FPS (Frames per second) as the frame rate and H.265 as an 251 example compression algorithm. The end-to-end latency requirement is 252 not only related to FPS but also the property of the service, i.e., 253 for weak interactive and strong interactive services. 255 With the typical parameters setting, cloud gaming generally needs a 256 bandwidth of 20~60 Mbps , we also consider the lagging significantly 257 happens when the latency is larger than 40~200ms, depending on the 258 types of games (e.g. 40ms for First Person Shoot games, 80ms for 259 Action games, and 200ms for Puzzle games). In order to avoid bad 260 user experiences, the lagging rate is better to be as low as zero (in 261 an optimal QoE). For low latency live show, 20~50 Mbps bandwidth may 262 be needed and the end-to-end latency requirements is less than 100 263 ms. Cloud VR service generally requires 100~500 Mbps bandwidth and 264 20~50 ms end-to-end latency. It is noted that these values are 265 dependent with the parameter settings and they are provided to 266 illustrate the order of magnitude of these parameters for the afore- 267 mentioned use cases. These value range may be updated according to 268 specific scenarios and requirements. 270 3. Current (Indirect) Technologies on NAA 272 The applications have tried to increase QoE with the help of network 273 information captured from the application layer to guess the network 274 dynamics, such as bitrate, buffer status, packet loss rate and so on. 276 For example, adaptive bitrate (ABR) and buffer control methods to 277 reduce delay, and application layer forward error scheme (AL-FEC) to 278 avoid packet losing are proposed. This document focuses on two novel 279 approaches, which have achieved good performance in practice. One is 280 video encoding based on ROI, the other is reinforcement learning 281 based adaptive bitrate. 283 3.1. Video Compression Based on ROI (Region of Interest) 285 A foveated mechanism [Saccadic] in the Human Visual System indicates 286 that only small fovea region captures most visual attention at high 287 resolution, while other peripheral regions receive little attention 288 at low resolution. And we call those regions which attract users 289 most, the regions of interest (ROI)[Fahad]. 291 To predict human attention or ROI, saliency detection has been widely 292 studied in recent years, with a lot of applications in object 293 recognition, object segmentation, action recognition, image caption, 294 image/video compression, etc. 296 Since there exists the region of interest in a video, the cloud 297 server can give the ROI region higher rate while making other regions 298 a lower rate. As a result, the whole rate of the video is reduced 299 while the watching experience will not be harmed. 301 This method means to detect the ROI and re-allocate the coding scheme 302 for interested and non-interested regions in order to save the 303 bandwidth without sacrificing user's QoE. In recent years, the ever- 304 increasing video size has become a big problem to applications. The 305 data rate of a cloud gaming video in 1080P can reach 25Mbps, which 306 brings huge burden to the network, even for 5G network. Those ROI- 307 based video compression methods are mainly applied to the high 308 concurrency network to relive the burden of networks and then keep 309 QoE in an acceptable range. 311 However, current methods utilize application information like 312 application rate and application buffer size as the indicators to 313 roughly adjust the algorithm in interactive video services. That 314 information is hard to reflect the real-time network status 315 precisely. Therefore, it is hard to balance the QoE and bandwidth 316 saving in real-time scenario. More direct information is helpful for 317 those ROI methods to improve the performance. 319 3.2. AI-based Adaptive Bitrate 321 This method intends to reduce lagging and ensure the acceptable 322 picture quality. 324 Applications such as video live streaming and cloud gaming employ 325 adaptive bitrate (ABR) algorithms to optimize user QoE [MPC][CS2P]. 327 Despite the abundance of recently proposed schemes, state-of-the-art 328 AI based ABR algorithms suffer from a key limitation. They use fixed 329 control rules based on simplified or inaccurate models of the 330 deployment environment. As a result, existing schemes inevitably 331 fail to achieve optimal performance across a broad set of network 332 conditions and QoE objectives. 334 A reinforcement learning based ABR algorithm named Pensieve was 335 proposed [Hongzi] recently. Unlike traditional ABR algorithms that 336 use fixed heuristics or inaccurate system models, Pensieve's ABR 337 algorithms are generated using observations of the resulting 338 performance of past decisions across a large number of video 339 streaming experiments. This allows Pensieve to optimize its policy 340 for different network characteristics and QoE metrics directly from 341 experience. Over a broad set of network conditions and QoE metrics, 342 it has been proven that Pensieve outperformed existing ABR algorithms 343 by 12%~25%. 345 For this method and those methods built upon this, it has been proven 346 that all the information, such as rate, download time, buffer size or 347 network level information which can reflect the performance are 348 useful to the reinforcement learning. Since those data can reflect 349 the network dynamics, they have been used to help the applications to 350 know how to change the rate and promote the users' QoE. 352 However, all these data are obtained from the client side or the 353 server side. In reality, it is not easy to obtain such data in an 354 effective and efficient way. Lack of standardized approach to 355 acquire these data, is difficult to make this usable for different 356 applications for large scale deployment. Meanwhile, these data which 357 reflect the real-time network status change rapidly and randomly 358 which is hard to use a theoretical model to characterize. 360 To summarize, current practices can make some improvements by 361 indirectly measuring network status and react in the application. 363 However, the network status data is not rich, direct, real-time, also 364 lacks predictability, especially when in the mobile and wireless 365 network scenarios, which results in long react delay or high QoE 366 fluctuations. 368 4. Preliminary QoE Improvement Based on MoWIE 370 4.1. MoWIE Architecture and Network Information exposure 372 The fundamental idea of MoWIE is to achieve on demand and periodic 373 network information from network to applications, helping the service 374 provider to do a better policy control to improve user experience. 376 A possible MoWIE architecture include three core components, the 377 Client Application, the Mobile Network and the Application Server. 379 The raw data are collected firstly from the radio network and core 380 network and then further processing on these collected data and 381 exposed Network information are provided to the application Server. 382 These functions are defined as the network information service 383 (NIS)and the NIS can be deployed at MEC (Mobile Edge 384 Computing). The application server can send the NIS request on UE/ 385 Cell level information and obtain the NIS response on network 386 information from the mobile network. After user data pre-processing, 387 the application server will make best use of the network information 388 to perform analytics and directly influent the application functions 389 e.g. bit rate, data amount etc. 391 Typically, the network information includes two types of information 392 as below: 394 Cell level Information: 396 * The number of Downlink PRBs (Physical Resource Block) occupied 397 during sampling period; and 399 * the Downlink MAC data rate per cell; 401 * UE level information (without privacy information): 403 * The Uplink SINR (Signal to Inference plus Noise Ratio); 405 * MCS: The index of MCS (Modulation and Coding Scheme); 407 * The number of packets occupied in PDCP buffer; The number of 408 Downlink PDCP SDU packets; 410 * The number of PDCP SDU packets lost; 412 * The Downlink MAC data rate per UE. 414 4.2. RAN assisted TCP optimization based on MoWIE 416 The RAN information are used to assist TCP sending window adjustment 417 rather than traditional transport layer measurement and 418 acknowledgement. The RAN proactively predicts available radio 419 bandwidth and the buffer status per UE in a time granularity of RTT 420 level (e.g. 100ms) and then piggybacks such information in TCP ACK. 422 We have conducted trial in real mobile network. It is observed that 423 for the UE with good SINR, the throughput is significantly improved 424 by nearly 100%, and the UE with medium SINR can achieve approximately 425 50% gain. 427 4.3. NAA QoE Test based on MoWIE 429 Different from traditional video streaming, cloud gaming has no 430 buffer to accommodate and re-arrange the received data. It must 431 display the stream once the stream is received. Any late stream is 432 of no use for the player. Cloud gaming performs not well in the 433 existing public 4G network according to our actual measurements. The 434 end to end delay is often greater than 100ms for a gaming client in 435 Shenzhen to a gaming server in Shanghai, coupled with the codec 436 delay. Here the delay is defined as the total delay from the user's 437 operation instruction to show the response picture on user's screen. 439 Once the network fluctuates, users will experience a longer delay. 441 The poor user experience is not only because of the relative low 442 network throughput, but also because the server cannot adapt the 443 application logical policies (e.g. codec scheme and data bitrate). 445 The popularity of 4K and even higher resolution and increasing FPS 446 for cloud gaming and AR/VR services require both high bandwidth and 447 low latency in wireless and cellular networks. The increasing 448 resolution would incur a higher encoding and decoding delay. 449 However, users' tolerance to delay will not increase with the 450 resolution, which means the application needs to adapt to the network 451 dynamics in a more efficient way. The higher resolution, the larger 452 range of the rate adaptation can be used. 454 In this section, we make experiments based on the methods described 455 in section 3 to improve the QoE of cloud gaming. The performance 456 between network-aware and native non-network-aware mechanisms are 457 compared. 459 4.4. ROI Detection with Network Information 461 The first experiment is based on the ROI detection. We will 462 investigate the impact of network perception. 464 Saliency detection method has successfully reduced the size of videos 465 and improve the QoE of users in video downloading [Saliency]. 467 However, it is not effective when applied to real-time interactive 468 streaming such as cloud gaming. 470 As we know, more accurate saliency region detection algorithm needs 471 more time to obtain the result. However, when the users are 472 suffering a bad performance network in cloud gaming, this precise 473 detection may incur more delay to the system. As a result, it will 474 harm the final QoE. 476 If the application can learn the network well in a real-time manner, 477 it can choose the algorithm based on how much delay the system can 478 tolerate. If the network condition is good enough, it can adopt an 479 algorithm which has deeper learning network and the added delay will 480 not be perceived by the end users. Thus, it can save huge bandwidth 481 without harming the QoE. On the other side, in a network with bad 482 condition, the server can use the fastest method to avoid extra 483 delay. 485 We make the experiments to show how the network information will 486 influence the total QoE and bandwidth saving in ROI detection. 488 The following 4 methods are compared: 490 1) The original video, without using ROI method. This acts as a 491 baseline. 493 2) Quick saliency detection and encoding method, which is not 494 accuracy in some cases. It only brings 10ms delay. 496 3) A relative accuracy saliency detection method. In general, if an 497 algorithm is more precise, it will take more time to get the results. 499 And the complexity of the picture will also influence the detection 500 time and accuracy. Based on our test video, we adopt the method 501 which brings delay about 40~70ms. 503 4) The application server in the cloud has the current bandwidth 504 information which derived from the wireless LAN NIC. Here it is a 505 simulation that all the collected bandwidth traces are already known 506 by the server. Thus, it can use the bandwidth traces to compute 507 transmission delay. Then the server can change the saliency 508 detection algorithm based on this information and then encode the 509 video. 511 Although the result of future bandwidth prediction is not always 512 accurate in real environment, the assumption here will not influence 513 the final results much. Since in cloud gaming the server encodes the 514 stream based on ROI information frame by frame instead of in a grain 515 of chunks, the future bandwidth prediction window size doesn't have 516 to be long. Therefore, even the server can only get the bandwidth or 517 delay prediction for a short time window, the server can still use 518 this method with network information. 520 Test environment: 522 A 720P game video segment with a rate of 6.8Mbps. This is not a very 523 high bandwidth requirement example in cloud gaming. We just show how 524 it will benefit from MoWIE. High bandwidth requirement case will 525 benefit more if the bandwidth fluctuates much. 527 The three different networks are all wireless networks and the 528 available bandwidth is varied frequently, where Network 1: The 529 overall network condition is not very good, the average network 530 bandwidth is 7.1Mbps, but it continues to fluctuate, and the minimum 531 is only 3.9Mbps. 533 Network 2: The overall network condition is good, with an average 534 network bandwidth of 12Mbps and a minimum of 6.4Mbps. 536 Network 3: The network fluctuates dramatically, with an average 537 network bandwidth of 8.4Mbps and a minimum network bandwidth of 538 3.7Mbps 540 Test content: 542 The four methods are conducted on the original video under each three 543 networks. After re-encoding based on the saliency detection, we 544 calculate the new QoE and the saved bandwidth. The results are shown 545 in the Figure 4-1: 547 The QoE value is the MOS as standardized in the ITU. 549 +---+-----------------+-----------------+-----------------+ 550 | | Network 1 | Network 2 | Network 3 | 551 +---+---+-------------+---+-------------+---+-------------+ 552 | |QoE| BW Saving |QoE| BW Saving |QoE| BW Saving | 553 +---+---+-------------+---+-------------+---+-------------+ 554 | 1 |3.8| 0 |4.8| 0 |4.3| 0 | 555 +---+---+-------------+---+-------------+---+-------------+ 556 | 2 |3.8| 5% |4.8| 9% |4.3| 7% | 557 +---+---+-------------+---+-------------+---+-------------+ 558 | 3 |2.2| 2.1% |4.6| 38% |3.1| 34% | 559 +---+---+-------------+---+-------------+---+-------------+ 560 | 4 |3.6| 9% |4.7| 33% |4.3| 25% | 561 +---+---+-------------+---+-------------+---+-------------+ 562 Figure 4-1: QoE and Bandwidth Saving 564 Conclusion: 566 It can be seen that the methods such as method 2 and method 3 that do 567 not rely on the network information directly, have certain 568 limitations. 570 Though the method 2 is simple and time-consuming, it can only detect 571 a small part of region of interest accurately. Thus, even if the 572 network condition is very good, it can only save a small amount of 573 bandwidth, and sometimes there are some incorrect ROI detection. The 574 QoE will be reduced without hitting the ROI region. 576 For Method 3, the algorithm is complicated, and it can correctly 577 detect the user's area of interest, so that it can re-allocate 578 encoding scheme and save a lot of bandwidth. However, its algorithm 579 will introduce higher delay. When the user network condition is 580 poor, the extra delay will cause even worst user's QoE. Although the 581 bandwidth is saved, it affects the user experience seriously. 583 Method 4 is based on the application's awareness of the network. If 584 the application can know certain network information, it can balance 585 the complexity of the algorithm (introducing delay) and the accuracy 586 of the algorithm (saving bandwidth) according to the actual network 587 conditions. As can be seen from the experiment, method 4 can ensure 588 the user's QoE and save the bandwidth greatly at the same time. 590 4.5. Adaptive Bitrate with Network Capability Exposure 592 This experiment is AI-based rate adaption by utilizing the network 593 information provided by the cellular base station (eNB) in cellular 594 network. 596 Tencent has launched real network testing of NAA-enabled cloud gaming 597 in China Mobile LTE network, with the enhancement in eNB supporting 598 base station information exposure. 600 To enable the NAA mechanism, some cellular network information from 601 eNBs are collected in an adaptive interval based on the change rate 602 of network status. There information is categorized in two levels, 603 i.e., cell level and UE level. Cell level information are common for 604 all the UEs under a serving LTE cell and UE level information is 605 specific for different UEs. 3GPP LTE specifications have specified 606 how the PDCP (Packet Data Convergence Protocol), RLC (Radio Link 607 Control), MAC (Medium Access Control) and PHY (Physical) protocols 608 operate and this information are very essential statistics from these 609 protocol layers. 611 It is noted that in NAA mechanism, as the network information is from 612 eNB, and the eNB has the real-time information of radio link quality 613 statistics and layer 1 and layer 2 operation information, NAA 614 mechanism can expose rich information to upper layer, e.g., it is 615 capable to differentiate packet loss and congestion, which is very 616 helpful to the applications in practice. 618 In order to compare the cases with and without NAA, the cloud gaming 619 test environment is setup with 1080p resolution and around 20Mbps 620 bitrate. 622 Test scenarios 1~5 are as follows. 624 Test scenarios 1: Weak network. This scenario is the case where 625 radio link quality is low, e.g., in cell edge area and the bandwidth 626 is not able to serve cloud gaming. 628 Test scenario 2: User competition scenario. This scenario is defined 629 as the case when user amount is large thus the cellular network 630 bandwidth cannot serve all the cloud gaming users. 632 Test scenario 3-5: Other scenarios with random user movement trace 633 and user distribution. 635 Test method: To simplify to comparison, we just use the MCS (MCS 636 index) information derived from the eNB [TS38.214]. The information 637 is provided directly to the application, and the application then 638 adjusts the bit rate according to this information. Here, MCS index 639 shows the modulation (e.g. QPSK, 16QAM,...) and the coding rate used 640 during physical layer transmission, which is relevant to the real 641 data rate per UE. The benchmark method is adopting a constant bit 642 rate without any information to help it predicting the network 643 condition. We compare these scenarios and observe the reduction of 644 delay when those eNB data are utilized. 646 For different scenarios, the lagging rate is defined as the 647 performance indicator. In our experiments, we assume lagging happens 648 when transmission delay is greater than 200ms and lagging rate is 649 defined as the ratio between the number of frames greater than 200ms 650 and the total number of frames. 652 +-------------+--------------------------+ 653 |Test Scenario| Reduction of Lagging Rate| 654 +-------------+--------------------------+ 655 | 1 | 46% | 656 +-------------+--------------------------+ 657 | 2 | 21% | 658 +-------------+--------------------------+ 659 | 3 | 37% | 660 +-------------+--------------------------+ 661 | 4 | 56% | 662 +-------------+--------------------------+ 663 | 5 | 32% | 664 +-------------+--------------------------+ 665 Figure 4-2: Reduction of Lagging Rate 667 It can be clearly seen that with the MCS information, the application 668 can adjust the bit rate to decrease the lagging rate and then 669 significantly improve the user QoE. In weak network scenario, 46% 670 lagging can be avoided by NAA. 672 4.6. Analysis of the Experiments 674 The above-mentioned technologies demonstrate the performance gain of 675 NAA with MoWIE. 677 Although application information can also help to predict the network 678 and have already been used in adaptive bit rate methods, the 679 application information is not as sensitive as eNB information at the 680 very beginning in a lot of cases. For example, when more users enter 681 the cell, the PRB information will first reflect that each user may 682 get less bandwidth. However, the application information needs to 683 react after there is a trend that the bitrate is decreasing. That is 684 to say, the lower layer network information is more directly. 686 Without MoWIE, the application cannot get the lower layer network 687 information directly and then try to detect "blindly" to adapt to the 688 dynamics of the lower layer network, which cannot meet the 689 requirements of cloud interactive applications like cloud gaming, low 690 delay live show and Cloud VR. 692 It is noted that the more real-time network resource status the 693 application can learn, the better it can predict how much network 694 resource it can use within a prediction time window. However, there 695 is tradeoff between network information collection frequency and its 696 load and feasibility to the network devices. In principle, the total 697 network resource consumed for such network status reporting is also 698 designed in light-weight manner, e.g., by properly controlling the 699 interval of report and also the number of bits needed to convey the 700 reported information elements. In our experiments, the network 701 status information can be obtained in an adaptive interval based on 702 the change rate of network status, in order to provide good 703 prediction with less load introduced in the network. In fact, not 704 all scenarios need a very frequent information collection. If some 705 information only changes in a very small range and won't influence 706 the final decision, it is unnecessary to report such information all 707 the time. However, if its value varies over the preset threshold, it 708 will inform the application immediately. 710 The distribution and impact of the exposed data to the performance 711 gain for different algorithm needs to be further studied. This draft 712 is to give a guidance to figure out what kind of data needs to be 713 exposed during initial deployment of these mechanisms. 715 In our current cloud gaming, the application information can help to 716 reduce about 50% the lagging rate. The left 50% improvement room can 717 be achieved by network information exposure with MoWIE. Actually, 718 the effect of the two-layer information can be accumulated. However, 719 due to current deployment limitation, we cannot collect the 720 application information with the eNB information at the same time. 721 Thus, in this version of the draft we compare the performance with 722 and without MoWIE. We don't compare between application information 723 assisted mode and network information assisted mode in this draft. 724 This is our on- going work. Since both application and eNB 725 information can reflect the network variation, we will compare the 726 performance among application information assisted mode, network 727 information assisted mode and the mode of utilizing both layer 728 information. 730 5. Standardization Considerations of MoWIE as an Extension to ALTO 732 It should be noticed that the previous mechanisms may also work on 733 IEEE 802.11 standards (e.g. EHT), helping SP having a better 734 understanding for the network environment between AP and STAs. Based 735 on the fact that 802.11 devices are working on unlicensed spectrums, 736 and easily influenced by adjacent unlicensed devices, duty cycle and 737 related CQI information (e.g. MCS, bandwidth, and etc.) are 738 considered very important network information here.Standardization 739 Considerations of MoWIE as an Extension to ALTO MoWIE can be a 740 realistic, important extension to ALTO to serve the aforementioned 741 use cases, in the setting of the newer generation (5G) of cellular 742 network, which is a completely open IP based network where routers/ 743 UPF with IP connectivity will be deployed much closer to the users. 744 One may consider not only the aforementioned cloud- based multimedia 745 applications, but also other latency sensitive applications such as 746 connected vehicles and automotive driving. 748 Extending ALTO with MoWIE, therefore, may allow ALTO to expose lower 749 layer network information to ensure higher application QoE for a wide 750 spectrum of applications. 752 One possible approach to standardizing the distribution of the 753 network information used in the evaluations is to send such 754 information as piggyback information in the datapath. One issue with 755 datapath method is that MoWIE intends to convey more complex and rich 756 information than current methods. To piggyback such complex and rich 757 information in the datapath will take away a lot of datapath 758 resource. But the datapath-based method can provide frequent changed 759 network information and it is much easy to synchronize the network 760 information and user data in the same time scale; Normally, there is 761 less user data in the uplink direction and the free "space" 763 within the MTU can be used to piggyback the network informaiton to 764 the application, in such case no additional create a second 765 communication channel between the application and network. However, 766 the datapath design may bring out more limited privacy management, 767 which is very important in MoWIE. The application cannot trust the 768 network information if there is no message authentication mechanism 769 for the piggyback network information. How the network inserts the 770 network information in the data packet is also challengeable since a 771 lot of transport layer protocol are encrypted and integration 772 protected. Another method is to create an associated path aligned 773 with datapath. Like the ICMP for IP and RTCP for RTP, this second 774 path can be used to provide additional information associated with 775 the datapath. But creating such second path is a big change to 776 current widely used transport protocols and a lot of applications 777 also need to change, this second path is also challengeable. 779 In 3GPP, network information exposure based on control plane 780 mechanism is introduced in 4G and 5G systems. We mainly discuss ALTO 781 extension-based design in tackling with this problem. Specifically, 782 the MoWIE extension will reuse existing ALTO mechanisms including 783 information resource directory, extensible performance metrics and 784 calendaring, and unified properties. It also requires modular, 785 reusable extensions, which we plan to specify in detail in a separate 786 document. Below is an overview of key considerations; security 787 considerations are in the following section. 789 * Network information selection and binding consideration: Instead 790 of hardcoding only specific network information, a modular design 791 of MoWIE is an ability for an ALTO client to select only the 792 relevant information (e.g., cell DLOccupyPRBNum metric and UE MCS) 793 and then request correspondingly. Existing ALTO information 794 resource directory is a starting point, but the design needs to be 795 generic," to provide abstraction for ease of use and 796 extensibility. The security mechanisms of the existing ALTO 797 protocol should also be extended to enforce proper authorization. 799 * Compact network information encoding consideration: One benefit of 800 ALTO is its high-level JSON based encoding. When the update 801 frequency increases, the existing base protocol and existing 802 extensions (in particular the SSE extension), however, may have 803 high bandwidth and processing overhead. Hence, encoding and 804 processing overhead of MoWIE should be considered. 806 * Stability and reliability consideration: A key benefit of the 807 MoWIE extension is the ability to allow more flexible, better 808 coordinated control. Any control mechanism, however, should 809 integrate fundamental overhead, stability and reliability 810 mechanisms. 812 6. IANA Considerations 814 This document has no actions for IANA. 816 7. Security Considerations 818 The collection, distribution of MoWIE information should consider the 819 security requirements on information privacy and information 820 integration protection and authentication in both sides. Since the 821 network status is not directly related to any special user, there is 822 currently no any privacy issue. But the information transmitted to 823 the application can pass through a lot of middle box and can be 824 changed by the man in the middle. To protect the network 825 information, an end to end encryption and integration is needed. 826 Also, the network needs to authenticate the information exposure 827 provided to right applications. These security requirements can be 828 implemented by the TLS and other security mechanisms. 830 8. Acknowledgments 832 The authors would like to thank Huang Wei for his contribution to the 833 previous drafts. 835 9. References 837 9.1. Normative References 839 [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, 840 "Definition of the Differentiated Services Field (DS 841 Field) in the IPv4 and IPv6 Headers", RFC 2474, 842 DOI 10.17487/RF2474, December 1998, 843 . 845 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 846 of Explicit Congestion Notification (ECN) to IP", 847 RFC 3168, DOI 10.17487/RFC3168, September 2001, 848 . 850 9.2. Informative References 852 [CS2P] Sun, Yi., Yin, Xiaoqi., Jiang, Junchen., Sekar, Vyas., 853 Lin, Fuyuan., Wang, Nanshu., Liu, Tao., and Bruno. 854 Sinopoli, "CS2P: Improving Video Bitrate Selection and 855 Adaptation with Data-Driven Throughput Prediction", 856 DOI 10.1145/2934872.2934898, 2016, 857 . 859 [Fahad] Fazal Elahi Guraya, Fahad., Alaya Cheikh, Faouzi., and 860 Victor. Medina, "A Novel Visual Saliency Model for 861 Surveillance Video Compression", 862 DOI 10.1109/SITIS.2011.84, 2011, 863 . 865 [Hongzi] Mao, Hongzi., Netravali, Ravi., and Mohammad. Alizadeh, 866 "Neural Adaptive Video Streaming with Pensieve", 867 DOI 10.1145/3098822.3098843, 2017, 868 . 870 [MPC] Yin, Xiaoqi., Jindal, Abhishek., Sekar, Vyas., and Bruno. 871 Sigopoli, "A Control-Theoretic Approach for Dynamic 872 Adaptive Video Streaming over HTTP", 873 DOI 10.1145/2785956.2787486, 2015, 874 . 876 [MPEGDASH] ISO/IEC, "ISO/IEC 23009, Dynamic Adaptive Streaming over 877 HTTP", 2020, 878 . 880 [Saccadic] Matin, E., "Saccadic suppression: A review and an 881 analysis", DOI 10.1037/h0037368, 1974, 882 . 884 [Saliency] Guo, C. and L. Zhang, "A Novel Multiresolution 885 Spatiotemporal Saliency Detection Model and Its 886 Applications in Image and Video Compression", 887 DOI 10.1109/TIP.2009.2030969, 2017, 888 . 890 [TS23.501] 3rd Generation Partnership Project (3GPP), "System 891 architecture for the 5G System (5GS)", 2020. 893 [TS26.114] 3rd Generation Partnership Project (3GPP), "IP Multimedia 894 Subsystem (IMS); Multimedia telephony; Media handling and 895 interaction", 2020. 897 [TS26.247] 3rd Generation Partnership Project (3GPP), "Progressive 898 Download and Dynamic Adaptive Streaming over HTTP(3GP- 899 DASH)", 2020. 901 [TS38.214] 3rd Generation Partnership Project (3GPP), "NR; Physical 902 layer procedures for data", 2020. 904 Authors' Addresses 906 Chunshan Xiong 907 Tencent 908 Flat 9, No. 10 West Building, Xi Bei Wang East Road 909 Beijing 910 100090 911 China 912 Email: chunshxiong@tencent.com 914 Yunfei Zhang 915 Tencent 916 Flat 9, No. 10 West Building,Xi Bei Wang East Road 917 Beijing 918 100090 919 China 921 Email: yanniszhang@tencent.com 923 Y. Richard Yang 924 Yale University 925 Watson 208A, 51 Prospect Street 926 New Haven, CT 06511 927 United States of America 929 Email: yang.r.yang@yale.edu 931 Gang Li 932 China Mobile Research Institute 933 No.32, Xuanwumenxi Ave, Xicheng District 934 Beijing 935 100053 936 China 938 Email: ligangyf@chinamobile.com 940 Yixue Lei 941 Tencent 942 Flat 9, No. 10 West Building,Xi Bei Wang East Road 943 Beijing 944 100090 945 China 947 Email: yixuelei@tencent.com 948 Yunbo Han 949 Tencent 950 Tencent Building, No. 10000 Shennan Avenue, Nanshan District 951 Shenzhen 952 518000 953 China 955 Email: yunbohan@tencent.com