idnits 2.17.1 draft-ietf-mops-ar-use-case-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (March 25, 2021) is 1127 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-12) exists of draft-ietf-mops-streaming-opcons-03 Summary: 2 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 MOPS R. Krishna 3 Internet-Draft InterDigital Europe Limited 4 Intended status: Informational A. Rahman 5 Expires: September 26, 2021 InterDigital Communications, LLC 6 March 25, 2021 8 Media Operations Use Case for an Augmented Reality Application on Edge 9 Computing Infrastructure 10 draft-ietf-mops-ar-use-case-00 12 Abstract 14 A use case describing transmission of an application on the Internet 15 that has several unique characteristics of Augmented Reality (AR) 16 applications is presented for the consideration of the Media 17 Operations (MOPS) Working Group. One key requirement identified is 18 that the Adaptive-Bit-Rate (ABR) algorithms' current usage of 19 policies based on heuristics and models is inadequate for AR 20 applications running on the Edge Computing infrastructure. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at https://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on September 26, 2021. 39 Copyright Notice 41 Copyright (c) 2021 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (https://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 57 2. Conventions used in this document . . . . . . . . . . . . . . 3 58 3. Use Case . . . . . . . . . . . . . . . . . . . . . . . . . . 3 59 3.1. Processing of Scenes . . . . . . . . . . . . . . . . . . 3 60 3.2. Generation of Images . . . . . . . . . . . . . . . . . . 4 61 4. Requirements . . . . . . . . . . . . . . . . . . . . . . . . 4 62 5. Informative References . . . . . . . . . . . . . . . . . . . 5 63 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 8 65 1. Introduction 67 The MOPS draft, [I-D.ietf-mops-streaming-opcons], provides an 68 overview of operational networking issues that pertain to Quality of 69 Experience (QoE) in delivery of video and other high-bitrate media 70 over the Internet. However, as it does not cover the increasingly 71 large number of applications with Augmented Reality (AR) 72 characteristics and their requirements on ABR algorithms, the 73 discussion in this draft compliments the overview presented in that 74 draft [I-D.ietf-mops-streaming-opcons]. 76 Future AR applications will bring several requirements for the 77 Internet and the mobile devices running these applications. AR 78 applications require a real-time processing of video streams to 79 recognize specific objects. This is then used to overlay information 80 on the video being displayed to the user. In addition some AR 81 applications will also require generation of new video frames to be 82 played to the user. Both the real-time processing of video streams 83 and the generation of overlay information are computationally 84 intensive tasks that generate heat [DEV_HEAT_1], [DEV_HEAT_2] and 85 drain battery power [BATT_DRAIN] on the AR mobile device. 86 Consequently, in order to run future applications with AR 87 characteristics on mobile devices, computationally intensive tasks 88 need to be offloaded to resources provided by Edge Computing. 90 Edge Computing is an emerging paradigm where computing resources and 91 storage are made available in close network proximity at the edge of 92 the Internet to mobile devices and sensors [EDGE_1], [EDGE_2]. 94 Adaptive-Bit-Rate (ABR) algorithms currently base their policy for 95 bit-rate selection on heuristics or models of the deployment 96 environment that do not account for the environment's dynamic nature 97 in use cases such as the one we present in this document. 98 Consequently, the ABR algorithms perform sub-optimally in such 99 deployments [ABR_1]. 101 2. Conventions used in this document 103 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 104 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 105 document are to be interpreted as described in [RFC2119]. 107 3. Use Case 109 We now descibe a use case that involves an application with AR 110 systems' characteristics. Consider a group of tourists who are being 111 conducted in a tour around the historical site of the Tower of 112 London. As they move around the site and within the historical 113 buildings, they can watch and listen to historical scenes in 3D that 114 are generated by the AR application and then overlaid by their AR 115 headsets onto their real-world view. The headset then continuously 116 updates their view as they move around. 118 The AR application first processes the scene that the walking tourist 119 is watching in real-time and identifies objects that will be targeted 120 for overlay of high resolution videos. It then generates high 121 resolution 3D images of historical scenes related to the perspective 122 of the tourist in real-time. These generated video images are then 123 overlaid on the view of the real-world as seen by the tourist. 125 We now discuss this processsing of scenes and generation of high 126 resolution images in greater detail. 128 3.1. Processing of Scenes 130 The AR application that runs on the mobile device needs to first 131 track the pose (coordinates and orientation) of the user's head, eyes 132 and the objects that are in view.This requires tracking natural 133 features and developing an annotated point cloud based model that is 134 then stored in a database.To ensure that this database can be scaled 135 up,techniques such as combining a client side simultaneous tracking 136 and mapping and a server-side localization are used[SLAM_1], 137 [SLAM_2], [SLAM_3], [SLAM_4]. Once the natural features are tracked, 138 virtual objects are geometrically aligned with those features.This is 139 followed by resolving occlusion that can occur between virtual and 140 the real objects [OCCL_1], [OCCL_2]. 142 The next step for the AR apllication is to apply photometric 143 registration [PHOTO_REG]. This requires aligning the brightness and 144 color between the virtual and real objects.Additionally, algorithms 145 that calculate global illumination of both the virtual and real 146 objects [GLB_ILLUM_1], [GLB_ILLUM_2] are executed.Various algorithms 147 to deal with artifacts generated by lens distortion [LENS_DIST], blur 148 [BLUR], noise [NOISE] etc are also required. 150 3.2. Generation of Images 152 The AR application must generate a high-quality video that has the 153 properties descibed in the previous step and overlay the video on the 154 AR device's display- a step called situated visualization. This 155 entails dealing with registration errors that may arise, esuring that 156 there is no visual interference [VIS_INTERFERE], and finally 157 maintaining temporal coherence by adapting to the movement of user's 158 eyes and head. 160 4. Requirements 162 The components of AR applications perform tasks such as real-time 163 generation and processing of high-quality video content that are 164 computationally intensive. As a result,on AR devices such as AR 165 glasses excessive heat is generated by the chip-sets that are 166 involved in the computation [DEV_HEAT_1], [DEV_HEAT_2]. 167 Additionally, the battery on such devices discharges quickly when 168 running such applications [BATT_DRAIN]. 170 A solution to the heat dissipation and battery drainge problem is to 171 offload the processing and video generation tasks to the remote 172 cloud.However, running such tasks on the cloud is not feasible as the 173 end-to-end delays must be within the order of a few milliseconds. 174 Additionally,such applications require high bandwidth and low jitter 175 to provide a high QoE to the user.In order to achieve such hard 176 timing constraints, computationally intensive tasks can be offloaded 177 to Edge devices. 179 Note that the Edge device providing the computation and storage is 180 itself limited in such resources compared to the Cloud. So, for 181 example, a sudden surge in demand from a large group of tourists can 182 overwhelm that device. This will result in a degraded user 183 experience as their AR device experiences delays in receiving the 184 video frames. In order to deal with this problem, the client AR 185 applications will need to use Adaptive Bit Rate (ABR) algorithms that 186 choose bit-rates policies tailored in a fine-grained manner to the 187 resource demands and playback the videos with appropriate QoE metrics 188 as the user moves around with the group of tourists. 190 However, heavy-tailed nature of several operational parameters make 191 prediction-based adaptation by ABR algorithms sub-optimal[ABR_2]. 192 This is because with such distributions, law of large numbers works 193 too slowly, the mean of sample does not equal the mean of 194 distribution, and as a result standard deviation and variance are 195 unsuitable as metrics for such operational parameters [HEAVY_TAIL_1], 196 [HEAVY_TAIL_2]. Other subtle issues with these distributions include 197 the "expectation paradox" [HEAVY_TAIL_1] where the longer we have 198 waited for an event the longer we have to wait and the issue of 199 mismatch between the size and count of events [HEAVY_TAIL_1]. This 200 makes designing an algorithm for adaptation error-prone and 201 challenging. Such operational parameters include but are not limited 202 to buffer occupancy, throughput, client-server latency, and variable 203 transmission times.In addition, edge devices and communication links 204 may fail and logical communication relationships between various 205 software components change frequently as the user moves around with 206 their AR device [UBICOMP]. 208 Thus, once the offloaded computationally intensive processing is 209 completed on the Edge Computing, the video is streamed to the user 210 with the help of an ABR algorithm which needs to meet the following 211 requirements [ABR_1]: 213 o Dynamically changing ABR parameters: The ABR algorithm must be 214 able to dynamically change parameters given the heavy-tailed 215 nature of network throughput. This, for example, may be 216 accomplished by AI/ML processing on the Edge Computing on a per 217 client or global basis. 219 o Handling conflicting QoE requirements: QoE goals often require 220 high bit-rates, and low frequency of buffer refills. However in 221 practice, this can lead to a conflict between those goals. For 222 example, increasing the bit-rate might result in the need to fill 223 up the buffer more frequently as the buffer capacity might be 224 limited on the AR device. The ABR algorithm must be able to 225 handle this situation. 227 o Handling side effects of deciding a specific bit rate: For 228 example, selecting a bit rate of a particular value might result 229 in the ABR algorithm not changing to a different rate so as to 230 ensure a non-fluctuating bit-rate and the resultant smoothness of 231 video quality . The ABR algorithm must be able to handle this 232 situation. 234 5. Informative References 236 [ABR_1] Mao, H., Netravali, R., and M. Alizadeh, "Neural Adaptive 237 Video Streaming with Pensieve", In Proceedings of the 238 Conference of the ACM Special Interest Group on Data 239 Communication, pp. 197-210, 2017. 241 [ABR_2] Yan, F., Ayers, H., Zhu, C., Fouladi, S., Hong, J., Zhang, 242 K., Levis, P., and K. Winstein, "Learning in situ: a 243 randomized experiment in video streaming", In 17th 244 USENIX Symposium on Networked Systems Design and 245 Implementation (NSDI 20), pp. 495-511, 2020. 247 [BATT_DRAIN] 248 Seneviratne, S., Hu, Y., Nguyen, T., Lan, G., Khalifa, S., 249 Thilakarathna, K., Hassan, M., and A. Seneviratne, "A 250 survey of wearable devices and challenges.", In IEEE 251 Communication Surveys and Tutorials, 19(4), p.2573-2620., 252 2017. 254 [BLUR] Kan, P. and H. Kaufmann, "Physically-Based Depth of Field 255 in Augmented Reality.", In Eurographics (Short Papers), 256 pp. 89-92., 2012. 258 [DEV_HEAT_1] 259 LiKamWa, R., Wang, Z., Carroll, A., Lin, F., and L. Zhong, 260 "Draining our Glass: An Energy and Heat characterization 261 of Google Glass", In Proceedings of 5th Asia-Pacific 262 Workshop on Systems pp. 1-7, 2013. 264 [DEV_HEAT_2] 265 Matsuhashi, K., Kanamoto, T., and A. Kurokawa, "Thermal 266 model and countermeasures for future smart glasses.", 267 In Sensors, 20(5), p.1446., 2020. 269 [EDGE_1] Satyanarayanan, M., "The Emergence of Edge Computing", 270 In Computer 50(1) pp. 30-39, 2017. 272 [EDGE_2] Satyanarayanan, M., Klas, G., Silva, M., and S. Mangiante, 273 "The Seminal Role of Edge-Native Applications", In IEEE 274 International Conference on Edge Computing (EDGE) pp. 275 33-40, 2019. 277 [GLB_ILLUM_1] 278 Kan, P. and H. Kaufmann, "Differential irradiance caching 279 for fast high-quality light transport between virtual and 280 real worlds.", In IEEE International Symposium on Mixed 281 and Augmented Reality (ISMAR),pp. 133-141, 2013. 283 [GLB_ILLUM_2] 284 Franke, T., "Delta voxel cone tracing.", In IEEE 285 International Symposium on Mixed and Augmented Reality 286 (ISMAR), pp. 39-44, 2014. 288 [HEAVY_TAIL_1] 289 Crovella, M. and B. Krishnamurthy, "Internet measurement: 290 infrastructure, traffic and applications", John Wiley and 291 Sons Inc., 2006. 293 [HEAVY_TAIL_2] 294 Taleb, N., "The Statistical Consequences of Fat Tails", 295 STEM Academic Press, 2020. 297 [I-D.ietf-mops-streaming-opcons] 298 Holland, J., Begen, A., and S. Dawkins, "Operational 299 Considerations for Streaming Media", draft-ietf-mops- 300 streaming-opcons-03 (work in progress), November 2020. 302 [LENS_DIST] 303 Fuhrmann, A. and D. Schmalstieg, "Practical calibration 304 procedures for augmented reality.", In Virtual 305 Environments 2000, pp. 3-12. Springer, Vienna, 2000. 307 [NOISE] Fischer, J., Bartz, D., and W. Strasser, "Enhanced visual 308 realism by incorporating camera image effects.", 309 In IEEE/ACM International Symposium on Mixed and 310 Augmented Reality, pp. 205-208., 2006. 312 [OCCL_1] Breen, D., Whitaker, R., and M. Tuceryan, "Interactive 313 Occlusion and automatic object placementfor augmented 314 reality", In Computer Graphics Forum, vol. 15, no. 3 , 315 pp. 229-238,Edinburgh, UK: Blackwell Science Ltd, 1996. 317 [OCCL_2] Zheng, F., Schmalstieg, D., and G. Welch, "Pixel-wise 318 closed-loop registration in video-based augmented 319 reality", In IEEE International Symposium on Mixed and 320 Augmented Reality (ISMAR), pp. 135-143, 2014. 322 [PHOTO_REG] 323 Liu, Y. and X. Granier, "Online tracking of outdoor 324 lighting variations for augmented reality with moving 325 cameras", In IEEE Transactions on visualization and 326 computer graphics, 18(4), pp.573-580, 2012. 328 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 329 Requirement Levels", BCP 14, RFC 2119, 330 DOI 10.17487/RFC2119, March 1997, 331 . 333 [SLAM_1] Ventura, J., Arth, C., Reitmayr, G., and D. Schmalstieg, 334 "A minimal solution to the generalized pose-and-scale 335 problem", In Proceedings of the IEEE Conference on 336 Computer Vision and Pattern Recognition, pp. 422-429, 337 2014. 339 [SLAM_2] Sweeny, C., Fragoso, V., Hollerer, T., and M. Turk, "A 340 scalable solution to the generalized pose and scale 341 problem", In European Conference on Computer Vision, pp. 342 16-31, 2014. 344 [SLAM_3] Gauglitz, S., Sweeny, C., Ventura, J., Turk, M., and T. 345 Hollerer, "Model estimation and selection towards 346 unconstrained real-time tracking and mapping", In IEEE 347 transactions on visualization and computer graphics, 348 20(6), pp. 825-838, 2013. 350 [SLAM_4] Pirchheim, C., Schmalstieg, D., and G. Reitmayr, "Handling 351 pure camera rotation in keyframe-based SLAM", In 2013 352 IEEE international symposium on mixed and augmented 353 reality (ISMAR), pp. 229-238, 2013. 355 [UBICOMP] Bardram, J. and A. Friday, "Ubiquitous Computing Systems", 356 In Ubiquitous Computing Fundamentals pp. 37-94. CRC 357 Press, 2009. 359 [VIS_INTERFERE] 360 Kalkofen, D., Mendez, E., and D. Schmalstieg, "Interactive 361 focus and context visualization for augmented reality.", 362 In 6th IEEE and ACM International Symposium on Mixed and 363 Augmented Reality, pp. 191-201., 2007. 365 Authors' Addresses 367 Renan Krishna 368 InterDigital Europe Limited 369 64, Great Eastern Street 370 London EC2A 3QR 371 United Kingdom 373 Email: renan.krishna@interdigital.com 374 Akbar Rahman 375 InterDigital Communications, LLC 376 1000 Sherbrooke Street West 377 Montreal H3A 3G4 378 Canada 380 Email: Akbar.Rahman@InterDigital.com