idnits 2.17.1 draft-mcbride-edge-data-discovery-overview-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (July 13, 2020) is 1376 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'I-D.irtf-icnrg-ccnxmessages' is defined on line 596, but no explicit reference was found in the text == Unused Reference: 'I-D.irtf-icnrg-ccnxsemantics' is defined on line 601, but no explicit reference was found in the text == Unused Reference: 'I-D.kutscher-icnrg-rice' is defined on line 606, but no explicit reference was found in the text == Outdated reference: A later version (-06) exists of draft-bernardos-intarea-vim-discovery-03 ** Downref: Normative reference to an Experimental draft: draft-bernardos-intarea-vim-discovery (ref. 'I-D.bernardos-intarea-vim-discovery') == Outdated reference: A later version (-07) exists of draft-bernardos-sfc-discovery-04 ** Downref: Normative reference to an Experimental draft: draft-bernardos-sfc-discovery (ref. 'I-D.bernardos-sfc-discovery') ** Downref: Normative reference to an Experimental draft: draft-irtf-icnrg-ccnxmessages (ref. 'I-D.irtf-icnrg-ccnxmessages') ** Downref: Normative reference to an Experimental draft: draft-irtf-icnrg-ccnxsemantics (ref. 'I-D.irtf-icnrg-ccnxsemantics') ** Downref: Normative reference to an Experimental draft: draft-kutscher-icnrg-rice (ref. 'I-D.kutscher-icnrg-rice') ** Downref: Normative reference to an Informational RFC: RFC 7927 Summary: 6 errors (**), 0 flaws (~~), 7 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 COINRG M. McBride 3 Internet-Draft Futurewei 4 Intended status: Standards Track D. Kutscher 5 Expires: January 14, 2021 Emden University 6 E. Schooler 7 Intel 8 CJ. Bernardos 9 UC3M 10 D. Lopez 11 Telefonica 12 July 13, 2020 14 Edge Data Discovery for COIN 15 draft-mcbride-edge-data-discovery-overview-04 17 Abstract 19 This document describes the problem of distributed data discovery in 20 edge computing, and in particular for computing-in-the-network 21 (COIN), which may require both the marshalling of data at the outset 22 of a computation and the persistence of the resultant data after the 23 computation. Although the data might originate at the network edge, 24 as more and more distributed data is created, processed, and stored, 25 it becomes increasingly dispersed throughout the network. There 26 needs to be a standard way to find it. New and existing protocols 27 will need to be developed to support distributed data discovery at 28 the network edge and beyond. 30 Status of This Memo 32 This Internet-Draft is submitted in full conformance with the 33 provisions of BCP 78 and BCP 79. 35 Internet-Drafts are working documents of the Internet Engineering 36 Task Force (IETF). Note that other groups may also distribute 37 working documents as Internet-Drafts. The list of current Internet- 38 Drafts is at https://datatracker.ietf.org/drafts/current/. 40 Internet-Drafts are draft documents valid for a maximum of six months 41 and may be updated, replaced, or obsoleted by other documents at any 42 time. It is inappropriate to use Internet-Drafts as reference 43 material or to cite them other than as "work in progress." 45 This Internet-Draft will expire on January 14, 2021. 47 Copyright Notice 49 Copyright (c) 2020 IETF Trust and the persons identified as the 50 document authors. All rights reserved. 52 This document is subject to BCP 78 and the IETF Trust's Legal 53 Provisions Relating to IETF Documents 54 (https://trustee.ietf.org/license-info) in effect on the date of 55 publication of this document. Please review these documents 56 carefully, as they describe your rights and restrictions with respect 57 to this document. Code Components extracted from this document must 58 include Simplified BSD License text as described in Section 4.e of 59 the Trust Legal Provisions and are provided without warranty as 60 described in the Simplified BSD License. 62 Table of Contents 64 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 65 1.1. Edge Data . . . . . . . . . . . . . . . . . . . . . . . . 3 66 1.2. Background . . . . . . . . . . . . . . . . . . . . . . . 3 67 1.3. Requirements Language . . . . . . . . . . . . . . . . . . 4 68 1.4. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 69 2. Edge Data Discovery Problem Scope . . . . . . . . . . . . . . 5 70 2.1. A Cloud-Edge Continuum . . . . . . . . . . . . . . . . . 5 71 2.2. Types of Edge Data . . . . . . . . . . . . . . . . . . . 6 72 3. Edge Scenarios Requiring Data Discovery . . . . . . . . . . . 7 73 4. Edge Data Discovery . . . . . . . . . . . . . . . . . . . . . 7 74 4.1. Types of Discovery . . . . . . . . . . . . . . . . . . . 7 75 4.2. Naming the Data . . . . . . . . . . . . . . . . . . . . . 8 76 5. Use Cases of Edge Data Discovery . . . . . . . . . . . . . . 9 77 5.1. Autonomous Vehicles . . . . . . . . . . . . . . . . . . . 9 78 5.2. Video Surveillance . . . . . . . . . . . . . . . . . . . 10 79 5.3. Elevator Networks . . . . . . . . . . . . . . . . . . . . 10 80 5.4. Service Function Chaining . . . . . . . . . . . . . . . . 10 81 5.5. Ubiquitous Witness . . . . . . . . . . . . . . . . . . . 12 82 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 83 7. Security Considerations . . . . . . . . . . . . . . . . . . . 13 84 8. Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 13 85 9. Normative References . . . . . . . . . . . . . . . . . . . . 13 86 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 14 88 1. Introduction 90 Edge computing is an architectural shift that migrates Cloud 91 functionality (compute, storage, networking, control, data 92 management, etc.) out of the back-end data center to be more 93 proximate to the IoT data being generated and analyzed at the edges 94 of the network. Edge computing provides local compute, storage and 95 connectivity services, often required for latency- and bandwidth- 96 sensitive applications. Thus, Edge Computing plays a key role in 97 verticals such as Energy, Manufacturing, Automotive, Video 98 Surveillance, Retail, Gaming, Healthcare, Mining, Buildings and Smart 99 Cities. 101 1.1. Edge Data 103 Edge computing is motivated at least in part by the sheer volume of 104 data that is being created by endpoint devices (sensors, cameras, 105 lights, vehicles, drones, wearables, etc.) at the very network edge 106 and that flows upstream, in a direction for which the network was not 107 originally designed. In fact, in dense IoT deployments (e.g., many 108 video cameras are streaming high definition video), where multiple 109 data flows collect or converge at edge nodes, data is likely to need 110 transformation (to be transcoded, subsampled, compressed, analyzed, 111 annotated, combined, aggregated, etc.) to fit over the next hop link, 112 or even to fit in memory or storage. Note also that the act of 113 performing computation on the data creates yet another new data 114 stream! Preservation of the original data streams is needed 115 sometimes but not always. 117 In addition, data may be cached, copied and/or stored at multiple 118 locations in the network on route to its final destination. With an 119 increasing percentage of devices connecting to the Internet being 120 mobile, support for in-the-network caching and replication is 121 critical for continuous data availability, not to mention efficient 122 network and battery usage for endpoint devices. 124 Additionally, as mobile devices' memory/storage fill up, in an edge 125 context they may have the ability to offload their data to other 126 proximate devices or resources, leaving a bread crumb trail of data 127 in their wakes. Therefore, although data might originate at edge 128 devices, as more and more data is continuously created, processed and 129 stored, it becomes increasingly dispersed throughout the physical 130 world (outside of or scattered across managed local data centers), 131 increasingly isolated in separate local edge clouds or data silos. 132 Thus, there needs to be a standard way to find it. New and existing 133 protocols will need to be identified/developed/enhanced for these 134 purposes. Being able to discover distributed data at the edge or in 135 the middle of the network will be an important component of Edge 136 computing. 138 1.2. Background 140 Several IETF T2T RG Edge Computing discussions have been held over 141 the last couple years. A comparative study on the definition of Edge 142 computing was presented in multiple sessions in T2T RG in 2018 and an 143 Edge Computing I-D was submitted early 2019. An IETF BEC (beyond 144 edge computing) effort has been evaluating potential gaps in existing 145 edge computing architectures. Edge Data Discovery is one potential 146 gap that was identified and that needs evaluation and a solution. 147 The newly proposed COIN RG highlights the need for computations in 148 the network to be able to marshal potentially distributed input data 149 and to handle resultant output data, i.e., its placement, storage 150 and/or possible migration strategy. 152 1.3. Requirements Language 154 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 155 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 156 document are to be interpreted as described in RFC 2119 [RFC2119]. 158 1.4. Terminology 160 o Edge: The edge encompasses all entities not in the back-end cloud. 161 The device edge represents the very leaves of the network and 162 encompasses the entities found in the last mile network. Sensors, 163 gateways, compute nodes are included. Because the things that 164 populate the IoT can be both physical and/or cyber, in some 165 solutions, particularly in software-defined or digital-twin 166 contexts, the device edge can include logical (vs physical) 167 entities. The infrastructure edge includes equipment on the 168 network operator side of the last mile network including cell 169 towers, edge data centers, cable headends, POPs, etc. See 170 Figure 1 for other possible tiers of edge clouds between the 171 device edge and the back-end cloud data center. 173 o Edge Computing: Distributed computation that is performed near the 174 network edge, where nearness is determined by the system 175 requirements. This includes high performance compute, storage and 176 network equipment on either the device or infrastructure edge. 178 o Edge Data Discovery: The process of finding required data from 179 edge entities, i.e., from databases, file systems, and device 180 memory that might be physically distributed in the network, and 181 providing access to it logically as if it were a single unified 182 source, perhaps through its namespace, that can be evaluated or 183 searched. 185 o ICN: Information Centric Networking. An ICN-enabled network 186 routes data by name (vs address), caches content natively in the 187 network, and employs data-centric security. Data discovery may 188 require that data be associated with a name or names, a series of 189 descriptive attributes, and/or a unique identifier. 191 2. Edge Data Discovery Problem Scope 193 Our focus is on how to define and scope the edge data discovery 194 problem. This requires some discussion of the evolving definition of 195 the edge as part of a cloud-to-edge continuum and in turn what is 196 meant by edge data, as well as the meta-data about the edge data. 198 2.1. A Cloud-Edge Continuum 200 Although Edge Computing data typically originates at edge devices, 201 there is nothing that precludes edge data from being created anywhere 202 in the cloud-to-edge computing continuum (Figure 1). New edge data 203 may result as a byproduct of computation being performed on the data 204 stream anywhere along its path in the network. For example, 205 infrastructure edges may create new edge data when multiple data 206 streams converge upon this aggregation point and require 207 transformation (e.g., to fit within the available resources, to 208 smooth raw measurements to eliminate high-frequency noise, or to 209 obfuscate data for privacy). 211 Initially our focus is on discovery of edge data that resides at the 212 Device Edge and the Infrastructure Edge. 214 +-------------------------------+ 215 | Back-end Cloud Data Center | 216 +-------------------------------+ 217 *** Cloud 218 * * Interconnect 219 *** 220 +-------------------------------+ 221 | Core Data Center | 222 +-------------------------------+ 223 *** Backbone 224 * * Network 225 *** 226 +-------------------------------+ 227 | Regional Data Center | 228 +-------------------------------+ 229 *** Metropolitan 230 * * Network 231 *** 232 +-------------------------------+ 233 | Infrastructure Edge | 234 +-------------------------------+ 235 *** Access 236 * * Network 237 *** 238 +-------------------------------+ 239 | Device Edge | 240 +-------------------------------+ 242 Figure 1: Cloud-to-edge computing continuum 244 2.2. Types of Edge Data 246 Besides classically constrained IoT device sensor and measurement 247 data accumulating throughout the edge computing infrastructure, edge 248 data may also take the form of higher frequency and higher volume 249 streaming data (from a continuous sensor or from a camera), meta data 250 (about the data), control data (regarding an event that was 251 triggered), and/or an executable that embodies a function, service, 252 or any other piece of code or algorithm. Edge data also could be 253 created after multiple streams converge at an edge node and are 254 processed, transformed, or aggregated together in some manner. 256 Regardless of edge data type, a key problem in the Cloud-Edge 257 continuum is that data is often kept in silos. Meaning, data is 258 often sequestered within the Edge where it was created. A goal of 259 this discussion is to consider the prospect that different types of 260 edge data will be made accessible across disparate edges, for example 261 to enable richer multi-modal analytics. But this will happen only if 262 data can be described, searched and discovered across heterogeneous 263 edges in a standard way. Having a mechanism to enable granular edge 264 data discovery is the problem that needs solving either with existing 265 or new protocols. The mechanisms shouldn't care to which flavor 266 cloud or edge the request for data discovery is made. 268 3. Edge Scenarios Requiring Data Discovery 270 1. A set of data resources appears (e.g., a mobile node hosting data 271 joins a network) and they want to be discoverable by an existing 272 but possibly virtualized and/or ephemeral data directory 273 infrastructure. 275 2. A device wants to discover data resources available at or near 276 its current location. As some of these resources may be mobile, 277 the available set of edge data may vary over time. 279 3. A device wants to discover to where best in the edge 280 infrastructure to opportunistically upload its data, for example 281 if a mobile device wants to offload its data to the 282 infrastructure (for greater data availability, battery savings, 283 etc.). 285 4. Edge Data Discovery 287 How can we discover data on the edge and make use of it? There are 288 proprietary implementations that collect data from various databases 289 and consolidate it for evaluation. We need a standard protocol set 290 for doing this data discovery, on the device or infrastructure edge, 291 in order to meet the requirements of many use cases. We will have 292 terabytes of data on the edge and need a way to identify its 293 existence and find the desired data. A user requires the need to 294 search for specific data in a data set and evaluate it using their 295 own tools. The tools are outside the scope of this document, but the 296 discovery of that data is in scope. 298 4.1. Types of Discovery 300 There are many aspects of discovery and many different protocols that 301 address each aspect. 303 Discovery of new devices added to an environment. Discovery of their 304 capabilities/services in client/server environments. Discovery of 305 these new devices automatically. Discovering a device and then 306 synchronizing the device inventory and configuration for edge 307 services. There are many existing protocols to help in this 308 discovery: UPnP, mDNS, DNS-SD, SSDP, NFC, XMPP, W3C network service 309 discovery, etc. 311 Edge devices discover each other in a standard way. We can use DHCP, 312 SNMP, SMS, COAP, LLDP, and routing protocols such as OSPF for devices 313 to discover one another. 315 Discovery of link state and traffic engineering data/services by 316 external devices. BGP-LS is one such solution. 318 The question is if one or more of these protocols might be a suitable 319 contender to extend to support edge data discovery? 321 4.2. Naming the Data 323 Information-Centric Networking (ICN) RFC 7927 [RFC7927] is a class of 324 architectures and protocols that provide "access to named data" as a 325 first-order network service. Instead of host-to-host communication 326 as in IP networks, ICNs often use location-independent names to 327 identify data objects, and the network provides the services of 328 processing (answering) requests for named data with the objective to 329 finally deliver the requested data objects to a requesting consumer. 331 Such an approach has profound effects on various aspects of a 332 networking system, including security (by enabling object-based 333 security on a message/packet level), forwarding behavior (name-based 334 forwarding, caching), but also on more operational aspects such as 335 bootstrapping, discovery etc. 337 The CCNx and NDN (https://named-data.net) variants of ICN are based 338 on a request/response abstraction where consumers (hosts, application 339 requesting named data) send INTEREST messages into the network that 340 are forwarded by network elements to a destination that can provide 341 the requested named data object. Corresponding responses are sent as 342 so-called DATA messages that follow the reverse INTEREST path. 344 Each unique data object is named unambiguously in a hierarchical 345 naming scheme and is authenticated through Public-Key cryptography 346 (data objects can also optionally be encrypted in different ways). 347 The naming concept and the object-based security approach lay the 348 foundation for location-independent operation. The network can 349 generally operate without any notion of location, and nodes 350 (consumers, forwarders) can forward requests for named data objects 351 directly, i.e., without any additional address resolution. Location 352 independence also enables additional features, for example the 353 possibility to replicate and cache named data objects. On-path 354 caching is a standard feature in many ICN systems -- typically for 355 enhancing reliability and performance. 357 In CCNx and NDN, forwarders are stateful, i.e., they keep track of 358 forwarded INTEREST to later match the received DATA messages. 360 Stateful forwarding (in conjunction with the general named-based and 361 location-independent operation) also empowers forwarders to execute 362 individual forwarding strategies and perform optimizations such as 363 in-network retransmissions, multicasting requests (in cases there are 364 several opportunities for accessing a particular named data object) 365 etc. 367 Naming data and application-specific naming conventions are naturally 368 important aspects in ICN. It is common that applications define 369 their own naming convention (i.e., semantics of elements in the name 370 hierarchy). Such names can often be directly derived from 371 application requirements, for example a name like /my-home/living- 372 room/light/switch/main could be relevant in a smart home setting, and 373 corresponding devices and applications could use a corresponding 374 convention to facilitate controllers finding sensors and actors in 375 such a system with minimal user configuration. 377 The aforementioned features make ICN amenable to data discovery. 378 Because there is no name/address chasm as in IP-based systems, data 379 can be discovered by sending an INTEREST to named data objects 380 directly (assuming a naming convention as described above). 381 Moreover, ICN can authenticate received data objects directly, for 382 example using local trust anchors in the network (for example in a 383 home network). 385 Advanced ICN features for data discovery include the concept of 386 manifests in CCNx, i.e., ICN objects that describe data collections, 387 and data set synchronization protocols in NDN (https://named- 388 data.net/publications/li2018sync-intro/) that can inform consumers 389 about the availability of new data in a tree-based data structure 390 (with automatic retrieval and authentication). Also, ICN is not 391 limited to accessing static data. Frameworks such as Named Function 392 Networking (http://www.named-function.net) and RICE can provide the 393 general ICN feature for discovery not only for data but also for name 394 functions (for in-network computing) and for their results. 396 5. Use Cases of Edge Data Discovery 398 5.1. Autonomous Vehicles 400 Autonomous vehicles rely on the processing of huge amounts of complex 401 data in real-time for fast and accurate decisions. These vehicles 402 will rely on high performance compute, storage and network resources 403 to process the volumes of data they produce in a low latency way. 404 Various systems will need a standard way to discover the pertinent 405 data for decision making. 407 5.2. Video Surveillance 409 The majority of the video surveillance footage will remain at the 410 edge infrastructure (not sent to the cloud data center). This 411 footage is coming from vehicles, factories, hotels, universities, 412 farms, etc. Much of the video footage will not be interesting to 413 those evaluating the data. A mechanism, perhaps a set of protocols, 414 is needed to identify the interesting data at the edge. What 415 constitutes interesting will be context specific, e.g., a video frame 416 might be considered interesting if and only if it includes a car, or 417 person, or bicyclist, or a backyard nocturnal creature, or etc. 418 Interesting video data may be stored longer in storage systems at the 419 very edge of the network and/or in networking equipment further away 420 from the device edge that has access to data in flight as it transits 421 the network. 423 5.3. Elevator Networks 425 Elevators are one of many industrial applications of edge computing. 426 Edge equipment receives data from hundreds of elevator sensors. The 427 data coming into the edge equipment is vibration, temperature, speed, 428 level, video, etc. We need the ability to identify where the data we 429 need to evalute is located. 431 5.4. Service Function Chaining 433 Service function chaining (SFC) allows the instantiation of an 434 ordered set of service functions (SFs) and the subsequent "steering" 435 of traffic through them. Service functions are expected to be 436 deployed at the edge of the network, as a feasible deployment of 437 "Compute In the Network", with multiple types of potential use cases 438 (e.g., fog robotics, Industry 4.0 automation, etc). Service 439 functions provide a specific treatment of received packets, therefore 440 they need to be discoverable so they can be used in a given service 441 composition via SFC. In addition, these functions can be producers 442 and/or consumers of data. So far, how the functions are discovered 443 and composed has been out of the scope of discussions in the IETF. 444 While there are some mechanisms that can be used and/or extended to 445 provide this functionality, more work needs to be done. An example 446 of this can be found in [I-D.bernardos-sfc-discovery]. 448 In an SFC environment deployed at the edge, the discovery protocol 449 may also need the following kind of meta-data information per 450 (service) function: 452 o Service Function Type: identifying the category of function 453 provided. 455 o SFC-aware: Yes/No. Indicates if the function is SFC-aware. 457 o Route Distinguisher (RD): IP address indicating the location of 458 the function. 460 o Pricing/costs details. 462 o Migration capabilities of the function: whether a given function 463 can be moved to another provider (potentially including 464 information about compatible providers topologically close). 466 o Mobility of the device hosting the function, with e.g. the 467 following sub-options: 469 Level: no, low, high; or a corresponding scale (e.g., 1 to 10). 471 Current geographical area (e.g., GPS coordinates, post code). 473 Target moving area (e.g., GPS coordinates, post code). 475 o Power source of the device hosting the function, with e.g. the 476 following sub-options: 478 Battery: Yes/No. If Yes, the following sub-options could be 479 defined: 481 Capacity of the battery (e.g., mmWh). 483 Charge status (e.g., %). 485 Lifetime (e.g., minutes). 487 Discovery of resources in an NFV environment: virtualized resources 488 do not need to be limited to those available in traditional data 489 centers, where the infrastructure is stable, static, typically 490 homogeneous and managed by a single admin entity. Computational 491 capabilities are becoming more and more ubiquitous, with terminal 492 devices getting extremely powerful, as well as other types of devices 493 that are close to the end users at the edge (e.g., vehicular onboard 494 devices for infotainment, micro data centers deployed at the edge, 495 etc.). It is envisioned that these devices would be able to offer 496 storage, computing and networking resources to nearby network 497 infrastructure, devices and things (the fog paradigm). These 498 resources can be used to host functions, for example to offload/ 499 complement other resources available at traditional data centers, but 500 also to reduce the end-to-end latency or to provide access to 501 specialized information (e.g., context available at the edge) or 502 hardware. Similar to the discovery of functions, while there are 503 mechanisms that can be reused/extended, there is no complete solution 504 yet defined. An example of work in this area is 505 [I-D.bernardos-intarea-vim-discovery]. The availability of this 506 meta-data about the capabilities of nearby physical as well as 507 virtualized resources can be made discoverable through edge data 508 discovery mechanisms. 510 5.5. Ubiquitous Witness 512 Ubiquitous Witness (UW) is the name of a use case that has been 513 presented in past COINRG and ICNRG meetings at the IETF. It 514 describes what might occur in dense IoT deployments when an anomaly 515 occurs. There are many "witnesses" to report on what happened within 516 a limited region of interest and around an approximate point in time. 517 The use case highlights the need for upstream data discovery and 518 management. It is agnostic to where the dense IoT deployment 519 resides, whether in a factory, home, commercial building, city, 520 entertainment venue, et cetera. For example, as cameras and other 521 sensors have become ubiquitous in Smart Cities, it would be helpful 522 to be able to discover and examine data from all devices and sensors 523 that witnessed an accident in a city intersection; this could be data 524 from cameras mounted at the intersection itself, on nearby buildings, 525 in cars, and cell phones of individuals on location. 527 If an anomaly were to automatically trigger independent upstream 528 flows of video data from all of the witnesses (within a proximal 529 vicinity and time window), the data flows would naturally converge at 530 shared collection or aggregation points in the network. These edge 531 nodes might opt to vault any data deemed part of a safety-related 532 anomaly, which would enable interested parties (the car owner, the 533 car manufacturer, an insurance company, a city traffic planner) to 534 investigate the root cause of the anomaly after the fact. The 535 implication however is that enough meta data has been generated 536 alongside the data itself (e.g., a data name, an identifier, or a geo 537 location and timestamp), to allow the retrieval of this distributed 538 data, provided those asking have proper authorization to access it. 540 The UW streams are contextually-related and as such it can be 541 advantageous also to be able to process them simultaneously, at the 542 time they are first generated. For example if collection nodes could 543 derive that groups of data streams are contextually-related, they 544 could stitch streams together to create a 360-degree view of the 545 anomalous event (e.g., to walk around in the data), or to winnow the 546 set of vaulted data to only the "best" video (e.g., highest 547 resolution, unoccluded views) or to perform compute-in-the-network to 548 enable them to fit within the available resources (e.g., at the 549 receiving node due to the convergence or implosion of upstream data, 550 or over the next hoplink). Ubiquitous Witness data doesn't have to 551 be video data, but video illustrates why one might want to jointly 552 process upstream flows in real-time. 554 6. IANA Considerations 556 N/A 558 7. Security Considerations 560 Security considerations will be a critical component of edge data 561 discovery particularly as intelligence is moved to the extreme edge 562 where data is to be extracted. 564 An assumption is that all data will have associated policies 565 (default, inherited or configured) that describe access control 566 permissions. Consequently, the discoverability of data will be a 567 function of who or what has requested access. In other words, the 568 discoverable view into the available data will be limited to those 569 who are authorized. Discovering edge data that is exclusively 570 private is out of scope of this document, the assumption being that 571 there will be some edge clouds that do not expose or publish the 572 availability of their data. Although edge data may be sent to the 573 back-end cloud as needed, there is nothing that precludes it from 574 being discoverable if the cloud offers it as public. 576 8. Acknowledgement 578 The authors thank Dave Oran for his detailed feedback on an early 579 version of this draft, as well as inputs from Greg Skinner and Lixia 580 Zhang. 582 9. Normative References 584 [I-D.bernardos-intarea-vim-discovery] 585 Bernardos, C. and A. Mourad, "IPv6-based discovery and 586 association of Virtualization Infrastructure Manager (VIM) 587 and Network Function Virtualization Orchestrator (NFVO)", 588 draft-bernardos-intarea-vim-discovery-03 (work in 589 progress), February 2020. 591 [I-D.bernardos-sfc-discovery] 592 Bernardos, C. and A. Mourad, "Service Function discovery 593 in fog environments", draft-bernardos-sfc-discovery-04 594 (work in progress), March 2020. 596 [I-D.irtf-icnrg-ccnxmessages] 597 Mosko, M., Solis, I., and C. Wood, "CCNx Messages in TLV 598 Format", draft-irtf-icnrg-ccnxmessages-09 (work in 599 progress), January 2019. 601 [I-D.irtf-icnrg-ccnxsemantics] 602 Mosko, M., Solis, I., and C. Wood, "CCNx Semantics", 603 draft-irtf-icnrg-ccnxsemantics-10 (work in progress), 604 January 2019. 606 [I-D.kutscher-icnrg-rice] 607 Krol, M., Habak, K., Oran, D., Kutscher, D., and I. 608 Psaras, "Remote Method Invocation in ICN", draft-kutscher- 609 icnrg-rice-00 (work in progress), October 2018. 611 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 612 Requirement Levels", BCP 14, RFC 2119, 613 DOI 10.17487/RFC2119, March 1997, 614 . 616 [RFC7927] Kutscher, D., Ed., Eum, S., Pentikousis, K., Psaras, I., 617 Corujo, D., Saucez, D., Schmidt, T., and M. Waehlisch, 618 "Information-Centric Networking (ICN) Research 619 Challenges", RFC 7927, DOI 10.17487/RFC7927, July 2016, 620 . 622 Authors' Addresses 624 Mike McBride 625 Futurewei 627 Email: michael.mcbride@futurewei.com 629 Dirk Kutscher 630 Emden University 632 Email: ietf@dkutscher.net 634 Eve Schooler 635 Intel 637 Email: eve.m.schooler@intel.com 638 URI: http://www.eveschooler.com 639 Carlos J. Bernardos 640 Universidad Carlos III de Madrid 641 Av. Universidad, 30 642 Leganes, Madrid 28911 643 Spain 645 Phone: +34 91624 6236 646 Email: cjbc@it.uc3m.es 647 URI: http://www.it.uc3m.es/cjbc/ 649 Diego R. Lopez 650 Telefonica 652 Email: diego.r.lopez@telefonica.com 653 URI: https://www.linkedin.com/in/dr2lopez/