idnits 2.17.1 draft-mcbride-edge-data-discovery-overview-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (Nov 1, 2020) is 1270 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'I-D.irtf-icnrg-ccnxmessages' is defined on line 632, but no explicit reference was found in the text == Unused Reference: 'I-D.irtf-icnrg-ccnxsemantics' is defined on line 637, but no explicit reference was found in the text == Unused Reference: 'I-D.kutscher-icnrg-rice' is defined on line 642, but no explicit reference was found in the text == Outdated reference: A later version (-06) exists of draft-bernardos-intarea-vim-discovery-04 ** Downref: Normative reference to an Experimental draft: draft-bernardos-intarea-vim-discovery (ref. 'I-D.bernardos-intarea-vim-discovery') == Outdated reference: A later version (-07) exists of draft-bernardos-sfc-discovery-05 ** Downref: Normative reference to an Experimental draft: draft-bernardos-sfc-discovery (ref. 'I-D.bernardos-sfc-discovery') ** Downref: Normative reference to an Experimental draft: draft-irtf-icnrg-ccnxmessages (ref. 'I-D.irtf-icnrg-ccnxmessages') ** Downref: Normative reference to an Experimental draft: draft-irtf-icnrg-ccnxsemantics (ref. 'I-D.irtf-icnrg-ccnxsemantics') ** Downref: Normative reference to an Experimental draft: draft-kutscher-icnrg-rice (ref. 'I-D.kutscher-icnrg-rice') ** Downref: Normative reference to an Informational RFC: RFC 7927 Summary: 6 errors (**), 0 flaws (~~), 7 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 COINRG M. McBride 3 Internet-Draft Futurewei 4 Intended status: Standards Track D. Kutscher 5 Expires: May 5, 2021 Emden University 6 E. Schooler 7 Intel 8 CJ. Bernardos 9 UC3M 10 D. Lopez 11 Telefonica 12 X. de Foy 13 InterDigital Communications 14 Nov 1, 2020 16 Edge Data Discovery for COIN 17 draft-mcbride-edge-data-discovery-overview-05 19 Abstract 21 This document describes the problem of distributed data discovery in 22 edge computing, and in particular for computing-in-the-network 23 (COIN), which may require both the marshalling of data at the outset 24 of a computation and the persistence of the resultant data after the 25 computation. Although the data might originate at the network edge, 26 as more and more distributed data is created, processed, and stored, 27 it becomes increasingly dispersed throughout the network. There 28 needs to be a standard way to find it. New and existing protocols 29 will need to be developed to support distributed data discovery at 30 the network edge and beyond. 32 Status of This Memo 34 This Internet-Draft is submitted in full conformance with the 35 provisions of BCP 78 and BCP 79. 37 Internet-Drafts are working documents of the Internet Engineering 38 Task Force (IETF). Note that other groups may also distribute 39 working documents as Internet-Drafts. The list of current Internet- 40 Drafts is at https://datatracker.ietf.org/drafts/current/. 42 Internet-Drafts are draft documents valid for a maximum of six months 43 and may be updated, replaced, or obsoleted by other documents at any 44 time. It is inappropriate to use Internet-Drafts as reference 45 material or to cite them other than as "work in progress." 47 This Internet-Draft will expire on May 5, 2021. 49 Copyright Notice 51 Copyright (c) 2020 IETF Trust and the persons identified as the 52 document authors. All rights reserved. 54 This document is subject to BCP 78 and the IETF Trust's Legal 55 Provisions Relating to IETF Documents 56 (https://trustee.ietf.org/license-info) in effect on the date of 57 publication of this document. Please review these documents 58 carefully, as they describe your rights and restrictions with respect 59 to this document. Code Components extracted from this document must 60 include Simplified BSD License text as described in Section 4.e of 61 the Trust Legal Provisions and are provided without warranty as 62 described in the Simplified BSD License. 64 Table of Contents 66 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 67 1.1. Edge Data . . . . . . . . . . . . . . . . . . . . . . . . 3 68 1.2. Background . . . . . . . . . . . . . . . . . . . . . . . 4 69 1.3. Requirements Language . . . . . . . . . . . . . . . . . . 4 70 1.4. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 71 2. Edge Data Discovery Problem Scope . . . . . . . . . . . . . . 5 72 2.1. A Cloud-Edge Continuum . . . . . . . . . . . . . . . . . 5 73 2.2. Types of Edge Data . . . . . . . . . . . . . . . . . . . 6 74 3. Edge Scenarios Requiring Data Discovery . . . . . . . . . . . 7 75 4. Edge Data Discovery . . . . . . . . . . . . . . . . . . . . . 7 76 4.1. Types of Discovery . . . . . . . . . . . . . . . . . . . 7 77 4.2. Early Stage of Discovery . . . . . . . . . . . . . . . . 8 78 4.3. Naming the Data . . . . . . . . . . . . . . . . . . . . . 8 79 5. Use Cases of Edge Data Discovery . . . . . . . . . . . . . . 10 80 5.1. Autonomous Vehicles . . . . . . . . . . . . . . . . . . . 10 81 5.2. Video Surveillance . . . . . . . . . . . . . . . . . . . 10 82 5.3. Elevator Networks . . . . . . . . . . . . . . . . . . . . 10 83 5.4. Service Function Chaining . . . . . . . . . . . . . . . . 11 84 5.5. Ubiquitous Witness . . . . . . . . . . . . . . . . . . . 12 85 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 86 7. Security Considerations . . . . . . . . . . . . . . . . . . . 13 87 8. Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . 14 88 9. Normative References . . . . . . . . . . . . . . . . . . . . 14 89 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 15 91 1. Introduction 93 Edge computing is an architectural shift that migrates Cloud 94 functionality (compute, storage, networking, control, data 95 management, etc.) out of the back-end data center to be more 96 proximate to the IoT data being generated and analyzed at the edges 97 of the network. Edge computing provides local compute, storage and 98 connectivity services, often required for latency- and bandwidth- 99 sensitive applications. Thus, Edge Computing plays a key role in 100 verticals such as Energy, Manufacturing, Automotive, Video 101 Surveillance, Retail, Gaming, Healthcare, Mining, Buildings and Smart 102 Cities. 104 1.1. Edge Data 106 Edge computing is motivated at least in part by the sheer volume of 107 data that is being created by endpoint devices (sensors, cameras, 108 lights, vehicles, drones, wearables, etc.) at the very network edge 109 and that flows upstream, in a direction for which the network was not 110 originally designed. In fact, in dense IoT deployments (e.g., many 111 video cameras are streaming high definition video), where multiple 112 data flows collect or converge at edge nodes, data is likely to need 113 transformation (to be transcoded, subsampled, compressed, analyzed, 114 annotated, combined, aggregated, etc.) to fit over the next hop link, 115 or even to fit in memory or storage. Note also that the act of 116 performing computation on the data creates yet another new data 117 stream! Preservation of the original data streams is needed 118 sometimes but not always. 120 In addition, data may be cached, copied and/or stored at multiple 121 locations in the network on route to its final destination. With an 122 increasing percentage of devices connecting to the Internet being 123 mobile, support for in-the-network caching and replication is 124 critical for continuous data availability, not to mention efficient 125 network and battery usage for endpoint devices. 127 Additionally, as mobile devices' memory/storage fill up, in an edge 128 context they may have the ability to offload their data to other 129 proximate devices or resources, leaving a bread crumb trail of data 130 in their wakes. Therefore, although data might originate at edge 131 devices, as more and more data is continuously created, processed and 132 stored, it becomes increasingly dispersed throughout the physical 133 world (outside of or scattered across managed local data centers), 134 increasingly isolated in separate local edge clouds or data silos. 135 Thus, there needs to be a standard way to find it. New and existing 136 protocols will need to be identified/developed/enhanced for these 137 purposes. Being able to discover distributed data at the edge or in 138 the middle of the network will be an important component of Edge 139 computing. 141 1.2. Background 143 Several IETF T2T RG Edge Computing discussions have been held over 144 the last couple years. A comparative study on the definition of Edge 145 computing was presented in multiple sessions in T2T RG in 2018 and an 146 Edge Computing I-D was submitted early 2019. An IETF BEC (beyond 147 edge computing) effort has been evaluating potential gaps in existing 148 edge computing architectures. Edge Data Discovery is one potential 149 gap that was identified and that needs evaluation and a solution. 150 The newly proposed COIN RG highlights the need for computations in 151 the network to be able to marshal potentially distributed input data 152 and to handle resultant output data, i.e., its placement, storage 153 and/or possible migration strategy. 155 1.3. Requirements Language 157 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 158 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 159 document are to be interpreted as described in RFC 2119 [RFC2119]. 161 1.4. Terminology 163 o Edge: The edge encompasses all entities not in the back-end cloud. 164 The device edge represents the very leaves of the network and 165 encompasses the entities found in the last mile network. Sensors, 166 gateways, compute nodes are included. Because the things that 167 populate the IoT can be both physical and/or cyber, in some 168 solutions, particularly in software-defined or digital-twin 169 contexts, the device edge can include logical (vs physical) 170 entities. The infrastructure edge includes equipment on the 171 network operator side of the last mile network including cell 172 towers, edge data centers, cable headends, POPs, etc. See 173 Figure 1 for other possible tiers of edge clouds between the 174 device edge and the back-end cloud data center. 176 o Edge Computing: Distributed computation that is performed near the 177 network edge, where nearness is determined by the system 178 requirements. This includes high performance compute, storage and 179 network equipment on either the device or infrastructure edge. 181 o Edge Data Discovery: The process of finding required data from 182 edge entities, i.e., from databases, file systems, and device 183 memory that might be physically distributed in the network, and 184 providing access to it logically as if it were a single unified 185 source, perhaps through its namespace, that can be evaluated or 186 searched. 188 o ICN: Information Centric Networking. An ICN-enabled network 189 routes data by name (vs address), caches content natively in the 190 network, and employs data-centric security. Data discovery may 191 require that data be associated with a name or names, a series of 192 descriptive attributes, and/or a unique identifier. 194 2. Edge Data Discovery Problem Scope 196 Our focus is on how to define and scope the edge data discovery 197 problem. This requires some discussion of the evolving definition of 198 the edge as part of a cloud-to-edge continuum and in turn what is 199 meant by edge data, as well as the meta-data about the edge data. 201 2.1. A Cloud-Edge Continuum 203 Although Edge Computing data typically originates at edge devices, 204 there is nothing that precludes edge data from being created anywhere 205 in the cloud-to-edge computing continuum (Figure 1). New edge data 206 may result as a byproduct of computation being performed on the data 207 stream anywhere along its path in the network. For example, 208 infrastructure edges may create new edge data when multiple data 209 streams converge upon this aggregation point and require 210 transformation (e.g., to fit within the available resources, to 211 smooth raw measurements to eliminate high-frequency noise, or to 212 obfuscate data for privacy). 214 Initially our focus is on discovery of edge data that resides at the 215 Device Edge and the Infrastructure Edge. 217 +-------------------------------+ 218 | Back-end Cloud Data Center | 219 +-------------------------------+ 220 *** Cloud 221 * * Interconnect 222 *** 223 +-------------------------------+ 224 | Core Data Center | 225 +-------------------------------+ 226 *** Backbone 227 * * Network 228 *** 229 +-------------------------------+ 230 | Regional Data Center | 231 +-------------------------------+ 232 *** Metropolitan 233 * * Network 234 *** 235 +-------------------------------+ 236 | Infrastructure Edge | 237 +-------------------------------+ 238 *** Access 239 * * Network 240 *** 241 +-------------------------------+ 242 | Device Edge | 243 +-------------------------------+ 245 Figure 1: Cloud-to-edge computing continuum 247 2.2. Types of Edge Data 249 Besides classically constrained IoT device sensor and measurement 250 data accumulating throughout the edge computing infrastructure, edge 251 data may also take the form of higher frequency and higher volume 252 streaming data (from a continuous sensor or from a camera), meta data 253 (about the data), control data (regarding an event that was 254 triggered), and/or an executable that embodies a function, service, 255 or any other piece of code or algorithm. Edge data also could be 256 created after multiple streams converge at an edge node and are 257 processed, transformed, or aggregated together in some manner. 259 Regardless of edge data type, a key problem in the Cloud-Edge 260 continuum is that data is often kept in silos. Meaning, data is 261 often sequestered within the Edge where it was created. A goal of 262 this discussion is to consider the prospect that different types of 263 edge data will be made accessible across disparate edges, for example 264 to enable richer multi-modal analytics. But this will happen only if 265 data can be described, searched and discovered across heterogeneous 266 edges in a standard way. Having a mechanism to enable granular edge 267 data discovery is the problem that needs solving either with existing 268 or new protocols. The mechanisms shouldn't care to which flavor 269 cloud or edge the request for data discovery is made. 271 3. Edge Scenarios Requiring Data Discovery 273 1. A set of data resources appears (e.g., a mobile node hosting data 274 joins a network) and they want to be discoverable by an existing 275 but possibly virtualized and/or ephemeral data directory 276 infrastructure. 278 2. A device wants to discover data resources available at or near 279 its current location. As some of these resources may be mobile, 280 the available set of edge data may vary over time. 282 3. A device wants to discover to where best in the edge 283 infrastructure to opportunistically upload its data, for example 284 if a mobile device wants to offload its data to the 285 infrastructure (for greater data availability, battery savings, 286 etc.). 288 4. Edge Data Discovery 290 How can we discover data on the edge and make use of it? There are 291 proprietary implementations that collect data from various databases 292 and consolidate it for evaluation. We need a standard protocol set 293 for doing this data discovery, on the device or infrastructure edge, 294 in order to meet the requirements of many use cases. We will have 295 terabytes of data on the edge and need a way to identify its 296 existence and find the desired data. A user requires the need to 297 search for specific data in a data set and evaluate it using their 298 own tools. The tools are outside the scope of this document, but the 299 discovery of that data is in scope. 301 4.1. Types of Discovery 303 There are many aspects of discovery and many different protocols that 304 address each aspect. 306 Discovery of new devices added to an environment. Discovery of their 307 capabilities/services in client/server environments. Discovery of 308 these new devices automatically. Discovering a device and then 309 synchronizing the device inventory and configuration for edge 310 services. There are many existing protocols to help in this 311 discovery: UPnP, mDNS, DNS-SD, SSDP, NFC, XMPP, W3C network service 312 discovery, etc. 314 Edge devices discover each other in a standard way. We can use DHCP, 315 SNMP, SMS, COAP, LLDP, and routing protocols such as OSPF for devices 316 to discover one another. 318 Discovery of link state and traffic engineering data/services by 319 external devices. BGP-LS is one such solution. 321 The question is if one or more of these protocols might be a suitable 322 contender to extend to support edge data discovery? 324 4.2. Early Stage of Discovery 326 The different types of discovery may involve mobile devices, which 327 can be the source, or target, of discovery operations. Mobile 328 devices may have an influence on discovery in COIN, and early stage 329 discovery may be necessary in some scenarios. 331 In many cases (e.g. crowds, drones or vehicular scenarios), multiple 332 networks, or attachment points, are available to a mobile device. 333 This type of device needs to efficiently select among multiple 334 interfaces, or multiple attachment points, which one(s) to use for 335 discovery. An early discovery stage should provide enough 336 information to perform such a selection and therefore reduce power 337 consumption, service latency, and impact on network usage. 339 To select among (already attached) multiple interfaces, we can 340 leverage provisioning domains, router advertisements, DHCP, etc. to 341 convey information about service or data. To select among multiple 342 attachment points, pre-attachment discovery (e.g. 802.11aq, or 343 obtaining provisioning domains through a control plane) or a 344 discovery protocol over a control plane (e.g. as described in 3GPP 345 edge computing) can be used. 347 What are suitable protocols to extend to support this early stage of 348 discovery? There is also a tradeoff between the amount of exposed 349 information and the limited resources available at this early stage. 350 Trust and privacy are also important early stage discovery factors. 352 4.3. Naming the Data 354 Information-Centric Networking (ICN) RFC 7927 [RFC7927] is a class of 355 architectures and protocols that provide "access to named data" as a 356 first-order network service. Instead of host-to-host communication 357 as in IP networks, ICNs often use location-independent names to 358 identify data objects, and the network provides the services of 359 processing (answering) requests for named data with the objective to 360 finally deliver the requested data objects to a requesting consumer. 362 Such an approach has profound effects on various aspects of a 363 networking system, including security (by enabling object-based 364 security on a message/packet level), forwarding behavior (name-based 365 forwarding, caching), but also on more operational aspects such as 366 bootstrapping, discovery etc. 368 The CCNx and NDN (https://named-data.net) variants of ICN are based 369 on a request/response abstraction where consumers (hosts, application 370 requesting named data) send INTEREST messages into the network that 371 are forwarded by network elements to a destination that can provide 372 the requested named data object. Corresponding responses are sent as 373 so-called DATA messages that follow the reverse INTEREST path. 375 Each unique data object is named unambiguously in a hierarchical 376 naming scheme and is authenticated through Public-Key cryptography 377 (data objects can also optionally be encrypted in different ways). 378 The naming concept and the object-based security approach lay the 379 foundation for location-independent operation. The network can 380 generally operate without any notion of location, and nodes 381 (consumers, forwarders) can forward requests for named data objects 382 directly, i.e., without any additional address resolution. Location 383 independence also enables additional features, for example the 384 possibility to replicate and cache named data objects. On-path 385 caching is a standard feature in many ICN systems -- typically for 386 enhancing reliability and performance. 388 In CCNx and NDN, forwarders are stateful, i.e., they keep track of 389 forwarded INTEREST to later match the received DATA messages. 390 Stateful forwarding (in conjunction with the general named-based and 391 location-independent operation) also empowers forwarders to execute 392 individual forwarding strategies and perform optimizations such as 393 in-network retransmissions, multicasting requests (in cases there are 394 several opportunities for accessing a particular named data object) 395 etc. 397 Naming data and application-specific naming conventions are naturally 398 important aspects in ICN. It is common that applications define 399 their own naming convention (i.e., semantics of elements in the name 400 hierarchy). Such names can often be directly derived from 401 application requirements, for example a name like /my-home/living- 402 room/light/switch/main could be relevant in a smart home setting, and 403 corresponding devices and applications could use a corresponding 404 convention to facilitate controllers finding sensors and actors in 405 such a system with minimal user configuration. 407 The aforementioned features make ICN amenable to data discovery. 408 Because there is no name/address chasm as in IP-based systems, data 409 can be discovered by sending an INTEREST to named data objects 410 directly (assuming a naming convention as described above). 411 Moreover, ICN can authenticate received data objects directly, for 412 example using local trust anchors in the network (for example in a 413 home network). 415 Advanced ICN features for data discovery include the concept of 416 manifests in CCNx, i.e., ICN objects that describe data collections, 417 and data set synchronization protocols in NDN (https://named- 418 data.net/publications/li2018sync-intro/) that can inform consumers 419 about the availability of new data in a tree-based data structure 420 (with automatic retrieval and authentication). Also, ICN is not 421 limited to accessing static data. Frameworks such as Named Function 422 Networking (http://www.named-function.net) and RICE can provide the 423 general ICN feature for discovery not only for data but also for name 424 functions (for in-network computing) and for their results. 426 5. Use Cases of Edge Data Discovery 428 5.1. Autonomous Vehicles 430 Autonomous vehicles rely on the processing of huge amounts of complex 431 data in real-time for fast and accurate decisions. These vehicles 432 will rely on high performance compute, storage and network resources 433 to process the volumes of data they produce in a low latency way. 434 Various systems will need a standard way to discover the pertinent 435 data for decision making. 437 5.2. Video Surveillance 439 The majority of the video surveillance footage will remain at the 440 edge infrastructure (not sent to the cloud data center). This 441 footage is coming from vehicles, factories, hotels, universities, 442 farms, etc. Much of the video footage will not be interesting to 443 those evaluating the data. A mechanism, perhaps a set of protocols, 444 is needed to identify the interesting data at the edge. What 445 constitutes interesting will be context specific, e.g., a video frame 446 might be considered interesting if and only if it includes a car, or 447 person, or bicyclist, or a backyard nocturnal creature, or etc. 448 Interesting video data may be stored longer in storage systems at the 449 very edge of the network and/or in networking equipment further away 450 from the device edge that has access to data in flight as it transits 451 the network. 453 5.3. Elevator Networks 455 Elevators are one of many industrial applications of edge computing. 456 Edge equipment receives data from hundreds of elevator sensors. The 457 data coming into the edge equipment is vibration, temperature, speed, 458 level, video, etc. We need the ability to identify where the data we 459 need to evalute is located. 461 5.4. Service Function Chaining 463 Service function chaining (SFC) allows the instantiation of an 464 ordered set of service functions (SFs) and the subsequent "steering" 465 of traffic through them. Service functions are expected to be 466 deployed at the edge of the network, as a feasible deployment of 467 "Compute In the Network", with multiple types of potential use cases 468 (e.g., fog robotics, Industry 4.0 automation, etc). Service 469 functions provide a specific treatment of received packets, therefore 470 they need to be discoverable so they can be used in a given service 471 composition via SFC. In addition, these functions can be producers 472 and/or consumers of data. So far, how the functions are discovered 473 and composed has been out of the scope of discussions in the IETF. 474 While there are some mechanisms that can be used and/or extended to 475 provide this functionality, more work needs to be done. An example 476 of this can be found in [I-D.bernardos-sfc-discovery]. 478 In an SFC environment deployed at the edge, the discovery protocol 479 may also need the following kind of meta-data information per 480 (service) function: 482 o Service Function Type: identifying the category of function 483 provided. 485 o SFC-aware: Yes/No. Indicates if the function is SFC-aware. 487 o Route Distinguisher (RD): IP address indicating the location of 488 the function. 490 o Pricing/costs details. 492 o Migration capabilities of the function: whether a given function 493 can be moved to another provider (potentially including 494 information about compatible providers topologically close). 496 o Mobility of the device hosting the function, with e.g. the 497 following sub-options: 499 Level: no, low, high; or a corresponding scale (e.g., 1 to 10). 501 Current geographical area (e.g., GPS coordinates, post code). 503 Target moving area (e.g., GPS coordinates, post code). 505 o Power source of the device hosting the function, with e.g. the 506 following sub-options: 508 Battery: Yes/No. If Yes, the following sub-options could be 509 defined: 511 Capacity of the battery (e.g., mmWh). 513 Charge status (e.g., %). 515 Lifetime (e.g., minutes). 517 Discovery of resources in an NFV environment: virtualized resources 518 do not need to be limited to those available in traditional data 519 centers, where the infrastructure is stable, static, typically 520 homogeneous and managed by a single admin entity. Computational 521 capabilities are becoming more and more ubiquitous, with terminal 522 devices getting extremely powerful, as well as other types of devices 523 that are close to the end users at the edge (e.g., vehicular onboard 524 devices for infotainment, micro data centers deployed at the edge, 525 etc.). It is envisioned that these devices would be able to offer 526 storage, computing and networking resources to nearby network 527 infrastructure, devices and things (the fog paradigm). These 528 resources can be used to host functions, for example to offload/ 529 complement other resources available at traditional data centers, but 530 also to reduce the end-to-end latency or to provide access to 531 specialized information (e.g., context available at the edge) or 532 hardware. Similar to the discovery of functions, while there are 533 mechanisms that can be reused/extended, there is no complete solution 534 yet defined. An example of work in this area is 535 [I-D.bernardos-intarea-vim-discovery]. The availability of this 536 meta-data about the capabilities of nearby physical as well as 537 virtualized resources can be made discoverable through edge data 538 discovery mechanisms. 540 5.5. Ubiquitous Witness 542 Ubiquitous Witness (UW) is the name of a use case that has been 543 presented in past COINRG and ICNRG meetings at the IETF. It 544 describes what might occur in dense IoT deployments when an anomaly 545 occurs. There are many "witnesses" to report on what happened within 546 a limited region of interest and around an approximate point in time. 547 The use case highlights the need for upstream data discovery and 548 management. It is agnostic to where the dense IoT deployment 549 resides, whether in a factory, home, commercial building, city, 550 entertainment venue, et cetera. For example, as cameras and other 551 sensors have become ubiquitous in Smart Cities, it would be helpful 552 to be able to discover and examine data from all devices and sensors 553 that witnessed an accident in a city intersection; this could be data 554 from cameras mounted at the intersection itself, on nearby buildings, 555 in cars, and cell phones of individuals on location. 557 If an anomaly were to automatically trigger independent upstream 558 flows of video data from all of the witnesses (within a proximal 559 vicinity and time window), the data flows would naturally converge at 560 shared collection or aggregation points in the network. These edge 561 nodes might opt to vault any data deemed part of a safety-related 562 anomaly, which would enable interested parties (the car owner, the 563 car manufacturer, an insurance company, a city traffic planner) to 564 investigate the root cause of the anomaly after the fact. The 565 implication however is that enough meta data has been generated 566 alongside the data itself (e.g., a data name, an identifier, or a geo 567 location and timestamp), to allow the retrieval of this distributed 568 data, provided those asking have proper authorization to access it. 570 The UW streams are contextually-related and as such it can be 571 advantageous also to be able to process them simultaneously, at the 572 time they are first generated. For example if collection nodes could 573 derive that groups of data streams are contextually-related, they 574 could stitch streams together to create a 360-degree view of the 575 anomalous event (e.g., to walk around in the data), or to winnow the 576 set of vaulted data to only the "best" video (e.g., highest 577 resolution, unoccluded views) or to perform compute-in-the-network to 578 enable them to fit within the available resources (e.g., at the 579 receiving node due to the convergence or implosion of upstream data, 580 or over the next hoplink). Ubiquitous Witness data doesn't have to 581 be video data, but video illustrates why one might want to jointly 582 process upstream flows in real-time. 584 6. IANA Considerations 586 N/A 588 7. Security Considerations 590 Security considerations will be a critical component of edge data 591 discovery particularly as intelligence is moved to the extreme edge 592 where data is to be extracted. 594 An assumption is that all data will have associated policies 595 (default, inherited or configured) that describe access control 596 permissions. Consequently, the discoverability of data will be a 597 function of who or what has requested access. In other words, the 598 discoverable view into the available data will be limited to those 599 who are authorized. Discovering edge data that is exclusively 600 private is out of scope of this document, the assumption being that 601 there will be some edge clouds that do not expose or publish the 602 availability of their data. Although edge data may be sent to the 603 back-end cloud as needed, there is nothing that precludes it from 604 being discoverable if the cloud offers it as public. 606 A trust relationship may be needed between the source and target of a 607 discovery operation to avoid denial of service attacks from a 608 malicious source or target of the operation. And discovery 609 information, which is exposed by a node or network, may need to be 610 protected for privacy purposes, e.g. not leak information in the 611 presence of a certain type of data in a network. 613 8. Acknowledgement 615 The authors thank Dave Oran, Greg Skinner and Lixia Zhang for 616 contributing to this document. 618 9. Normative References 620 [I-D.bernardos-intarea-vim-discovery] 621 Bernardos, C. and A. Mourad, "IPv6-based discovery and 622 association of Virtualization Infrastructure Manager (VIM) 623 and Network Function Virtualization Orchestrator (NFVO)", 624 draft-bernardos-intarea-vim-discovery-04 (work in 625 progress), September 2020. 627 [I-D.bernardos-sfc-discovery] 628 Bernardos, C. and A. Mourad, "Service Function discovery 629 in fog environments", draft-bernardos-sfc-discovery-05 630 (work in progress), September 2020. 632 [I-D.irtf-icnrg-ccnxmessages] 633 Mosko, M., Solis, I., and C. Wood, "CCNx Messages in TLV 634 Format", draft-irtf-icnrg-ccnxmessages-09 (work in 635 progress), January 2019. 637 [I-D.irtf-icnrg-ccnxsemantics] 638 Mosko, M., Solis, I., and C. Wood, "CCNx Semantics", 639 draft-irtf-icnrg-ccnxsemantics-10 (work in progress), 640 January 2019. 642 [I-D.kutscher-icnrg-rice] 643 Krol, M., Habak, K., Oran, D., Kutscher, D., and I. 644 Psaras, "Remote Method Invocation in ICN", draft-kutscher- 645 icnrg-rice-00 (work in progress), October 2018. 647 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 648 Requirement Levels", BCP 14, RFC 2119, 649 DOI 10.17487/RFC2119, March 1997, 650 . 652 [RFC7927] Kutscher, D., Ed., Eum, S., Pentikousis, K., Psaras, I., 653 Corujo, D., Saucez, D., Schmidt, T., and M. Waehlisch, 654 "Information-Centric Networking (ICN) Research 655 Challenges", RFC 7927, DOI 10.17487/RFC7927, July 2016, 656 . 658 Authors' Addresses 660 Mike McBride 661 Futurewei 663 Email: michael.mcbride@futurewei.com 665 Dirk Kutscher 666 Emden University 668 Email: ietf@dkutscher.net 670 Eve Schooler 671 Intel 673 Email: eve.m.schooler@intel.com 674 URI: http://www.eveschooler.com 676 Carlos J. Bernardos 677 Universidad Carlos III de Madrid 678 Av. Universidad, 30 679 Leganes, Madrid 28911 680 Spain 682 Phone: +34 91624 6236 683 Email: cjbc@it.uc3m.es 684 URI: http://www.it.uc3m.es/cjbc/ 686 Diego R. Lopez 687 Telefonica 689 Email: diego.r.lopez@telefonica.com 690 URI: https://www.linkedin.com/in/dr2lopez/ 691 Xavier de Foy 692 InterDigital Communications, LLC 693 1000 Sherbrooke West 694 Montreal 695 Canada 697 Email: Xavier.Defoy@InterDigital.com