idnits 2.17.1 draft-mcbride-data-discovery-use-cases-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document date (February 19, 2021) is 1162 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-08) exists of draft-li-apn-problem-statement-usecases-01 == Outdated reference: A later version (-03) exists of draft-sardon-blockchain-gateways-usecases-00 Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group M. McBride 3 Internet-Draft J. Guichard 4 Intended status: Informational Y. Qu 5 Expires: August 23, 2021 Futurewei 6 T. Hardjono 7 MIT 8 CJ. Bernardos 9 UC3M 10 February 19, 2021 12 Data Discovery Use Cases 13 draft-mcbride-data-discovery-use-cases-00 15 Abstract 17 There needs to be a solution for locating and capturing data in a 18 standardized way. Data may be cached, copied and/or stored at 19 multiple locations in the network on route to its final destination. 20 With an increasingly high volume of devices connecting to the 21 Internet, support for network caching and replication is critical for 22 continuous data availability. There are data repositories throughout 23 a modern network and there needs to be a standardized way to locating 24 the repositories and discovering the desired data within. 26 There are several use cases which illustrate a need for a data 27 discovery solution. An application might need to query the network 28 to discover resources (program, service, resource) that can help the 29 local application perform a particular task. Additionally, there 30 could be volumes of data which needs to be searched and discovered in 31 order to provide a result to be acted upon by the application. These 32 are a couple of the use cases being addressed in this document. 34 Status of This Memo 36 This Internet-Draft is submitted in full conformance with the 37 provisions of BCP 78 and BCP 79. 39 Internet-Drafts are working documents of the Internet Engineering 40 Task Force (IETF). Note that other groups may also distribute 41 working documents as Internet-Drafts. The list of current Internet- 42 Drafts is at https://datatracker.ietf.org/drafts/current/. 44 Internet-Drafts are draft documents valid for a maximum of six months 45 and may be updated, replaced, or obsoleted by other documents at any 46 time. It is inappropriate to use Internet-Drafts as reference 47 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on August 23, 2021. 50 Copyright Notice 52 Copyright (c) 2021 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (https://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 68 2. Requirements Language . . . . . . . . . . . . . . . . . . . . 3 69 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 70 4. Problem Statement . . . . . . . . . . . . . . . . . . . . . . 3 71 4.1. Types of Data . . . . . . . . . . . . . . . . . . . . . . 4 72 5. Use Cases . . . . . . . . . . . . . . . . . . . . . . . . . . 4 73 5.1. Application-Aware Service Function Chaining . . . . . . . 4 74 5.2. Available CPU and Memory Resources . . . . . . . . . . . 5 75 5.3. Data Dependency . . . . . . . . . . . . . . . . . . . . . 5 76 5.4. Distributed Ledgers . . . . . . . . . . . . . . . . . . . 5 77 5.5. Edge Computing . . . . . . . . . . . . . . . . . . . . . 6 78 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 6 79 7. Security Considerations . . . . . . . . . . . . . . . . . . . 6 80 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 6 81 9. Normative References . . . . . . . . . . . . . . . . . . . . 6 82 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 7 84 1. Introduction 86 An application might need to query the network to discover resources 87 that can help the local application perform a particular task. There 88 could be volumes of data which needs to be searched and discovered in 89 order to provide a result to be acted upon. 91 Data discovery might involve an application requesting data. It 92 might involve a device looking to store data or to request the 93 processing from a data store and then gather the result. Or it could 94 be execution of a set of instructions at an appropriate device in the 95 network. Another possible area is service chaining where an 96 application needs to run its data through a firewall but the selected 97 firewall must have a particular rule set applicable to this 98 particular application. Perhaps the service function has to be 99 located within a particular environment (security level). Or a 100 particular device must be found that is capable of executing upon a 101 set of instructions provided in the data packet. This document 102 focuses on various data discovery use cases. 104 2. Requirements Language 106 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 107 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 108 document are to be interpreted as described in RFC 2119 [RFC2119]. 110 3. Terminology 112 o SFC: Service Function Chaining 114 o APN: Application-Aware Networking 116 o DLT: Distributed Ledger Technologies 118 4. Problem Statement 120 As discussed in [I-D.mcbride-data-discovery-problem-statement], there 121 are many proprietary and standardized ways of discovering networking 122 devices and hosts. There are many solutions for discovering data 123 within a database. There are proprietary, non-standardized, ways of 124 discovering the data that may be stored throughout an environment of 125 networking devices. We can discover information about the devices 126 but can't locate and capture stored data (resource, program, service, 127 etc) in a standard way. With more networking devices storing 128 collected data there needs to be a standard way of discovering the 129 specific data needed amongst a potentially huge lake of databases. 131 This data discovery problem is particularly true for use cases where 132 it will be important to have the capability to express a data request 133 within the data packets and have the network route the traffic 134 accordingly. This might be an application requesting data. It might 135 be a device looking to store data or to request the processing, and 136 result, from a data store. It could be execution of a set of 137 instructions at an appropriate device in the network. An application 138 may need to run its data through a firewall but the selected firewall 139 must have a particular rule set applicable to this particular 140 application. Perhaps a service function needs to be located within a 141 particular environment (security level). Or a particular device must 142 be found that is capable of executing upon a set of instructions 143 provided in the data packet. This document focuses on data discovery 144 use cases. 146 4.1. Types of Data 148 Discoverable data can be a resource, program, service etc. And an 149 infinite amount, and types, of data can be discoverable including 150 statistics, measurements, temperature, location, metadata, health, 151 transactions and so on. 153 Program: applets, graphics, games, spreadsheets, database systems, 154 browsers, etc 156 Service: firewalls, load balancers, spam filters, header 157 manipulators, etc 159 Resource: CPU, memory, etc 161 5. Use Cases 163 Here are some use cases to illustrate the need for data discovery: 165 5.1. Application-Aware Service Function Chaining 167 Application Aware Networking (APN), as described in 168 [I-D.li-apn-problem-statement-usecases], allows applications to 169 specify finer granularity requirements to the network operator by 170 providing application knowledge to the network layer. This 171 granularity includes the ability to convey the characteristics of an 172 application's traffic flow and program the network infrastructure 173 accordingly to provide service assurance. 175 An application might need to query the network to discover resources 176 that can help the local application perform a particular task. 177 Additionally, there could be volumes of data which needs to be 178 searched and discovered in order to provide a result to be acted upon 179 by the application. 181 End-to-end service delivery often needs to go through various service 182 functions, including traditional network service functions such as 183 firewalls, DPIs as well as new application-specific functions, both 184 physical and virtual. APN provides assigning a given traffic flow to 185 a specific service function chain (SFC) but also specifically allows 186 the subsequent steering according to the application information 187 carried in the APN packets. 189 When an application needs to run its data through a firewall, but the 190 selected firewall must have a particular rule set applicable to this 191 particular application, then the application can leverage data 192 discovery functionality. The service function may be required to be 193 located within a particular environment such as a with a certain 194 security level. Data discovery is needed to find that particular 195 rule set (amongst the various firewalls) and then steer the packet 196 accordingly. Or a particular device, along the SFC, may need to be 197 found that is capable of executing upon a set of instructions 198 provided in the data packet. The data capabilities of devices needs 199 to be discoverable in order to steer the application packets towards 200 them along a SFC. 202 5.2. Available CPU and Memory Resources 204 An application, or service, may need to discover the available server 205 memory and compute resources from the network. A certain amount of 206 CPU resources may be required to support a particular application 207 workload. And the application may need to know the maximum CPU 208 utilization threshold available on a compute device. Gathering info 209 on available clock speeds and amount of cores can help determine how 210 quickly servers load and interact with a set of applications. The 211 network can provide the discoverability of the necessary data (cpu, 212 memory) in order for applications to properly execute. A network 213 planning app can also utilize this information to help predict future 214 resource demands in order to meet applications performance 215 requirements. 217 5.3. Data Dependency 219 There may be scenarios where it's critical to find X type of data 220 that can help a local application, or service, successfully perform a 221 particular task. Perhaps an industrial application needs real time 222 measurement data, such as temperature, in order to execute a process. 223 This required data may be cached, copied and/or stored at multiple 224 locations in the network on route to its final destination. With an 225 increasing percentage of devices connecting to the Internet being 226 mobile, support for in-the-network caching and replication is 227 critical for continuous data availability, not to mention efficient 228 network and battery usage for endpoint devices. In order for some 229 applications to properly execute, we need to find a way for the 230 network to provide support for data discovery. 232 5.4. Distributed Ledgers 234 DLT Gateways, as discussed in 235 [I-D.sardon-blockchain-gateways-usecases], will be given a 236 permissioned view of assets/transactions, that they are requested to 237 transfer, within their attached DLT domain. GW's may also need to 238 discover assets/transactions, not explicitly provided, within the DLT 239 domain. It may become necessary for the GW (or other network 240 element.. if permitted) to discover the data (asset, resource, 241 service...) in order to transfer the required asset. Discovery of 242 the data parts is also needed to validate the transfer after the 243 asset movement. The ledger in the DLT will not hold all the relevant 244 information pertaining to a previous asset transfer. So there needs 245 to be ways to search/discover these. The data parts, to be 246 discovered, include: 248 Relevant DLT transaction public-keys of the involved entities 249 (i.e. public-keys (addresses) used on both DLTs. 251 Relevant entity public-keys and X.509 certs (Originator, owner of 252 gateway G1, owner of gateway G2, Beneficiary). This is similar to 253 the X.509 certs and cert-profiles used in the SWIFT banking 254 network. 256 Relevant asset-related JSON documents (e.g. asset profiles). 258 5.5. Edge Computing 260 As described in [I-D.mcbride-edge-data-discovery-overview], the 261 required data may be distributed across thousands of edge computing 262 devices. Edge computing is motivated by the sheer volume of data 263 that is being created by endpoint devices (sensors, cameras, lights, 264 vehicles, drones, wearables, etc.) at the very network edge. In 265 dense IoT deployments (e.g., many video cameras are streaming high 266 definition video), where multiple data flows collect or converge at 267 edge nodes, data is likely to need transformation (transcoded, 268 subsampled, compressed, analyzed, annotated, combined, aggregated, 269 etc.) to fit over the next hop link, or even to fit in memory or 270 storage. This data, distributed across the edge, will need to be 271 discovered in order to perform any number of functions such as an IoT 272 application needing elevator vibration data in order to execute a 273 process. 275 6. IANA Considerations 277 7. Security Considerations 279 8. Acknowledgements 281 9. Normative References 283 [I-D.li-apn-problem-statement-usecases] 284 Li, Z., Peng, S., Voyer, D., Xie, C., Liu, P., Qin, Z., 285 Ebisawa, K., Previdi, S., and J. Guichard, "Problem 286 Statement and Use Cases of Application-aware Networking 287 (APN)", draft-li-apn-problem-statement-usecases-01 (work 288 in progress), September 2020. 290 [I-D.mcbride-data-discovery-problem-statement] 291 McBride, M., Kutscher, D., Schooler, E., Bernardos, C., 292 and D. Lopez, "Data Discovery Problem Statement", draft- 293 mcbride-data-discovery-problem-statement-00 (work in 294 progress), July 2020. 296 [I-D.mcbride-edge-data-discovery-overview] 297 McBride, M., Kutscher, D., Schooler, E., Bernardos, C., 298 Lopez, D., and X. Foy, "Edge Data Discovery for COIN", 299 draft-mcbride-edge-data-discovery-overview-05 (work in 300 progress), November 2020. 302 [I-D.sardon-blockchain-gateways-usecases] 303 Sardon, A. and T. Hardjono, "Blockchain Gateways: Use- 304 Cases", draft-sardon-blockchain-gateways-usecases-00 (work 305 in progress), October 2020. 307 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 308 Requirement Levels", BCP 14, RFC 2119, 309 DOI 10.17487/RFC2119, March 1997, 310 . 312 Authors' Addresses 314 Mike McBride 315 Futurewei 317 Email: michael.mcbride@futurewei.com 319 Jim Guichard 320 Futurewei 322 Email: james.n.guichard@futurewei.com 324 Yingzhen Qu 325 Futurewei 327 Email: yingzhen.qu@futurewei.com 328 Thomas Hardjono 329 MIT 331 Email: hardjono@mit.edu 333 Carlos J. Bernardos 334 Universidad Carlos III de Madrid 335 Av. Universidad, 30 336 Leganes, Madrid 28911 337 Spain 339 Phone: +34 91624 6236 340 Email: cjbc@it.uc3m.es 341 URI: http://www.it.uc3m.es/cjbc/