idnits 2.17.1 draft-liu-dyncast-ps-usecases-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (7 March 2022) is 774 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'TR-466' is defined on line 707, but no explicit reference was found in the text Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 rtgwg P. Liu 3 Internet-Draft China Mobile 4 Intended status: Informational P. Eardley 5 Expires: 8 September 2022 British Telecom 6 D. Trossen 7 Huawei Technologies 8 M. Boucadair 9 Orange 10 LM. Contreras 11 Telefonica 12 C. Li 13 Huawei Technologies 14 7 March 2022 16 Dynamic-Anycast (Dyncast) Use Cases and Problem Statement 17 draft-liu-dyncast-ps-usecases-03 19 Abstract 21 Many service providers have been exploring distributed computing 22 techniques to achieve better service response time and optimized 23 energy consumption. Such techniques rely upon the distribution of 24 computing services and capabilities over many locations in the 25 network, such as its edge, the metro region, virtualized central 26 office, and other locations. In such a distributed computing 27 environment, providing services by utilizing computing resources 28 hosted in various computing facilities (e.g., edges) is being 29 considered, e.g., for computationally intensive and delay sensitive 30 services. Ideally, services should be computationally balanced using 31 service-specific metrics instead of simply dispatching the service 32 requests in a static way or optimizing solely connectivity metrics. 33 For example, systematically directing end user-originated service 34 requests to the geographically closest edge or some small computing 35 units may lead to an unbalanced usage of computing resources, which 36 may then degrade both the user experience and the overall service 37 performance. 39 This document provides an overview of scenarios and problems 40 associated with realizing such scenarios, identifying key engineering 41 investigation areas which require adequate architectures and 42 protocols to achieve balanced computing and networking resource 43 utilization among facilities providing the services. 45 Status of This Memo 47 This Internet-Draft is submitted in full conformance with the 48 provisions of BCP 78 and BCP 79. 50 Internet-Drafts are working documents of the Internet Engineering 51 Task Force (IETF). Note that other groups may also distribute 52 working documents as Internet-Drafts. The list of current Internet- 53 Drafts is at https://datatracker.ietf.org/drafts/current/. 55 Internet-Drafts are draft documents valid for a maximum of six months 56 and may be updated, replaced, or obsoleted by other documents at any 57 time. It is inappropriate to use Internet-Drafts as reference 58 material or to cite them other than as "work in progress." 60 This Internet-Draft will expire on 8 September 2022. 62 Copyright Notice 64 Copyright (c) 2022 IETF Trust and the persons identified as the 65 document authors. All rights reserved. 67 This document is subject to BCP 78 and the IETF Trust's Legal 68 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 69 license-info) in effect on the date of publication of this document. 70 Please review these documents carefully, as they describe your rights 71 and restrictions with respect to this document. Code Components 72 extracted from this document must include Revised BSD License text as 73 described in Section 4.e of the Trust Legal Provisions and are 74 provided without warranty as described in the Revised BSD License. 76 Table of Contents 78 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 79 2. Definition of Terms . . . . . . . . . . . . . . . . . . . . . 4 80 3. Sample Use Cases . . . . . . . . . . . . . . . . . . . . . . 5 81 3.1. Cloud Virtual Reality (VR) or Augmented Reality (AR) . . 6 82 3.2. Intelligent Transportation . . . . . . . . . . . . . . . 8 83 3.3. Digital Twin . . . . . . . . . . . . . . . . . . . . . . 9 84 4. Problems in Existing Solutions . . . . . . . . . . . . . . . 10 85 4.1. Dynamicity of Relations . . . . . . . . . . . . . . . . . 10 86 4.2. Efficiency . . . . . . . . . . . . . . . . . . . . . . . 12 87 4.3. Complexity and Accuracy . . . . . . . . . . . . . . . . . 12 88 4.4. Metric Exposure and Use . . . . . . . . . . . . . . . . . 13 89 4.5. Security . . . . . . . . . . . . . . . . . . . . . . . . 13 90 4.6. Changes to Infrastructure . . . . . . . . . . . . . . . . 14 91 5. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 14 92 6. Security Considerations . . . . . . . . . . . . . . . . . . . 15 93 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 94 8. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 15 95 9. Informative References . . . . . . . . . . . . . . . . . . . 15 96 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . 16 97 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 16 99 1. Introduction 101 Edge computing aims to provide better response times and transfer 102 rates compared to Cloud Computing, by moving the computing towards 103 the edge of a network. Edge computing can be built on embedded 104 systems, gateways, and others, all being located close to end users' 105 premises. There is an emerging requirement that multiple edge sites 106 (called "edges", for for short) are deployed at different locations 107 to provide a service. There are millions of home gateways, thousands 108 of base stations, and hundreds of central offices in a city that can 109 serve as candidate edges for behaving as service nodes. Depending on 110 the location of an edge and its capacity, different computing 111 resources can be contributed by each edge to deliver a service. At 112 peak hours, computing resources attached to a client's closest edge 113 may not be sufficient to handle all the incoming service requests. 114 Longer response times or even dropping of requests can be experienced 115 by users. Increasing the computing resources hosted on each edge to 116 the potential maximum capacity is neither feasible nor economically 117 viable in many cases. 119 Some user devices are battery-dependent. Offloading computation 120 intensive processing to the edge can save battery power. Moreover, 121 the edge may use a data set (for the computation) that may not exist 122 on the user device because of the size of data pool or due to data 123 governance reasons. 125 At the same time, with new technologies such as serverless computing 126 and container based virtual functions, the service node at an edge 127 can be easily created and terminated in a sub-second scale, which in 128 turn changes the availability of a computing resources for a service 129 dramatically over time, therefore impacting the possibly "best" 130 decision on where to send a service request from a client. 132 Traditional techniques to manage the overall load balancing process 133 of clients issuing requests include choose-the-closest or round- 134 robin. Those solutions are relatively static, which may cause an 135 unbalanced distribution in terms of network load and computational 136 load among available sources. For example, DNS-based load balancing 137 usually configures a domain in the Domain Name System (DNS) such that 138 client requests to that domain name are distributed across a group of 139 servers. It usually provides several IP addresses for a domain name. 141 There are some dynamic solutions to distribute the requests to the 142 server that best fits a service-specific metric, such as the best 143 available resources and minimal load. They usually require Layer 4 - 144 Layer 7 handling of the packet processing, such as through DNS-based 145 or indirection servers. Such an approach is inefficient for large 146 number of short connections. At the same time, such approaches can 147 often not retrieve the desired metric, such as the network status, in 148 real time. Therefore, the choice of the service node is almost 149 entirely determined by the computing status, rather than the 150 comprehensive considerations of both computing and network metrics or 151 makes rather long-term decisions due to the (upper layer) overhead in 152 the decision making itself. 154 Distributing service requests to a specific service having multiple 155 instances attached to multiple edges, while taking into account 156 computing as well as service-specific metrics in the distribution 157 decision, is seen as a dynamic anycast (or "dyncast", for short) 158 problem of sending service requests, without prescribing the use of a 159 routing solution. 161 As a problem statement, this document describes sample usage 162 scenarios as well as key areas in which current solutions lead to 163 problems that ultimately affect the deployment (including the 164 performance) of edge services. Those key areas target the 165 identification of candidate solution components. 167 2. Definition of Terms 169 This document makes use of the following terms: 171 Service: A monolithic functionality that is provided by an endpoint 172 according to the specification for said service. A composite 173 service can be built by orchestrating monolithic services. 175 Service instance: Running environment (e.g., a node) that makes the 176 functionality of a service available. One service can have several 177 instances running at different network locations. 179 Service identifier: Used to uniquely identify a service, at the same 180 time identifying the whole set of service instances that each 181 represent the same service behavior, no matter where those service 182 instances are running. 184 Anycast: An addressing and packet forwarding approach that assigns 185 an "anycast" identifier for one or more service instances to which 186 requests to an "anycast" identifier could be routed/forwarded, 187 following the definition in[RFC4786] as anycast being "the practice 188 of making a particular Service Address available in multiple, 189 discrete, autonomous locations, such that datagrams sent are routed 190 to one of several available locations". 192 Dyncast: Dynamic Anycast, taking the dynamic nature of computing 193 resource metrics into account to steer an anycast-like decision in 194 sending an incoming service request. 196 3. Sample Use Cases 198 This section presents a non-exhaustive list of scenarios which 199 require multiple edge sites to interconnect and to coordinate at the 200 network layer to meet the service requirements and ensure better user 201 experience. 203 Before outlining the use cases, however, let us describe a basic 204 model that we assume through which those use cases are being 205 realized. This model justifies the choice of the terminology 206 introduced in Section 2. 208 We assume that clients access one or more services with an objective 209 to meet a desired user experience. Each participating service may be 210 realized at one or more places in the network (called, service 211 instances). Such service instances are instantiated and deployed as 212 part of the overall service deployment process, e.g., using existing 213 orchestration frameworks, within so-called edge sites, which in turn 214 are reachable through a network infrastructure via an egress router. 216 When a client issues a service request to a required service, the 217 request is being steered by its ingress router to one of the 218 available service instances that realize the requested service. Each 219 service instance may act as a client towards another service, thereby 220 seeing its own outbound traffic steered to a suitable service 221 instance of the request service and so on, achieving service 222 composition and chaining as a result. 224 The aforementioned selection of one of candidate service instances is 225 done using traffic steering methods , where the steering decision may 226 take into account pre-planned policies (assignment of certain clients 227 to certain service instances), realize shortest-path to the 'closest' 228 service instance, or utilize more complex and possibly dynamic metric 229 information, such as load of service instances, latencies experienced 230 or similar, for a more dynamic selection of a suitable service 231 instance. 233 It is important to note that clients may move throughout the 234 execution of a service, which may, as a result, position other 235 service instance 'better' in terms of latency, load, or other 236 metrics. This creates a (physical) dynamicity that will need to be 237 catered for. 239 Apart from the input into the traffic steering decision, under the 240 aforementioned constraint of possible client mobility, its 241 realization may differ in terms of the layer of the protocol stack at 242 which the needed operations for the decision are implemented. 243 Possible layers are application, transport, or network layers. 244 Section 4 discusses some choice realization issues. 246 As a summary, Figure 1 outlines the main aspects of the assumed 247 system model for realizing the use cases that follow next. 249 +------------+ +------------+ +------------+ 250 +------------+ | +------------+ | +------------+ | 251 | edge | | | edge | | | edge | | 252 | site 1 |-+ | site 2 |-+ | site 3 |-+ 253 +-----+------+ +------+-----+ +------+-----+ 254 | | | 255 +----+-----+ +-----+----+ +-----+----+ 256 | Router 1 | | Router 2 | | Router 3 | 257 +----+-----+ +-----+----+ +-----+----+ 258 | | | 259 | +--------+--------+ | 260 | | | | 261 +-----------| Infrastructure |-----------+ 262 | | 263 +--------+--------+ 264 | 265 +----+----+ 266 | Ingress | 267 +---------------| Router |--------------+ 268 | +----+----+ | 269 | | | 270 +--+--+ +--+---+ +---+--+ 271 +------+| +------+ | +------+ | 272 |client|+ |client|-+ |client|-+ 273 +------+ +------+ +------+ 275 Figure 1: Dyncast Use Case Model 277 3.1. Cloud Virtual Reality (VR) or Augmented Reality (AR) 279 Cloud VR/AR services are used in some exhibitions, scenic spots, and 280 celebration ceremonies. In the future, they might be used in more 281 applications, such as industrial internet, medical industry, and meta 282 verse. 284 Cloud VR/AR introduces the concept of cloud computing to the 285 rendering of audiovisual assets in such applications. Here, the edge 286 cloud helps encode/decode and render content. The end device usually 287 only uploads posture or control information to the edge and then VR/ 288 AR contents are rendered in the edge cloud. The video and audio 289 outputs generated from the edge cloud are encoded, compressed, and 290 transmitted back to the end device or further transmitted to central 291 data center via high bandwidth networks. 293 Edge sites may use CPU or GPU for encode/decode. GPU usually has 294 better performance but CPU is simpler and more straightforward to use 295 as well as possibly more widespread in deployment. Available 296 remaining resources determines if a service instance can be started. 297 The instance's CPU, GPU and memory utilization has a high impact on 298 the processing delay on encoding, decoding and rendering. At the 299 same time, the network path quality to the edge site is a key for 300 user experience of quality of audio/ video and input command response 301 times. 303 A Cloud VR service, such as a mobile gaming service, brings 304 challenging requirements to both network and computing so that the 305 edge node to serve a service request has to be carefully selected to 306 make sure it has sufficient computing resource and good network path. 307 For example, for an entry-level Cloud VR (panoramic 8K 2D video) with 308 110-degree Field of View (FOV) transmission, the typical network 309 requirements are bandwidth 40Mbps, 20ms for motion-to-photon latency, 310 packet loss rate is 2.4E-5; the typical computing requirements are 8K 311 H.265 real-time decoding, 2K H.264 real-time encoding. We can 312 further divide the 20ms latency budget into (i) sensor sampling 313 delay, (ii) image/frame rendering delay, (iii) display refresh delay, 314 and (iv) network delay. With upcoming high display refresh rate 315 (e.g., 144Hz) and GPU resources being used for frame rendering, we 316 can expect an upper bound of roughly 5ms for the round-trip latency 317 in these scenarios, which is close to the frame rendering computing 318 delay. 320 Furthermore, specific techniques may be employed to divide the 321 overall rendering into base assets that are common across a number of 322 clients participating in the service, while the client-specific input 323 data is being utilized to render additional assets. When being 324 delivered to the client, those two assets are being combined into the 325 overall content being consumed by the client. The requirements for 326 sending the client input data as well as the requests for the base 327 assets may be different in terms of which service instances may serve 328 the request, where base assets may be served from any nearby service 329 instance (since those base assets may be served without requiring 330 cross-request state being maintained), while the client-specific 331 input data is being processed by a stateful service instance that 332 changes, if at all, only slowly over time due to the stickiness of 333 the service that is being created by the client-specific data. Other 334 splits of rendering and input tasks can be found in[TR22.874] for 335 further reading. 337 When it comes to the service instances themselves, those may be 338 instantiated on-demand, e.g., driven by network or client demand 339 metrics, while resources may also be released, e.g., after an idle 340 timeout, to free up resources for other services. Depending on the 341 utilized node technologies, the lifetime of such "function as a 342 service" may range from many minutes down to millisecond scale. 343 Therefore computing resources across participating edges exhibit a 344 distributed (in terms of locations) as well as dynamic (in terms of 345 resource availability) nature. In order to achieve a satisfying 346 service quality to end users, a service request will need to be sent 347 to and served by an edge with sufficient computing resource and a 348 good network path. 350 3.2. Intelligent Transportation 352 For the convenience of transportation, more video capture devices are 353 required to be deployed as urban infrastructure, and the better video 354 quality is also required to facilitate the content analysis. So, the 355 transmission capacity of the network will need to be further 356 increased, and the collected video data needs to be further 357 processed, such as for pedestrian face recognition, vehicle moving 358 track recognition, and prediction. This, in turn, also impacts the 359 requirements for the video processing capacity of computing nodes. 361 In auxiliary driving scenarios, to help overcome the non-line-of- 362 sight problem due to blind spot or obstacles, the edge node can 363 collect comprehensive road and traffic information around the vehicle 364 location and perform data processing, and then vehicles with high 365 security risk can be warned accordingly, improving driving safety in 366 complicated road conditions, like at intersections. This scenario is 367 also called "Electronic Horizon", as explained in[HORITA]. For 368 instance, video image information captured by, e.g., an in-car, 369 camera is transmitted to the nearest edge node for processing. The 370 notion of sending the request to the "nearest" edge node is important 371 for being able to collate the video information of "nearby" cars, 372 using, for instance, relative location information. Furthermore, 373 data privacy may lead to the requirement to process the data as close 374 to the source as possible to limit data spread across too many 375 network components in the network. 377 Nevertheless, load at specific "closest" nodes may greatly vary, 378 leading to the possibility for the closest edge node becoming 379 overloaded, leading to a higher response time and therefore a delay 380 in responding to the auxiliary driving request with the possibility 381 of traffic delays or even traffic accidents occurring as a result. 382 Hence, in such cases, delay-insensitive services such as in-vehicle 383 entertainment should be dispatched to other light loaded nodes 384 instead of local edge nodes, so that the delay-sensitive service is 385 preferentially processed locally to ensure the service availability 386 and user experience. 388 In video recognition scenarios, when the number of waiting people and 389 vehicles increases, more computing resources are needed to process 390 the video content. For rush hour traffic congestion and weekend 391 personnel flow from the edge of a city to the city center, efficient 392 network and computing capacity scheduling is also required. Those 393 would cause the overload of the nearest edge sites if there is no 394 extra method used, and some of the service request flow might be 395 steered to others edge site except the nearest one. 397 3.3. Digital Twin 399 A number of industry associations, such as the Industrial Digital 400 Twin Association or the Digital Twin Consortium 401 (https://www.digitaltwinconsortium.org/), have been founded to 402 promote the concept of the Digital Twin (DT) for a number of use case 403 areas, such as smart cities, transportation, industrial control, 404 among others. The core concept of the DT is the "administrative 405 shell" [Industry4.0], which serves as a digital representation of the 406 information and technical functionality pertaining to the "assets" 407 (such as an industrial machinery, a transportation vehicle, an object 408 in a smart city or others) that is intended to be managed, 409 controlled, and actuated. 411 As an example for industrial control, the programmable logic 412 controller (PLC) may be virtualized and the functionality aggregated 413 across a number of physical assets into a single administrative shell 414 for the purpose of managing those assets. PLCs may be virtualized in 415 order to move the PLC capabilities from the physical assets to the 416 edge cloud. Several PLC instances may exist to enable load balancing 417 and fail-over capabilities, while also enabling physical mobility of 418 the asset and the connection to a suitable "nearby" PLC instance. 419 With this, traffic dynamicity may be similar to that observed in the 420 connected car scenario in the previous sub-section. Crucial here is 421 high availability and bounded latency since a failure of the 422 (overall) PLC functionality may lead to a production line stop, while 423 boundary violations of the latency may lead to loosing 424 synchronization with other processes and, ultimately, to production 425 faults, tool failures or similar. 427 Particular attention in Digital Twin scenarios is given to the 428 problem of data storage. Here, decentralization, not only driven by 429 the scenario (such as outlined in the connected car scenario for 430 cases of localized reasoning over data originating from driving 431 vehicles) but also through proposed platform solutions, such as those 432 in [GAIA-X], plays an important role. With decentralization, 433 endpoint relations between client and (storage) service instances may 434 frequently change as a result. 436 Digital twin for networks[I-D.zhou-nmrg-digitaltwin-network-concepts] 437 has also been proposed recently. It is to introduce digital twin 438 technology into the network to build a network system with physical 439 network entities and virtual twins, which can be mapped in real time. 440 The goal of digital twin network will be applied not only to 441 industrial Internet, but also to operator network. When the network 442 is large, it needs real-time scheduling ability, more efficient and 443 accurate data collection and modeling, and promote the automation, 444 intelligent operation and maintenance and upgrading of the network. 446 4. Problems in Existing Solutions 448 There are a number of problems that may occur when realizing the use 449 cases listed in the previous section. This section suggests a 450 classification for those problems to aid the possible identification 451 of solution components for addressing them. 453 4.1. Dynamicity of Relations 455 The mapping from a service identifier to a specific service instance 456 that may execute the service for a client usually happens through 457 resolving the service identification into a specific IP address at 458 which the service instance is reachable. 460 Application layer solutions can be foreseen, using an application 461 server to resolve binding updates. While the viability of these 462 solutions will generally depend on the additional latency that is 463 being introduced by the resolution via said application server, 464 frequencies down to changing relations every few (or indeed EVERY) 465 service requests is seen as difficult to be viable. 467 Message brokers, however, could be used, dispatching incoming service 468 requests from clients to a suitable service instance, where such 469 dispatching could be controlled by service-specific metrics, such as 470 computing load. The introduction of such brokers, however, may lead 471 to adverse effects on efficiency, specifically when it comes to 472 additional latencies due to the necessary communication with the 473 broker; we discuss this problem separately in the next subsection. 475 DNS[RFC1035] realizes an 'early binding' to explicitly bind from the 476 service identification to the network address before sending user 477 data, so the client creates an 'instance affinity' for the service 478 identifier that binds the client to the resolved service instance 479 address, which could also realize the load balancing. 481 However, we can foresee scenarios in which such 'instance affinity' 482 may change very frequently, possibly even at the level of each 483 service request. One such driver may be frequently changing metrics 484 for the decision making, such as latency and load of the involved 485 service instance. Also client mobility creates a natural/physical 486 dynamicity with the result that 'better' service instances may become 487 available and, vice versa, previous assignments of the client to a 488 service instance may be less optimal, leading to reduced performance, 489 such as through increased latency. 491 DNS is not designed for this level of dynamicity. Updates to the 492 mapping between service identifier to service instance address cannot 493 be pushed quickly enough into the DNS that takes several minutes 494 updates to propagate, and clients would need to frequently resolve 495 the original binding. If try to DNS to meet this level of 496 dynamicity, frequent resolving of the same service name would likely 497 lead to an overload of the it. These issues are also discussed in 498 Section 5.4 of [I-D.sarathchandra-coin-appcentres]. 500 A solution that leaves the dispatching of service requests entirely 501 to the client may be possible to achieve the needed dynamicity, but 502 with the drawback that the individual destinations, i.e., the network 503 identifiers for each service instance, must be known to the client 504 for doing so. While this may be viable for certain applications, it 505 cannot generally scale with a large number of clients. Furthermore, 506 it may be undesirable for every client to know all available service 507 instance identifiers, e.g., for reasons of not wanting to expose this 508 information to clients from the perspective of the service provider 509 but also, again, for scalability reasons if the number of service 510 instances is very high. 512 Existing solutions exhibit limitations in providing dynamic 'instance 513 affinity', those limitations being inherently linked to the design 514 used for the mapping between the service identifier and the address 515 of the service instance, particularly when relying on an indirection 516 point in the form of a resolution or load balancing server. These 517 limitations may lead to 'instance affinity' to last many requests or 518 even for the entire session between the client and the service, which 519 may be undesirable from the service provider perspective in terms of 520 best balance requests across many service instances. 522 4.2. Efficiency 524 The use of external resolvers, such as application layer repositories 525 in general, also affects the efficiency of the overall service 526 request. Additional signaling is required between client and 527 resolver, either through the application layer solution, which not 528 only leads to more messaging but also to increased latency for the 529 additional resolution. Accommodating smaller instance affinities 530 increases this additional signaling but also the latencies 531 experienced, overall impacting the efficiency of the overall service 532 transaction. 534 As mentioned in the previous subsection, broker systems could be used 535 to allow for dispatching service requests to different service 536 instances at high dynamicity. However, the usage of such broker 537 inevitably introduces 'path stretch' compared to the possible direct 538 path between client and service instance, increasing the overall flow 539 completion time. 541 Existing solutions may introduce additional latencies and 542 inefficiencies in packet transmission due to the need for additional 543 resolution steps or indirection points, and will lead to the 544 accuracy problems to select the appropriate edge. 546 4.3. Complexity and Accuracy 548 As we can see from the discussion on efficiency in the previous 549 subsection, the time when external resolvers collect the necessary 550 information and deal with it to select the edge nodes, the network 551 and computing resource status may change already. So any additional 552 control decision on which service instance to choose for which 553 incoming service request requires careful planning to keep potential 554 inefficiencies, caused by additional latencies and path stretch, at a 555 minimum. Additional control plane elements, such as brokers, are 556 usually neither well nor optimally placed in relation to the data 557 path that the service request will ultimately traverse. 559 Existing solutions require careful planning for the placement of 560 necessary control plane functions in relation to the resulting data 561 plane traffic to improve the accuracy; a problem often intractable in 562 scenarios of varying service demand. 564 4.4. Metric Exposure and Use 566 Some systems may use the geographical location, as deduced from IP 567 prefix, to pick the closest edge. The issue here may be that edges 568 may not be far apart in edge computing deployments, while it may also 569 be hard to deduce geo-location from IP addresses. Furthermore, the 570 geo-location may not be the key distinguishing metric to be 571 considered, particularly if geographic co-location does not 572 necessarily mean network topology co-location. Also, "closer 573 geographically" does not consider the computing load of possible 574 closer yet more loaded nodes, consequently leading to possibly worse 575 performance for the end user. 577 Solutions may also perform 'health checks' on an infrequent base 578 (>1s) to reflect the service node status and switch in fail-over 579 situations. Health checks, however, inadequately reflect an overall 580 computing status of a service instance. It may therefore not reflect 581 at all the decision basis a suitable service instance, e.g., based on 582 the number of ongoing sessions as an indicator of load. Infrequent 583 checks may also be too coarse in granularity, e.g., for supporting 584 mobility-induced dynamics such as the connected car scenario of 585 Section 3.2. 587 Service brokers may use richer computing metrics (such as load) but 588 may lack the necessary network metrics. 590 Existing solutions lack the necessary information to make the right 591 decision on the selection of the suitable service instance due to the 592 limited semantic or due to information not being exposed across 593 boundaries between, e.g., service and network provider. 595 4.5. Security 597 Resolution systems opens up two vectors of attack, namely attacking 598 the mapping system itself, as well as attacking the service instance 599 directly after having been resolved. The latter is particularly an 600 issue for a service provider who may deploy significant service 601 infrastructure since the resolved IP addresses will enable the client 602 to directly attack the service instance but also infer (over time) 603 information about available service instances in the service 604 infrastructure with the possibility of even wider and coordinated 605 Denial-of-Service (DoS) attacks. 607 Broker systems may prevent this ability by relying on a pure service 608 identifier only for the client to broker communication, thereby 609 hiding the direct communication to the service instance albeit at the 610 expense of the additional latency and inefficiencies discussed in 611 Section 4.1 and 4.2. DoS attacks here would be entirely limited to 612 the broker system only since the service instance is hidden by the 613 broker. 615 Existing solutions may expose control as well as data plane to the 616 possibility of a distributed Denial-of-Service attack on the 617 resolution system as well as service instance. Localizing the attack 618 to the data plane ingress point would be desirable from the 619 perspective of securing service request routing, which is not 620 achieved by existing solutions. 622 4.6. Changes to Infrastructure 624 Dedicated resolution systems, such as the DNS or broker-based 625 systems, require appropriate investments into their deployment. 626 While the DNS is an inherent part of the Internet infrastructure, its 627 inability to deal with the dynamicity in service instance relations, 628 as discussed in Section 4.1, may either require significant changes 629 to the DNS or the establishment of a separate infrastructure to 630 support the needed dynamicity. In a manner, the efforts on Multi- 631 Access Edge Computing [MEC], are proposing such additional 632 infrastructure albeit not solely for solving the problem of suitably 633 dispatching service requests to service instances (or application 634 servers, as called in [MEC]). 636 Existing solutions may expose control as well as data plane to the 637 possibility of a distributed Denial-of-Service attack on the 638 resolution system as well as service instance. Localizing the attack 639 to the data plane ingress point would be desirable from the 640 perspective of securing service request routing, which is not 641 achieved by existing solutions. 643 5. Conclusion 645 This document presents use cases in which we observe the demand for 646 considering the dynamic nature of service requests in terms of 647 requirements on the resources fulfilling them in the form of service 648 instances. In addition, those very service instances may themselves 649 be dynamic in availability and status, e.g., in terms of load or 650 experienced latency. 652 As a consequence, the problem of satisfying service-specific metrics 653 to allow for selecting the most suitable service instance among the 654 pool of instances available to the service throughout the network is 655 a challenge, with a number of observed problems in existing 656 solutions. The use cases as well as the categorization of the 657 observed problems may start the process of determining how they are 658 best satisfied within the IETF protocol suite or through suitable 659 extensions to that protocol suite. 661 6. Security Considerations 663 Section 4.5 discusses some security considerations. 665 7. IANA Considerations 667 This document does not make any IANA request. 669 8. Contributors 671 The following people have substantially contributed to this document: 673 Peter Willis 674 BT 676 9. Informative References 678 [RFC4786] Abley, J. and K. Lindqvist, "Operation of Anycast 679 Services", BCP 126, RFC 4786, DOI 10.17487/RFC4786, 680 December 2006, . 682 [RFC1035] Mockapetris, P., "Domain names - implementation and 683 specification", STD 13, RFC 1035, DOI 10.17487/RFC1035, 684 November 1987, . 686 [I-D.zhou-nmrg-digitaltwin-network-concepts] 687 Zhou, C., Yang, H., Duan, X., Lopez, D., Pastor, A., Wu, 688 Q., Boucadair, M., and C. Jacquenet, "Digital Twin 689 Network: Concepts and Reference Architecture", Work in 690 Progress, Internet-Draft, draft-zhou-nmrg-digitaltwin- 691 network-concepts-07, 5 March 2022, 692 . 695 [I-D.sarathchandra-coin-appcentres] 696 Trossen, D., Sarathchandra, C., and M. Boniface, "In- 697 Network Computing for App-Centric Micro-Services", Work in 698 Progress, Internet-Draft, draft-sarathchandra-coin- 699 appcentres-04, 26 January 2021, 700 . 703 [TR22.874] 3GPP, "Study on traffic characteristics and performance 704 requirements for AI/ML model transfer in 5GS (Release 705 18)", 2021. 707 [TR-466] BBF, "TR-466 Metro Compute Networking: Use Cases and High 708 Level Requirements", 2021. 710 [HORITA] Horita, Y., "Extended electronic horizon for automated 711 driving", Proceedings of 14th International Conference on 712 ITS Telecommunications (ITST)", 2015. 714 [Industry4.0] 715 Industry4.0, "Details of the Asset Administration Shell, 716 Part 1 & Part 2", 2020. 718 [GAIA-X] Gaia-X, ""GAIA-X: A Federated Data Infrastructure for 719 Europe"", 2021. 721 [MEC] ETSI, ""Multi-Access Edge Computing (MEC)"", 2021. 723 Acknowledgements 725 The author would like to thank Yizhou Li, Luigi IANNONE, Christian 726 Jacquenet, Kehan Yao and Yuexia Fu for their valuable suggestions to 727 this document. 729 Authors' Addresses 731 Peng Liu 732 China Mobile 733 Email: liupengyjy@chinamobile.com 735 Philip Eardley 736 British Telecom 737 Email: philip.eardley@bt.com 739 Dirk Trossen 740 Huawei Technologies 741 Email: dirk.trossen@huawei.com 743 Mohamed Boucadair 744 Orange 745 Email: mohamed.boucadair@orange.com 747 Luis M. Contreras 748 Telefonica 749 Email: luismiguel.contrerasmurillo@telefonica.com 751 Cheng Li 752 Huawei Technologies 753 Email: c.l@huawei.com