idnits 2.17.1 

draft-liu-dyncast-ps-usecases-03.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (7 March 2022) is 774 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Unused Reference: 'TR-466' is defined on line 707, but no explicit
     reference was found in the text


     Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	rtgwg                                                             P. Liu
3	Internet-Draft                                              China Mobile
4	Intended status: Informational                                P. Eardley
5	Expires: 8 September 2022                                British Telecom
6	                                                              D. Trossen
7	                                                     Huawei Technologies
8	                                                            M. Boucadair
9	                                                                  Orange
10	                                                           LM. Contreras
11	                                                              Telefonica
12	                                                                   C. Li
13	                                                     Huawei Technologies
14	                                                            7 March 2022

16	       Dynamic-Anycast (Dyncast) Use Cases and Problem Statement
17	                    draft-liu-dyncast-ps-usecases-03

19	Abstract

21	   Many service providers have been exploring distributed computing
22	   techniques to achieve better service response time and optimized
23	   energy consumption.  Such techniques rely upon the distribution of
24	   computing services and capabilities over many locations in the
25	   network, such as its edge, the metro region, virtualized central
26	   office, and other locations.  In such a distributed computing
27	   environment, providing services by utilizing computing resources
28	   hosted in various computing facilities (e.g., edges) is being
29	   considered, e.g., for computationally intensive and delay sensitive
30	   services.  Ideally, services should be computationally balanced using
31	   service-specific metrics instead of simply dispatching the service
32	   requests in a static way or optimizing solely connectivity metrics.
33	   For example, systematically directing end user-originated service
34	   requests to the geographically closest edge or some small computing
35	   units may lead to an unbalanced usage of computing resources, which
36	   may then degrade both the user experience and the overall service
37	   performance.

39	   This document provides an overview of scenarios and problems
40	   associated with realizing such scenarios, identifying key engineering
41	   investigation areas which require adequate architectures and
42	   protocols to achieve balanced computing and networking resource
43	   utilization among facilities providing the services.

45	Status of This Memo

47	   This Internet-Draft is submitted in full conformance with the
48	   provisions of BCP 78 and BCP 79.

50	   Internet-Drafts are working documents of the Internet Engineering
51	   Task Force (IETF).  Note that other groups may also distribute
52	   working documents as Internet-Drafts.  The list of current Internet-
53	   Drafts is at https://datatracker.ietf.org/drafts/current/.

55	   Internet-Drafts are draft documents valid for a maximum of six months
56	   and may be updated, replaced, or obsoleted by other documents at any
57	   time.  It is inappropriate to use Internet-Drafts as reference
58	   material or to cite them other than as "work in progress."

60	   This Internet-Draft will expire on 8 September 2022.

62	Copyright Notice

64	   Copyright (c) 2022 IETF Trust and the persons identified as the
65	   document authors.  All rights reserved.

67	   This document is subject to BCP 78 and the IETF Trust's Legal
68	   Provisions Relating to IETF Documents (https://trustee.ietf.org/
69	   license-info) in effect on the date of publication of this document.
70	   Please review these documents carefully, as they describe your rights
71	   and restrictions with respect to this document.  Code Components
72	   extracted from this document must include Revised BSD License text as
73	   described in Section 4.e of the Trust Legal Provisions and are
74	   provided without warranty as described in the Revised BSD License.

76	Table of Contents

78	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
79	   2.  Definition of Terms . . . . . . . . . . . . . . . . . . . . .   4
80	   3.  Sample Use Cases  . . . . . . . . . . . . . . . . . . . . . .   5
81	     3.1.  Cloud Virtual Reality (VR) or Augmented Reality (AR)  . .   6
82	     3.2.  Intelligent Transportation  . . . . . . . . . . . . . . .   8
83	     3.3.  Digital Twin  . . . . . . . . . . . . . . . . . . . . . .   9
84	   4.  Problems in Existing Solutions  . . . . . . . . . . . . . . .  10
85	     4.1.  Dynamicity of Relations . . . . . . . . . . . . . . . . .  10
86	     4.2.  Efficiency  . . . . . . . . . . . . . . . . . . . . . . .  12
87	     4.3.  Complexity and Accuracy . . . . . . . . . . . . . . . . .  12
88	     4.4.  Metric Exposure and Use . . . . . . . . . . . . . . . . .  13
89	     4.5.  Security  . . . . . . . . . . . . . . . . . . . . . . . .  13
90	     4.6.  Changes to Infrastructure . . . . . . . . . . . . . . . .  14
91	   5.  Conclusion  . . . . . . . . . . . . . . . . . . . . . . . . .  14
92	   6.  Security Considerations . . . . . . . . . . . . . . . . . . .  15
93	   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  15
94	   8.  Contributors  . . . . . . . . . . . . . . . . . . . . . . . .  15
95	   9.  Informative References  . . . . . . . . . . . . . . . . . . .  15
96	   Acknowledgements  . . . . . . . . . . . . . . . . . . . . . . . .  16
97	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  16

99	1.  Introduction

101	   Edge computing aims to provide better response times and transfer
102	   rates compared to Cloud Computing, by moving the computing towards
103	   the edge of a network.  Edge computing can be built on embedded
104	   systems, gateways, and others, all being located close to end users'
105	   premises.  There is an emerging requirement that multiple edge sites
106	   (called "edges", for for short) are deployed at different locations
107	   to provide a service.  There are millions of home gateways, thousands
108	   of base stations, and hundreds of central offices in a city that can
109	   serve as candidate edges for behaving as service nodes.  Depending on
110	   the location of an edge and its capacity, different computing
111	   resources can be contributed by each edge to deliver a service.  At
112	   peak hours, computing resources attached to a client's closest edge
113	   may not be sufficient to handle all the incoming service requests.
114	   Longer response times or even dropping of requests can be experienced
115	   by users.  Increasing the computing resources hosted on each edge to
116	   the potential maximum capacity is neither feasible nor economically
117	   viable in many cases.

119	   Some user devices are battery-dependent.  Offloading computation
120	   intensive processing to the edge can save battery power.  Moreover,
121	   the edge may use a data set (for the computation) that may not exist
122	   on the user device because of the size of data pool or due to data
123	   governance reasons.

125	   At the same time, with new technologies such as serverless computing
126	   and container based virtual functions, the service node at an edge
127	   can be easily created and terminated in a sub-second scale, which in
128	   turn changes the availability of a computing resources for a service
129	   dramatically over time, therefore impacting the possibly "best"
130	   decision on where to send a service request from a client.

132	   Traditional techniques to manage the overall load balancing process
133	   of clients issuing requests include choose-the-closest or round-
134	   robin.  Those solutions are relatively static, which may cause an
135	   unbalanced distribution in terms of network load and computational
136	   load among available sources.  For example, DNS-based load balancing
137	   usually configures a domain in the Domain Name System (DNS) such that
138	   client requests to that domain name are distributed across a group of
139	   servers.  It usually provides several IP addresses for a domain name.

141	   There are some dynamic solutions to distribute the requests to the
142	   server that best fits a service-specific metric, such as the best
143	   available resources and minimal load.  They usually require Layer 4 -
144	   Layer 7 handling of the packet processing, such as through DNS-based
145	   or indirection servers.  Such an approach is inefficient for large
146	   number of short connections.  At the same time, such approaches can
147	   often not retrieve the desired metric, such as the network status, in
148	   real time.  Therefore, the choice of the service node is almost
149	   entirely determined by the computing status, rather than the
150	   comprehensive considerations of both computing and network metrics or
151	   makes rather long-term decisions due to the (upper layer) overhead in
152	   the decision making itself.

154	   Distributing service requests to a specific service having multiple
155	   instances attached to multiple edges, while taking into account
156	   computing as well as service-specific metrics in the distribution
157	   decision, is seen as a dynamic anycast (or "dyncast", for short)
158	   problem of sending service requests, without prescribing the use of a
159	   routing solution.

161	   As a problem statement, this document describes sample usage
162	   scenarios as well as key areas in which current solutions lead to
163	   problems that ultimately affect the deployment (including the
164	   performance) of edge services.  Those key areas target the
165	   identification of candidate solution components.

167	2.  Definition of Terms

169	   This document makes use of the following terms:

171	   Service:  A monolithic functionality that is provided by an endpoint
172	     according to the specification for said service.  A composite
173	     service can be built by orchestrating monolithic services.

175	   Service instance:  Running environment (e.g., a node) that makes the
176	     functionality of a service available.  One service can have several
177	     instances running at different network locations.

179	   Service identifier:  Used to uniquely identify a service, at the same
180	     time identifying the whole set of service instances that each
181	     represent the same service behavior, no matter where those service
182	     instances are running.

184	   Anycast:  An addressing and packet forwarding approach that assigns
185	     an "anycast" identifier for one or more service instances to which
186	     requests to an "anycast" identifier could be routed/forwarded,
187	     following the definition in[RFC4786] as anycast being "the practice
188	     of making a particular Service Address available in multiple,
189	     discrete, autonomous locations, such that datagrams sent are routed
190	     to one of several available locations".

192	   Dyncast:  Dynamic Anycast, taking the dynamic nature of computing
193	     resource metrics into account to steer an anycast-like decision in
194	     sending an incoming service request.

196	3.  Sample Use Cases

198	   This section presents a non-exhaustive list of scenarios which
199	   require multiple edge sites to interconnect and to coordinate at the
200	   network layer to meet the service requirements and ensure better user
201	   experience.

203	   Before outlining the use cases, however, let us describe a basic
204	   model that we assume through which those use cases are being
205	   realized.  This model justifies the choice of the terminology
206	   introduced in Section 2.

208	   We assume that clients access one or more services with an objective
209	   to meet a desired user experience.  Each participating service may be
210	   realized at one or more places in the network (called, service
211	   instances).  Such service instances are instantiated and deployed as
212	   part of the overall service deployment process, e.g., using existing
213	   orchestration frameworks, within so-called edge sites, which in turn
214	   are reachable through a network infrastructure via an egress router.

216	   When a client issues a service request to a required service, the
217	   request is being steered by its ingress router to one of the
218	   available service instances that realize the requested service.  Each
219	   service instance may act as a client towards another service, thereby
220	   seeing its own outbound traffic steered to a suitable service
221	   instance of the request service and so on, achieving service
222	   composition and chaining as a result.

224	   The aforementioned selection of one of candidate service instances is
225	   done using traffic steering methods , where the steering decision may
226	   take into account pre-planned policies (assignment of certain clients
227	   to certain service instances), realize shortest-path to the 'closest'
228	   service instance, or utilize more complex and possibly dynamic metric
229	   information, such as load of service instances, latencies experienced
230	   or similar, for a more dynamic selection of a suitable service
231	   instance.

233	   It is important to note that clients may move throughout the
234	   execution of a service, which may, as a result, position other
235	   service instance 'better' in terms of latency, load, or other
236	   metrics.  This creates a (physical) dynamicity that will need to be
237	   catered for.

239	   Apart from the input into the traffic steering decision, under the
240	   aforementioned constraint of possible client mobility, its
241	   realization may differ in terms of the layer of the protocol stack at
242	   which the needed operations for the decision are implemented.
243	   Possible layers are application, transport, or network layers.
244	   Section 4 discusses some choice realization issues.

246	   As a summary, Figure 1 outlines the main aspects of the assumed
247	   system model for realizing the use cases that follow next.

249	        +------------+      +------------+       +------------+
250	      +------------+ |    +------------+ |     +------------+ |
251	      |    edge    | |    |    edge    | |     |    edge    | |
252	      |   site 1   |-+    |   site 2   |-+     |   site 3   |-+
253	      +-----+------+      +------+-----+       +------+-----+
254	            |                    |                    |
255	       +----+-----+        +-----+----+         +-----+----+
256	       | Router 1 |        | Router 2 |         | Router 3 |
257	       +----+-----+        +-----+----+         +-----+----+
258	            |                    |                    |
259	            |           +--------+--------+           |
260	            |           |                 |           |
261	            +-----------|  Infrastructure |-----------+
262	                        |                 |
263	                        +--------+--------+
264	                                 |
265	                            +----+----+
266	                            | Ingress |
267	            +---------------|  Router |--------------+
268	            |               +----+----+              |
269	            |                    |                   |
270	         +--+--+              +--+---+           +---+--+
271	       +------+|            +------+ |         +------+ |
272	       |client|+            |client|-+         |client|-+
273	       +------+             +------+           +------+

275	                      Figure 1: Dyncast Use Case Model

277	3.1.  Cloud Virtual Reality (VR) or Augmented Reality (AR)

279	   Cloud VR/AR services are used in some exhibitions, scenic spots, and
280	   celebration ceremonies.  In the future, they might be used in more
281	   applications, such as industrial internet, medical industry, and meta
282	   verse.

284	   Cloud VR/AR introduces the concept of cloud computing to the
285	   rendering of audiovisual assets in such applications.  Here, the edge
286	   cloud helps encode/decode and render content.  The end device usually
287	   only uploads posture or control information to the edge and then VR/
288	   AR contents are rendered in the edge cloud.  The video and audio
289	   outputs generated from the edge cloud are encoded, compressed, and
290	   transmitted back to the end device or further transmitted to central
291	   data center via high bandwidth networks.

293	   Edge sites may use CPU or GPU for encode/decode.  GPU usually has
294	   better performance but CPU is simpler and more straightforward to use
295	   as well as possibly more widespread in deployment.  Available
296	   remaining resources determines if a service instance can be started.
297	   The instance's CPU, GPU and memory utilization has a high impact on
298	   the processing delay on encoding, decoding and rendering.  At the
299	   same time, the network path quality to the edge site is a key for
300	   user experience of quality of audio/ video and input command response
301	   times.

303	   A Cloud VR service, such as a mobile gaming service, brings
304	   challenging requirements to both network and computing so that the
305	   edge node to serve a service request has to be carefully selected to
306	   make sure it has sufficient computing resource and good network path.
307	   For example, for an entry-level Cloud VR (panoramic 8K 2D video) with
308	   110-degree Field of View (FOV) transmission, the typical network
309	   requirements are bandwidth 40Mbps, 20ms for motion-to-photon latency,
310	   packet loss rate is 2.4E-5; the typical computing requirements are 8K
311	   H.265 real-time decoding, 2K H.264 real-time encoding.  We can
312	   further divide the 20ms latency budget into (i) sensor sampling
313	   delay, (ii) image/frame rendering delay, (iii) display refresh delay,
314	   and (iv) network delay.  With upcoming high display refresh rate
315	   (e.g., 144Hz) and GPU resources being used for frame rendering, we
316	   can expect an upper bound of roughly 5ms for the round-trip latency
317	   in these scenarios, which is close to the frame rendering computing
318	   delay.

320	   Furthermore, specific techniques may be employed to divide the
321	   overall rendering into base assets that are common across a number of
322	   clients participating in the service, while the client-specific input
323	   data is being utilized to render additional assets.  When being
324	   delivered to the client, those two assets are being combined into the
325	   overall content being consumed by the client.  The requirements for
326	   sending the client input data as well as the requests for the base
327	   assets may be different in terms of which service instances may serve
328	   the request, where base assets may be served from any nearby service
329	   instance (since those base assets may be served without requiring
330	   cross-request state being maintained), while the client-specific
331	   input data is being processed by a stateful service instance that
332	   changes, if at all, only slowly over time due to the stickiness of
333	   the service that is being created by the client-specific data.  Other
334	   splits of rendering and input tasks can be found in[TR22.874] for
335	   further reading.

337	   When it comes to the service instances themselves, those may be
338	   instantiated on-demand, e.g., driven by network or client demand
339	   metrics, while resources may also be released, e.g., after an idle
340	   timeout, to free up resources for other services.  Depending on the
341	   utilized node technologies, the lifetime of such "function as a
342	   service" may range from many minutes down to millisecond scale.
343	   Therefore computing resources across participating edges exhibit a
344	   distributed (in terms of locations) as well as dynamic (in terms of
345	   resource availability) nature.  In order to achieve a satisfying
346	   service quality to end users, a service request will need to be sent
347	   to and served by an edge with sufficient computing resource and a
348	   good network path.

350	3.2.  Intelligent Transportation

352	   For the convenience of transportation, more video capture devices are
353	   required to be deployed as urban infrastructure, and the better video
354	   quality is also required to facilitate the content analysis.  So, the
355	   transmission capacity of the network will need to be further
356	   increased, and the collected video data needs to be further
357	   processed, such as for pedestrian face recognition, vehicle moving
358	   track recognition, and prediction.  This, in turn, also impacts the
359	   requirements for the video processing capacity of computing nodes.

361	   In auxiliary driving scenarios, to help overcome the non-line-of-
362	   sight problem due to blind spot or obstacles, the edge node can
363	   collect comprehensive road and traffic information around the vehicle
364	   location and perform data processing, and then vehicles with high
365	   security risk can be warned accordingly, improving driving safety in
366	   complicated road conditions, like at intersections.  This scenario is
367	   also called "Electronic Horizon", as explained in[HORITA].  For
368	   instance, video image information captured by, e.g., an in-car,
369	   camera is transmitted to the nearest edge node for processing.  The
370	   notion of sending the request to the "nearest" edge node is important
371	   for being able to collate the video information of "nearby" cars,
372	   using, for instance, relative location information.  Furthermore,
373	   data privacy may lead to the requirement to process the data as close
374	   to the source as possible to limit data spread across too many
375	   network components in the network.

377	   Nevertheless, load at specific "closest" nodes may greatly vary,
378	   leading to the possibility for the closest edge node becoming
379	   overloaded, leading to a higher response time and therefore a delay
380	   in responding to the auxiliary driving request with the possibility
381	   of traffic delays or even traffic accidents occurring as a result.
382	   Hence, in such cases, delay-insensitive services such as in-vehicle
383	   entertainment should be dispatched to other light loaded nodes
384	   instead of local edge nodes, so that the delay-sensitive service is
385	   preferentially processed locally to ensure the service availability
386	   and user experience.

388	   In video recognition scenarios, when the number of waiting people and
389	   vehicles increases, more computing resources are needed to process
390	   the video content.  For rush hour traffic congestion and weekend
391	   personnel flow from the edge of a city to the city center, efficient
392	   network and computing capacity scheduling is also required.  Those
393	   would cause the overload of the nearest edge sites if there is no
394	   extra method used, and some of the service request flow might be
395	   steered to others edge site except the nearest one.

397	3.3.  Digital Twin

399	   A number of industry associations, such as the Industrial Digital
400	   Twin Association or the Digital Twin Consortium
401	   (https://www.digitaltwinconsortium.org/), have been founded to
402	   promote the concept of the Digital Twin (DT) for a number of use case
403	   areas, such as smart cities, transportation, industrial control,
404	   among others.  The core concept of the DT is the "administrative
405	   shell" [Industry4.0], which serves as a digital representation of the
406	   information and technical functionality pertaining to the "assets"
407	   (such as an industrial machinery, a transportation vehicle, an object
408	   in a smart city or others) that is intended to be managed,
409	   controlled, and actuated.

411	   As an example for industrial control, the programmable logic
412	   controller (PLC) may be virtualized and the functionality aggregated
413	   across a number of physical assets into a single administrative shell
414	   for the purpose of managing those assets.  PLCs may be virtualized in
415	   order to move the PLC capabilities from the physical assets to the
416	   edge cloud.  Several PLC instances may exist to enable load balancing
417	   and fail-over capabilities, while also enabling physical mobility of
418	   the asset and the connection to a suitable "nearby" PLC instance.
419	   With this, traffic dynamicity may be similar to that observed in the
420	   connected car scenario in the previous sub-section.  Crucial here is
421	   high availability and bounded latency since a failure of the
422	   (overall) PLC functionality may lead to a production line stop, while
423	   boundary violations of the latency may lead to loosing
424	   synchronization with other processes and, ultimately, to production
425	   faults, tool failures or similar.

427	   Particular attention in Digital Twin scenarios is given to the
428	   problem of data storage.  Here, decentralization, not only driven by
429	   the scenario (such as outlined in the connected car scenario for
430	   cases of localized reasoning over data originating from driving
431	   vehicles) but also through proposed platform solutions, such as those
432	   in [GAIA-X], plays an important role.  With decentralization,
433	   endpoint relations between client and (storage) service instances may
434	   frequently change as a result.

436	   Digital twin for networks[I-D.zhou-nmrg-digitaltwin-network-concepts]
437	   has also been proposed recently.  It is to introduce digital twin
438	   technology into the network to build a network system with physical
439	   network entities and virtual twins, which can be mapped in real time.
440	   The goal of digital twin network will be applied not only to
441	   industrial Internet, but also to operator network.  When the network
442	   is large, it needs real-time scheduling ability, more efficient and
443	   accurate data collection and modeling, and promote the automation,
444	   intelligent operation and maintenance and upgrading of the network.

446	4.  Problems in Existing Solutions

448	   There are a number of problems that may occur when realizing the use
449	   cases listed in the previous section.  This section suggests a
450	   classification for those problems to aid the possible identification
451	   of solution components for addressing them.

453	4.1.  Dynamicity of Relations

455	   The mapping from a service identifier to a specific service instance
456	   that may execute the service for a client usually happens through
457	   resolving the service identification into a specific IP address at
458	   which the service instance is reachable.

460	   Application layer solutions can be foreseen, using an application
461	   server to resolve binding updates.  While the viability of these
462	   solutions will generally depend on the additional latency that is
463	   being introduced by the resolution via said application server,
464	   frequencies down to changing relations every few (or indeed EVERY)
465	   service requests is seen as difficult to be viable.

467	   Message brokers, however, could be used, dispatching incoming service
468	   requests from clients to a suitable service instance, where such
469	   dispatching could be controlled by service-specific metrics, such as
470	   computing load.  The introduction of such brokers, however, may lead
471	   to adverse effects on efficiency, specifically when it comes to
472	   additional latencies due to the necessary communication with the
473	   broker; we discuss this problem separately in the next subsection.

475	   DNS[RFC1035] realizes an 'early binding' to explicitly bind from the
476	   service identification to the network address before sending user
477	   data, so the client creates an 'instance affinity' for the service
478	   identifier that binds the client to the resolved service instance
479	   address, which could also realize the load balancing.

481	   However, we can foresee scenarios in which such 'instance affinity'
482	   may change very frequently, possibly even at the level of each
483	   service request.  One such driver may be frequently changing metrics
484	   for the decision making, such as latency and load of the involved
485	   service instance.  Also client mobility creates a natural/physical
486	   dynamicity with the result that 'better' service instances may become
487	   available and, vice versa, previous assignments of the client to a
488	   service instance may be less optimal, leading to reduced performance,
489	   such as through increased latency.

491	   DNS is not designed for this level of dynamicity.  Updates to the
492	   mapping between service identifier to service instance address cannot
493	   be pushed quickly enough into the DNS that takes several minutes
494	   updates to propagate, and clients would need to frequently resolve
495	   the original binding.  If try to DNS to meet this level of
496	   dynamicity, frequent resolving of the same service name would likely
497	   lead to an overload of the it.  These issues are also discussed in
498	   Section 5.4 of [I-D.sarathchandra-coin-appcentres].

500	   A solution that leaves the dispatching of service requests entirely
501	   to the client may be possible to achieve the needed dynamicity, but
502	   with the drawback that the individual destinations, i.e., the network
503	   identifiers for each service instance, must be known to the client
504	   for doing so.  While this may be viable for certain applications, it
505	   cannot generally scale with a large number of clients.  Furthermore,
506	   it may be undesirable for every client to know all available service
507	   instance identifiers, e.g., for reasons of not wanting to expose this
508	   information to clients from the perspective of the service provider
509	   but also, again, for scalability reasons if the number of service
510	   instances is very high.

512	   Existing solutions exhibit limitations in providing dynamic 'instance
513	   affinity', those limitations being inherently linked to the design
514	   used for the mapping between the service identifier and the address
515	   of the service instance, particularly when relying on an indirection
516	   point in the form of a resolution or load balancing server.  These
517	   limitations may lead to 'instance affinity' to last many requests or
518	   even for the entire session between the client and the service, which
519	   may be undesirable from the service provider perspective in terms of
520	   best balance requests across many service instances.

522	4.2.  Efficiency

524	   The use of external resolvers, such as application layer repositories
525	   in general, also affects the efficiency of the overall service
526	   request.  Additional signaling is required between client and
527	   resolver, either through the application layer solution, which not
528	   only leads to more messaging but also to increased latency for the
529	   additional resolution.  Accommodating smaller instance affinities
530	   increases this additional signaling but also the latencies
531	   experienced, overall impacting the efficiency of the overall service
532	   transaction.

534	   As mentioned in the previous subsection, broker systems could be used
535	   to allow for dispatching service requests to different service
536	   instances at high dynamicity.  However, the usage of such broker
537	   inevitably introduces 'path stretch' compared to the possible direct
538	   path between client and service instance, increasing the overall flow
539	   completion time.

541	   Existing solutions may introduce additional latencies and
542	   inefficiencies in packet transmission due to the need for additional
543	   resolution steps or indirection points&#65292; and will lead to the
544	   accuracy problems to select the appropriate edge.

546	4.3.  Complexity and Accuracy

548	   As we can see from the discussion on efficiency in the previous
549	   subsection, the time when external resolvers collect the necessary
550	   information and deal with it to select the edge nodes, the network
551	   and computing resource status may change already.  So any additional
552	   control decision on which service instance to choose for which
553	   incoming service request requires careful planning to keep potential
554	   inefficiencies, caused by additional latencies and path stretch, at a
555	   minimum.  Additional control plane elements, such as brokers, are
556	   usually neither well nor optimally placed in relation to the data
557	   path that the service request will ultimately traverse.

559	   Existing solutions require careful planning for the placement of
560	   necessary control plane functions in relation to the resulting data
561	   plane traffic to improve the accuracy; a problem often intractable in
562	   scenarios of varying service demand.

564	4.4.  Metric Exposure and Use

566	   Some systems may use the geographical location, as deduced from IP
567	   prefix, to pick the closest edge.  The issue here may be that edges
568	   may not be far apart in edge computing deployments, while it may also
569	   be hard to deduce geo-location from IP addresses.  Furthermore, the
570	   geo-location may not be the key distinguishing metric to be
571	   considered, particularly if geographic co-location does not
572	   necessarily mean network topology co-location.  Also, "closer
573	   geographically" does not consider the computing load of possible
574	   closer yet more loaded nodes, consequently leading to possibly worse
575	   performance for the end user.

577	   Solutions may also perform 'health checks' on an infrequent base
578	   (>1s) to reflect the service node status and switch in fail-over
579	   situations.  Health checks, however, inadequately reflect an overall
580	   computing status of a service instance.  It may therefore not reflect
581	   at all the decision basis a suitable service instance, e.g., based on
582	   the number of ongoing sessions as an indicator of load.  Infrequent
583	   checks may also be too coarse in granularity, e.g., for supporting
584	   mobility-induced dynamics such as the connected car scenario of
585	   Section 3.2.

587	   Service brokers may use richer computing metrics (such as load) but
588	   may lack the necessary network metrics.

590	   Existing solutions lack the necessary information to make the right
591	   decision on the selection of the suitable service instance due to the
592	   limited semantic or due to information not being exposed across
593	   boundaries between, e.g., service and network provider.

595	4.5.  Security

597	   Resolution systems opens up two vectors of attack, namely attacking
598	   the mapping system itself, as well as attacking the service instance
599	   directly after having been resolved.  The latter is particularly an
600	   issue for a service provider who may deploy significant service
601	   infrastructure since the resolved IP addresses will enable the client
602	   to directly attack the service instance but also infer (over time)
603	   information about available service instances in the service
604	   infrastructure with the possibility of even wider and coordinated
605	   Denial-of-Service (DoS) attacks.

607	   Broker systems may prevent this ability by relying on a pure service
608	   identifier only for the client to broker communication, thereby
609	   hiding the direct communication to the service instance albeit at the
610	   expense of the additional latency and inefficiencies discussed in
611	   Section 4.1 and 4.2.  DoS attacks here would be entirely limited to
612	   the broker system only since the service instance is hidden by the
613	   broker.

615	   Existing solutions may expose control as well as data plane to the
616	   possibility of a distributed Denial-of-Service attack on the
617	   resolution system as well as service instance.  Localizing the attack
618	   to the data plane ingress point would be desirable from the
619	   perspective of securing service request routing, which is not
620	   achieved by existing solutions.

622	4.6.  Changes to Infrastructure

624	   Dedicated resolution systems, such as the DNS or broker-based
625	   systems, require appropriate investments into their deployment.
626	   While the DNS is an inherent part of the Internet infrastructure, its
627	   inability to deal with the dynamicity in service instance relations,
628	   as discussed in Section 4.1, may either require significant changes
629	   to the DNS or the establishment of a separate infrastructure to
630	   support the needed dynamicity.  In a manner, the efforts on Multi-
631	   Access Edge Computing [MEC], are proposing such additional
632	   infrastructure albeit not solely for solving the problem of suitably
633	   dispatching service requests to service instances (or application
634	   servers, as called in [MEC]).

636	   Existing solutions may expose control as well as data plane to the
637	   possibility of a distributed Denial-of-Service attack on the
638	   resolution system as well as service instance.  Localizing the attack
639	   to the data plane ingress point would be desirable from the
640	   perspective of securing service request routing, which is not
641	   achieved by existing solutions.

643	5.  Conclusion

645	   This document presents use cases in which we observe the demand for
646	   considering the dynamic nature of service requests in terms of
647	   requirements on the resources fulfilling them in the form of service
648	   instances.  In addition, those very service instances may themselves
649	   be dynamic in availability and status, e.g., in terms of load or
650	   experienced latency.

652	   As a consequence, the problem of satisfying service-specific metrics
653	   to allow for selecting the most suitable service instance among the
654	   pool of instances available to the service throughout the network is
655	   a challenge, with a number of observed problems in existing
656	   solutions.  The use cases as well as the categorization of the
657	   observed problems may start the process of determining how they are
658	   best satisfied within the IETF protocol suite or through suitable
659	   extensions to that protocol suite.

661	6.  Security Considerations

663	   Section 4.5 discusses some security considerations.

665	7.  IANA Considerations

667	   This document does not make any IANA request.

669	8.  Contributors

671	   The following people have substantially contributed to this document:

673	           Peter Willis
674	           BT

676	9.  Informative References

678	   [RFC4786]  Abley, J. and K. Lindqvist, "Operation of Anycast
679	              Services", BCP 126, RFC 4786, DOI 10.17487/RFC4786,
680	              December 2006, <https://www.rfc-editor.org/info/rfc4786>.

682	   [RFC1035]  Mockapetris, P., "Domain names - implementation and
683	              specification", STD 13, RFC 1035, DOI 10.17487/RFC1035,
684	              November 1987, <https://www.rfc-editor.org/info/rfc1035>.

686	   [I-D.zhou-nmrg-digitaltwin-network-concepts]
687	              Zhou, C., Yang, H., Duan, X., Lopez, D., Pastor, A., Wu,
688	              Q., Boucadair, M., and C. Jacquenet, "Digital Twin
689	              Network: Concepts and Reference Architecture", Work in
690	              Progress, Internet-Draft, draft-zhou-nmrg-digitaltwin-
691	              network-concepts-07, 5 March 2022,
692	              <https://www.ietf.org/archive/id/draft-zhou-nmrg-
693	              digitaltwin-network-concepts-07.txt>.

695	   [I-D.sarathchandra-coin-appcentres]
696	              Trossen, D., Sarathchandra, C., and M. Boniface, "In-
697	              Network Computing for App-Centric Micro-Services", Work in
698	              Progress, Internet-Draft, draft-sarathchandra-coin-
699	              appcentres-04, 26 January 2021,
700	              <https://www.ietf.org/archive/id/draft-sarathchandra-coin-
701	              appcentres-04.txt>.

703	   [TR22.874] 3GPP, "Study on traffic characteristics and performance
704	              requirements for AI/ML model transfer in 5GS (Release
705	              18)", 2021.

707	   [TR-466]   BBF, "TR-466 Metro Compute Networking: Use Cases and High
708	              Level Requirements", 2021.

710	   [HORITA]   Horita, Y., "Extended electronic horizon for automated
711	              driving", Proceedings of 14th International Conference on
712	              ITS Telecommunications (ITST)", 2015.

714	   [Industry4.0]
715	              Industry4.0, "Details of the Asset Administration Shell,
716	              Part 1 & Part 2", 2020.

718	   [GAIA-X]   Gaia-X, ""GAIA-X: A Federated Data Infrastructure for
719	              Europe"", 2021.

721	   [MEC]      ETSI, ""Multi-Access Edge Computing (MEC)"", 2021.

723	Acknowledgements

725	   The author would like to thank Yizhou Li, Luigi IANNONE, Christian
726	   Jacquenet, Kehan Yao and Yuexia Fu for their valuable suggestions to
727	   this document.

729	Authors' Addresses

731	   Peng Liu
732	   China Mobile
733	   Email: liupengyjy@chinamobile.com

735	   Philip Eardley
736	   British Telecom
737	   Email: philip.eardley@bt.com

739	   Dirk Trossen
740	   Huawei Technologies
741	   Email: dirk.trossen@huawei.com

743	   Mohamed Boucadair
744	   Orange
745	   Email: mohamed.boucadair@orange.com

747	   Luis M. Contreras
748	   Telefonica
749	   Email: luismiguel.contrerasmurillo@telefonica.com

751	   Cheng Li
752	   Huawei Technologies
753	   Email: c.l@huawei.com