idnits 2.17.1 

draft-geng-rtgwg-cfn-dyncast-ps-usecase-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (October 30, 2020) is 1272 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Outdated reference: A later version (-04) exists of
     draft-sarathchandra-coin-appcentres-03


     Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	rtgwg                                                            L. Geng
3	Internet-Draft                                                    P. Liu
4	Intended status: Informational                              China Mobile
5	Expires: May 3, 2021                                           P. Willis
6	                                                                      BT
7	                                                        October 30, 2020

9	Dynamic-Anycast in Compute First Networking (CFN-Dyncast) Use Cases and
10	                           Problem Statement
11	               draft-geng-rtgwg-cfn-dyncast-ps-usecase-00

13	Abstract

15	   Service providers are exploring the edge computing to achieve better
16	   response time, control over data and carbon energy saving by moving
17	   the computing services towards the edge of the network in scenarios
18	   of 5G MEC (Multi-access Edge Computing), virtualized central office,
19	   and others.  Providing services by sharing computing resources from
20	   multiple edges is emerging and becoming more and more useful for
21	   computationally intensive tasks.  The service nodes attached to
22	   multiple edges normally have two key features, service equivalency
23	   and service dynamism.  Ideally they should serve the service in a
24	   computational balanced way.  However lots of approaches dispatch the
25	   service in a static way, e.g., to the geographically closest edge,
26	   and they may cause unbalanced usage of computing resources at edges
27	   which further degrades user experience and system utilization.  This
28	   draft provides an overview of scenarios and problems associated.

30	   Networking taking account of computing resource metrics as one of its
31	   top parameters is called Compute First Networking (CFN) in this
32	   document.  The document identifies several key areas which require
33	   more investigations in architecture and protocol to achieve the
34	   balanced computing and networking resource utilization among edges in
35	   CFN.

37	Status of This Memo

39	   This Internet-Draft is submitted in full conformance with the
40	   provisions of BCP 78 and BCP 79.

42	   Internet-Drafts are working documents of the Internet Engineering
43	   Task Force (IETF).  Note that other groups may also distribute
44	   working documents as Internet-Drafts.  The list of current Internet-
45	   Drafts is at https://datatracker.ietf.org/drafts/current/.

47	   Internet-Drafts are draft documents valid for a maximum of six months
48	   and may be updated, replaced, or obsoleted by other documents at any
49	   time.  It is inappropriate to use Internet-Drafts as reference
50	   material or to cite them other than as "work in progress."

52	   This Internet-Draft will expire on May 3, 2021.

54	Copyright Notice

56	   Copyright (c) 2020 IETF Trust and the persons identified as the
57	   document authors.  All rights reserved.

59	   This document is subject to BCP 78 and the IETF Trust's Legal
60	   Provisions Relating to IETF Documents
61	   (https://trustee.ietf.org/license-info) in effect on the date of
62	   publication of this document.  Please review these documents
63	   carefully, as they describe your rights and restrictions with respect
64	   to this document.  Code Components extracted from this document must
65	   include Simplified BSD License text as described in Section 4.e of
66	   the Trust Legal Provisions and are provided without warranty as
67	   described in the Simplified BSD License.

69	Table of Contents

71	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
72	   2.  Definition of Terms . . . . . . . . . . . . . . . . . . . . .   4
73	   3.  Main Use-Cases  . . . . . . . . . . . . . . . . . . . . . . .   4
74	     3.1.  Cloud Based Recognition in Augmented Reality (AR) . . . .   4
75	     3.2.  Connected Car . . . . . . . . . . . . . . . . . . . . . .   5
76	     3.3.  Cloud Virtual Reality (VR)  . . . . . . . . . . . . . . .   5
77	   4.  Requirements  . . . . . . . . . . . . . . . . . . . . . . . .   6
78	   5.  Problems Statement  . . . . . . . . . . . . . . . . . . . . .   6
79	     5.1.  Anycast based service addressing methodology  . . . . . .   7
80	     5.2.  Flow affinity . . . . . . . . . . . . . . . . . . . . . .   7
81	     5.3.  Computing Aware Routing . . . . . . . . . . . . . . . . .   8
82	   6.  Summary . . . . . . . . . . . . . . . . . . . . . . . . . . .   8
83	   7.  Security Considerations . . . . . . . . . . . . . . . . . . .   9
84	   8.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   9
85	   9.  Informative References  . . . . . . . . . . . . . . . . . . .   9
86	   Acknowledgements  . . . . . . . . . . . . . . . . . . . . . . . .   9
87	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .   9

89	1.  Introduction

91	   Edge computing aims to provide better response times and transfer
92	   rate, with respect to Cloud Computing, by moving the computing
93	   towards the edge of the network.  Edge computing can be built on
94	   industrial PCs, embedded systems, gateways and others.  They are put
95	   close to the end user.  There is an emerging requirement that
96	   multiple edge sites (called edges too in this document) are deployed
97	   at different locations to provide the service.  There are millions of
98	   home gateways, thousands of base stations and hundreds of central
99	   offices in a city that can serve as candidate edges for hosting
100	   service nodes.  Depending on the location of the edge and its
101	   capacity, each edge has different computing resources to be used for
102	   a service.  At peak hour, computing resources attached to a client's
103	   closest edge site may not be sufficient to handle all the incoming
104	   service demands.  Longer response time or even demand dropping can be
105	   experienced by the user.  Increasing the computing resources hosted
106	   on each edge site to the potential maximum capacity is neither
107	   feasible nor economical.

109	   Some user devices are purely battery-driven.  Offloading the
110	   computation intensive processing to the edge can save the battery.
111	   Moreover the edge may use the data set (for the computation) that may
112	   not exist on the user device because of the size of data pool or data
113	   governance reasons.

115	   At the same time, with the new technologies such as serverless
116	   computing and container based virtual functions, service node on an
117	   edge can be easily created and terminated in a sub-second scale.  It
118	   makes the available computing resources for a service change
119	   dramatically over time at an edge.

121	   DNS-based load balancing usually configures a domain in Domain Name
122	   System (DNS) such that client requests to the domain are distributed
123	   across a group of servers.  It usually provides several IP addresses
124	   for a domain name.  The traditional techniques to manage the overall
125	   load balancing process of clients issuing requests including choose-
126	   the-closest or round-robin.  The are relatively static which may
127	   cause the unbalanced workload distribution in terms of network load
128	   and computational load.

130	   There are some dynamic ways which tries to distribute the request to
131	   the server with the best available resources and minimal load.  They
132	   usually require L4-L7 handling of the packet processing.  It is not
133	   an efficient approach for huge number of short connections.  At the
134	   same time, such approaches can hardly get network status in real
135	   time.  Therefore the choice of the service node is almost entirely
136	   determined by the computing status, rather than the comprehensive
137	   consideration of both computing and network.

139	   Networking taking account of computing resource metrics as one of its
140	   top parameters is called Compute First Networking (CFN) in this
141	   document.  Edge site can interact with each other to provide network-
142	   based edge computing service dispatching to achieve better load
143	   balancing in CFN.  Both computing load and network status are network
144	   visible resources.

146	   A single service has multiple instances attached to multiple edge
147	   computing sites is conceptually like anycast in network language.
148	   Because of the dynamic and anycast aspects of the problem, jointly
149	   with the CFN deployment, we generally refer to it in this document as
150	   CFN-Dyncast, as for Compute First Networking Dynamic Anycast.  This
151	   draft describes usage scenarios, problem space and key areas of CFN-
152	   Dyncast.

154	2.  Definition of Terms

156	   CFN: Compute First Networking.  Networking architecture taking
157	   account of computing resource metrics as one of its top parameters to
158	   achieve flexible load management and performance optimizations in
159	   terms of both network and computing resources.

161	   CFN-Dyncast: Compute First Networking Dynamic Anycast.  The dynamic
162	   and anycast aspects of the architecture in a CFN deployment.

164	3.  Main Use-Cases

166	   This section presents several typical scenarios which require
167	   multiple edge sites to interconnect and to co-ordinate at network
168	   layer to meet the service requirements and ensure user experience.

170	3.1.  Cloud Based Recognition in Augmented Reality (AR)

172	   In AR environment, the end device captures the images via cameras and
173	   sends out the computing intensive service demand.  Normally service
174	   nodes at the edge are responsible for tasks with medium computational
175	   complexity or low latency requirement like object detection, feature
176	   extraction and template matching, and service nodes at cloud are
177	   responsible for the most intensive computational tasks like object
178	   recognition or latency non-sensitive tasks like AI based model
179	   training.  The end device hence only handles the tasks like target
180	   tracking and image display, thereby reducing the computing load of
181	   the client.

183	   The computing resource for a specific service at the edge can be
184	   instantiated on-demand.  Once the task is completed, this resource
185	   can be released.  The lifetime of such "function as a service" can be
186	   on a millisecond scale.  Therefore computing resources on the edges
187	   have distributed and dynamic natures.  A service demand has to be
188	   sent to and served by an edge with sufficient computing resource and
189	   a good network path.

191	3.2.  Connected Car

193	   In auxiliary driving scenarios, to help overcome the non-line-of-
194	   sight problem due to blind spot or obstacles, the edge node can
195	   collect the comprehensive road and traffic information around the
196	   vehicle location and perform data processing, and then the vehicles
197	   in high security risk can be signaled.  It improves the driving
198	   safety in complicated road conditions, like at the intersections.
199	   The video image information captured by the surveillance camera is
200	   transmitted to the nearest edge node for processing.  Warnings can be
201	   sent to the cars driving too fast or under other invisible dangers.

203	   When the local edge node is overloaded, the service demand sent to it
204	   will be queued and the response from the auxiliary driving will be
205	   delayed, and it may lead to traffic accidents.  Hence, in such cases,
206	   delay-insensitive services such as in-vehicle entertainment should be
207	   dispatched to other light loaded nodes instead of local edge nodes,
208	   so that the delay-sensitive service is preferentially processed
209	   locally to ensure the service availability and user experience.

211	3.3.  Cloud Virtual Reality (VR)

213	   Cloud VR introduces the concept and technology of cloud computing and
214	   rendering into VR applications.  Edge cloud helps encode/decode and
215	   rendering in this scenario.  The end device usually only uploads the
216	   posture or control information to the edge and then VR contents are
217	   rendered in edge cloud.  The video and audio outputs generated from
218	   edge cloud are encoded, compressed, and transmitted back to the end
219	   device or further transmitted to central data center via high
220	   bandwidth network.

222	   Cloud VR services have high requirements on both network and
223	   computing.  For example, for an entry-level Cloud VR (panoramic 8K 2D
224	   video) with 110-degree Field of View (FOV) transmission, the typical
225	   network requirements are bandwidth 40Mbps, RTT 20ms, packet loss rate
226	   is 2.4E-5; the typical computing requirements are 8K H.265 real-time
227	   decoding, 2K H.264 real-time encoding.

229	   Edge site may use CPU or GPU for encode/decode.  GPU usually has
230	   better performance but CPU is more simple and straight forward for
231	   use.  Edges have different computing resources in terms of CPU and
232	   GPU physically deployed.  Available remaining resource determines if
233	   a gaming instance can be started.  The instance CPU, GPU and memory
234	   utilization has a high impact on the processing delay on encoding,
235	   decoding and rendering.  At the same time, the network path quality
236	   to the edge site is a key for user experience on quality of audio/
237	   video and game command response time.

239	   Cloud VR service brings challenging requirements on both network and
240	   computing so that the edge node to serve a service demand has to be
241	   carefully selected to make sure it has sufficient computing resource
242	   and good network path.

244	4.  Requirements

246	   This document mainly targets at the typical edge computing scenarios
247	   with two key features, service equivalency and service dynamism.

249	   o  Service equivalency: A service is provided by one or more service
250	      instances, providing an equivalent service functionality to
251	      clients, while the existence of several instances is (possibly
252	      across multiple edges) is to ensure better scalability and
253	      availability

255	   o  Service dynamism: A single instance has very dynamic resources
256	      over time to serve a service demand.  Its dynamism is affected by
257	      computing resource capability and load, network path quality, and
258	      etc.  The balancing mechanisms should adapt to the service
259	      dynamism quickly and seamlessly.  Failover kind of switching is
260	      not desired.

262	5.  Problems Statement

264	   A service demand should be routed to the most suitable edge and
265	   further to the service instance in real time among the multiple edges
266	   with service equivalency and dynamism.  Existing mechanisms use one
267	   or more of the following ways and each of them has issues associated.

269	   o  Use the least network cost as metric to select the edge.  Issue:
270	      Computing information is a key to be considered in edge computing,
271	      and it is not included here.

273	   o  Use geographical location deduced from IP prefix, pick the closest
274	      edge.  Issue: Edges are not so far apart in edge computing
275	      scenario.  Either hard to be deduced from IP address or the
276	      location is not the key distinguisher.

278	   o  Health check in an infrequent base (>1s) to reflect the service
279	      node status, and switch when fail-over.  Issue: Health check is
280	      very different from computing status information of service
281	      instance.  It is too coarse granularity.

283	   o  Application layer randomly picks or uses round-robin way to pick a
284	      service node.  Issue: It may share the load across multiple
285	      service instances in terms of the computing capacity, the network
286	      cost variance is barely considered.  Edges can be deployed in
287	      different cities which are not equal cost paths to a client.
288	      Therefore network status is also a major concern.

290	   o  Global resolver and early binding (DNS-based load balancing):
291	      Client queries a global resolver or load balancer first and gets
292	      the exact server's address.  And then steer traffic using that
293	      address as binding address.  It is called early binding because an
294	      explicit binding address query has to be performed before sending
295	      user data.  Issue: Firstly, it clashes with the service dynamism.
296	      Current resolver does not have the capability of such high
297	      frequent change of indirection to new instance based on the
298	      frequent change of each service instance.  Secondly, edge
299	      computing flow can be short.  One or two round trip would be
300	      completed.  Out-of-band query for specific server address has high
301	      overhead as it takes one more round trips.  As discussed in
302	      section 5.4 of [I-D.sarathchandra-coin-appcentres], the flexible
303	      re-routing to appropriate service instances out of a pool of
304	      available ones faces significant challenges when utilizing DNS for
305	      this purpose.

307	   o  Traditional anycast.  Issue: Only works for single request/reply
308	      communication.  No flow affinity guaranteed.

310	   A network based dynamic anycast (Dyncast) architecture aims to
311	   address the following points in CFN (CFN-Dyncast).

313	5.1.  Anycast based service addressing methodology

315	   A unique service identifier is used by all the service instances for
316	   a specific service no matter which edge it attaches to.  An anycast
317	   like addressing and routing methodology among multiple edges makes
318	   sure the data packet potentially can reach any of the edges with the
319	   service instance attached.  At the same time, each service instance
320	   has its own unicast address to be used by the attaching edge to
321	   access the service.  From service identifier (an anycast address) to
322	   a specific unicast address, the discovery and mapping methodology is
323	   required to allow in-band service instance and edge selection in real
324	   time in network.

326	5.2.  Flow affinity

328	   The traditional anycast is normally used for single request single
329	   response style communication as each packet is forwarded individually
330	   based on the forwarding table at the time.  Packets may be sent to
331	   different places when the network status changes.  CFN in edge
332	   computing requires multiple request multiple response style
333	   communication between the client and the service node.  Therefore the
334	   data plane must maintain flow affinity.  All the packets from the
335	   same flow should go to the same service node.

337	5.3.  Computing Aware Routing

339	   Given that the current state of the art for routing is based on the
340	   network cost, computing resource and/or load information is not
341	   available or distributed at the network layer.  At the same time,
342	   computing resource metrics are not well defined and understood by the
343	   network.  They can be CPU/GPU capacity and load, number of sessions
344	   currently serving, latency of service process expected and the
345	   weights of each metric.  Hence it is hard to make the best choice of
346	   the edge based on both computing and network metrics at the same
347	   time.

349	   Computing information metric representation has to be agreed on by
350	   the participated edges and metrics are to be exchanged among them.

352	   Network cost in current routing system does not change very
353	   frequently.  However computing load is highly dynamic information.
354	   It changes rapidly with the number of sessions, CPU/GPU utilization
355	   and memory space.  It has to be determined at what interval or event
356	   such information needs to be distributed among edges.  More frequent
357	   distribution more accurate synchronization, but also more overhead.

359	   Choosing the least path cost is the most common rule in routing.
360	   However, the logic does not work well in computing aware routing.
361	   Choosing the least computing load may result in oscillation.  The
362	   least load edge can quickly be flooded by the huge number of new
363	   computing demands and soon become overloaded.  Tidal effect may
364	   follow.

366	   Depending on the usage scenario, computing information can be carried
367	   in BGP, IGP or SDN-like centralized way.  More investigations in
368	   those solution spaces is to be elaborated in other documents.  It is
369	   out of scope of this draft.

371	6.  Summary

373	   This document presents the CFN-Dyncast problem statement.  CFN-
374	   Dyncast aims at leveraging the resources mobile providers have
375	   available at the edge of their networks.  However, CFN-Dyncast aims
376	   at taking into account as well the dynamic nature of service demands
377	   and the availability of network resources so as to satisfy service
378	   requirements and load balance among service instances.

380	   This also document illustrate some use cases problems and list the
381	   requirements for CFN-Dyncast.  CFN-Dyncast architecture should
382	   addresses how to distribute the computing resource information at the
383	   network layer and how to assure flow affinity in an anycast based
384	   service addressing environment.

386	7.  Security Considerations

388	   TBD

390	8.  IANA Considerations

392	   No IANA action is required so far.

394	9.  Informative References

396	   [I-D.sarathchandra-coin-appcentres]
397	              Trossen, D., Sarathchandra, C., and M. Boniface, "In-
398	              Network Computing for App-Centric Micro-Services", draft-
399	              sarathchandra-coin-appcentres-03 (work in progress),
400	              October 2020.

402	Acknowledgements

404	   The author would like to thank Yizhou Li, Luigi IANNONE and Dirk
405	   Trossen for their valuable suggestions to this document.

407	Authors' Addresses

409	   Liang Geng
410	   China Mobile

412	   Email: gengliang@chinamobile.com

414	   Peng Liu
415	   China Mobile

417	   Email: liupengyjy@chinamobile.com

419	   Peter Willis
420	   BT

422	   Email: peter.j.willis@bt.com