Internet-Draft Computing Resource Modeling for CAN July 2022
Liu, et al. Expires 12 January 2023 [Page]
Workgroup:
rtgwg
Internet-Draft:
draft-liu-can-computing-resource-modeling-00
Published:
Intended Status:
Informational
Expires:
Authors:
P. Liu
China Mobile
Z. Du
China Mobile
L. Rui
Beijing University of Posts and Telecommunications
W. Li
Beijing University of Posts and Telecommunications
C. Li
Huawei Technologies
G. Huang
ZTE

Computing Resource Modeling for CAN

Abstract

This document describes the considerations and potential architecture of modeling the computing resource in the Computing-Aware Network(CAN).

Moreover, the network and application based modeling are also presented in this document to meet the potential requirements of integrated and hierarchical modeling.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 12 January 2023.

Table of Contents

1. Introduction

Computing-Aware Networking (CAN) is proposed to support steering the traffic among different edge sites according to both the real-time network and computing resource status. This requires the network to be aware of computing resource information and select a service instance based on the joint metric of computing and networking.[I-D.liu-dyncast-ps-usecases][I-D.liu-dyncast-gap-reqs][I-D.li-dyncast-architecture] proposed Dyncast to meet the traffic steering requirements in CAN.

In order to generate steering strategies, the modeling of computing capacity is required. Different from the network, computing capacity is more complex to be measurement. For instance, it is hard to predict how long will be used to process a specific computing task based on the different computing resource, which is hard to calculate and will be influenced by the whole internal environments of computing nodes. But there are some indicators has been used to describe the computing capacity of hardware and computing service, moreover, some related work has been proposed to measurement and evaluate the computing capacity, which could be the basis of computing capacity modeling.

[cloud-network-edge] proposed to allocate and adjust corresponding resources to users according to the demands of computing, storage and network resources.

[heterogeneous-multicore-architectures] proposed to design heterogeneous multi-core architectures according to different customization, such as CPU microprocessors with ultra-low power consumption and high code density; Low power microprocessor with FPU. And a high-performance application processor with FPU and MMU support based on a completely unordered multi problem architecture.

[ARM-based] proposed the cluster scheduling model that is combined with GPU virtualization and designed a hierarchical cluster resource management framework, which can make the heterogeneous CPU-GPU cluster be effectively used.

The hardware cloud service providers have also disclosed their parameter indicator for computing services:

[One-api] provides a collection of programming languages and cross architecture libraries across different architectures, to be compatible with heterogeneous computing resources, including CPU, GPU, FPGA, and others. [Amazon] uses the computing resource parameters when evaluating the performance, including the average CPU utilization, average number of bytes received and sent out, and average application load balancer. Alibaba cloud [Aliyun] gives the indicators including vcpu, memory, local storage, network basic and burst bandwidth capacity, network receiving and contracting capacity, etc., when providing cloud servers service. [Tencent-cloud] uses vcpu, memory (GB), network receiving and sending (PPS), number of queues, intranet bandwidth capacity (Gbps), dominant frequency, etc.

Based on those and the demand of CAN traffic steering, this document analyzes the types of computing resources and tasks, providing the factors to be considered when modeling and evaluating the computing resource capacity. This document doesn't specify the specific using way of the modeling, including who will model the computing resource, what factors must be considered and the form of the representing results based on modeling. A proposed vector of modeling result could be further weighted into a group of indicators or a single indicator according to the specific demand of applications.

2. Definition of Terms

This document makes use of the following terms:

Computing-Aware Networking(CAN):
Aiming at computing and network resource optimization by steering traffic to appropriate computing resources considering not only routing metric but also computing resource metric and service affiliation.
Service:
A monolithic functionality that is provided by an endpoint according to the specification for said service. A composite service can be built by orchestrating monolithic services.
Service instance:
Running environment (e.g., a node) that makes the functionality of a service available. One service can have several instances running at different network locations.
Service identifier:
Used to uniquely identify a service, at the same time identifying the whole set of service instances that each represent the same service behavior, no matter where those service instances are running.
Service transaction:
Has one or more several service request that has several flows which require the affinity because of the transaction related state.
Computing Capacity
The ability of nodes with computing resource achieve specific result output through data processing, including but not limited to computing, communication, memory and storage capacity.

3. Requirements of Computing Resource Modeling

3.1. Support Classification of Chips and Computing Types

Different heterogeneous computing resources have different characteristics. For example, CPUs usually deal with pervasive computing and are most widely used; GPUs usually handle parallel computing, such as rendering of display tasks, and is widely used in artificial intelligence and neural network algorithm computing. FPGA and ASCI are usually used to handle customized computing. At the same time, different computing tasks need to call different calculation types, such as integer calculation, floating-point calculation, hash calculation, etc. Therefore:

MUST support the classification of various heterogeneous chips for different kinds of computing tasks.

MUST support the classification of the computing types required by the task.

3.2. Support Multi-level Modeling

Because the network and computing have multi-dimensional and hierarchical resources, such as cache, storage, communication, etc., these dimensions will affect each other and further affect the overall level of computing capacity. Other factors other than the computing itself need to be considered in modeling. At the same time, the form of computing resources is also hierarchical, such as computing type, chip type, hardware type, and converging the network. For different computing forms, such as gateway, all-in-one machine, edge cloud and central cloud, the computing capacity, and types provided are also different; It is necessary to comprehensively consider multi-dimensional and multi-modal resources, and provide multi-level modeling according to application demands. Therefore:

MUST support modeling computing nodes, including computing, storage, communication,etc..

SHOULD support the integrated modeling of the converged network.

3.3. Support to be used for Further Representation

Modeling itself provides a general method to evaluate the capacities of computing resource. For CAN, modeling-based computing resource representation is the basis for subsequent traffic steering. In addition, for different applications, it may be optimized based on general modeling methods to establish a set of models that conform to their own characteristics, so as to generate corresponding representation methods. Moreover, in order to use computing resource status more efficiently and protect privacy, modeling for the further representation of resource information needs to support the necessary simplification and obfuscation.

MUST support different modeling methods according to specific representation demands.

MUST support Application-oriented modeling methods.

MUST support obscuring the computing Information on demand of the application.

4. Usage of Computing Resource Modeling of CAN

4.1. Modeling Based on CAN-defined Format

Figure 1 shows the case of modeling based on CAN-defiend Format. CAN provides the modeling format to the computing domain to evaluate the computing resource capacity of computing domain and then get the result based on the unified interface, which will define the properties should be notified to CAN. Then CAN could select the specific service instance based on the computing resource and network resource status.

In this way, the CAN domain and computing domain has the relative loose boundary based on the situation that the CAN service and computing resource belongs to the same provider, CAN could be aware of computing resource more or less, depending on the privacy preserving demand of the computing domain at the same time. The exposed computing capacity including the static information of computing node category/level and the dynamic capabilities information of computing node.

Based on the static information, some visualization functions can be implemented on the management plane to know the global view of computing resources, which could also help the deployment of applications considering the overall distributed status of computing and network resource. Based on the dynamic information, CAN could steer category-based applications traffic based on the unified modeling format and interface.

                                   |

         CAN Domain                |                     Computing Domain

+--------+    ---------------------->------------------->  +-------------+
|visuali-|                   Modeling Format               |  Computing  |
|zation  |                         |                       |             |
+--------+    <--------------------<---------------------  |  Resource   |
|Traffic |      Stastic level/category of computing node   |             |
|Steering|                         |                       |  Modeling   |
+--------+    <--------------------<---------------------  +-------------+
                  Dynamic capability of computing node

                                   |

                                   |
Figure 1: Modeling Based on CAN-defined Format

4.2. Modeling Based on Application-defined Method

Figure 2 shows the case of modeling based on application-defiend method. Computing resource of the specific application evaluates it's computing capacity by itself, and then notifies the result which might be the index of real time computing level to CAN. Then CAN selects the specific service instance based on the computing index.

In this way, the CAN domain and computing domain has the strict boundary based on the situation that the CAN service and computing resource belongs to the different providers. CAN is just aware of the index of computing resource which is defined by application, don't know the real status of computing domain, and the traffic steering right is potentially controlled under application itself. If CAN is authorized by application, it could steer traffic based on network status at the same time.

                         |                     |
                         |                     |
         CAN Domain      |                     |       Computing Domain
                         |                     |
                         |                     |           +-------------+
+--------+               |                     |           |  Computing  |
|Traffic |               |                     |           |             |
|        |    <---------------------<---------- ---------- |  Resource   |
|Steering|      dynamic index of computing capacity level  |             |
+--------+               |                     |           |  Modeling   |
                         |                     |           +-------------+
                         |                     |
                         |                     |
                         |                     |
                         |                     |
Figure 2: Modeling Based on Application-defined Method

5. Architecture of Computing Modeling

This Section describes the potential architecture of computing resource modeling, regardless of any ways of the further usage of traffic steering of CAN, neither of the usage ways described in Section 4.

According to the computing indicators and related work described in Section 2, computing capacity includes the types of computing resources and tasks, and also need to consider multi-dimensional capabilities such as communication, memory, and storage. Because every factor will affect each others. For instance, with the rapid growth of modern computer CPU performance, the communication bottleneck between CPU and cache has become increasingly prominent. Moreover, the storage capacity greatly affects the processing speed of a computer. So the architecture of computing capacity modeling could be seen in figure 3.

                                                           +-------+      +-------+
                                                        +--|  CPU  |  +---|  GPU  |
                                       +-------------+  |  +-------+  |   +-------+
                                       |    Chips    |--+-------------+
                                    +--|  Category   |  |  +-------+  |   +-------+
                                    |  +-------------+  +--| FPGA  |  +---|  ASIC |
                   +-------------+  |                      +-------+      +-------+
                   |  Computing  |--+
                +--|  Capacity   |--+                      +----------------------+
                |  +-------------+  |                   +--|  intCalculationRate  |
                |  +-------------+  |  +-------------+  |  +----------------------+
                +--|Communication|  +--|  Computing  |  |  +----------------------+
+-------------+ |  |  Capacity   |     |    Types    |--+--| floatCalculationRate |
|  Computing  | |  +-------------+     +-------------+  |  +----------------------+
|  Resource   |-+  +-------------+                      |  +----------------------+
|  Modeling   | |  |   Cache     |                      +--|  hashCalculationRate |
+-------------+ +--|  Capacity   |                         +----------------------+
                |  +-------------+
                |  +-------------+
                +--|  Storage    |
                   |  Capacity   |
                   +-------------+
Figure 3: Referecen Architecture of Computing Modeling Format

5.1. Computing Capacity

The computing capacity includes the chips category and computing types. Common chip types include CPU, GPU, FPGA and ASIC. CPU and GPU belong to von Neumann structure, with instruction decoding and execution and shared memory. According to the different characteristics and requirements of computing programs, the computing performance can be divided into integer computing performance, floating-point computing performance and hash computing performance.

5.1.1. Types of Chips

CPU (Central Processing Unit) is a general-purpose processor needs to be able to handle comprehensive and complex tasks, as well as the synchronization and coordination between tasks. Therefore, a lot of space is required on the chip to perform branch prediction and optimization and save various states to reduce the delay during task switching. This also makes it more suitable for logic control, serial operation and universal type data operation.

GPU (Graphics Processing Unit) has a large-scale parallel computing framework composed of thousands of smaller and more efficient Alu cores. Most transistors are mainly used to build control circuits and caches, and the control circuits are relatively simple.

FPGA (Field Programmable Gate Array) is essentially an architecture without instructions and shared memory, which is more efficient than GPU and CPU. The main advantage of FPGA in data processing tasks is its stability and extremely low latency, which is suitable for streaming computing intensive tasks and communication intensive tasks.

ASIC (Application Specific Integrated Circuit) is a special integrated circuit, and its performance is actually better than FPGA. However, for customized customers, its cost is much higher than FPGA.

On this basis, according to different computing task requirements, chip manufacturers have also developed various "xpus", including APU (Accelerated Processing Unit), DPU (Deep-learning Processing Unit), TPU (Tensor Processing Unit), NPU (Neural-network Processing Unit) and BPU (Brain Processing Unit), which are made based on the CPU, GPU, FPGA and ASIC.

5.1.2. Type of Computing

At present, the computing type in computer mainly includes integer calculation, floating-point calculation, and hash calculation.

The integer calculation rate is expressed as the calculation rate of the integer data operation benchmark program running on the CPU. Integer computing capability has its specific application scenarios, such as discrete-time processing, data compression, search, sorting algorithm, encryption algorithm, decryption algorithm, etc.

Floating point calculation rate is expressed as the calculation rate of the floating-point data operation benchmark program running on the CPU. There are many kinds of benchmark programs, each of which can reflect the floating-point computing performance of nodes from different aspects.

The hash calculation rate refers to the output speed of the hash function when the computer performs intensive mathematical and encryption related operations. For example, in the process of obtaining bitcoin through "mining", how many hash collisions can a mining machine do per second, and the unit is hash/s.

5.1.3. Relation of Computing Types and Chips

The differences computing capacity of the above different chip types is summarized as figure 4 shows. CPU is good at intCalculation, GPU and FPGA are good at floatCalculation, and ASIC is good at intCalculation.

+-----+------------------+------------------+------------------+
|     |  intCalculation  | floatCalculation |  hashCalculation |
+-----+------------------+------------------+------------------+
| CPU |        good      |      Ordinary    |      Ordinary    |
+-----+------------------+------------------+------------------+
| GPU |      Ordinary    |        good      |      Ordinary    |
+-----+------------------+------------------+------------------+
| FPGA|      Ordinary    |        good      |      Ordinary    |
+-----+------------------+------------------+------------------+
| ASIC|      Ordinary    |        good      |        good      |
+-----+------------------+------------------+------------------+
Figure 4: Relation of Computing Types and Chips

5.1.4. Consideration of Using in CAN

For the CAN-defined modeling way, CAN could get the computing information of edge sites/service instance more or less, and we assume that the CAN system also could get the characteristics/demands/identifier of service transaction, then select the service instance among different edge sites. For example, there is a service transaction with the task of image processing, which could consider the identifier for service category of service demand, then the CAN system could find the suitable edge sites/service instance which has the computing resource of float calculation or GPU.

When using in the network, it could use 00,01,10 to represent the different computing chips or computing task, then it could be recorded in the control plane to support the mapping and further selection to the computing resource. In some cases, there will be more factors of computing resource, so some processing of obscuring and weighting are needed, the representation or signaling of the computing status might not be so direct.

For the application-defined modeling way, CAN might not know any explicit calculation information of computing types or chips category, even might not what kind of index is.

5.2. Communication, Cache and Storage Capacity

Besides the computing capacity, the communication, cache, and storage capacity should also be considered because each of them can potentially influence the comprehensive capacity of computing resource nodes.

The communication capacity is the external communication rate of computing nodes. From the point of view of a single node, the communication capability indicator of a node mainly includes the network bandwidth. Moreover, it is often to have cluster of service instances for one task (like Hadoop architecture). Therefore the network capacity among those instances are also important factor in assessing the capability of the cluster of the service nodes for one task.

The cache(memory) capacity describers the amount of of the cache unit on a node. The memory (CACHE) indicator mainly includes the cache(memory) capacity and cache(memory) bandwidth.

The storage capacity is the external storage (for example, hard disk) of the computing node. The storage indicators of a node mainly includes the storage capacity, storage bandwidth, operations per second (IOPs) and response time of the node.

5.3. Comprehensive Computing Capability Evaluation

Based on the architecture of computing resource modeling, this Section proposes the comprehensive performance evaluation methods based on the vectors to represent each capability of computing, communication, cache, and storage.

Figure 5~8 shows the vector of computing node(i) including each aspects.

     +-                         -+
A(i)=|   Computing Capacity(i)   |
     +-                         -+
Figure 5: Computing Performance Vector
     +-                         -+
B(i)=|  Comunication Capacity(i) |
     +-                         -+
Figure 6: Comunication Performance Vector
     +-                        -+
C(i)=|     Cache Capacity(i)    |
     +-                        -+
Figure 7: Cache Performance Vector
     +-                         -+
D(i)=|    Storage Capacity(i)    |
     +-                         -+
Figure 8: Storage Performance Vector

The vector of computing capacity, communication capacity, cache capacity and storage capacity could be further weighted to a comprehensive vector.

V = aA+bB+cC+dD
Figure 9: Storage Performance Vector

Where, a, b, c and d are the weight coefficients corresponding to the evaluation indicators of computing capacity, communication capacity, cache capacity and storage capacity respectively, and a+b+c+d=1.

5.4. Consideration of Using in CAN

The vector gives the overall view of the evaluation result of computing resource, but no specific expression is specified, that is, just to model the computing resource including the computing, communication, cache, and storage capability, while the result could be weighted into any of the following form to be used under different demands:

o a group of vectors to represent the weighted level of computing, bandwidth, cache, storage capacity.

o a single vector to represent the single comprehensive and weighted level of overall capability.

Then the CAN system could select the service instance based on the processed vector. To expose the computing status, some existing protocol could be extended, which is out of the scope of this document.

6. Network Resource Modeling

The modeling of the network resource is optional, which depends on how to select the service instance and network path. For some applications which care both network and computing resource, the CAN service provider also need to consider the modeling of network and computing together.

The network structure can be represented as graphs, where the nodes represent the network deivces and the edges represent the network path. It should evaluate the single node, the network links and the E2E performance.

6.1. Consideration of Using in CAN

When to consider both the computing and network status at the same time, the comprehensive modeling of computing and network might be used. For example, measurement all the resource in a unified dimension, such as latency, reliability, etc.

If there is no strict demand of consider them at same time, for instance, consider computing status first and then network status. CAN could select the service instance at first, then to mark identifier for network path selection of network itself. In this situation, the network modeling is not really needed.

7. Application Demands Modeling

The application is usually composed of several sub service that complete different functions, and the service is usually composed of several sub transactions, which would be the smallest schedulable unit.

The application always has its own demands for network and computing resource, for instance we can see the HD video always requires the high bandwidth and the PC game always requires the better GPU and memory.

7.1. Consideration of Using in CAN

The modeling of the application demand is optional, which depends on whether the application could tell the demands to the network, or what it could tell. Once the CAN knows the application's demand, there should be a mapping between application demand and the modeling of the computing and/or network resource.

8. Conclusion

This document presents the potential modeling methods for CAN to steer the traffic to the appropriate edge sites accurately. The modeling algorithm and modeling processing might belong to computing domain, while the further representation and signaling of the weighted computing information based on the modeling could be the basis of traffic steering. Moreover, the visualization of computing resources and more functions could be realized to support the computing and network joint optimization.

9. Security Considerations

TBD.

10. IANA Considerations

TBD.

11. Acknowledgements

The author would like to thank Thomas Fossati, Dirk Trossen, Linda Dunbar for their valuable suggestions to this document.

12. Contributors

The following people have substantially contributed to this document:

        Jing Wang
        China Mobile
        wangjingjc.chinamobile.com

13. Informative References

[I-D.liu-dyncast-ps-usecases]
Liu, P., Eardley, P., Trossen, D., Boucadair, M., Contreras, L. M., and C. Li, "Dynamic-Anycast (Dyncast) Use Cases and Problem Statement", Work in Progress, Internet-Draft, draft-liu-dyncast-ps-usecases-03, , <https://www.ietf.org/archive/id/draft-liu-dyncast-ps-usecases-03.txt>.
[I-D.liu-dyncast-gap-reqs]
Liu, P., Jiang, T., Eardley, P., Trossen, D., and C. Li, "Dynamic-Anycast (Dyncast) Gap analysis and Requirements", Work in Progress, Internet-Draft, draft-liu-dyncast-gap-reqs-00, , <https://www.ietf.org/archive/id/draft-liu-dyncast-gap-reqs-00.txt>.
[I-D.li-dyncast-architecture]
Li, Y., Iannone, L., Trossen, D., Liu, P., and C. Li, "Dynamic-Anycast Architecture", Work in Progress, Internet-Draft, draft-li-dyncast-architecture-04, , <https://www.ietf.org/archive/id/draft-li-dyncast-architecture-04.txt>.
[One-api]
One-api, "http://www.oneapi.net.cn/", .
[Amazon]
Amaozn, "https://docs.aws.amazon.com/autoscaling/ec2/userguide/as-scaling-target-tracking.html#available-metrics", .
[Aliyun]
Aliyun, "https://help.aliyun.com/?spm=a2c4g.11186623.6.538.34063af89EIb5v", .
[Tencent-cloud]
Tencent-cloud, "https://buy.cloud.tencent.com/pricing", .
[cloud-network-edge]
cloud-network-edge, "A new edge computing scheme based on cloud, network and edge fusion", .
[heterogeneous-multicore-architectures]
access, I., "Towards energy-efficient heterogeneous multicore architectures for edge computing", .
[ARM-based]
Guide, S., "A heterogeneous CPU-GPU cluster scheduling model based on ARM", .

Authors' Addresses

Peng Liu
China Mobile
No.32 XuanWuMen West Street
Beijing
100053
China
Zongpeng Du
China Mobile
No.32 XuanWuMen West Street
Beijing
100053
China
Lanlan Rui
Beijing University of Posts and Telecommunications
No.10 XiTuCheng Road, Haidian District
Beijing
100876
China
Wenjing Li
Beijing University of Posts and Telecommunications
No.10 XiTuCheng Road, Haidian District
Beijing
100876
China
Cheng Li
Huawei Technologies
Guangping Huang
ZTE