RTGWG D.H. Huang
Internet-Draft ZTE Corporation
Intended status: Standards Track G.C. Chen
Expires: 25 April 2024 J.L. Liang
China Telecom
Y.Z. Zhang
China Unicom
D.Y. Dong
Beijing Jiaotong University
Y.DY. Yuan
F.HK. Fu
C. Huang
Y. Guo
ZTE Corporation
23 October 2023
Service ID for Addressing and Networking
draft-huang-rtgwg-sid-for-networking-00
Abstract
More and more emerging applications have raised the demand for
establishing networking connections?anywhere and anytime, alongside
the availability of highly distributive?any-cloud services. Such a
demand motivates the need to efficiently interconnect heterogeneous
entities, e.g., different domains of network and cloud owned by
different providers, with the goal of reducing cost, e.g., overheads
and end-to-end latency, while ensuring the overall performance
satisfies the requirements of the applications. Considering that
different network domains and cloud providers may adopt different
types of technologies, the key of interconnection and efficient
coordination is to employ a unified interface that can be understood
by heterogeneous parties which could derive the consistent
requirements of the same service and treat the service traffic
appropriately by their proprietary policies and technologies.
Therefore, service ID is one promising candidate for the unified
interface since it could be designed to be lightweight, secure, and
enables fast and efficient packet treatment. Leveraging service ID,
addressing and networking among heterogeneous network domains and
cloud providers can be accomplished by establishing the mapping
between the unified service ID and the specific technologies used by
a network domain or a cloud provider.
This document provides typical use cases of unified service ID for
addressing and routing (SIAN), validating that interconnecting
different network domains or cloud providers can be achieved at lower
Huang, et al. Expires 25 April 2024 [Page 1]
Internet-Draft Abbreviated Title October 2023
cost without sacrificing the performance of application compared with
existing methods of which problems as well as gaps have also been
illustrated. The requirements for SIAN are also derived for each of
the scenarios. Finally, a framework solution is demonstrated.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at https://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on 25 April 2024.
Copyright Notice
Copyright (c) 2023 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust's Legal
Provisions Relating to IETF Documents (https://trustee.ietf.org/
license-info) in effect on the date of publication of this document.
Please review these documents carefully, as they describe your rights
and restrictions with respect to this document. Code Components
extracted from this document must include Revised BSD License text as
described in Section 4.e of the Trust Legal Provisions and are
provided without warranty as described in the Revised BSD License.
Table of Contents
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1. Requirements Language . . . . . . . . . . . . . . . . . . 5
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5
3. Use Case Scenarios . . . . . . . . . . . . . . . . . . . . . 5
3.1. Service Mesh Management in Multi-Cloud . . . . . . . . . 6
3.2. Multi-Domain and Multi-Cloud interconnection . . . . . . 7
4. Problems and gap analysis of the existing service
identification mechanism . . . . . . . . . . . . . . . . 8
4.1. Sate burden of service identification . . . . . . . . . . 9
Huang, et al. Expires 25 April 2024 [Page 2]
Internet-Draft Abbreviated Title October 2023
4.2. Granularity and traffic engineering of service
identification . . . . . . . . . . . . . . . . . . . . . 9
4.2.1. Granularity of service identification . . . . . . . . 9
4.2.2. Traffic Engineering of service identification . . . . 9
4.3. Service operation and fulfillment . . . . . . . . . . . . 10
4.4. Convergence of network and cloud . . . . . . . . . . . . 10
4.5. L4/L7 gateway in the way of end to end service traffic . 11
5. Requirements of service identification for addressing and
networking . . . . . . . . . . . . . . . . . . . . . . . 12
6. Framework consideration of service identification for
addressing and networking . . . . . . . . . . . . . . . . 12
6.1. Service ID over existing networking IDs and labels . . . 12
6.2. Service ID Management and maintenance . . . . . . . . . . 13
6.3. Lifecycle and governance of service ID . . . . . . . . . 13
6.4. Key Processes of service . . . . . . . . . . . . . . . . 14
6.5. Service ID based Routing and Forwarding reference framework
and work flow . . . . . . . . . . . . . . . . . . . . . . 15
6.5.1. Components and working mechanism . . . . . . . . . . 15
6.5.2. Control Plane Consideration . . . . . . . . . . . . . 17
6.5.3. Data Plane Consideration . . . . . . . . . . . . . . 18
6.5.3.1. Service ID Encapsulated in The IP Address
Field . . . . . . . . . . . . . . . . . . . . . . . 18
6.5.3.2. Service ID Standalone Encapsulation . . . . . . . 18
6.5.3.3. Service ID-based forwarding . . . . . . . . . . . 20
6.6. OAM Consideration . . . . . . . . . . . . . . . . . . . . 21
6.7. End to end service flow upon service ID . . . . . . . . . 23
6.7.1. Service ID in destination address . . . . . . . . . . 23
6.7.2. Service ID in flow label and extension headers . . . 24
7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 24
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 24
9. Security Considerations . . . . . . . . . . . . . . . . . . . 24
10. Informative References . . . . . . . . . . . . . . . . . . . 24
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 25
1. Introduction
Emerging applications have raised stringent requirements such as a
1G+bps rate and less than 10ms delay. To satisfy these requirements,
three major trends have taken place. First, the cloud-native
paradigm enables one application to be decomposed into multiple
microservices each performing an independent piece of functionality.
Second, virtualization technologies decouple the logical function
from physical infrastructure, enabling the deployment of
microservices in multiple network locations while their aggregate
performance is the same as the monolithic application. Third, cloud
computing tasks are offloaded to the edge such as base stations,
vehicles, or even handheld devices, which further bring the micro
service closer to clients. These three trends lead to the deployment
Huang, et al. Expires 25 April 2024 [Page 3]
Internet-Draft Abbreviated Title October 2023
of highly distributive?any-cloud services and the demand for
establishing Internet connections?anywhere and anytime. However,
considering the heterogeneous technologies adopted by different
entities i.e., network domains and cloud provider, and the dynamicity
required in selecting appropriate service nodes when clients are
moving or the available resources changes, it remains a challenge to
efficiently interconnect different entities with consistent SLA
guaranteeing. Currently, when a packet is delivered from one network
domain to another, it is generally sent via a tunnel where two
endpoints are located in the two network domains. The tunnel is
unaware of the underlying technologies used by the two network
domains, but the encapsulation and the de-capsulation process at both
endpoints lead to a larger end-to-end delay. Moreover, the
establishment and the tearing down procedures of the tunnel take
time, which makes the tunneling approach not able to dynamically
select appropriate network domains.
To achieve efficient inter-domain or inter-cloud communications, it
is critical to design a unified interface that can be understood by
any network domain or cloud provider. Among all the available
technologies, we observe that service ID is one promising candidate
for the unified interface and select typical use cases to demonstrate
its advantages. Leveraging service ID, addressing and networking
among heterogeneous network domains and cloud providers with
consistent service SLA guarantee could be accomplished by
establishing the mapping between the unified service ID and the
specific technologies used by a network domain or a cloud provider.
[[I-D.trossen-rtgwg-rosa-arch]] illustrates the service address to be
employed as anycast in the overlay network for service oriented
addressing for the benefits of decoupling with specific networking
and computing resources, while [[I-D.ldbc-cats-framework]] employs
computing service ID as an index for computing awareness traffic
steering, and [[I-D.li-apn-framework]] designs an APP ID as an
interface between application as well as its networking requirements
and the underlying network. Rather than leveraging the conventional
5 tuples for either traffic steering or interface between different
parties, employment of a light-weight and standalone service ID in
the routing network could address the deterrent gaps with significant
benefits. This draft will try to demonstrate the chief gaps with
overall requirements of service ID, and focuses upon the use cases
the above drafts have not yet brought up and illustrates end to end
solution considerations from perspectives of interconnection of
networks, clouds and terminals.
Huang, et al. Expires 25 April 2024 [Page 4]
Internet-Draft Abbreviated Title October 2023
1.1. Requirements Language
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
2. Terminology
* Service ID:Service Identifier
* SIAN:Service ID for Addressing and Networking
* SCMS:Service Control and Management System
* SSMC:SIAN Service Metric Collector
* SNMC:SIAN Network Metric Collector
* SUTF:SIAN User Traffic Forwarder
* SPIS:SIAN Path and Instance Selector
* OAM:Operation Administration and Maintenance
* FM:Failure management
* PM:Performance management
* CSP:cloud service provider
* Multi-cloud [ITU-T Y. 3537]:Use of cloud services in the public
cloud from two or more independent cloud service providers (CSPs)
at the same time for business.
3. Use Case Scenarios
In the following, we have a couple of typical use cases that require
the interconnection of different entities such as network domains and
CSPs. For each case, we demonstrate the complexity and the possible
hinderances for service performance using existing methods and
illustrate how service ID facilitates interconnection between
different parties.
Huang, et al. Expires 25 April 2024 [Page 5]
Internet-Draft Abbreviated Title October 2023
3.1. Service Mesh Management in Multi-Cloud
In the cloud-native paradigm, an application is generally decomposed
into multiple micro-service components each performing an independent
piece of functionality and then using service mesh to manage inter-
service communications. Due to constraints imposed by the computing
resources (e.g., processor types or storage capacity) and the
capability of different CSPs, especially by cost and energy
insufficiency for those edge computing providers, deploying the
entire application functionalities in one site is therefore not
economical feasible, so instantiating and executing micro-service
components in multi-cloud environments, and then performing inter-
service networking over Internet become attractive. Moreover, to
ensure the aggregated performance of the micro-services deployed in
multi-cloud is the same as the monolithic cloud service, it is an
essential use case to conduct service mesh management in multi-cloud.
In the existing service mesh management approach, first, it is
running in each CSP’s internal SDN domain, which is separated from
that of other CSPs or external 3rd parties. The service accessing
point (either in layer 4 or layer 7 from the client’s point of view)
which currently resides in the API gateway that each CSP operate
separately, before a client’s request is processed, one of the CSP’s
API gateway must be first appointed as the service accessing point to
intercept all the incoming requests. The API gateway then processes
the client’s request according to the capabilities and container
orchestration of its own CSP. So, if a client’s front end
application tries to access ubiquitous computing resources and make
use of an available back end micro-service instance deployed in
different CSPs, such an approach actually limits and adds
complexities to clients’ capabilities of switching CSPs to serve
their requests if there are better choices available in other CSPs.
Second, inter-service communication is generally conducted using
sidecar proxies that are collocated with service pods. As shown in
Fig. 1, a sidecar proxy, e.g., sidecar proxy A, intercepts all
incoming traffic of the collocated service pod, e.g., micro-service
A, decapsulates packets of the traffic, conducts appropriate
processing, such as service discovery, routing, or rate limiting, and
sends the packets to appropriate service instances. After receiving
packets from a service instance, sidecar proxy A encapsulates the
packets and sends the packets to another sidecar proxy, e.g., sidecar
proxy B, using layer-7 protocols such as gRPC, or REST API.
Huang, et al. Expires 25 April 2024 [Page 6]
Internet-Draft Abbreviated Title October 2023
+-----------------+ +-----------------+
| cloud GW A | | cloud GW B |
+-----------------+ +-----------------+
| intercept |
v incoming traffic v
+-----------------+ +-----------------+
| sidecar proxy A|--->| sidecar proxy B |
+-----------------+ +-----------------+
| ^ | ^
v | v |
+-----------------+ +-----------------+
| microservice A| | microservice B |
+-----------------+ +-----------------+
Figure 1
It can be seen from the above procedures that each hop inter-service
communication incurs additional delay including the processing time
of two sidecar proxies. If a composite cloud native service requires
meshed or multi-hop inter-service communications in multi-cloud, the
complexity of managing the composite cloud service is tremendous and
the end-to-end delay of the composite cloud service can easily become
intolerable.
3.2. Multi-Domain and Multi-Cloud interconnection
In industry, there is a growing interest in connecting factories that
are located in different areas to achieve smart manufacturing and
fast logistics. Since the distance between factories may range from
several kilometers to thousands of kilometers, the communications
among factories generally involve multiple network domains and CSPs
that adopt heterogeneous technologies. Moreover, the requirements of
inter-factory communications are diverse. For discrete automation
applications, the end-to-end delay is required to be less than 10ms
and the data rate is less than 10Mbps, while for process automation
systems, the end-to-end delay is about 60ms and the data rate can be
as high as 100Mbps.
Huang, et al. Expires 25 April 2024 [Page 7]
Internet-Draft Abbreviated Title October 2023
To accommodate such diverse applications, tunnels are generally
established to connect heterogeneous domains in existing approaches.
As shown in Fig. 2, a tunnel is established between network domains A
and B to deliver packets for factories A and B. If the two network
domains use different protocols, two gateways also need to be
established at the two endpoints of the tunnel, respectively. When
factory A sends a remote control message to factory B, the packets
are encapsulated at GW A and then sent to the ingress of the tunnel.
The packets are delivered through the tunnel and when they reach the
egress, they are sent to GW B and further decapsulated.
+----------------------------------+ +---------------------------------+
| | Tunnel | |
| _____________ ____ |--------------------| _______ ____________ |
| |_Factory A_|-------->|_GWA_| | --------------> | |_GWB_|-------->|_Factory B_| |
| |--------------------| |
| network domain A | | network domain B |
+----------------------------------+ +---------------------------------+
Figure 2
It can be seen from the above procedures that each hop cross-domain
communication incurs additional delay including the encapsulation and
the decapsulation time of packets. If a remote control application
requires multi-hop cross-domain communications, such as the
application involves the sequential execution of multiple factories,
or the device that triggers the application moves from one network
domain to another, the complexity of managing the remote control
application is tremendous and the end-to-end delay of the application
would always exceed the maximum tolerable latency requirements.
4. Problems and gap analysis of the existing service identification
mechanism
This section illustrates the problems and gap analysis of the
conventional 5-tuple service identification mechanism in terms of the
emerging use cases.
Huang, et al. Expires 25 April 2024 [Page 8]
Internet-Draft Abbreviated Title October 2023
4.1. Sate burden of service identification
No explicit service identification scheme has been designed for the
L3 routing network in which all identifications are designed
specifically for devices, traffic flows, network sections such as IP
addresses, labels, while the ports from L4 have been designed to be
always associated with specific protocols (TCP/UDP) rather than
service identification. Therefore, when it comes to service
identification in the routing network, mapping state has to be
maintained by combining the selected tuples among the above L3/L4
routing network identifications, the selected tuples combination is
actually traffic flow identification. In the very scenario in which
the cloud resources as well as the associated services and
applications have been migrated from centralized sites to the edge
sites, the mapping state of service identification through selected
tuples combination would increase dramatically and put an
overwhelming state burden for the routing network in terms of
scalability.
4.2. Granularity and traffic engineering of service identification
4.2.1. Granularity of service identification
The conventional methods utilized to distinguish traffic flows mainly
rely on the 5-tuple of the incoming packets. For instance, ACL
(Access Control List) and PBR (Policy-based Routing) apply
corresponding 5-tuple matching strategies. However, a set of 5-tuple
which includes Source IP, Destination IP, Source Port, Destination
Port, and Protocol is not enough to reflect and indicate explicit
information of Application Layer services. Elements of a set of
5-tuple belong to the Network and Transport Layer and only reckon to
be an estimation and inference of Application Layer services. Thus,
the current 5-tuple scheme is not sufficient to provide fine-granular
service provisioning.
Particularly, it’s critically important to identify the key sub-flow
which is more sensitive to networking SLA guarantee than other sub
flows which share a same 5-tuple identification, therefore the
networking nodes could not be able to identify the said sub-flow.
4.2.2. Traffic Engineering of service identification
The existing SRv6 technology provides the SRV6 TE policy capability
to implement differentiated network service capabilities. In
general, the following traffic engineering methods are used to with
the SRV6 TE policy :
Huang, et al. Expires 25 April 2024 [Page 9]
Internet-Draft Abbreviated Title October 2023
* Binding SID-based traffic engineering: In general, it is used for
network-side tunnel concatenation, cross-domain path
concatenation, and SD-WAN scenarios. This involves security
authorization and accounting management, and thus is rarely
feasible for user traffic steering
* Color-based traffic engineering : The device looks up for a
matching SRV6 TE policy with the same color and end-point address.
If a matching SRV6 TE policy exists, the device guides the service
traffic to the policy. Then, it forwards the service traffic
through the TE policy.the service and application specific SLA
requirements have to anchor upon the existing TE policy
capabilities rather than the other way around.
* DSCP-based Traffic engineering: DSCP bits in the service packets
are always used to further distinguish the services. However,
DSCP ranges from 0 to 63, the differentiation as well as the
diversification it could indicate would be quite limited.
4.3. Service operation and fulfillment
From perspective of service operation and fulfillment, an easy and
simple interface for both the underlying network capabilities and the
key services and applications with tailored and guaranteed networking
and cloud resources, is imperative. However, the selected tuples
combination scheme indicates either particular paths or traffic flows
and thus could not be exposed directly to the third parties with
regard to the fulfillment and operation of the said services and the
networking capabilities.
In the case of segment routing over IPv6, binding segment
identifications have been designed and rendered in some occasions as
network service and capability to the third parties. Nevertheless,
SRv6 binding segment identification stays exactly within network
scheduling and orchestration domain, the exposure from SRv6 BSID
would be actually restricted for network operator itself rather than
a straightforward and fulfillment interface to the third parties.
4.4. Convergence of network and cloud
In current conditions, the original 5-tuples of packets are
terminated when entering the cloud. Kubernetes, for instance,
applies a NodePort mechanism in which the access Destination IP
refers to any possible IP of a host in the cluster. Afterwards, the
packets are further steered to a possible specific Pod according to
Iptables rules configured in the cluster itself. Also, SNAT
operations are indispensable when a source node decides to achieve
load balance by distributing the packets to Pods deployed on other
Huang, et al. Expires 25 April 2024 [Page 10]
Internet-Draft Abbreviated Title October 2023
nodes. Another typical example is interconnections between different
clusters, Istio for instance. The remote service is registered with
the remote Gateway IP in the local cluster. Thus, 5-tuples in the
packets sent only implys the remote Gateway and ends at the edge of
the remote cluster. Therefore, the semantics indicated by 5-tuples
which records in the network domain is not preserved and inherited in
the cloud in the current scheme.
+----------------------+ +--------------+
| Cluster 1 | Network | Cluster 2 |
| | | |
| +-+ +---+----+ +---+----+ |
| ( )---->|GateWay +-->|GateWay | |
| +-+ +---+----+ +---+----+ |
| sleep.sample | |\ +-+ |
| Round Robin between: | | \-->( ) |
| local Pod(Pod IP) | | +-+ |
| Remote Cluster(GW IP)| | Pod |
| | | |
+----------------------+ +--------------+
Figure 3
4.5. L4/L7 gateway in the way of end to end service traffic
The end to end service traffic has always been terminated at L4/L7
gateways in the cloud sites in terms of traffic routing and
forwarding because of the current service and application governance
mechanism where the cloud as well as the applications has been
operated in separate domain. A significant price has been paid in
terms of the end to end service traffic forwarding performance, a
higher performance benefits could have been gained with L3-based
hardware forwarding instead of the L4/L7-based software forwarding.
On top of L4/L7 gateway routing termination, there’re two different
set of IP address with regard to the same service and application, so
Network Address Translation (NAT) would always be involved in the
process of the service traffic forwarding both inbound and outbound
cloud.
Under scenario of inter-cloud and client-cloud service traffic
forwarding, L4/L7 gateway and NAT brings forwarding performance
burden which could be hindering for some sensitive services and
applications.
Huang, et al. Expires 25 April 2024 [Page 11]
Internet-Draft Abbreviated Title October 2023
5. Requirements of service identification for addressing and networking
In this section, requirements of service identification for routing
network have been identified based upon the use cases and the
problems and gap analysis of the existing service identification
mechanisms.
REQ1 Service identification SHOULD have standalone semantics against
5-tuples.
REQ2 Service identification SHOULD have global and unified semantics
across terminal, network and cloud.
REQ3 Service identification SHOULD be able to index the specified
service profile in terms of its SLA requirements.
REQ4 Service identification might indicate specified networking
capabilities and specified applications as well as application
components such as micro-services.
REQ5 Service identification might cover only the selected services
and applications which have been designated to be networking and
computing sensitive.
6. Framework consideration of service identification for addressing and
networking
6.1. Service ID over existing networking IDs and labels
It’s quite important to make the routing network be aware of the
service identification in such a straightforward way that the
networking node does not have to be heavily stateful when it comes to
service identification specific routing and forwarding, and more
importantly, decoupling mechanism between application and network
remains as it is.
A Standalone entity of service identification is employed in the
network control and data plane which shares the following features:
* Location and device independent.
* Semantics only of service type as well as its networking and
computing SLA requirements.
* Globally unique within a controlled network and possibly across
multiple domains and across terminal and cloud.
Huang, et al. Expires 25 April 2024 [Page 12]
Internet-Draft Abbreviated Title October 2023
6.2. Service ID Management and maintenance
The edge computing service is being expanded from a single edge site
to networking and collaborating with multiple edge sites to solve
problems such as high cost, poor service experience, and low resource
utilization. Large-scale edge sites require interconnection and
coordination, dynamic services require optimal service access and
load balancing. Based on the computing capability and network
conditions of the real processing delay, services can be dynamically
scheduled to appropriate service instances to improve resource
utilization and user experience. Service identification based
addressing and networking is employed to facilitate these
interconnections and coordination.
Service ID is designated as indicating a common type of fundamental
service which has global semantics across terminal, network and
cloud. In addition, a local service ID may be assigned by the
operation and management system in the service domain. Service ID
provides effective interconnections between networks and services.
Based on the attributes associated with service ID, the network can
perceive the resources provided by the services, the quality of the
service, as well as the service requirements. From the perspective
of the service platforms, the overall view of the computing and
network resources with regard to the service ID could be established.
6.3. Lifecycle and governance of service ID
Registration: The service ID is assigned by the SCMS system when a
service provider registers a cloud service or a network operator
registers a networking connection service.
Publish: The service ID can be published after the service has been
identified and authenticated and authorized. Network operators can
configure specific network policies for a service according to the
requirements associated with a service ID, and service providers can
also orchestrate specific service instances for a service according
to the resource status associated with a service ID.
Subscription: The terminal application system subscribes the service
ID from the SCMS, integrates the service ID into the client of the
application system, and encapsulates the service ID in the protocol
header of the data flow.
Update: As the service is used, the attributes associated with the
service ID are updated, and the network policy is updated in real
time based on the network status.
Huang, et al. Expires 25 April 2024 [Page 13]
Internet-Draft Abbreviated Title October 2023
Revocation: The service ID is not be revoked due to the termination
of a particular service, and is only be terminated and revoked by the
SCMS in accordance with the operating agreement and business contract
of the service.
6.4. Key Processes of service
*Service initiation: SIAN service is initiated through service data
traffic, so it is not necessary to initiate a signaling interaction
flow through a separate service. The terminal application program
carries the subscribed service identifier in the service packet, and
initiates a service data traffic transmission request to the SIAN
network.
*Service awareness: Network senses a resource metric indicator of a
corresponding service instance by using service ID, and spreads the
metric indicator on a control plane, so as to further calculate a
service routing table based on the service identifier according to
the network and the service metric indicator; and in addition, senses
a network SLA requirement of a service type level granularity, and
implements a service SLA policy guarantee of the streamline.
*Service routing: In the L3 forwarding entry mechanism in which the
light-weight service identifier of the forwarding plane is used as an
index, the SIAN architecture logically introduces a service routing
sub-layer, that is, a routing protocol uses the service identifier as
a routing identifier. Logically, the service routing sub-layer only
implements service routing, that is, service identifiers are used as
the index for computing, scheduling and routing. Specifically, the
service routing sublayer implements comprehensive selection of
service instances and network paths for service data traffic, and
implements efficient service-centric computing, network scheduling,
and routing.
*Service delivery: After a service flow is forwarded to Service
Server through the SIAN network, and Service Server completes service
routing and scheduling in the cloud based on the service ID.
*Service OAM: SIAN enables complete OAM to measure and monitor the
health of network links and service instances. The measurement of
the OAM system is reported to the control plane to update network
metric and resource metric indicators of service instances in real
time, and adjust the service SLA status and service routing tables of
the streamline in a timely manner. It also supports network-level
OAM, which is used to detect service quality, trigger service route
re-convergence and self-healing.
Huang, et al. Expires 25 April 2024 [Page 14]
Internet-Draft Abbreviated Title October 2023
6.5. Service ID based Routing and Forwarding reference framework and
work flow
This section proposes a reference framework and work flow to
demonstrate the end to end service ID based routing and forwarding
process as illustrated in figure 4.
| Service S-ID 1,instance SI-ID-1 1[metrics] |
|<------------------------------------------>|
| Service S-ID 2,instance SI-ID-2 1[metrics] |
+-+------+ +--------+ +------+-+ +---------+
| SIAN | | | | SIAN | | Edge |
| Ingress| | | | Egress | | Site |
| +----+ | | | | | | |
| |SPIS| | | | | | ++-------++
+------+ | +----+ | Network |underlay| Network | +----+ |Service||S-ID 1 ||
|client+-+ +----+ |<---------+ domain |<---------+ |SSMC| |<------+|SI-ID-1||
+------+ | |SNMC| | metrics | | metrics | +----+ |metric |+-------+|
| +----+ | | | | +----+ | |+-------+|
| +----+ | | | | |SUTF| | ||S-ID 2 ||
| |SUTF| | | | | +----+ | ||SI-ID-2||
| +----+ | | | | | |+-------+|
+-+----+-+ +--------+ +--------+ +---------+
Figure 4
6.5.1. Components and working mechanism
* Service client: A host requests service identification information
of a specific application from a management and control system,
and generates a data packet that carries the service
identification information. If the information is carried in the
data packet, the information is used by the SIAN ingress gateway
node to determine the address of the instance of the service and
the path between the SIAN ingress and egress gateway nodes, so as
to forward the data packet to the destination of the service
instance. That is, after the service instance is selected, the
data packet is directed to the corresponding SR path that meets
the application requirements. In the SIAN architecture, service
identities must be client-aware, and there are various schemes for
carrying them.
* SIAN Ingress : Receives the service compute network SLA parameters
delivered by the service control and management system, and
generates the service routing table indexed by service ID in
accordance with the compute network resource status on the control
plane. Receives and parses the service identifier carried in the
Huang, et al. Expires 25 April 2024 [Page 15]
Internet-Draft Abbreviated Title October 2023
user service packet, searches for a service routing entry
according to the service identifier, and forwards the service
packet.
* SIAN Egress : The specific service path is terminated at the tail
node, and packets are forwarded to the Service server. SIAN
Egress connects to a plurality of computing resources and senses
status information of the computing resources.
* Edge site and service instances: The Edge site is usually deployed
near the user to install various services (such as AR/VR) that are
extremely sensitive to delay and bandwidth, so that users can have
better experience in accessing the network. Service instance is
an instance resource that provides the service, and can accept,
process, and respond to service requests. Generally, a same Edge
site may deploy service instances (SI-ID-1 1 and SI-ID-1 2 in FIG.
2) that provide a same service type, or may deploy service
instances (SI-ID-1 3 and SI-ID-2 1 in FIG. 2) that provide
different service types.
* SSMC(SIAN Service Metric Collector ): Deployed on the SIAN egress
to collect service metric information, including the resource
usage, slow request ratio, and average service completion time.
The information changes frequently. To avoid too much pressure on
the network due to frequent updates, it is recommended that the
information be compressed in accordance with the threshold or long
period (minutes).
* SNMC(SIAN Network Metric Collector ): Deployed on the SIAN ingress
to collect the network metric information spread by the transport
network device and SIAN gateway. The information includes link
bandwidth, physical link delay, and link occupation. It is
usually spread in the domain through the IGP protocol, and an TE-
DB is formed on each network node.
* SPIS(SIAN Path and Instance Selector): Deployed on the SIAN
ingress or centralized server. In some cases, for example, across
domains, the SPIS must be deployed on the server. In accordance
with the metric information recorded by the SSMC and SNMC, the
SPIS is delivered to the forwarding plane SUTF through the control
plane calculation by using the service identifier mapping
algorithm.
Huang, et al. Expires 25 April 2024 [Page 16]
Internet-Draft Abbreviated Title October 2023
* SUTF(SIAN User Traffic Forwarder): The SIAN ingress and egress
gateways are usually deployed to identify client service request
traffic, and select a path and a service instance in accordance
with the service forwarding table. The undelay network does not
distinguish service traffic, but forwards packets in accordance
with the path carried by packets, for example, SRH.
6.5.2. Control Plane Consideration
Service identification based metric notification as well as the
forwarding policy would be achieved by extending the existing routing
protocols and mechanisms as following:
* Service metric distribution: The SSMC of the SIAN egress perceives
service metric changes and spreads the information in the network
domain by using the IGP/BGP protocol. In the overlay model, to
spread the service metric to affect the undelay network, it is
recommended that this parameter be set to underlay bypass. To
reduce the resource consumption of the network control plane and
forwarding plane, it is recommended that this parameter be set to
SIAN egress to converge service instances and information before
spreading.
* Distributed service route calculation and delivery: The SIAN
ingress and egress are deployed by using the overlay model.
However, as a network device, the SIAN ingress and egress can
interconnect with the undelay network through IGPs to obtain
network metric information. The SPIS obtains SSMC and SNMC
records metric information, calculates service routes by using the
constraint-based algorithm, and delivers the information to the
SUTF for service access. This overlay model can still achieve the
goal of joint service and network calculation, and achieve joint
traffic engineering of streamline computing and networks.
* Deployment of centralized service route computation: The SPIS is
deployed on the compute network controller. The metric
information collected by the SNMC and the compute network
controller is reported through BGP-LS. The metric information
collected by the SSMC can also be reported through the extended
BGP-LS protocol. In a cross-domain scenario, this is the only
option for implementing service routing.
Huang, et al. Expires 25 April 2024 [Page 17]
Internet-Draft Abbreviated Title October 2023
6.5.3. Data Plane Consideration
A Service ID defined in this draft owns its unique semantics in the
forwarding procedure. The forwarding plane regards the Service ID as
a simple indicator to steer the traffic in the purportedly overlay
service routing layer. It is also gifted with possible incremental
values and scalability, security insurance through a whole service
process based on Service ID for instance.
6.5.3.1. Service ID Encapsulated in The IP Address Field
Service ID can be encapsulated in the IP address field in an IPv6
header. Typical encapsulation methods are displayed as below.When
the Service ID is encapsulated in the IP address field in an IPv6
header, its semantics is preserved and maintained from the client to
the network domain.
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| | | | |
| Prefix | Node | Service ID |Padding|
| | | | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 5
6.5.3.2. Service ID Standalone Encapsulation
Service ID can be encapsulated in a standalone position which
decouples from IP addresses. Service ID encapsulated in the Flow
Label field.
Huang, et al. Expires 25 April 2024 [Page 18]
Internet-Draft Abbreviated Title October 2023
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version| Traffic Class | Flow Label(Service ID) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Payload Length | Next Header | Hop Limit |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
| Source |
| Address |
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| |
| Destination |
| Address |
| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 6
Service ID can also be encapsulated and carried IPve extention
headers as following:
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Next Header | Hdr Ext Len | Options(variable) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
| (Service ID) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
Figure 7
When the Service ID is encapsulated in a standalone position in an
IPv6 header, a corresponding unique service semantics is preserved
and maintained from the client to the network and is also capable of
being delivered into the cloud. As illustrated in section 4,, a
standalone Service ID enables global service provisioning in the
whole service process across network and cloud sites.
Huang, et al. Expires 25 April 2024 [Page 19]
Internet-Draft Abbreviated Title October 2023
6.5.3.3. Service ID-based forwarding
Service forwarding table: In a traditional network, services are
identified through 5-tuples. If the network needs to distinguish
services, QoS policy remark dscp is used to hand over services to the
SR-TE ingress gateway of the Underlay in IP+DSCP mode. Traffic-based
automatic traffic diversion implements fine mapping, and the SR-TE
technology ensures the scalability of the solution. It should be
noted that although this solution has clear management boundaries,
network device resource consumption and configuration complexity
cannot be ignored. Considering that the service ID is directly
mapped to the SLA for service access, a routing table based on the
service ID is directly designed. The routing table is directly
interconnected with the SR-TE POLICY that meets the SLA. The user
directly carries the service ID and queries the table on the SIAN
ingress to provide services. In this way, the above-mentioned
limitation problem is solved. Depending on whether service
information is aggregated, two models are supported: Model 1 (non-
aggregated): The service forwarding table carries the policy path and
the selected service instance. Model 2 (aggregated): The service
forwarding table carries the policy path or service site
identification.
Service ID encapsulation: Because of its powerful programmable
capability, the SR-MPLS/SRv6 is currently selected by the SIAN
gateway and transport network. For terminals and cloud services, the
IPv4 is used for access and interconnection. In the future, the SR-
MPLS/SRv6 will gradually transit to the IPv6. Therefore, the SR-
MPLS/SRv6 needs to support the IPv4 and IPv6 scenarios. There are
multiple encapsulation modes.
Huang, et al. Expires 25 April 2024 [Page 20]
Internet-Draft Abbreviated Title October 2023
Service packet forwarding: After receiving a service request packet,
the SIAN ingress obtains the service identifier and searches the
service forwarding table. Based on the service forwarding table
model 1, the SIAN ingress modifies the destination policy in the
service request packet to the service instance policy carried in the
forwarding table, encapsulates the tunnel header in accordance with
the policy information, and forwards the packet. After decapsulating
the tunnel encapsulation packet, the SIAN egress forwards the packet
in accordance with the standard IP route. For the service forwarding
table model 2, the SIAN ingress does not modify the tunnel header
encapsulated in accordance with the policy information and forwards
the packet. After decapsulating the tunnel header, the SIAN egress
searches the local service forwarding table in accordance with the
service identifier, modifies the destination IP of the service packet
to the IP in the service forwarding table, and forwards the packet
based on the standard IP route. Regardless of the forwarding model,
the underlay node does not perceive the information inside the tunnel
and forwards the packet.
Service flow affinity in the service forwarding table: Flow affinity
means that packets from the same flow are always sent to the same
egress and processed by the same service instance.
For the service forwarding plane table 1 model, when a new flow
arrives at the ingress, after the best service instance and egress
are determined, the ingress updates the flow identifier (5-tuple),
preferred egress, and affinity timeout time to the flow binding
table. The destination egress is already the real service instance
egress, and the egress does not need to search the flow affinity
table.
For the service forwarding plane table 2 model, when a new flow
arrives at the ingress, after only the best egress egress is
determined, the ingress updates information such as a flow identifier
(information such as a 5-tuple is distinguished), a preferred egress,
and an affinity timeout time to the flow binding table. Because a
destination egress is not determined, the egress still needs to
search the flow affinity table to obtain an instance egress, modify
the flow affinity table, and perform table lookup and forwarding of
an egress forwarding table.
6.6. OAM Consideration
The main function of the OAM is to detect network defects before an
abnormal event is activated. It isolates correctable errors or time
errors within a certain range and does not interfere with network
operation, thus ensuring that the operator fulfills its QoS
commitment and achieves the pre-signed SLA.
Huang, et al. Expires 25 April 2024 [Page 21]
Internet-Draft Abbreviated Title October 2023
The OAM generally includes a fault management (FM) function and a
performance management (PM) function. FM features such as CC, CV,
and RDI automatically detect and locate defects in the network. PM
features such as LM, DM, and Throughput can diagnose service
degradation. The OAM function is also the key to network
survivability and triggering network protection.
| | | |
| Access network | Transport network | Data center network|
|<-------------->|<------------------->|<------------------->|
| | | |
+--+---+ +---+---+ +--------+ +--+---+ +----+---+
|client+--------+ SIAN +--+underlay+--+ SIAN +------------+Services|
+--+---+ |Ingress| | node | |Egress| +----+---+
| +---+---+ +---+----+ +--+---+ |
| | Link OAM | Link OAM | Service OAM |
| |<-------->|<-------->|<------------------->|
| | | | |
| | Network E2E OAM | |
| |<------------------->| |
| | | |
| | Network to Service E2E OAM |
| |<----------------------------------------->|
| | |
| Client to Service E2E OAM |
|<---------------------------------------------------------->|
| |
Figure 8
In addition to a conventional network domain OAM technology, the SIAN
OAM also introduces computing power-related OAMs. Referring to an
architecture of the SIAN OAM in figure 8. the SIAN OAM specifically
includes the following layers:
* The base-layer OAM: includes the network Link OAM (such as BFD,
EFM, and MPLS-LM-DM) and the Service OAM (such as ping and keep
alive) from the SIAN egress gateway to the service instance. The
related OAM detection results are used as the reference and factor
for service and network joint traffic engineering calculation, and
are also used for triggering fast convergence through fault
detection.
* The network-layer OAM: includes Network E2E OAM (such as BFD, INT,
TWAMP, SR-PM, and MPLS-LM-DM) and Network To Service E2E OAM (such
as ping, INT, and RTT mesurement). It implements network fault
and quality deterioration perception, and is respectively used to
Huang, et al. Expires 25 April 2024 [Page 22]
Internet-Draft Abbreviated Title October 2023
trigger network-segment SLA and network-to-service SLA. It is
used to trigger the recalculation of network, service paths, and
instances to achieve service SLA. At the same time, it is self-
proofed.
* The application layer OAM: includes the Client To Service E2E OAM
(such as ping, INT, and http ping), which is used to implement
application-level end-to-end detection and evaluate the
achievement of application-level SLA. In most cases, software-
level application-level burial points can be used to implement
end-to-end QoS detection for applications.
6.7. End to end service flow upon service ID
As illustrated above, the service ID could be materialized in
different fields of the data packet, the end to end service flow
would quite different when the service ID is put in the fixed field
such as destination address field and in the extension header as a
standalone encapsulation respectively.
6.7.1. Service ID in destination address
The client in the terminal obtains the service ID by either DNS
inquiry or other subscription processes, and encapsulates the service
ID in the field of destination address. When the service request
arrives at the ingress gateway which is aware of service ID, the
ingress retrieves the service ID and treats the request as well as
the subsequent flow according to the service ID specific policy
maintained at the ingress.In particular, the policy here is actually
service ID-based addressing in which both service ID and its
corresponding service requirements which could be satisfied by the
network would be involved. From the perspective of service ID
awareness, it could be only ingress and egress related while the
traditional underlay network nodes would transmit the service flow in
the scheduled networking policy without being aware of service
ID.When the service request arrives at the egress gateway, it could
continue forwarding the service request according to the constraints
associated with the service ID beyond networking and the policy would
be terminated otherwise.The key point here about service ID in
destination address is the traditional service discovery process such
as DNS could stay as it is and therefore the client in the terminal
would not be impacted.
Huang, et al. Expires 25 April 2024 [Page 23]
Internet-Draft Abbreviated Title October 2023
6.7.2. Service ID in flow label and extension headers
The service ID encapsulated in the extension header in a standalone
way by the client of the terminal could remain intact through the
entire network as well as the cloud site, and thus be treated at the
service ID-awareness nodes which would retrieve it and steer the
service traffic according to the service ID specific policy. The
service work flow would be the same as that of service ID in
destination address except the following sub-work flow:
* Adaptation has to occur at client of the terminal because an
additional extension header encapsulated with service ID should be
added to the original data packet header.
* At egress, when the service traffic continue to be forwarded to
other service ID-unaware network domain or cloud sites, the
service ID could remain in the user packet header even when the
network address translation would be executed.
7. Acknowledgements
To be added upon contributions, comments and suggestions.
8. IANA Considerations
This memo includes no request to IANA.
9. Security Considerations
A standalone service ID in routing network would add a new threat
exposure in terms of networking sercurity.However, service ID of this
proposal should be governed and managed by the network and cloud
platform, so service ID should be strictly handled within a closed
system. The security related behaviors with regard to networking
node would proposed in other documents.
10. Informative References
[I-D.ldbc-cats-framework]
Li, C.L., "A Framework for Computing-Aware Traffic
Steering (CATS)", August 2023,
.
[I-D.li-apn-framework]
Li, ZB.L., "Application-aware Networking (APN) Framework",
October 2023, .
Huang, et al. Expires 25 April 2024 [Page 24]
Internet-Draft Abbreviated Title October 2023
[I-D.trossen-rtgwg-rosa-arch]
Trossen, D.T., "Architecture for Routing on Service
Addresses", July 2023, .
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119,
DOI 10.17487/RFC2119, March 1997,
.
Authors' Addresses
Daniel Huang
ZTE Corporation
Nanjing
China
Phone: +86 13770311052
Email: huang.guangping@zte.com.cn
Ge Chen
China Telecom
Guangzhou
China
Email: chengg55@chinatelecom.cn
Jie Liang
China Telecom
Guangzhou
China
Email: liangjie6@chinatelecom.cn
Yan Zhang
China Unicom
Beijing
China
Email: zhangy1156@chinaunicom.cn
Dong Yang
Beijing Jiaotong University
Beijing
Email: dyang@bjtu.edu.cn
Huang, et al. Expires 25 April 2024 [Page 25]
Internet-Draft Abbreviated Title October 2023
Dongyu Yuan
ZTE Corporation
Nanjing
China
Email: yuan.dongyu@zte.com.cn
Fu Huakai
ZTE Corporation
Wuhan
China
Email: fu.huaka@zte.com.cn
Cheng Huang
ZTE Corporation
Shanghai
Phone: +86 13167198926
Email: huang.cheng13@zte.com.cn
Yong Guo
ZTE Corporation
Shanghai
Phone: +86 15618880912
Email: guo.yong3@zte.com.cn
Huang, et al. Expires 25 April 2024 [Page 26]