Internet-Draft Operations, Administration and Maintenan March 2024
Fu, et al. Expires 4 September 2024 [Page]
Intended Status:
Standards Track
H. Fu
ZTE Corporation
B. Liu
China Mobile
Z. Li
China Mobile
D.H. Huang
ZTE Corporation
C. Huang
ZTE Corporation
L. Ma
ZTE Corporation
W. Duan
ZTE Corporation

Operations, Administration and Maintenance (OAM) for Computing-Aware Traffic Steering


This document describes an OAM framework for Computing-Aware Traffic Steering (CATS). The proposed OAM framework enables the fault and the performance management of end-to-end connections from clients to networks and finally to computing instances. In the following sections, the major components of the framework, the functionalities, and the deployment considerations are elaborated in detail.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 4 September 2024.

Table of Contents

1. Introduction

As described in [I-D.ietf-cats-usecases-requirements], edge computing provides lower response time and higher transmission rate than cloud computing by moving computing instances to the network edge. To meet the requirements of users that are highly distributive, service providers deploy the same type of service instances at multiple edge sites, which involves steering traffic from clients to the most appropriate computing instance.

Compute-aware traffic steering (CATS) [I-D.ldbc-cats-framework] is a traffic engineering approach [I-D.ietf-teas-rfc3272bis] developed to address the aforementioned traffic steering problem. This approach takes into account the dynamic nature of both the computing resources and the network states to optimize the way that traffic is forwarded towards a given service instance. Various metrics can be taken into account to devise and enforce such service-specific and computing-aware traffic steering policies.

To achieve better service assurance, it is necessary to not only rapidly detect whether the QoS provided by the computing networks meets the SLA requirements of clients, but also dynamically trigger the calculation and the adjustment of both the computing and the networking services. There are OAM technologies developed for Carrier Networks, but these technologies are only deployed in the network domain to facilitate the operations and the maintenance of network operators, and cannot provide measurements of an end-to-end connection from a client to a computing instance.

To this end, this document proposes an OAM architecture based on the CATS framework to extend the coverage of the existing OAM technologies from purely the network to an end-to-end connection from a client to the network and finally to the computing instances. Besides the architecture, the major components and the associated deployment considerations are also described.

2. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

3. Terminology

This document makes use of the terms defined in [I-D.ldbc-cats-framework].

4. Requirements and Motivation

The main objectives of OAM are to detect anomalies before they intensify, reduce the number of traffic flows impacted by these abnormalities, and ensure that network operators fulfill their QoS guarantee commitments to meet the Service Level Agreement(SLA) of clients.

As a traffic engineering method, computing-aware traffic steering (CATS) takes into account the dynamic nature of both the computing resources and the network states to optimize the way that traffic is forwarded toward a given service instance. However, existing OAM technologies developed for the carrier network cannot be used to collect metrics associated with the computing resources. Therefore, it is necessary to extend the existing OAM technologies to build an end-to-end OAM for CATS. Key objectives include:

5. Framework and Components

The CATS OAM architecture is shown in Fig. 1. In this architecture, both the CATS router and the Underlay node are deployed with the existing OAM technologies that are developed for the Carrier Network. These OAM technologies are used to detect anomalies and monitor service performance in the network domain, and can be divided into three categories: link OAM, tunnel OAM, and service OAM.

               /--->| Carrier OAM Domain |<--\
              /     +--------------------+    \
             /                                 \
            |           Service OAM             |
            |                                   |
            |           Tunnel OAM              |
            |                                   |
            |    Link OAM     |     Link OAM    |
            |                 |                 |
+------+ +--+--------+    +---+----+   +--------+--+ +--------+
|client+-+  CATS-    +----+underlay+---+  CATS-    +-+service |
|      | |Forwarder 1|    |  node  |   |Forwarder 2| |instance|
+------+ +-----------+    +--------+   +-----------+ +----+---+
    ^       ^                                   ^         |
    |       |                                   |         |
    |       |                               +---+----+    |
    |       |                               | SI_OAM |<-->|
    |    +--+-----+                         +--------+    |
    |    | TC_OAM |<------------------------------------->|
    |    +--+-----+                                       |
    |       |                                             |
    |    +--+-----+                                       |
    +----+ AF_OAM |<------------------------------------->|
         +--+-----+                                      /
             \                                          /
              \         +-----------------+            /
               \------->| CATS OAM Domain |<----------/

Figure 1: CATS OAM Functional Components

5.1. Component

To achieve the four objectives mentioned in Chapter 3, we designed a CATS OAM architecture based on the CATS architecture and the existing OAM technologies that are developed for the carrier network. This CATS OAM architecture can flexibly support existing OAM detection tools, e.g., the ones mentioned in the previous section, and consists of the following three components:

5.2. Deployment Consideration

To demonstrate the complete CATS OAM procedure, a proper OAM detection tool needs to be selected and deployed on the network and service instance hosts of the CATS OAM architecture. The selection of OAM detection tools is out of the scope of this document.

                  +--------------+ Intelligent controller  +-------------+
                  |              +-------------------+-----+             |
                  |                                   |                  |
                  v                                   v                  v
            +-----------+                        +-----------+       +--------+
            |  CATS-    |                        |  CATS-    |       |  Edge  |
            |Forwarder 1|                        |Forwarder 2|       |  Site  |
            |           |                        |           |Service|        |
+--------+  |+---------+|                        |+---------+|Metrics|S-ID 1  |
| client |  ||  C-PS   ||       +--------+       ||  C-SMA  |<-------|SI-ID 1 |
|        |  |+---------+|Network|        |Network|+---------+|       |        |
|+------+|  |  ^    ^   |Metrics|Underlay|Metrics|       ^   |       |S-ID 1  |
||AF-OAM|+--+  |    |   |<------+ domain |<------|       |   |-------|SI-ID 2 |
|+--+---+|  |  |    |   |       +--------+       |   +---+--+| OWAMP |        |
|   |    |  |  |    |   |                        |   |SI-OAM|<------>|S-ID 2  |
+---+----+  |  |+---+--+|           OWAMP        |   +------+|       |SI-ID 1 |
    |       |  ||TC-OAM|+------------------------+-----------+------>|        |
    |       |  |+------+|                        |           |       |S-ID 2  |
    |       | ++-------+|           IOAM         |           |       |SI-ID 2 |
    |       | | AF-OAM |+------------------------+-----------+------>|        |
    |       | +--------+|           IOAM         |           |       |        |
    +-------+-----------+------------------------+-----------+------>|        |
            +-----------+                        +-----------+       +--------+

Figure 2: An Example Of CATS OAM Deployment

As illustrated in Fig. 2, the OWAMP and the IOAM tools are selected as examples to describe how the CATS OAM component works with these detection tools to fulfill the four objectives :

6. Operation

The OAM architecture proposed in this document enables CATS to provide robust operations capabilities while forwarding and routing. It should be noted that both the testing packets and the data packets should be delivered via the same path i.e., performance monitoring must be conducted in-band, and the testing traffic must not affect the data traffic. As a result, the testing traffic does shares the treatments with the data flow being monitored but does not introduce congestion when the network functions normally.

To be added.

7. Management

It is necessary to disclose a set of metrics to support the decision of the operator. The following performance metrics are useful:

To be added.

7.1. Indicator Collection

The number of metrics and the frequency that these metrics are collected need to be considered when designing the OAM mechanism. The OAM mechanism may be distributed, centralized, or both. The mechanism may be executed periodically or triggered by an event.

To be added.

8. Maintenance

Service protection is designed to mitigate simple network failures faster than the response time expected from the CATS control plane. In the events that affect network operations, e.g., link contexts change, network and computing devices crash/restart, and traffic starts/ends, the CATS control plane needs to perform remediation and re-optimization operations to ensure SLAs of all active flows are satisfied. The control plane should continuously obtain the network status and evaluate whether the current configurations are suitable.

To be added.

9. Security Considerations


10. Acknowledgements

To be added upon contributions, comments and suggestions.

11. IANA Considerations


12. References

12.1. Normative References

Li, C., Du, Z., Boucadair, M., Contreras, L. M., and J. Drake, "A Framework for Computing-Aware Traffic Steering (CATS)", Work in Progress, Internet-Draft, draft-ldbc-cats-framework-06, , <>.
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <>.
Shalunov, S., Teitelbaum, B., Karp, A., Boote, J., and M. Zekauskas, "A One-way Active Measurement Protocol (OWAMP)", RFC 4656, DOI 10.17487/RFC4656, , <>.
Mizrahi, T., Sprecher, N., Bellagamba, E., and Y. Weingarten, "An Overview of Operations, Administration, and Maintenance (OAM) Tools", RFC 7276, DOI 10.17487/RFC7276, , <>.
Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, , <>.
Filsfils, C., Ed., Previdi, S., Ed., Ginsberg, L., Decraene, B., Litkowski, S., and R. Shakir, "Segment Routing Architecture", RFC 8402, DOI 10.17487/RFC8402, , <>.
Filsfils, C., Ed., Dukes, D., Ed., Previdi, S., Leddy, J., Matsushima, S., and D. Voyer, "IPv6 Segment Routing Header (SRH)", RFC 8754, DOI 10.17487/RFC8754, , <>.
Brockners, F., Ed., Bhandari, S., Ed., Bernier, D., and T. Mizrahi, Ed., "In Situ Operations, Administration, and Maintenance (IOAM) Deployment", RFC 9378, DOI 10.17487/RFC9378, , <>.

12.2. Informative References

Yao, K., Trossen, D., Boucadair, M., Contreras, L. M., Shi, H., Li, Y., Zhang, S., and Q. An, "Computing-Aware Traffic Steering (CATS) Problem Statement, Use Cases, and Requirements", Work in Progress, Internet-Draft, draft-ietf-cats-usecases-requirements-02, , <>.
Farrel, A., "Overview and Principles of Internet Traffic Engineering", Work in Progress, Internet-Draft, draft-ietf-teas-rfc3272bis-27, , <>.
Li, Y., Iannone, L., Trossen, D., Liu, P., and C. Li, "Dynamic-Anycast Architecture", Work in Progress, Internet-Draft, draft-li-dyncast-architecture-08, , <>.

Authors' Addresses

Huakai Fu
ZTE Corporation
Bo Liu
China Mobile
Zhenqiang Li
China Mobile
Daniel Huang
ZTE Corporation
Cheng Huang
ZTE Corporation
Liwei Ma
ZTE Corporation
Wei Duan
ZTE Corporation