Network Working Group Y. Jiang W. Xu Internet Draft Huawei Z. Cao Intended status: Standards Track Leibniz Uni-Hannover Y. Zhu China Telecom Expires: April 2016 October 16, 2015 Fault Management in Service Function Chaining draft-jxc-sfc-fm-03.txt Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html This Internet-Draft will expire on April 16, 2013. Copyright Notice Copyright (c) 2015 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Jiang and et al Expires April 16, 2016 [Page 1] Internet-Draft SFC Fault Management October 2015 Abstract This document specifies some Fault Management tools for SFC. It also describes the procedures of using these tools to perform connectivity verification and SFP tracing functions upon SFC services. Table of Contents 1. Introduction .............................................. 2 1.1. Conventions used in this document ...................... 3 1.2. Terminology ............................................ 4 1.3. SFC OAM Requirements ................................... 4 2. Packet Format ............................................. 5 3. Theory of Operation ....................................... 8 3.1. Continuity Check and Connectivity Verification of SFC .. 8 3.1.1. MEP sending an SFC CC-CV packet ..................... 8 3.1.2. SFF processing an SFC CC-CV ......................... 9 3.1.3. SF processing an SFC CC-CV .......................... 9 3.1.4. MEP terminating an SFC CC-CV packet ................. 9 3.2. SFC Route Tracing ...................................... 9 3.2.1. MEP sending an SFC Trace Request ................... 11 3.2.2. SFF processing an SFC Trace Route Request .......... 11 3.2.3. SF processing an SFC Trace Request ................. 11 3.2.4. MEP receiving an SFC Trace Reply ................... 12 4. Security Considerations .................................. 12 5. IANA Considerations ...................................... 12 6. References ............................................... 12 6.1. Normative References .................................. 12 6.2. Informative References ................................ 12 7. Acknowledgments .......................................... 13 1. Introduction This document discusses Operations, Administration and Maintenance (OAM), specifically, fault management of Service Function Chaining (SFC), and further provides a solution that can be used in SFC OAM. An SFC layer OAM framework was described in [I-D.ietf-sfc-oam- framework], which introduced multiple layers of OAM model including the link layer, the network layer, the service layer. The link layer and the network layer can leverage existing OAM tool sets available to perform OAM function. Jiang and et al Expires April 16, 2016 [Page 2] Internet-Draft SFC Fault Management October 2015 This document is focused on the SFC OAM mechanism in the service layer. Currently the service layer does not have OAM tools yet, and new OAM tools should be developed according to the characteristic of service layer. In the design of SFC OAM, one important principle is to reuse existing OAM tools as much as possible. As a result, methods in this document follow some existing OAM mechanisms, e.g. Ping, BFD, etc. A requisite of SFC OAM is that SFC OAM messages must follow the same data path as normal SFC packets would traverse. This is usually called "In-band" signaling. SFC OAM tools are used primarily to validate the SFC data plane, and may further be used to verify the SFC data plane against the SFC control plane. SFC OAM tools discussed in this document include Connectivity Function, Continuity Function, and Trace Function. SFC Connectivity Function is used to verify the connectivity existing between classifier or service function and another service function. Connectivity verification packets will travel along the tested service chain path to an anticipant endpoint. SFC Continuity Function is used to validate or verify the constant reachability for a specific SFP. It can detect failure between a source device and destination device in a periodic of time, which usually be used in fast fault detection. SFC Trace Function is used to locate the fault position. The trace request packets will travel along the SFP, it will trigger every transit device to generate response. The trace reply message will carry information for fault location. Details of these SFC functions are described in section 3. 1.1. Conventions used in this document The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119]. Jiang and et al Expires April 16, 2016 [Page 3] Internet-Draft SFC Fault Management October 2015 1.2. Terminology Maintenance Entity Group (MEG): The set of one or more maintenance entities that maintain and monitor a section or a transport path in an OAM domain. MEP: MEG End Point, an OAM end point capable of initiating (source MEP) and terminating (sink MEP) OAM packets for fault management and performance monitoring. MIP: MEG Intermediate Point, an OAM intermediate point terminates and processes OAM packets that are sent to this particular MIP and may generate OAM packets in reaction to received OAM packets. Service Chaining Header: a header in front of packet, added by an SFF. SFF uses service chaining header information to forward service chaining packet. Service Chaining Packet: an original packet added with a service chaining header. Terminologies introduced in [I-D.ietf-sfc-architecture] will not be repeated here. 1.3. SFC OAM Requirements The following SFC OAM requirements MUST be supported: (R1) SFC OAM MUST allow for continuity check between SFFs. (R2) SFC OAM MUST allow for connectivity verification between SFFs. (R3) SFC OAM MUST support trace routing in a service function path. (R4) SFC OAM MUST support connectivity verification between SFs in an SFC chain. (R5) SFC OAM MUST support performance measurements in SFs and SFFs. (R6) SFC OAM MUST support monitoring of unidirectional and bi- directional SFC path. (R7) SFC OAM MUST support fate sharing of SFC OAM packets and SFC service packets on the same SFC path (congruent path). Jiang and et al Expires April 16, 2016 [Page 4] Internet-Draft SFC Fault Management October 2015 Since control plane is not a prerequisite for SFC, we cannot resort to control plane hello session. Furthermore, OAM packets need to be transported on the same data path as the SFC packets, so that any data plane failure can be identified. Therefore, there is a need to provide an OAM tool that would enable users to detect failures in the SFC data plane, and a mechanism to isolate and identify faults. This document discusses the fault management problem in SFC. The basic idea is to verify that packets in a particular Service Function Chain actually passing through the SFFs and SFs along the respective SFC path. It is proposed that this test be carried out by sending an OAM message (called an "SFC trace request message") across an SFC path. The SFC trace request message carries the SFC identifier whose SFC path is being verified. This SFC OAM request message is forwarded on the SFC path just like any SFC data packet belonging to that Service Function Chain. The OAM message is processed by each SFF along the SFC, and the SFF will respond with an SFC trace Reply message, carrying information such as the previous SF identifier and its position in the SFC. 2. Packet Format The Network Service Header (NSH) [I-D.ietf-sfc-nsh] as described in Figure 1 indicates an SFC OAM message (where O bit is set to 1): 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Ver|1|C|R|R|R|R|R| Length | MD Type | OAM Type | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Service Path ID | Service Index | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 1 NSH header for OAM o OAM Type indicate the type of SFC OAM. The OAM has the following types: Jiang and et al Expires April 16, 2016 [Page 5] Internet-Draft SFC Fault Management October 2015 Value Meaning ----- ------- 1 continuity check 2 connectivity verification 3 trace routing 4 performance measurements SFC OAM payload is depicted in Figure 2. 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Version|R|R|R|R| Message Type | Reserved | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Originator Handle | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Remote Handle | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sequence Number | | Sending Timestamp (seconds) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sending Timestamp (microseconds) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Receiving Timestamp (seconds) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Receiving Timestamp (microseconds) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | TLVs | . . . . . . | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 2 SFC OAM payload o Version: version of SFC OAM message. This field is 8 bits long, and current version is set to 0x01. o Message Type indicate the type of SFC OAM message. The SFC OAM message has the following types: Jiang and et al Expires April 16, 2016 [Page 6] Internet-Draft SFC Fault Management October 2015 Value Meaning ----- ------- 1 continuity check message 2 trace request message 3 trace reply message o Originator Handle: The Originator Handle is filled in by the packet original sender. o Remote Handle: The Remote Handle indicates the sink point. It usually used in verify part of SFP. It should be filled with 0xffff when verify the whole SFP. o Sequence Number: The Sequence Number is assigned by the sender of the SFC request message and can be used to track the correct reply message. o The Sending Timestamp is the time-of-day (in seconds and microseconds, according to the sender's clock) when the SFC OAM request is sent. o The Receiving Timestamp in an SFC OAM reply message is the time- of-day (according to the receiver's clock) that the corresponding request was received. o TLVs (Type-Length-Value) have the following format: 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Value | . . . . . . | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Figure 3 SFC OAM TLVs Types of SFC OAM TLV will be defined in the later revision; Length is the length of the Value field in octets; and Value field is variable depending on its Type (it is zero padded to align to a 4-octet boundary). Jiang and et al Expires April 16, 2016 [Page 7] Internet-Draft SFC Fault Management October 2015 3. Theory of Operation In order to describe SFC OAM in an abstract way, we reuse some nomenclatures in MPLS WG. SFC OAM operates in the context of Maintenance Entities (MEs) that define a relationship between two points of a service function path to which maintenance and monitoring operations apply. The two points that define a maintenance entity are called Maintenance Entity Group End Points (MEPs). An abstract reference model for an ME is illustrated in Figure 4 below: +-+ +-+ +-+ +-+ |A|----|B|----|C|----|D| +-+ +-+ +-+ +-+ Figure 4 SFC OAM Reference Model In Figure 4, node A can be a classifier or an entry SFF, node D can be an exit SFF, and node B and C can be any SFF or SF (with the restriction that any two SFs cannot be directly connected in SFC forwarding layer) on the SFC path. In general, MEG End Points (MEPs) are the source and sink points of a MEG for SFC OAM. 3.1. Continuity Check and Connectivity Verification of SFC Proactive Continuity Check (CC) can be used to detect a loss of continuity defect between two MEPs in a MEG. Proactive Connectivity Verification (CV) can be used to detect an unexpected connectivity defect between two MEGs or unexpected connectivity within the MEG with an unexpected MEP. BFD can also be used as a tool of proactive CC & CV in SFC, where BFD Control packets must be sent along the same path as the monitored SFC path. 3.1.1.MEP sending an SFC CC-CV packet A source MEP can proactively sends CC-CV packets periodically to its sink peer MEP. An SFC CC-CV packet is an SFC CC-CV message encapsulated with an SFC Header. The SFC header is set as described in [I-D. draft-ietf-sfc-nsh] and its flag O MUST be set to 1. The SFC OAM message is set as follows: Jiang and et al Expires April 16, 2016 [Page 8] Internet-Draft SFC Fault Management October 2015 - The message type MUST be set to 1. - The Sender's Handle is set by the original sender, and MUST be set with the sender's identifier. - The Sequence Number is set with a random value. - The Remote Handle is set with 0xffff to verify the whole SFP. 3.1.2.SFF processing an SFC CC-CV When an SFF receives a CC-CV packet, it performs service forwarding function, and sends the packet to the next SF or next SFF. 3.1.3.SF processing an SFC CC-CV An SF can only be configured as an MIP in an MEG of SFC. When an SF (being an MIP) receives a CC-CV packet, it only sends it back to the SFF transparently. 3.1.4.MEP terminating an SFC CC-CV packet A sink MEP detects a loss of continuity defect when it fails to receive proactive CC-V OAM packets from the source MEP for a consecutive time. When CC-V packets are received by a sink MEP, it is parsed. If any mis-connectivity defect is detected, a warning should be raised and fault management system should be notified of the detected defects. 3.2. SFC Route Tracing According to the SFC architecture [I-D. draft-ietf-sfc-architecture- 09], SFC can be categorized into two abstraction layers, that is, service function layer and SFC forwarding layer. In the service function layer, a service function chain actually is a service function graph, where a service function is connected to another service function one by one in sequence. In the SFC forwarding layer, service functions are further attached to SFF nodes thus form a more detailed forwarding graph. As defects can be located on either service functions or SFF nodes, it is critical to trace route both service functions and SFF nodes to detect and isolate any defects for SFC. Jiang and et al Expires April 16, 2016 [Page 9] Internet-Draft SFC Fault Management October 2015 In order to trace route of a service function chain, different layers of service function chain can be monitored: o Service-function-layer, that is, only SF identifiers can be set as the destination MEP in the trace route request and response messages. The trace routing operation collects all the SFs' identifiers along an SFC path. By comparing this SF list with the pre-configured service function graph, an operator could determine whether there is any fault in the SF connectivity and locate the defect on an SF when there are any of them. o SFC-forwarding-layer, that is, both SF identifiers and SFF can be set as the destination MEP in the trace route request and response messages. The trace routing operation collects all the SFs' identifiers and SFF identifiers along an SFC path. By comparing this SF and SFF list with the pre-configured SFC forwarding graph, an operator could determine whether there is any fault in the forwarding layer and locate the defect on an SFF or an SF. Furthermore, two different mechanisms may be used to trace route a service function chain: o TTL mechanism Similar to the IP trace route, the detection node launches a number of trace request messages in sequence to detect the fault in a specific path, the TTL of request message is set successively to 1, 2, ..., and so on. The trace route request will pass the SFs along the service function graph, and each SF will decrease the TTL value by 1. A trace route reply message will be generated and send back to the launcher when the resulted TTL is equal to zero. In this way, the launcher of trace routing can get the list of SFs that the trace route request message passes by parsing all the trace route reply messages, and isolate the fault location if there is any. o record route mechanism The detection node launches a single trace route request message, and this message is transported over the specific SFC path. Jiang and et al Expires April 16, 2016 [Page 10] Internet-Draft SFC Fault Management October 2015 When the trace route request message is received by an SF in the SFC path, the SF adds its SF identifier to the end of an SF list carried in the message. Moreover, a trace route reply message should be generated and sent back to the launcher, and the new record route SF list MUST be copied to the trace route reply message. In this way, the launcher of trace routing can get the list of SFs that the trace route request message passes by parsing all the trace route reply messages, and isolate the fault location if there is any. 3.2.1.MEP sending an SFC Trace Request In general, MEG End Points (MEPs) are the source and sink points of a MEG for SFC OAM. An MEP initiates a trace route request packet to detect and track any fault in a Service Function Chain. An SFC Trace route request packet is an SFC trace route request message encapsulated with an SFC Header. The SFC header is set as described in [I-D.draft-ietf-sfc-nsh-00] and flag O MUST be set to 1. The SFC OAM message is further set as follows: - The message type MUST be set to 2. - The Sender's Handle MUST be set to the sender's identifier. - The Receiver's Handle can be set to the exit SFF's identifier. 3.2.2.SFF processing an SFC Trace Route Request When an SFF receives a trace route request packet with O flag being set in SFC header, it firstly adds its identifier to the end of the record route list in the trace request. It then performs service forwarding function, and sends the new trace route request packet to the next SF or next SFF. Furthermore, the SFF sends a trace reply packet back to the source MEP with a copy of the new record route SF list. 3.2.3.SF processing an SFC Trace Request An SF can only be configured as an MIP in an MEG of SFC. When an SF (being an MIP) receives a trace request packet with OAM flag being set in SFC header from an SFF, it only sends it back to the SFF transparently. Jiang and et al Expires April 16, 2016 [Page 11] Internet-Draft SFC Fault Management October 2015 3.2.4.MEP receiving an SFC Trace Reply An MEP should only process an SFC trace reply packet in response to an SFC trace request that it has sent. Thus, upon receipt of an SFC trace reply packet, an MEP should try to match the trace reply packet with a trace request that it has previously sent, by checking the corresponding path identifier and Sequence Number in the SFC OAM packets. If no match is found, then the MEP MUST drop the trace reply packet silently. Since each SFF in the SFC path will send a trace reply packet when the trace request packet passes it, a source MEP will receive a sequence of trace reply packets from SFFs (other than the MEP itself) along the SFC path. Thus, the source MEP can get the full service topology and SFC path if there is no defect in the SFC data plane, and could detect and locate the data plane defects if there are any of them. 4. Security Considerations It will be considered in a future revision. 5. IANA Considerations It will be considered in a future revision. 6. References 6.1. Normative References 6.2. Informative References [I-D.ietf-sfc-problem-statement] P. Quinn, and T. Nadeau; Service Function Chaining Problem Statement; August 2014; Work in Progress [I-D.ietf-sfc-architecture] J. Halpern, C. Pignataro; Service Function Chaining (SFC) Architecture; September 2014; Work in Progress Jiang and et al Expires April 16, 2016 [Page 12] Internet-Draft SFC Fault Management October 2015 [I-D.ietf-sfc-oam-framework] S. Aldrin; C. Pignataro; N. Akiya Service Function Chaining Operations, Administration and Maintenance Framework; July 2014; Work in Progress [I-D.ietf-sfc-nsh] P. Quinn, U. Elzur; Network Service Header; July 2015; Work in Progress 7. Acknowledgments TBD Authors' Addresses Yuanlong Jiang Huawei Technologies Co., Ltd. Bantian, Longgang district Shenzhen 518129, China Email: jiangyuanlong@huawei.com Weiping Xu Huawei Technologies Co., Ltd. Bantian, Longgang district Shenzhen 518129, China Email: xuweiping@huawei.com Zhen Cao Leibniz Uni-Hannover Email: zhencao.ietf@gmail.com Yongqing Zhu China Telecom 109, West Zhongshan Road, Tianhe District, Guangzhou,China Email: zhuyq@gsta.com Jiang and et al Expires April 16, 2016 [Page 13]