Internet Draft David Allan Document: draft-allan-y1711-and-lsp-ping-00.txt Nortel Networks Category: Informational February 2003 Y.1711 and LSP-PING Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Copyright Notice Copyright(C) The Internet Society (2003). All Rights Reserved. Abstract This internet draft shows that that the OAM tools defined by ITU-T SG13/Q3 and the IETF are complementary. Sub-IP ID Summary [to be removed when published] WHERE DOES IT FIT IN THE PICTURE OF THE SUB-IP WORK Fits in the MPLS, PWE and PPVPN boxes. WHY IS IT TARGETED AT THESE WGs This draft shows that LSP-PING and Y.1711 can be considered to be complementary tools in the suite of options to measure and instrument MPLS. Allan et.al Expires August 2003 Page 1 Y.1711 and LSP-PING 1. Introduction [Y1711] and [LSP-PING] are products of the ITU-T SG13/Q3 and the IETF MPLS WG respectively. Each is reflective of the design philosophies of the communities of their origin. The purpose of this draft is to compare and contrast design elements of the two approaches. The conclusion drawn is that the approaches are complementary and comprehensive instrumentation of MPLS is ultimately possible using both. 2. Y.1711 Y.1711 and its proposed extensions is primarily focused on fault management and availability measurement for MPLS. The major design objective of Y.1711 as it currently stands is fast, simple and automatic defect detection and handling. A secondary goal is to be able to measure availability simply. It trades precision in fault isolation in return for this fast/simple/automatic defect detection/handling capability (frequently referred to as "bounded detection time"). This manifests itself in a number of design decisions: - The basic CV probe has been ruthlessly simplified to minimize processing. Frequent injection of CV probes into the network does not degrade the network. This manifests itself in a small number of fixed fields. Frequent injection of CV probes is a prerequisite for consistent/deterministic defect detection/handling and availability measurement. Injection of CV probes into LSPs from multiple sources (MP2P possibly with ECMP) is assumed to result in arrival rates at the LSP egress bursting at line rate. - The CV probe is augmented with defect notification PDUs, FDI for the forward direction, and BDI for the reverse direction. These are used for alarm suppression and control of performance measurement functions. BDI has limited applicability given that most LSPs are uni-directional, however it is very useful for interworking OAM with bi-directional PW clients (e.g. ATM). - A slightly more sophisticated probe type, the FEC-CV (under study) can carry aggregated FEC information (in the form of a bloom filter) such that a significant amount of configuration information that is bound to the LSP can be verified in a single transaction. Simple boolean operations on the bloom filter at the LSP egress can be used to detect misbranching while being tolerant of inbound filtering and other artifacts of network operations. - Probe processing is primarily performed at the egress such that for uni-directional LSPs, there is minimal ambiguity in detecting failure. This is also required to take the appropriate consequent Allan Expires August 2003 Page 2 Y.1711 and LSP-PING actions, eg to inform higher layer clients of lower layer failures and thus avoid generating alarm storms in inappropriate places, or perhaps suppress traffic if a security compromise is indicated (ie traffic arriving from the wrong source). - Probe processing provides a simple "pass/fail" indication and sufficient information to permit a craftsperson to initiate diagnosis. It is dependent on other tools to perform specific diagnosis and isolation of problems. For example either the basic CV (on p2p LSPs) or the FEC-CV (on mp2p LSPs) will identify misconfiguration and/or misbranching problems exist and permit the network to take some form of automated response, but will not identify the precise problem (in the case of the basic CV it will identify the source of the offending LSP). - Y.1711 is not designed to extract information from the network as to configuration and layout of network components. It does not currently define any path tracing functionality and only operates on LSP endpoints. - A corollary of the above, is that only LSP end points have any role in CV processing, and the CV probe passes transparently through intermediate nodes. - Y.1711 depends on some degree of ubiquitous deployment at the edge to maximize coverage of fault detection. - Y.1711 is primarily focused on tunnel end points. However core LSRs may add significant value by implementing a specific subset of Y.1711: FDI generation for P2P LSPs to provide alarm suppression and fault notification to the edge devices when failures in the core occur. And specific assumptions are made w.r.t. the evolution of the MPLS architecture: - OAM friendly LSP terminations will most likely deprecate PHP. This will: - Permit simple state association between OAM probes and paths, - Simplify verification of LSP function, - Permit verification to be at the granularity of LSPs instead of individual FEC elements, - Permit PW and VPN labels to be explicitly tested, and facilitate interworking with PW client OAM, - Simplify instrumentation of performance measurement. - allow consequent actions on defect detection to be enabled at the appropriate node, - allow a deterministic measurement of availability and QoS performance metrics at the appropriate node(the latter only have relevance when the LSP is "up" so it is important to be able to correctly specify this behavior). Allan Expires August 2003 Page 3 Y.1711 and LSP-PING - Load balancing implementations will become friendly to reserved labels and preserve fate sharing between probes and traffic [GUIDELINES]. 3. LSP-PING LSP-PING is designed to be retrofitted to existing deployed networks and to exercise all functionality currently deployed. In order to do so, the design trade off is that detection or diagnosis of a problem may take an arbitrary number of transactions. Protocol complexity is tolerated as initial implementations will be in software. Protocol complexity manifests itself in the form of TLV encoding of key information (FEC stack elements, and downstream LSR label map, extracting ECMP specifics is still a topic of discussion). Future functionality may be added to the protocol via the definition of additional TLVs. Protocol complexity also manifests itself from requiring all nodes to be able to process the probes. Aspects of the protocol design would permit a sparse subset to be handled in hardware (exact pattern match on the PDU). For example, in a VPN application, pinging a PE is facilitated by limiting the number of FECs at any level in the stack to one. Presumably an implementation of probe handling that matched on a ping of the PE loopback address could be optimized for that specific case. LSP-PING permits a uni-directional path to be tested from a single point, but depends on a reliable return path in order to propagate the test results back to the originating LSR. Therefore the protocol is designed to tolerate degrees of ambiguity in individual test results. Failure of an individual ping response may be due to any of several causes: - Forwarding path failure (including partial failure of ECMP or other load balancing constructs) - Return path failure - Port rate limiting at the egress - Port rate limiting at the ping origin - Congestive loss in the network And to deal with this ping supports several features to allow ambiguity to be eliminated via having the ingress perform variations of the original transaction: - Probe sequencing to permit both ingress and egress to detect gaps in probe sequences. - Return path may be specified permitting data plane and control plane problems to be distinguished. - Destination address may be manipulated to exercise payload sensitive ECMP implementations LSP-PING generally assumes PHP and that any specific LSP binding at the egress point of probe processing may not exist. From the perspective of reliable fault detection this is a minor issue as Allan Expires August 2003 Page 4 Y.1711 and LSP-PING the use of a non-routable destination address limits any untested modes of failure. However this does alter the granularity of useful verification, as probe contents must be checked with the set of FECs associated with the LSR, rather than simply the set specifically associated with the LSP of interest. When testing a label stack for a VPN PE, the number of individual transactions required may be quite large as the number of FEC elements supported by the PE can be considerable. LSP-PING permits a label stack. For PW and VPN application, PHP may be employed by the PE such that PWs and VPN labels may not be directly tested (hence the FEC stack to permit transport or PSN probes to proxy verification for the transported application). LSP-PING has a traceroute mode that can extract a significant amount of information w.r.t. network configuration. Specifically all details of path construction for a given FEC (note that LSP- PING will most likely need to be augmented with authentication and authorization capability in the long term). No assumptions are made w.r.t. the evolution of the MPLS architecture. 4. Summary LSP-PING and Y.1711 should be considered to be complementary. LSP-PING uses repeated transactions over time and the ability to encode specific FEC information to gain authoritative precision in testing. The PING/TRACEROUTE paradigm is suitable for employment by craftspersons (who frequently have the luxury of time to isolate and correct problems). Y.1711 uses simple periodic probes (basic CV), or information digests (FEC-CV) and egress probe processing to allow automatic fault detection/handling and thus minimize the time required to make an authoritative determination of the existence of a problem. This makes Y.1711 suitable for proactive fault detection and harmonizing MPLS operations and maintenance with many types of client layers. 5. References [GUIDELINES] Allan, D., "Guidelines for MPLS Load Balancing", IETF Internet Draft, November 2002 [LSP-PING] Pan et.al. "Detecting Data Plane Liveliness in MPLS", draft-ietf-mpls-lsp-ping-01, IETF work in progress, October 2002 [Y1711] ITU-T Recommendation Y.1711, "OAM mechanism for MPLS networks" Allan Expires August 2003 Page 5 Y.1711 and LSP-PING 6. Author's Address David Allan Nortel Networks Phone: 1-613-763-6362 3500 Carling Ave. Email: dallan@nortelnetworks.com Ottawa, Ontario, CANADA 7. Full Copyright Statement "Copyright (C) The Internet Society (2003). Except as set forth below, authors retain all their rights. This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for rights in submissions defined in the IETF Standards Process must be followed, or as required to translate it into languages other than English. The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns. This document and the information contained herein is provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/S HE REPRESENTS (IF ANY), THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Allan Expires August 2003 Page 6