Network Working Group S. Poretsky Internet Draft NextPoint Networks Expires: August 2008 Intended Status: Informational R. Papneja Isocore J. Karthik Cisco Systems S. Vapiwala Cisco Systems February 25, 2008 Benchmarking Terminology for Protection Performance Intellectual Property Rights (IPR) statement: By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Status of this Memo Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet-Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Copyright Notice Copyright (C) The IETF Trust (2008). Abstract This document provides common terminology and metrics for benchmarking the performance of sub-IP layer protection mechanisms. The performance benchmarks are measured at the IP-Layer, so avoid dependence on specific sub-IP protection mechanisms. The benchmarks and terminology can be applied in methodology documents for different sub-IP layer protection mechanisms such as Automatic Protection Switching (APS), Virtual Router Redundancy Protocol (VRRP), Stateful High Availability (HA), and Multi-Protocol Label Switching Fast Reroute (MPLS-FRR). Poretsky, Papneja, Karthik, Vapiwala Expires August 2008 [Page 1] Internet-Draft Benchmarking Terminology for February 2008 Protection Performance Table of Contents 1. Introduction..............................................3 2. Existing definitions......................................6 3. Test Considerations.......................................7 3.1. Paths................................................7 3.1.1. Path............................................7 3.1.2. Working Path....................................8 3.1.3. Primary Path....................................8 3.1.4. Protected Primary Path..........................8 3.1.5. Backup Path.....................................9 3.1.6. Standby Backup Path.............................10 3.1.7. Dynamic Backup Path.............................10 3.1.8. Disjoint Paths..................................10 3.1.9. Point of Local repair (PLR).....................11 3.1.10. Shared Risk Link Group (SRLG)..................11 3.2. Protection Mechanisms................................12 3.2.1. Link Protection.................................12 3.2.2. Node Protection.................................12 3.2.3. Path Protection.................................12 3.2.4. Backup Span.....................................13 3.2.5. Local Link Protection...........................13 3.2.6. Redundant Node Protection.......................14 3.2.7 State Control Interface.........................14 3.2.8. Protected Interface.............................15 3.3. Protection Switching.................................15 3.3.1. Protection Switching System.....................15 3.3.2. Failover Event..................................15 3.3.3. Failure Detection...............................16 3.3.4. Failover........................................17 3.3.5. Restoration.....................................17 3.3.6. Reversion.......................................18 3.4. Nodes................................................18 3.4.1. Protection-Switching Node.......................18 3.4.2. Non-Protection Switching Node...................19 3.4.3. Headend Node....................................19 3.4.4. Backup Node.....................................19 3.4.5. Merge Node......................................20 3.4.6. Primary Node....................................20 3.4.7. Standby Node....................................21 3.5. Benchmarks...........................................21 3.5.1. Failover Packet Loss............................21 3.5.2. Reversion Packet Loss...........................22 3.5.3. Failover Time...................................22 3.5.4. Reversion Time..................................23 3.5.5. Additive Backup Latency.........................23 3.6 Failover Time Calculation Methods.....................24 3.6.1 Time-Based Loss Method...........................24 3.6.2 Packet-Loss Based Method.........................25 3.6.3 Timestamp-Based Method...........................25 4. Acknowledgments...........................................26 5. IANA Considerations.......................................26 6. Security Considerations...................................26 7. References................................................26 8. Author's Address..........................................27 Poretsky, Papneja, Karthik, Vapiwala Expires August 2008 [Page 2] Internet-Draft Benchmarking Terminology for February 2008 Protection Performance 1. Introduction The IP network layer provides route convergence to protect data traffic against planned and unplanned failures in the internet. Fast convergence times are critical to maintain reliable network connectivity and performance. Technologies that function at sub-IP layers can be enabled to provide further protection of IP traffic by providing the failure recovery at the sub-IP layers so that the outage is not observed at the IP-layer. Such Sub-IP Protection technologies include High Availability (HA) stateful failover, Virtual Router Redundancy Protocol (VRRP), Automatic Link Protection (APS) for SONET/SDH, Resilient Packet Ring (RPR) for Ethernet, and Fast Reroute for Multi-Protocol Label Switching (MPLS-FRR) [8]. Benchmarking terminology have been defined for IP-layer route convergence [7]. New terminology and methodologies specific to benchmarking sub-IP layer protection mechanisms are required. This will enable different implementations of the same protection mechanisms to be benchmarked and evaluated. In addition, different protection mechanisms can be benchmarked and evaluated. The metrics for benchmarking the performance of sub-IP protection mechanisms are measured at the IP layer, so that the results are always measured in reference to IP and independent of the specific protection mechanism being used. The purpose of this document is to provide a single terminology for benchmarking sub-IP protection mechanisms. It is intended that there can exist unique methodology documents for each sub-IP protection mechanism. The sequence of events is as follows: 1. Failover Event - Primary Path fails 2. Failure Detection- Failover Event is detected 3. Failover - Backup Path becomes the Working Path due to Failover Event 4. Restoration - Primary Path recovers from a Failover Event 5. Reversion (optional) - Primary Path becomes the Working Path These terms are further defined in this document. Figures 1 through 5 show fundamental models that MAY be used in benchmarking Sub-IP Protection mechanisms. Sub-IP Protection mechanisms MUST use a Protection Switching System that consists of a minimum of two Protection-Switching Nodes, an Ingress Node known as the Headend Node and an Egress Node known as the Merge Node. The protection MAY be provided with either a Primary Path and Backup Path, as shown in Figures 1 through 4, or a Primary Node and Standby Node, as shown in Figure 5. A Protection Switching System may provide link protection, node protection, path protection, local link protection, and high availability, as shown in Figures 1 through 5 respectively. A Failover Event occurs along the Primary Path or at the Primary Poretsky, Papneja, Karthik, Vapiwala Expires August 2008 [Page 3] Internet-Draft Benchmarking Terminology for February 2008 Protection Performance Node. The Working Path is the Primary Path prior to the Failover Event and the Backup Path after the Failover Event. A Tester is set outside the two paths or nodes as it sends and receives IP traffic along the Working Path. The tester MUST record the IP packet sequence numbers, departure time, and arrival time so that the metrics of Failover Time, Additive Latency, Packet Reordering, Duplicate Packets, and Reversion Time can be measured. The Tester may be a single device or a test system. If Reversion is supported then the Working Path is the Primary Path after Restoration (Failure Recovery) of the Primary Path. Link Protection, as shown in Figure 1, provides protection when a Failover Event occurs on the link between two nodes along the Primary Path. Node Protection, as shown in Figure 2, provides protection when a Failover Event occurs at a Node along the Primary Path. Path Protection, as shown in Figure 3, provides protection for link or node failures for multiple hops along the Primary Path. Local Link Protection, as shown in Figure 4, provides Sub-IP Protection of a link between two nodes, without a Backup Node. An example of such a Sub-IP Protection mechanism is SONET APS. High Availability Protection, as shown in Figure 5, provides protection of a Primary Node with a redundant Standby Node. State Control is provided between the Primary and Standby Nodes. Failure of the Primary Node is detected at the Sub-IP layer to force traffic to switch to the Standby Node, which has state maintained for zero or minimal packet loss. +-----------+ +--------------| Tester |<-----------------------+ | +-----------+ | | IP Traffic | Failover IP Traffic | | | Event | | | | | ------------ | ---------- | +--->| Ingress/ | V | Egress/ |---+ |Headend Node|------------------|Merge Node| Primary ------------ ---------- Path | ^ | --------- | Backup +--------| Backup |-------------+ Path | Node | --------- Figure 1. System Under Test (SUT) for Sub-IP Link Protection Poretsky, Papneja, Karthik, Vapiwala Expires August 2008 [Page 4] Internet-Draft Benchmarking Terminology for February 2008 Protection Performance +-----------+ +--------------------| Tester |<-----------------+ | +-----------+ | | IP Traffic | Failover IP Traffic | | | Event | | V | | ------------ -------- ---------- | +--->| Ingress/ | |MidPoint| | Egress/ |---+ |Headend Node|----| Node |----|Merge Node| Primary ------------ -------- ---------- Path | ^ | --------- | Backup +--------| Backup |-------------+ Path | Node | --------- Figure 2. System Under Test (SUT) for Sub-IP Node Protection +-----------+ +---------------------------| Tester |<----------------------+ | +-----------+ | | IP Traffic | Failover IP Traffic | | | Event | | Primary Path | | | ------------ -------- | -------- ---------- | +--->| Ingress/ | |MidPoint| V |Midpoint| | Egress/ |---+ |Headend Node|----| Node |---| Node |---|Merge Node| ------------ -------- -------- ---------- | ^ | --------- -------- | Backup +--------| Backup |----| Backup |--------+ Path | Node | | Node | --------- -------- Figure 3. System Under Test (SUT) for Sub-IP Path Protection +-----------+ +--------------------| Tester |<-------------------+ | +-----------+ | | IP Traffic | Failover IP Traffic | | | Event | | Primary | | | +--------+ Path v +--------+ | | | |------------------------>| | | +--->| Ingress| | Egress |----+ | Node |- - - - - - - - - - - - >| Node | +--------+ Backup Path +--------+ ^ ^ | IP-Layer Forwarding | +-------------------------------------------+ Figure 4. System Under Test (SUT) for Sub-IP Local Link Protection Poretsky, Papneja, Karthik, Vapiwala Expires August 2008 [Page 5] Internet-Draft Benchmarking Terminology for February 2008 Protection Performance +-----------+ +-----------------| Tester |<--------------------+ | +-----------+ | | IP Traffic | Failover IP Traffic | | | Event | | V | | --------- -------- ---------- | +--->| Ingress | |Primary | | Egress/ |------+ | Node |----| Node |----|Merge Node| Primary --------- -------- ---------- Path | State |Control ^ | Interface |(Optional) | | --------- | +---------| Standby |---------+ | Node | --------- Figure 5. System Under Test (SUT) for Sub-IP Redundant Node Protection 2. Existing definitions This document uses existing terminology defined in other BMWG work. Examples include, but are not limited to: Latency [Ref.[2], section 3.8] Frame Loss Rate [Ref.[2], section 3.6] Throughput [Ref.[2], section 3.17] Device Under Test (DUT) [Ref.[3], section 3.1.1] System Under Test (SUT) [Ref.[3], section 3.1.2] Out-of-order Packet [Ref.[4], section 3.3.2] Duplicate Packet [Ref.[4], section 3.3.3] Forwarding Delay [Ref.[4], section 3.2.4] Jitter [Ref.[4], section 3.2.5] Packet Loss [Ref.[7], Section 3.5] Packet Reordering [Ref.[10], section 3.3] This document has the following frequently used acronyms: DUT Device Under Test SUT System Under Test This document adopts the definition format in Section 2 of RFC 1242 [2]. The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14, RFC 2119 [5]. RFC 2119 defines the use of these key words to help make the intent of standards track documents as clear as possible. While this document uses these keywords, this document is not a standards track document. Poretsky, Papneja, Karthik, Vapiwala Expires August 2008 [Page 6] Internet-Draft Benchmarking Terminology for February 2008 Protection Performance 3. Test Considerations 3.1. Paths 3.1.1 Path Definition: A unidirectional sequence of nodes, , and links with the following properties: a. R1 is the ingress node and forwards IP packets, which input into DUT/SUT, to R2 as sub-IP frames over link L12. b. Ri is a node which forwards data frames to R[i+1] over Link Li[i+1] for all i, 1