Network Working Group Rajiv Papneja Internet Draft Isocore Intended Status: Informational S.Vapiwala Expires: April 2, 2009 J. Karthik Cisco Systems S. Poretsky Allot S. Rao Qwest Communications Jean-Louis Le Roux France Telecom November 3, 2008 Methodology for Benchmarking MPLS Protection Mechanisms draft-ietf-bmwg-protection-meth-04.txt Status of this Memo By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html This Internet-Draft will expire on April 3, 2009. Abstract Papneja, et al. Expires April, 2009 [Page 1] Internet-Draft Methodology for Benchmarking MPLS November 2008 Protection Mechanisms This draft describes the methodology for benchmarking MPLS Protection mechanisms for link and node protection as defined in [MPLS-FRR-EXT]. This document provides test methodologies and test bed setup for measuring failover times while considering all dependencies that might impact faster recovery of real-time services bound to MPLS based traffic engineered tunnels. The terms used in the procedures included in this document are defined in [TERM-ID]. Table of Contents 1. Introduction...................................................3 2. Document Scope.................................................4 3. General reference sample topology..............................5 4. Existing definitions...........................................5 5. Test Considerations............................................6 5.1. Failover Events..............................................6 5.2. Failure Detection [TERM-ID]..................................7 5.3. Use of Data Traffic for MPLS Protection benchmarking.........7 5.4. LSP and Route Scaling........................................8 5.5. Selection of IGP.............................................8 5.6. Reversion [TERM-ID]..........................................8 5.7. Traffic Generation...........................................8 5.8. Motivation for Topologies....................................9 6. Reference Test Setup...........................................9 6.1. Link Protection with 1 hop primary (from PLR) and 1 hop backup TE tunnels.......................................................10 6.2. Link Protection with 1 hop primary (from PLR) and 2 hop backup TE tunnels.......................................................11 6.3. Link Protection with 2+ hop (from PLR) primary and 1 hop backup TE tunnels.......................................................11 6.4. Link Protection with 2+ hop (from PLR) primary and 2 hop backup TE tunnels.......................................................12 6.5. Node Protection with 2 hop primary (from PLR) and 1 hop backup TE tunnels.......................................................12 6.6. Node Protection with 2 hop primar (from PLR) and 2 hop backup TE tunnels.......................................................13 6.7. Node Protection with 3+ hop primary (from PLR) and 1 hop backup TE tunnels.......................................................14 6.8. Node Protection with 3+ hop primary (from PLR) and 2 hop backup TE tunnels.......................................................15 7. Test Methodology..............................................15 7.1. Headend as PLR with link failure............................15 7.2. Mid-Point as PLR with link failure..........................17 7.3. Headend as PLR with Node Failure............................18 Papneja, et al. Expires April, 2009 [Page 2] Internet-Draft Methodology for Benchmarking MPLS November 2008 Protection Mechanisms 7.4. Mid-Point as PLR with Node failure..........................19 7.5. MPLS FRR Forwarding Performance Test cases..................21 7.5.1. PLR as Headend............................................21 7.5.2. PLR as Mid-point..........................................22 8. Reporting Format..............................................23 Benchmarks.......................................................24 9. Security Considerations.......................................25 10. IANA Considerations..........................................25 11. References...................................................25 11.1. Normative References.......................................25 11.2. Informative References.....................................25 Author's Addresses...............................................26 Intellectual Property Statement..................................27 Disclaimer of Validity...........................................28 Copyright Statement..............................................28 12. Acknowledgments..............................................28 Appendix A: Fast Reroute Scalability Table.......................28 Appendix B: Abbreviations........................................31 1. Introduction This draft describes the methodology for benchmarking MPLS based protection mechanisms. The new terminology that this document introduces is defined in [TERM-ID]. MPLS based protection mechanisms provide fast recovery of real-time services from a planned or an unplanned link or node failures. MPLS protection mechanisms are generally deployed in a network infrastructure, where MPLS is used for provisioning of point-to- point traffic engineered tunnels (tunnel). MPLS based protection mechanisms promises to improve service disruption period by minimizing recovery time from most common failures. Generally there two factors impacting service availability - one is frequency of failures, and other being duration for which the failures last. Failures can be classified further into two types- - correlated uncorrelated failures. A Correlated failure is the co- occurrence of two or more failures simultaneously. A typical example would be a failure of logical resource (e.g. layer-2 links), relying on a common physical resource (e.g. common interface) fails. Within the context of MPLS protection mechanisms, failures that arise due to Shared Risk Link Groups (SRLG) [MPLS-FRR-EXT] can be considered as correlations failures or. Not all correlated failures are Papneja, et al. Expires April, 2009 [Page 3] Internet-Draft Methodology for Benchmarking MPLS November 2008 Protection Mechanisms predictable in advance especially the ones caused due to natural disasters. Planned failures on the other hand are predictable and implementations should handle both types of failures and recover gracefully within the time frame acceptable for service assurance. Hence, failover recovery time is one of the most important benchmark that a service provider considers in choosing the building blocks for their network infrastructure. It is a known fact that network elements from different manufactures behave differently to network failures, which impact their ability to recover from the failures. It becomes imperative from network service providers to have a common benchmark, which could be followed to understand the performance behaviors of network elements. Considering failover recovery an important parameter, the test methodology presented in this document considers the factors that may impact the failover times. To benchmark the failover times, data plane traffic is used as defined in [IGP-METH]. All benchmarking test cases defined in this document apply to both facility backup and local protection enabled in detour mode. The test cases cover all possible failure scenarios and the associated procedures benchmark the ability of the DUT to perform recovery from failures within target failover time. 2. Document Scope This document provides detailed test cases along with different topologies and scenarios that should be considered to effectively benchmark MPLS protection mechanisms and failover times. Different failure scenarios and scaling considerations are also provided in this document, in addition to reporting formats for the observed results. Benchmarking of unexpected correlated failures is currently out of scope of this document. Papneja, et al. Expires April, 2009 [Page 4] Internet-Draft Methodology for Benchmarking MPLS November 2008 Protection Mechanisms 3. General reference sample topology Figure 1 illustrates the basic reference testbed and is applicable to all the test cases defined in this document. TG & TA represents Traffic Generator & Analyzer respectively. A tester is connected to the DUT and it sends and receives IP traffic along with the working Path, run protocol emulations simulating real world peering scenarios. The reference testbed shown in the figure --------------------------- | ------------|--------------- | | | | | | | | -------- -------- -------- -------- -------- TG-| R1 |-----| R2 |----| R3 | | R4 | | R5 |-TA | |-----| |----| |----| |---| | -------- -------- -------- -------- -------- | | | | | | | | | -------- | | ---------| R6 |-------- | | |-------------------- -------- Fig.1: Fast Reroute Topology. The tester MUST record the number of lost, duplicate, and reordered packets. It should further record arrival and departure times so that failover Time, Additive Latency, and Reversion Time can be measured. The tester may be a single device or a test system emulating all the different roles along a primary or backup path. 4. Existing definitions For the sake of clarity and continuity this RFC adopts the template for definitions set out in Section 2 of RFC 1242. Definitions are indexed and grouped together in sections for ease of reference. The terms used in this document are defined in detail in [TERM-ID]. Papneja, et al. Expires April, 2009 [Page 5] Internet-Draft Methodology for Benchmarking MPLS November 2008 Protection Mechanisms The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document is to be interpreted as described in RFC 2119. The reader is assumed to be familiar with the commonly used MPLS terminology, some of which is defined in [MPLS-FRR-EXT]. 5. Test Considerations This section discusses the fundamentals of MPLS Protection testing: -The types of network events that causes failover -Indications for failover -the use of data traffic -Traffic generation -LSP Scaling -Reversion of LSP -IGP Selection 5.1. Failover Events [TERM ID] The failover to the backup tunnel is primarily triggered by either link or node failures observed downstream of the Point of Local repair (PLR). Some of these failure events are listed below. Link failure events - Interface Shutdown on PLR side with POS Alarm - Interface Shutdown on remote side with POS Alarm - Interface Shutdown on PLR side with RSVP hello enabled - Interface Shutdown on remote side with RSVP hello enabled - Interface Shutdown on PLR side with BFD - Interface Shutdown on remote side with BFD - Fiber Pull on the PLR side (Both TX & RX or just the TX) - Fiber Pull on the remote side (Both TX & RX or just the RX) - Online insertion and removal (OIR) on PLR side - OIR on remote side - Sub-interface failure (e.g. shutting down of a VLAN) - Parent interface shutdown (an interface bearing multiple sub- interfaces Papneja, et al. Expires April, 2009 [Page 6] Internet-Draft Methodology for Benchmarking MPLS November 2008 Protection Mechanisms Node failure events A System reload is initiated either by a graceful shutdown or by a power failure. A system crash is referred to as a software failure or an assert. - Reload protected Node, when RSVP hello is enabled - Crash Protected Node, when RSVP hello is enabled - Reload Protected Node, when BFD is enable - Crash Protected Node, when BFD is enable 5.2. Failure Detection [TERM-ID] Link failure detection time depends on the link type and failure detection protocols running. For SONET/SDH, the alarm type (such as LOS, AIS, or RDI) can be used. Other link types have layer-two alarms, but they may not provide a short enough failure detection time. Ethernet based links do not have layer 2 failure indicators, and therefore relies on layer 3 signaling for failure detection. MPLS has different failure detection techniques such as BFD, or use of RSVP hellos. These methods can be used for the layer 3 failure indicators required by Ethernet based links, or for some other non- Ethernet based links to help improve failure detection time. The test procedures in this document can be used for a local failure or remote failure scenarios for comprehensive benchmarking and to evaluate failover performance independent of the failure detection techniques. 5.3. Use of Data Traffic for MPLS Protection benchmarking Currently end customers use packet loss as a key metric for failover time. Packet loss is an externally observable event and has direct impact on customers' applications. MPLS protection mechanism is expected to minimize the packet loss in the event of a failure. For this reason it is important to develop a standard router benchmarking methodology for measuring MPLS protection that uses packet loss as a metric. At a known rate of forwarding, packet loss can be measured and the failover time can be determined. Measurement of control plane signaling to establish backup paths is not enough Papneja, et al. Expires April, 2009 [Page 7] Internet-Draft Methodology for Benchmarking MPLS November 2008 Protection Mechanisms to verify failover. Failover is best determined when packets are actually traversing the backup path. An additional benefit of using packet loss for calculation of failover time is that it allows use of a black-box tests environment. Data traffic is offered at line-rate to the device under test (DUT), and an emulated network failure event is forced to occur, and packet loss is externally measured to calculate the convergence time. This setup is independent of the DUT architecture. In addition, this methodology considers the packets in error and duplicate packets that could have been generated during the failover process. In scenarios, where separate measurement of packets in error and duplicate packets is difficult to obtain, these packets should be attributed to lost packets. 5.4. LSP and Route Scaling Failover time performance may vary with the number of established primary and backup tunnel label switched paths (LSP) and installed routes. However the procedure outlined here should be used for any number of LSPs (L) and number of routes protected by PLR(R). Number of L and R must be recorded. 5.5. Selection of IGP The underlying IGP could be ISIS-TE or OSPF-TE for the methodology proposed here. 5.6. Reversion [TERM-ID] Fast Reroute provides a method to return or restore a backup path to original primary LSP upon recovery from the failure. This is referred to as Reversion, which can be implemented as Global Reversion or Local Reversion. In all test cases listed here Reversion should not produce any packet loss, out of order or duplicate packets. Each of the test cases in this methodology document provides a check to confirm that there is no packet loss. 5.7. Traffic Generation It is suggested that there be one or more traffic streams as long as there is a steady and constant rate of flow for all the streams. In order to monitor the DUT performance for recovery times a set of route prefixes should be advertised before traffic is sent. The traffic should be configured towards these routes. Papneja, et al. Expires April, 2009 [Page 8] Internet-Draft Methodology for Benchmarking MPLS November 2008 Protection Mechanisms A typical example would be configuring the traffic generator to send the traffic to the first, middle and last of the advertised routes. (First, middle and last could be decided by the numerically smallest, median and the largest respectively of the advertised prefix). Generating traffic to all of the prefixes reachable by the protected tunnel (probably in a Round-Robin fashion, where the traffic is destined to all the prefixes but one prefix at a time in a cyclic manner) is not recommended. The reason why traffic generation is not recommended in a Round-Robin fashion to all the prefixes, one at a time is that if there are many prefixes reachable through the LSP the time interval between 2 packets destined to one prefix may be significantly high and may be comparable with the failover time being measured which does not aid in getting an accurate failover measurement. 5.8. Motivation for Topologies Given that the label stack is dependent of the following 3 entities it is recommended that the benchmarking of failover time be performed on all the 8 topologies provided in section 4 - Type of protection (Link Vs Node) - # of remaining hops of the primary tunnel from the PLR - # of remaining hops of the backup tunnel from the PLR 6. Reference Test Setup In addition to the general reference topology shown in figure 1, this section provides detailed insight into various proposed test setups that should be considered for comprehensively benchmarking the failover time in different roles along the primary tunnel: This section proposes a set of topologies that covers all the scenarios for local protection. All of these 8 topologies shown (figure 2- figure 9) can be mapped to the reference topology shown in figure 1. Topologies provided in sections 4.1 to 4.8 refer to test-bed required to benchmark failover time when DUT is configured as a PLR in either headend or midpoint role. The labels stack provided with each topology is at the PLR. The label stacks shown below each figure in section 4.1 to 4.9 considers enabling of Penultimate Hop Popping (PHP). Papneja, et al. Expires April, 2009 [Page 9] Internet-Draft Methodology for Benchmarking MPLS November 2008 Protection Mechanisms Figures 2-9 uses the following convention: a) HE is Headend b) TE is Tail-End c) MID is Mid point d) MP is Merge Point e) PLR is Point of Local Repair f) PRI is Primary g) BKP denotes Backup Node 6.1. Link Protection with 1 hop primary (from PLR) and 1 hop backup TE tunnels ------- -------- PRI -------- | R1 | | R2 | | R3 | TG-| HE |--| MID |----| TE |-TA | | | PLR |----| | ------- -------- BKP -------- Figure 2: Represents the setup for section 4.1 Traffic No of Labels No of labels after before failure failure IP TRAFFIC (P-P) 0 0 Layer3 VPN (PE-PE) 1 1 Layer3 VPN (PE-P) 2 2 Layer2 VC (PE-PE) 1 1 Layer2 VC (PE-P) 2 2 Mid-point LSPs 0 0 Papneja, et al. Expires April, 2009 [Page 10] Internet-Draft Methodology for Benchmarking MPLS November 2008 Protection Mechanisms 6.2. Link Protection with 1 hop primary (from PLR) and 2 hop backup TE tunnels ------- -------- -------- | R1 | | R2 | | R3 | TG-| HE | | MID |PRI | TE |-TA | |----| PLR |----| | ------- -------- -------- |BKP | | -------- | | | R6 | | |----| BKP |----| | MID | -------- Figure 3: Representing setup for section 4.2 Traffic No of Labels No of labels before failure after failure IP TRAFFIC (P-P) 0 1 Layer3 VPN (PE-PE) 1 2 Layer3 VPN (PE-P) 2 3 Layer2 VC (PE-PE) 1 2 Layer2 VC (PE-P) 2 3 Mid-point LSPs 0 1 6.3. Link Protection with 2+ hop (from PLR) primary and 1 hop backup TE tunnels -------- -------- -------- -------- | R1 | | R2 |PRI | R3 |PRI | R4 | TG-| HE |----| MID |----| MID |------| TE |-TA | | | PLR |----| | | | -------- -------- BKP -------- -------- Figure 4: Representing setup for section 4.3 Traffic No of Labels No of labels before failure after failure Papneja, et al. Expires April, 2009 [Page 11] Internet-Draft Methodology for Benchmarking MPLS November 2008 Protection Mechanisms IP TRAFFIC (P-P) 1 1 Layer3 VPN (PE-PE) 2 2 Layer3 VPN (PE-P) 3 3 Layer2 VC (PE-PE) 2 2 Layer2 VC (PE-P) 3 3 Mid-point LSPs 1 1 6.4. Link Protection with 2+ hop (from PLR) primary and 2 hop backup TE tunnels -------- -------- PRI -------- PRI -------- | R1 | | R2 | | R3 | | R4 | TG-| HE |----| MID |----| MID |------| TE |-TA | | | PLR | | | | | -------- -------- -------- -------- BKP| | | -------- | | | R6 | | ---| BKP |- | MID | -------- Figure 5: Representing the setup for section 4.4 Traffic No of Labels No of labels before failure after failure IP TRAFFIC (P-P) 1 2 Layer3 VPN (PE-PE) 2 3 Layer3 VPN (PE-P) 3 4 Layer2 VC (PE-PE) 2 3 Layer2 VC (PE-P) 3 4 Mid-point LSPs 1 2 6.5. Node Protection with 2 hop primary (from PLR) and 1 hop backup TE tunnels Papneja, et al. Expires April, 2009 [Page 12] Internet-Draft Methodology for Benchmarking MPLS November 2008 Protection Mechanisms -------- -------- -------- -------- | R1 | | R2 |PRI | R3 | PRI | R4 | TG-| HE |----| MID |----| MID |------| TE |-TA | | | PLR | | | | | -------- -------- -------- -------- |BKP | ----------------------------- Figure 6: Representing the setup for section 4.5 Traffic No of Labels No of labels before failure after failure IP TRAFFIC (P-P) 1 0 Layer3 VPN (PE-PE) 2 1 Layer3 VPN (PE-P) 3 2 Layer2 VC (PE-PE) 2 1 Layer2 VC (PE-P) 3 2 Mid-point LSPs 1 0 6.6. Node Protection with 2 hop primary (from PLR) and 2 hop backup TE tunnels -------- -------- -------- -------- | R1 | | R2 | | R3 | | R4 | TG-| HE | | MID |PRI | MID |PRI | TE |-TA | |----| PLR |----| |----| | -------- -------- -------- -------- | | BKP| -------- | | | R6 | | ---------| BKP |--------- | MID | -------- Figure 7: Representing setup for section 4.6 Papneja, et al. Expires April, 2009 [Page 13] Internet-Draft Methodology for Benchmarking MPLS November 2008 Protection Mechanisms Traffic No of Labels No of labels before failure after failure IP TRAFFIC (P-P) 1 1 Layer3 VPN (PE-PE) 2 2 Layer3 VPN (PE-P) 3 3 Layer2 VC (PE-PE) 2 2 Layer2 VC (PE-P) 3 3 Mid-point LSPs 1 1 6.7. Node Protection with 3+ hop primary (from PLR) and 1 hop backup TE tunnels -------- -------- PRI -------- PRI -------- PRI -------- | R1 | | R2 | | R3 | | R4 | | R5 | TG-| HE |--| MID |---| MID |---| MP |---| TE |-TA | | | PLR | | | | | | | -------- -------- -------- -------- -------- BKP| | -------------------------- Figure 8: Representing setup for section 4.7 Traffic No of Labels No of labels before failure after failure IP TRAFFIC (P-P) 1 1 Layer3 VPN (PE-PE) 2 2 Layer3 VPN (PE-P) 3 3 Layer2 VC (PE-PE) 2 2 Layer2 VC (PE-P) 3 3 Mid-point LSPs 1 1 Papneja, et al. Expires April, 2009 [Page 14] Internet-Draft Methodology for Benchmarking MPLS November 2008 Protection Mechanisms 6.8. Node Protection with 3+ hop primary (from PLR) and 2 hop backup TE tunnels -------- -------- -------- -------- -------- | R1 | | R2 | | R3 | | R4 | | R5 | TG-| HE | | MID |PRI| MID |PRI| MP |PRI| TE |-TA | |-- | PLR |---| |---| |---| | -------- -------- -------- -------- -------- BKP| | | -------- | | | R6 | | ---------| BKP |------- | MID | -------- Figure 9: Representing setup for section 4.8 Traffic No of Labels No of labels before failure after failure IP TRAFFIC (P-P) 1 2 Layer3 VPN (PE-PE) 2 3 Layer3 VPN (PE-P) 3 4 Layer2 VC (PE-PE) 2 3 Layer2 VC (PE-P) 3 4 Mid-point LSPs 1 2 7. Test Methodology The procedure described in this section can be applied to all the 8 base test cases and the associated topologies. The backup as well as the primary tunnels are configured to be alike in terms of bandwidth usage. In order to benchmark failover with all possible label stack depth applicable as seen with current deployments, it is suggested that the methodology includes all the scenarios listed here 7.1. Headend as PLR with link failure Objective Papneja, et al. Expires April, 2009 [Page 15] Internet-Draft Methodology for Benchmarking MPLS November 2008 Protection Mechanisms To benchmark the MPLS failover time due to Link failure events described in section 3.1 experienced by the DUT which is the point of local repair (PLR). Test Setup - Select any one topology out of 8 from section 4 - Select overlay technology for FRR test e.g. IGP,VPN,or VC - The DUT will also have 2 interfaces connected to the traffic Generator/analyzer. (If the node downstream of the PLR is not A simulated node, then the Ingress of the tunnel should have one link connected to the traffic generator and the node downstream to the PLR or the egress of the tunnel should have a link connected to the traffic analyzer). Test Configuration 1. Configure the number of primaries on R2 and the backups on R2 as required by the topology selected. 2. Advertise prefixes (as per FRR Scalability table describe in Appendix A) by the tail end. Procedure 1. Establish the primary LSP on R2 required by the topology selected. 2. Establish the backup LSP on R2 required by the selected topology. 3. Verify primary and backup LSPs are up and that primary is protected. 4. Verify Fast Reroute protection is enabled and ready. 5. Setup traffic streams as described in section 3.7. 6. Send IP traffic at maximum Forwarding Rate to DUT. 7. Verify traffic switched over Primary LSP. 8. Trigger any choice of Link failure as describe in section 3.1. 9. Verify that primary tunnel and prefixes gets mapped to backup tunnels. 10. Stop traffic stream and measure the traffic loss. 11. Failover time is calculated as defined in section 6, Reporting format. Papneja, et al. Expires April, 2009 [Page 16] Internet-Draft Methodology for Benchmarking MPLS November 2008 Protection Mechanisms 12. Start traffic stream again to verify reversion when protected interface comes up. Traffic loss should be 0 due to make before break or reversion. 13. Enable protected interface that was down (Node in the case of NNHOP). 14. Verify headend signals new LSP and protection should be in place again. 7.2. Mid-Point as PLR with link failure Objective To benchmark the MPLS failover time due to Link failure events described in section 3.1 experienced by the device under test which is the point of local repair (PLR). Test Setup - Select any one topology out of 8 from section 4 - Select overlay technology for FRR test as Mid-Point LSPs - The DUT will also have 2 interfaces connected to the traffic generator. Test Configuration 1. Configure the number of primaries on R1 and the backups on R2 as required by the topology selected. 2. Advertise prefixes (as per FRR Scalability table describe in Appendix A) by the tail end. Procedure 1. Establish the primary LSP on R1 required by the topology selected. 2. Establish the backup LSP on R2 required by the selected topology. 3. Verify primary and backup LSPs are up and that primary is protected. Papneja, et al. Expires April, 2009 [Page 17] Internet-Draft Methodology for Benchmarking MPLS November 2008 Protection Mechanisms 4. Verify Fast Reroute protection. 5. Setup traffic streams as described in section 3.7. 6. Send IP traffic at maximum Forwarding Rate to DUT. 7. Verify traffic switched over Primary LSP. 8. Trigger any choice of Link failure as describe in section 3.1. 9. Verify that primary tunnel and prefixes gets mapped to backup tunnels. 10. Stop traffic stream and measure the traffic loss. 11. Failover time is calculated as per defined in section 6, Reporting format. 12. Start traffic stream again to verify reversion when protected interface comes up. Traffic loss should be 0 due to make before break or reversion. 13. Enable protected interface that was down (Node in the case of NNHOP). 14. Verify headend signals new LSP and protection should be in place again. 7.3. Headend as PLR with Node Failure Objective To benchmark the MPLS failover time due to Node failure events described in section 3.1 experienced by the device under test, which is the point of local repair (PLR). Test Setup - Select any one topology from section 4.5 to 4.8 - Select overlay technology for FRR test e.g. IGP, VPN, or VC - The DUT will also have 2 interfaces connected to the traffic generator. Test Configuration 1. Configure the number of primaries on R2 and the backups on R2 as required by the topology selected. 2. Advertise prefixes (as per FRR Scalability table describe in Appendix A) by the tail end. Procedure Papneja, et al. Expires April, 2009 [Page 18] Internet-Draft Methodology for Benchmarking MPLS November 2008 Protection Mechanisms 1. Establish the primary LSP on R2 required by the topology selected. 2. Establish the backup LSP on R2 required by the selected topology. 3. Verify primary and backup LSPs are up and that primary is protected. 4. Verify Fast Reroute protection. 5. Setup traffic streams as described in section 3.7. 6. Send IP traffic at maximum Forwarding Rate to DUT. 7. Verify traffic switched over Primary LSP. 8. Trigger any choice of Node failure as describe in section 3.1. 9. Verify that primary tunnel and prefixes gets mapped to backup tunnels 10. Stop traffic stream and measure the traffic loss. 11. Failover time is calculated as per defined in section 6, Reporting format. 12. Start traffic stream again to verify reversion when protected interface comes up. Traffic loss should be 0 due to make before break or reversion. 13. Boot protected Node that was down. 14. Verify headend signals new LSP and protection should be in place again. 7.4. Mid-Point as PLR with Node failure Objective To benchmark the MPLS failover time due to Node failure events described in section 3.1 experienced by the device under test, which is the point of local repair (PLR). Test Setup - Select any one topology from section 4.5 to 4.8. - Select overlay technology for FRR test as Mid-Point LSPs. - The DUT will also have 2 interfaces connected to the traffic generator. Test Configuration Papneja, et al. Expires April, 2009 [Page 19] Internet-Draft Methodology for Benchmarking MPLS November 2008 Protection Mechanisms 1. Configure the number of primaries on R1 and the backups on R2 as required by the topology selected. 2. Advertise prefixes (as per FRR Scalability table describe in Appendix A) by the tail end. Procedure 1. Establish the primary LSP on R1 required by the topology selected. 2. Establish the backup LSP on R2 required by the selected topology. 3. Verify primary and backup LSPs are up and that primary is protected. 4. Verify Fast Reroute protection. 5. Setup traffic streams as described in section 3.7. 6. Send IP traffic at maximum Forwarding Rate to DUT. 7. Verify traffic switched over Primary LSP. 8. Trigger any choice of Node failure as describe in section 3.1. 9. Verify that primary tunnel and prefixes gets mapped to backup tunnels. 10. Stop traffic stream and measure the traffic loss. 11. Failover time is calculated as per defined in section 6, Reporting format. 12. Start traffic stream again to verify reversion when protected interface comes up. Traffic loss should be 0 due to make before break or reversion. 13. Boot protected Node that was down. 14. Verify headend signals new LSP and protection should be in place again. Papneja, et al. Expires April, 2009 [Page 20] Internet-Draft Methodology for Benchmarking MPLS November 2008 Protection Mechanisms 7.5. MPLS FRR Forwarding Performance Test cases For the following MPLS FRR Forwarding Performance Benchmarking cases, Test the maximum PPS rate allowed by given hardware. One may follow the procedure for determining MPLS forwarding performance defined in [MPLS-FORWARD] 7.5.1. PLR as Headend Objective To benchmark the maximum rate (pps) on the PLR (as headend) over primary FRR LSP and backup LSP. Test Setup - Select any one topology out of 8 from section 4. - Select overlay technology for FRR test e.g. IGP,VPN,or VC. - The DUT will also have 2 interfaces connected to the traffic Generator/analyzer. (If the node downstream of the PLR is not A simulated node, then the Ingress of the tunnel should have one link connected to the traffic generator and the node downstream to the PLR or the egress of the tunnel should have a link connected to the traffic analyzer). Procedure 1. Establish the primary LSP on R2 required by the topology selected. 2. Establish the backup LSP on R2 required by the selected topology. 3. Verify primary and backup LSPs are up and that primary is protected. 4. Verify Fast Reroute protection is enabled and ready. 5. Setup traffic streams as described in section 3.7. 6. Send IP traffic at maximum forwarding rate (pps) that the device under test supports over the primary LSP. 7. Record maximum PPS rate forwarded over primary LSP. 8. Stop traffic stream. 9. Trigger any choice of Link failure as describe in section 3.1. Papneja, et al. Expires April, 2009 [Page 21] Internet-Draft Methodology for Benchmarking MPLS November 2008 Protection Mechanisms 10. Verify that primary tunnel and prefixes gets mapped to backup tunnels. 11. Send IP traffic at maximum forwarding rate (pps) that the device under test supports over the primary LSP. 12. Record maximum PPS rate forwarded over backup LSP. 7.5.2. PLR as Mid-point Objective To benchmark the maximum rate (pps) on the PLR (as mid-point of the primary path and ingress of the backup path) over primary FRR LSP and backup LSP. Test Setup - Select any one topology out of 8 from section 4. - Select overlay technology for FRR test as Mid-Point LSPs. - The DUT will also have 2 interfaces connected to the traffic generator. Procedure 1. Establish the primary LSP on R1 required by the topology selected. 2. Establish the backup LSP on R2 required by the selected topology. 3. Verify primary and backup LSPs are up and that primary is protected. 4. Verify Fast Reroute protection is enabled and ready. 5. Setup traffic streams as described in section 3.7. 6. Send IP traffic at maximum forwarding rate (pps) that the device under test supports over the primary LSP. 7. Record maximum PPS rate forwarded over primary LSP. 8. Stop traffic stream. 9. Trigger any choice of Link failure as describe in section 3.1. 10. Verify that primary tunnel and prefixes gets mapped to backup tunnels. 11. Send IP traffic at maximum forwarding rate (pps) that the device under test supports over the backup LSP. Papneja, et al. Expires April, 2009 [Page 22] Internet-Draft Methodology for Benchmarking MPLS November 2008 Protection Mechanisms 12. Record maximum PPS rate forwarded over backup LSP. 8. Reporting Format For each test, it is recommended that the results be reported in the following format. Parameter Units IGP used for the test ISIS-TE/ OSPF-TE Interface types Gige,POS,ATM,VLAN etc. Packet Sizes offered to the DUT Bytes Forwarding rate Number of packets per second IGP routes advertised Number of IGP routes RSVP hello timers configured Milliseconds (if any) Number of FRR tunnels Number of tunnels configured Number of VPN routes installed Number of VPN routes on the headend Number of VC tunnels Number of VC tunnels Number of BGP routes BGP routes installed Number of mid-point tunnels Number of tunnels Number of Prefixes protected by Number of LSPs Primary Topology being used Section number, and figure reference Papneja, et al. Expires April, 2009 [Page 23] Internet-Draft Methodology for Benchmarking MPLS November 2008 Protection Mechanisms Failure event Event type Benchmarks Parameter Unit Minimum failover time Milliseconds Mean failover time Milliseconds Maximum failover time Milliseconds Minimum reversion time Milliseconds Mean reversion time Milliseconds Maximum reversion time Milliseconds Failover time suggested above is calculated using one of the following three methods 1. Packet-Based Loss method (PBLM): (Number of packets dropped/packets per second * 1000) milliseconds. This method could also be referred as Rate Derived method. 2. Time-Based Loss Method (TBLM): This method relies on the ability of the Traffic generators to provide statistics which reveal the duration of failure in milliseconds based on when the packet loss occurred (interval between non-zero packet loss and zero loss). 3. Timestamp Based Method (TBM): This method of failover calculation is based on the timestamp that gets transmitted as payload in the packets originated by the generator. The Traffic Analyzer records the timestamp of the last packet received before the failover event and the first packet after the failover and derives the time based on the difference between these 2 timestamps. Note: The payload could also contain sequence numbers for out-of-order packet calculation and duplicate packets. Note: If the primary is configured to be dynamic, and if the primary is to reroute, make before break should occur from the backup that Papneja, et al. Expires April, 2009 [Page 24] Internet-Draft Methodology for Benchmarking MPLS November 2008 Protection Mechanisms is in use to a new alternate primary. If there is any packet loss seen, it should be added to failover time. 9. Security Considerations During the course of test, the test topology must be disconnected from devices that may forward the test traffic into a production environment. There are no specific security considerations within the scope of this document. 10. IANA Considerations There are no considerations for IANA at this time. 11. References 11.1. Normative References [MPLS-FRR-EXT] Pan, P., Atlas, A., Swallow, G., "Fast Reroute Extensions to RSVP-TE for LSP Tunnels", RFC 4090. 11.2. Informative References [TERM-ID] Poretsky S., Papneja R., Karthik J., Vapiwala S., "Benchmarking Terminology for Protection Performance", draft-ietf-bmwg-protection-term- 05.txt, work in progress. [MPLS-FRR-EXT] Pan P., Swallow G., Atlas A., "Fast Reroute Extensions to RSVP-TE for LSP Tunnels'', RFC 4090. [IGP-METH] S. Poretsky, B. Imhoff, "Benchmarking Methodology for IGP Data Plane Route Convergence, draft-ietf- bmwg-igp-dataplane-conv-meth-16.txt, work in progress. [MPLS-FORWARD] A. Akhter, and R. Asati, ''MPLS Forwarding Benchmarking Methodology,'' draft-ietf-bmwg-mpls- forwarding-meth-00.txt, work in progress. Papneja, et al. Expires April, 2009 [Page 25] Internet-Draft Methodology for Benchmarking MPLS November 2008 Protection Mechanisms Author's Addresses Rajiv Papneja Isocore 12359 Sunrise Valley Drive, STE 100 Reston, VA 20190 USA Phone: +1 703 860 9273 Email: rpapneja@isocore.com Samir Vapiwala Cisco System 300 Beaver Brook Road Boxborough, MA 01719 USA Phone: +1 978 936 1484 Email: svapiwal@cisco.com Jay Karthik Cisco System 300 Beaver Brook Road Boxborough, MA 01719 USA Phone: +1 978 936 0533 Email: jkarthik@cisco.com Scott Poretsky Allot Communications 67 South Bedford Street, Suite 400 Burlington, MA 01803 USA Phone: + 1 508 309 2179 EMail: sporetsky@allot.com Shankar Rao Qwest Communications, 950 17th Street Suite 1900 Papneja, et al. Expires April, 2009 [Page 26] Internet-Draft Methodology for Benchmarking MPLS November 2008 Protection Mechanisms Qwest Communications Denver, CO 80210 USA Phone: + 1 303 437 6643 Email: shankar.rao@qwest.com Jean-Louis Le Roux France Telecom 2 av Pierre Marzin 22300 Lannion France Phone: 00 33 2 96 05 30 20 Email: jeanlouis.leroux@orange-ft.com Intellectual Property Statement The IETF takes no position regarding the validity or scope of any Intellectual Property Rights or other rights that might be claimed to pertain to the implementation or use of the technology described in this document or the extent to which any license under such rights might or might not be available; nor does it represent that it has made any independent effort to identify any such rights. Information on the procedures with respect to rights in RFC documents can be found in BCP 78 and BCP 79. Copies of IPR disclosures made to the IETF Secretariat and any assurances of licenses to be made available, or the result of an attempt made to obtain a general license or permission for the use of such proprietary rights by implementers or users of this specification can be obtained from the IETF on-line IPR repository at http://www.ietf.org/ipr. The IETF invites any interested party to bring to its attention any copyrights, patents or patent applications, or other proprietary rights that may cover technology that may be required to implement this standard. Please address the information to the IETF at ietf-ipr@ietf.org. Papneja, et al. Expires April, 2009 [Page 27] Internet-Draft Methodology for Benchmarking MPLS November 2008 Protection Mechanisms Disclaimer This document and the information contained herein are provided on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Full Copyright Statement Copyright (C) The IETF Trust (2008). This document is subject to the rights, licenses and restrictions contained in BCP 78, and except as set forth therein, the authors retain all their rights. 12. Acknowledgments We would like to thank Jean Philip Vasseur for his invaluable input to the document and Curtis Villamizar his contribution in suggesting text on definition and need for benchmarking Correlated failures. Additionally we would like to thank Arun Gandhi, Amrit Hanspal, Karu Ratnam and for their input to the document. Appendix A: Fast Reroute Scalability Table This section provides the recommended numbers for evaluating the scalability of fast reroute implementations. It also recommends the typical numbers for IGP/VPNv4 Prefixes, LSP Tunnels and VC entries. Based on the features supported by the device under test, appropriate scaling limits can be used for the test bed. Papneja, et al. Expires April, 2009 [Page 28] Internet-Draft Methodology for Benchmarking MPLS November 2008 Protection Mechanisms A 1. FRR IGP Table No. of Headend TE Tunnels IGP Prefixes 1 100 1 500 1 1000 1 2000 1 5000 2 (Load Balance) 100 2 (Load Balance) 500 2 (Load Balance) 1000 2 (Load Balance) 2000 2 (Load Balance) 5000 100 100 500 500 1000 1000 2000 2000 Papneja, et al. Expires April, 2009 [Page 29] Internet-Draft Methodology for Benchmarking MPLS November 2008 Protection Mechanisms A 2. FRR VPN Table No. of Headend TE Tunnels VPNv4 Prefixes 1 100 1 500 1 1000 1 2000 1 5000 1 10000 1 20000 1 Max 2 (Load Balance) 100 2 (Load Balance) 500 2 (Load Balance) 1000 2 (Load Balance) 2000 2 (Load Balance) 5000 2 (Load Balance) 10000 2 (Load Balance) 20000 2 (Load Balance) Max A 3. FRR Mid-Point LSP Table No of Mid-point TE LSPs could be configured at recommended levels - 100, 500, 1000, 2000, or max supported number. Papneja, et al. Expires April, 2009 [Page 30] Internet-Draft Methodology for Benchmarking MPLS November 2008 Protection Mechanisms A 4. FRR VC Table No. of Headend TE Tunnels VC entries 1 100 1 500 1 1000 1 2000 1 Max 100 100 500 500 1000 1000 2000 2000 Appendix B: Abbreviations BFD - Bidirectional Fault Detection BGP - Border Gateway protocol CE - Customer Edge DUT - Device Under Test FRR - Fast Reroute IGP - Interior Gateway Protocol IP - Internet Protocol LSP - Label Switched Path MP - Merge Point MPLS - Multi Protocol Label Switching N-Nhop - Next - Next Hop Nhop - Next Hop OIR - Online Insertion and Removal P - Provider PE - Provider Edge PHP - Penultimate Hop Popping PLR - Point of Local Repair Papneja, et al. Expires April, 2009 [Page 31] Internet-Draft Methodology for Benchmarking MPLS November 2008 Protection Mechanisms RSVP - Resource reSerVation Protocol SRLG - Shared Risk Link Group TA - Traffic Analyzer TE - Traffic Engineering TG - Traffic Generator VC - Virtual Circuit VPN - Virtual Private Network Papneja, et al. Expires April, 2009 [Page 32]