idnits 2.17.1 draft-ietf-bmwg-protection-meth-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 24. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 1133. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1144. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1151. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1157. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == The page length should not exceed 58 lines per page, but there was 6 longer pages, the longest (page 29) being 63 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 32 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) (A line matching the expected section header was found, but with an unexpected indentation: ' 7. IANA Considerations' ) ** The document seems to lack an Authors' Addresses Section. ** There are 273 instances of too long lines in the document, the longest one being 9 characters in excess of 72. ** The abstract seems to contain references ([TERM-ID], [MPLS-FRR-EXT]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year == Line 55 has weird spacing: '... The bench...' == Line 56 has weird spacing: '... based prote...' == Line 119 has weird spacing: '... This draft...' == Line 128 has weird spacing: '... There are ...' == Line 148 has weird spacing: '... Some route...' == (11 more instances...) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 19, 2008) is 5909 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC-WORDS' is defined on line 1046, but no explicit reference was found in the text == Outdated reference: A later version (-09) exists of draft-ietf-bmwg-protection-term-02 Summary: 5 errors (**), 0 flaws (~~), 11 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group R. Papneja 3 Internet Draft Isocore 4 Intended status: Informational S. Vapiwala 5 Expires: August 2008 J. Karthik 6 Cisco Systems 7 S. Poretsky 8 Reef Point 9 S. Rao 10 Qwest Communications 11 Jean-Louis Le Roux 12 France Telecom 13 February 19, 2008 15 Methodology for benchmarking MPLS Protection mechanisms 16 18 Status of this Memo 20 By submitting this Internet-Draft, each author represents that 21 any applicable patent or other IPR claims of which he or she is 22 aware have been or will be disclosed, and any of which he or she 23 becomes aware will be disclosed, in accordance with Section 6 of 24 BCP 79. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF), its areas, and its working groups. Note that 28 other groups may also distribute working documents as Internet- 29 Drafts. 31 Internet-Drafts are draft documents valid for a maximum of six months 32 and may be updated, replaced, or obsoleted by other documents at any 33 time. It is inappropriate to use Internet-Drafts as reference 34 material or to cite them other than as "work in progress." 36 The list of current Internet-Drafts can be accessed at 37 http://www.ietf.org/ietf/1id-abstracts.txt 39 The list of Internet-Draft Shadow Directories can be accessed at 40 http://www.ietf.org/shadow.html 42 This Internet-Draft will expire on August 19, 2008. 44 Copyright Notice 46 Copyright (C) The IETF Trust (2008). 48 Poretsky, Rao, Le Roux 49 Protection Mechanisms 51 Abstract 53 This draft describes the methodology for benchmarking MPLS Protection 54 mechanisms for link and node protection as defined in [MPLS-FRR-EXT]. 55 The benchmarking and terminology [TERM-ID] are to be used for 56 benchmarking MPLS based protection mechanisms [MPLS-FRR-EXT]. This 57 document provides test methodologies and test-bed setup for measuring 58 failover times while considering all dependencies that might impact 59 faster recovery of real time services riding on MPLS based primary 60 tunnel. The terms used in the procedures included in this document are 61 defined in [TERM-ID]. 63 Table of Contents 65 1. Introduction...................................................3 66 2. Existing definitions...........................................6 67 3. Test Considerations............................................6 68 3.1. Failover Events...........................................6 69 3.2. Failure Detection [TERM-ID]...............................7 70 3.3. Use of Data Traffic for MPLS Protection Benchmarking......8 71 3.4. LSP and Route Scaling.....................................8 72 3.5. Selection of IGP..........................................8 73 3.6. Reversion [TERM-ID].......................................9 74 3.7. Traffic generation........................................9 75 3.8. Motivation for topologies.................................9 76 4. Test Setup....................................................10 77 4.1. Link Protection with 1 hop primary (from PLR) and 1 hop 78 backup........................................................11 79 TE tunnels....................................................11 80 4.2. Link Protection with 1 hop primary (from PLR) and 2 hop 81 backup TE tunnels.............................................11 82 4.3. Link Protection with 2+ hop (from PLR) primary and 1 hop 83 backup TE tunnels.............................................12 84 4.4. Link Protection with 2+ hop (from PLR) primary and 2 hop 85 backup TE tunnels.............................................12 86 4.5. Node Protection with 2 hop primary (from PLR) and 1 hop 87 backup TE tunnels.............................................13 89 Poretsky, Rao, Le Roux 90 Protection Mechanisms 91 4.6. Node Protection with 2 hop primary (from PLR) and 2 hop 92 backup TE tunnels.............................................14 93 4.7. Node Protection with 3+ hop primary (from PLR) and 1 hop 94 backup TE tunnels.............................................15 95 4.8. Node Protection with 3+ hop primary (from PLR) and 2 hop 96 backup TE tunnels.............................................16 97 5. Test Methodology..............................................16 98 5.1. Headend as PLR with link failure.........................16 99 5.2. Mid-Point as PLR with link failure.......................18 100 5.3. Headend as PLR with Node failure.........................19 101 5.4. Mid-Point as PLR with Node failure.......................20 102 5.5. MPLS FRR Forwarding Performance Test Cases...............22 103 5.5.1. PLR as Headend......................................22 104 5.5.2. PLR as Mid-point....................................23 105 6. Reporting Format..............................................24 106 7. IANA Considerations...........................................25 107 This document requires no IANA considerations....................25 108 8. Security Considerations.......................................25 109 9. Acknowledgements..............................................26 110 10. References...................................................26 111 10.1. Normative References....................................26 112 10.2. Informative References..................................27 113 11. Authors' Addresses...........................................27 114 Intellectual Property Statement..................................29 115 Appendix A: Fast Reroute Scalability Table.......................30 117 1. Introduction 119 This draft describes the methodology for benchmarking MPLS based 120 protection mechanisms. The new terminology that it introduces is defined 121 in [TERM-ID]. 123 MPLS based protection mechanisms provide faster recovery of real time 124 services in case of an unplanned link or node failure in the network 125 core, where MPLS is used as a signaling protocol to setup point-to-point 126 traffic engineered tunnels. MPLS based protection mechanisms improve 127 service availability by minimizing the duration of the most common 128 failures. There are generally two factors impacting service 129 availability. One is the frequency and the other is the duration of the 130 failure. Unexpected correlated failures are less common. Correlated 131 failures mean co-occurrence of two or more failures simultaneously. 133 Poretsky, Rao, Le Roux 134 Protection Mechanisms 135 These failures are often observed when two or more logical resources 136 (for e.g. layer-2 links), relying on a common physical resource (for 137 e.g. common transport) fail. Common transport may include TDM and WDM 138 links providing multiplexing at layer-2 and layer-1. Within the context 139 of MPLS protection mechanisms, Shared Risk Link Groups [MPLS-FRR-EXT] 140 encompass correlations failures. 142 Not all correlated failures can be anticipated in advance of their 143 occurrence. Failures due to natural disasters or planned failures are 144 the most notable causes. Due to the frequent occurrences of such 145 failures, it is necessary that implementations can handle these faults 146 gracefully, and recover the services affected by failures very quickly. 148 Some routers recover faster as compared to the others, hence 149 benchmarking this type of failures become very useful. Benchmarking of 150 unexpected correlated failures should include measurement of 151 restoration with and without the availability of IP fallback. This 152 document provides detailed test cases focusing on benchmarking MPLS 153 protection mechanisms. Benchmarking of unexpected correlated failures 154 is currently out of scope of this document. 156 A link or a node failure could occur either at the head-end or at the 157 mid point node of a primary tunnel. The backup tunnel could offer either 158 link or node protection following a failure along the path of the 159 primary tunnel. The time lapsed in transitioning primary tunnel traffic 160 to the backup tunnel is a key measurement that ensures the service level 161 agreements. Failover time depends upon many factors such as the number 162 of prefixes bound to a tunnel, services (such as IGP, BGP, Layer 3/ 163 Layer 2 VPNs) that are bound to the tunnel, number of primary tunnels 164 affected by the failure event, number of primary tunnels protected by 165 backup, the type of failure and the physical media on which the failover 166 occurs. This document describes all different topologies and scenarios 167 that should be considered to effectively benchmark MPLS protection 168 mechanisms and failover times. Different failure scenarios and scaling 169 considerations are also provided in this document. In addition the 170 document provides a reporting format for the observed results. 172 To benchmark the failover time, data plane traffic is used as defined in 173 [IGP-METH]. Traffic loss is the key component in a black-box type test 174 and is used to measure convergence. 176 Poretsky, Rao, Le Roux 177 Protection Mechanisms 179 All benchmarking test cases defined in this document apply to both 180 facility backup and local protection enabled in detour mode. The test 181 cases cover all possible failure scenarios and the associated procedures 182 benchmark the ability of the DUT to perform recovery from failures 183 within target failover time. 185 Figure 1 represents the basic reference test bed and is applicable to 186 all the test cases defined in this document. TG & TA represents Traffic 187 Generator & Analyzer respectively. A tester is connected to the DUT and 188 it sends and receives IP traffic along with the working Path, run 189 protocol emulations simulating real world peering scenarios. 191 --------------------------- 192 | ------------|--------------- 193 | | | | 194 | | | | 195 -------- -------- -------- -------- -------- 196 TG-| R1 |-----| R2 |----| R3 | | R4 | | R5 |-TA 197 | |-----| |----| |----| |---| | 198 -------- -------- -------- -------- -------- 199 | | | | 200 | | | | 201 | -------- | | 202 ---------| R6 |-------- | 203 | |-------------------- 204 -------- 206 Fig.1: Fast Reroute Topology. 208 The tester MUST record the number of lost, duplicate, and reordered 209 packets. It should further record arrival and departure times so that 210 Failover Time, Additive Latency, and Reversion Time can be measured. 211 The tester may be a single device or a test system emulating all the 212 different roles along a primary or backup path. 214 Poretsky, Rao, Le Roux 215 Protection Mechanisms 216 2. Existing definitions 218 For the sake of clarity and continuity this RFC adopts the template 219 for definitions set out in Section 2 of RFC 1242. Definitions are 220 indexed and grouped together in sections for ease of reference. 222 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 223 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in 224 this document are to be interpreted as described in RFC 2119. 226 The reader is assumed to be familiar with the commonly used MPLS 227 terminology, some of which is defined in [MPLS-FRR-EXT]. 229 3. Test Considerations 231 This section discusses the fundamentals of MPLS Protection testing: 233 -The types of network events that causes failover 234 -Indications for failover 235 -the use of data traffic 236 -Traffic generation 237 -LSP Scaling 238 -Reversion of LSP 239 -IGP Selection 241 3.1. Failover Events 243 The failover to the backup tunnel is primarily triggered by either a 244 link or node failures observed downstream of the Point of Local 245 repair (PLR). Some of these failure events are listed below. 247 Link failure events 249 - Interface Shutdown on PLR side with POS Alarm 250 - Interface Shutdown on remote side with POS Alarm 251 - Interface Shutdown on PLR side with RSVP hello 252 - Interface Shutdown on remote side with RSVP hello 253 - Interface Shutdown on PLR side with BFD 254 - Interface Shutdown on remote side with BFD 256 Poretsky, Rao, Le Roux 257 Protection Mechanisms 258 - Fiber Pull on the PLR side (Both TX & RX or just the Tx) 259 - Fiber Pull on the remote side (Both TX & RX or just the Rx) 260 - Online insertion and removal (OIR) on PLR side 261 - OIR on remote side 262 - Sub-interface failure (e.g. shutting down of a VLAN) 263 - Parent interface shutdown (an interface bearing multiple sub- 264 interfaces 266 Node failure events 268 A System reload is initiated either by a graceful shutdown or by a 269 power failure. A system crash is referred to as a software failure or 270 an assert. 272 - Reload protected Node, when RSVP Hello is enabled 273 - Crash Protected Node, when RSVP Hello is enable 274 - Reload Protected Node, when BFD is enable 275 - Crash Protected Node, when BFD is enable 277 3.2. Failure Detection [TERM-ID] 279 Local failures can be detected via SONET/SDH failure with directly 280 connected LSR. Failure indication may vary with the type of alarm - 281 LOS, AIS, or RDI. Failures on Ethernet links such as Gigabit Ethernet 282 rely upon Layer 3 signaling indication for failure. 284 Different MPLS protection mechanisms and different implementations 285 use different failure detection techniques such as RSVP hellos, BFD 286 etc. Ethernet technologies such as Gigabit Ethernet rely upon layer 3 287 failure indication mechanisms since there is no Layer 2 failure 288 indication mechanism. The failure detection time may not always be 289 negligible and it could impact the overall failover time. 291 The test procedures in this document can be used for a local failure 292 or remote failure scenarios for comprehensive benchmarking and to 293 evaluate failover performance independent of the failure detection 294 techniques. 296 Poretsky, Rao, Le Roux 297 Protection Mechanisms 299 3.3. Use of Data Traffic for MPLS Protection Benchmarking 301 Currently end customers use packet loss as a key metric for failover 302 time. Packet loss is an externally observable event and has direct 303 impact on customers' applications. MPLS protection mechanism is 304 expected to minimize the packet loss in the event of a failure. For 305 this reason it is important to develop a standard router benchmarking 306 methodology for measuring MPLS protection that uses packet loss as a 307 metric. At a known rate of forwarding, packet loss can be measured 308 and the Failover time can be determined. Measurement of control plane 309 signaling to establish backup paths is not enough to verify failover. 310 Failover is best determined when packets are actually traversing the 311 backup path. 313 An additional benefit of using packet loss for calculation of 314 Failover time is that it allows use of a black-box tests environment. 315 Data traffic is offered at line-rate to the device under test (DUT), 316 and an emulated network failure event is forced to occur, and packet 317 loss is externally measured to calculate the convergence time. This 318 setup is independent of the DUT architecture. 320 In addition, this methodology considers the packets in error and 321 duplicate packets that could have been generated during the failover 322 process. In scenarios, where separate measurement of packets in error 323 and duplicate packets is difficult to obtain, these packets should be 324 attributed to lost packets. 326 3.4. LSP and Route Scaling 328 Failover time performance may vary with the number of established 329 primary and backup tunnels (LSP) and installed routes. However the 330 procedure outlined here should be used for any number of LSPs (L) and 331 number of routes protected by PLR(R). Number of L and R must be 332 recorded. 334 3.5. Selection of IGP 336 The underlying IGP could be ISIS-TE or OSPF-TE for the methodology 337 proposed here. 339 Poretsky, Rao, Le Roux 340 Protection Mechanisms 342 3.6. Reversion [TERM-ID] 344 Fast Reroute provides a method to return or restore a backup path to 345 original primary LSP upon recovery from the failure. This is referred 346 to as Reversion, which can be implemented as Global Reversion or 347 Local Reversion. In all test cases listed here Reversion should not 348 produce any packet loss, out of order or duplicate packets. Each of 349 the test cases in this methodology document provides a check to 350 confirm that there is no packet loss. 352 3.7. Traffic generation 354 It is suggested that there be one or more traffic streams as long as 355 there is a steady and constant rate of flow for all the streams. In 356 order to monitor the DUT performance for recovery times a set of 357 route prefixes should be advertised before traffic is sent. The 358 traffic should be configured towards these routes. 360 A typical example would be configuring the traffic generator to send 361 the traffic to the first, middle and last of the advertised routes. 362 (First, middle and last could be decided by the numerically smallest, 363 median and the largest respectively of the advertised prefix). 364 Generating traffic to all of the prefixes reachable by the protected 365 tunnel (probably in a Round-Robin fashion, where the traffic is 366 destined to all the prefixes but one prefix at a time in a cyclic 367 manner) is not recommended. The reason why traffic generation is not 368 recommended in a Round-Robin fashion to all the prefixes, one at a 369 time is that if there are many prefixes reachable through the LSP the 370 time interval between 2 packets destined to one prefix may be 371 significantly high and may be comparable with the failover time being 372 measured which does not aid in getting an accurate failover 373 measurement. 375 3.8. Motivation for topologies 377 Poretsky, Rao, Le Roux 378 Protection Mechanisms 379 Given that the label stack is dependent of the following 3 entities 380 it is recommended that the benchmarking of failover time be performed 381 on all the 8 topologies provided in section 4 383 - Type of protection (Link Vs Node) 385 - # of remaining hops of the primary tunnel from the PLR 387 - # of remaining hops of the backup tunnel from the PLR 389 4. Test Setup 391 Topologies to be used for benchmarking the failover time: 393 This section proposes a set of topologies that covers all the 394 scenarios for local protection. All of these 8 topologies shown 395 (figure 2- figure 9) can be mapped to the reference topology shown in 396 figure 1. Topologies provided in sections 4.1 to 4.8 refer to test- 397 bed required to benchmark failover time when DUT is configured as a 398 PLR in either head-end or midpoint role. The labels stack provided 399 with each topology is at the PLR. 401 The label stacks shown below each figure in section 4.1 to 4.9 402 considers enabling of Penultimate Hop Popping (PHP). 404 Figures 2-9 uses the following convention: 406 a) HE is Head-End 408 b) TE is Tail-End 410 c) MID is Mid point 412 d) MP is Merge Point 414 e) PLR is Point of Local Repair 416 f) PRI is Primary 418 g) BKP denotes Backup Node 420 Poretsky, Rao, Le Roux 421 Protection Mechanisms 422 4.1. Link Protection with 1 hop primary (from PLR) and 1 hop backup 424 TE tunnels 426 ------- -------- PRI -------- 427 | R1 | | R2 | | R3 | 428 TG-| HE |--| MID |----| TE |-TA 429 | | | PLR |----| | 430 ------- -------- BKP -------- 431 Figure 2: Represents the setup for section 4.1 433 Traffic No of Labels No of labels after 434 before failure failure 435 IP TRAFFIC (P-P) 0 0 436 Layer3 VPN (PE-PE) 1 1 437 Layer3 VPN (PE-P) 2 2 438 Layer2 VC (PE-PE) 1 1 439 Layer2 VC (PE-P) 2 2 440 Mid-point LSPs 0 0 442 4.2. Link Protection with 1 hop primary (from PLR) and 2 hop backup TE 443 tunnels 445 ------- -------- -------- 446 | R1 | | R2 | | R3 | 447 TG-| HE | | MID |PRI | TE |-TA 448 | |----| PLR |----| | 449 ------- -------- -------- 450 |BKP | 451 | -------- | 452 | | R6 | | 453 |----| BKP |----| 454 | MID | 455 -------- 456 Figure 3: Representing setup for section 4.2 458 Traffic No of Labels No of labels 459 before failure after failure 461 Poretsky, Rao, Le Roux 463 Protection Mechanisms 464 IP TRAFFIC (P-P) 0 1 465 Layer3 VPN (PE-PE) 1 2 466 Layer3 VPN (PE-P) 2 3 467 Layer2 VC (PE-PE) 1 2 468 Layer2 VC (PE-P) 2 3 469 Mid-point LSPs 0 1 471 4.3. Link Protection with 2+ hop (from PLR) primary and 1 hop backup TE 472 tunnels 474 -------- -------- -------- -------- 475 | R1 | | R2 |PRI | R3 |PRI | R4 | 476 TG-| HE |----| MID |----| MID |------| TE |-TA 477 | | | PLR |----| | | | 478 -------- -------- BKP -------- -------- 479 Figure 4: Representing setup for section 4.3 481 Traffic No of Labels No of labels 482 before failure after failure 484 IP TRAFFIC (P-P) 1 1 485 Layer3 VPN (PE-PE) 2 2 486 Layer3 VPN (PE-P) 3 3 487 Layer2 VC (PE-PE) 2 2 488 Layer2 VC (PE-P) 3 3 489 Mid-point LSPs 1 1 491 4.4. Link Protection with 2+ hop (from PLR) primary and 2 hop backup TE 492 tunnels 494 -------- -------- PRI -------- PRI -------- 495 | R1 | | R2 | | R3 | | R4 | 497 Poretsky, Rao, Le Roux 498 Protection Mechanisms 499 TG-| HE |----| MID |----| MID |------| TE |-TA 500 | | | PLR | | | | | 501 -------- -------- -------- -------- 502 BKP| | 503 | -------- | 504 | | R6 | | 505 ---| BKP |- 506 | MID | 507 -------- 508 Figure 5: Representing the setup for section 4.4 510 Traffic No of Labels No of labels 511 before failure after failure 513 IP TRAFFIC (P-P) 1 2 514 Layer3 VPN (PE-PE) 2 3 515 Layer3 VPN (PE-P) 3 4 516 Layer2 VC (PE-PE) 2 3 517 Layer2 VC (PE-P) 3 4 518 Mid-point LSPs 1 2 520 4.5. Node Protection with 2 hop primary (from PLR) and 1 hop backup TE 521 tunnels 523 -------- -------- -------- -------- 524 | R1 | | R2 |PRI | R3 | PRI | R4 | 525 TG-| HE |----| MID |----| MID |------| TE |-TA 526 | | | PLR | | | | | 527 -------- -------- -------- -------- 528 |BKP | 529 ----------------------------- 530 Figure 6: Representing the setup for section 4.5 532 Traffic No of Labels No of labels 533 before failure after failure 535 Poretsky, Rao, Le Roux 537 Protection Mechanisms 538 IP TRAFFIC (P-P) 1 0 539 Layer3 VPN (PE-PE) 2 1 540 Layer3 VPN (PE-P) 3 2 541 Layer2 VC (PE-PE) 2 1 542 Layer2 VC (PE-P) 3 2 543 Mid-point LSPs 1 0 545 4.6. Node Protection with 2 hop primary (from PLR) and 2 hop backup TE 546 tunnels 548 -------- -------- -------- -------- 549 | R1 | | R2 | | R3 | | R4 | 550 TG-| HE | | MID |PRI | MID |PRI | TE |-TA 551 | |----| PLR |----| |----| | 552 -------- -------- -------- -------- 553 | | 554 BKP| -------- | 555 | | R6 | | 556 ---------| BKP |--------- 557 | MID | 558 -------- 559 Figure 7: Representing setup for section 4.6 561 Traffic No of Labels No of labels 562 before failure after failure 564 IP TRAFFIC (P-P) 1 1 565 Layer3 VPN (PE-PE) 2 2 566 Layer3 VPN (PE-P) 3 3 567 Layer2 VC (PE-PE) 2 2 568 Layer2 VC (PE-P) 3 3 569 Mid-point LSPs 1 1 571 Poretsky, Rao, Le Roux 572 Protection Mechanisms 573 4.7. Node Protection with 3+ hop primary (from PLR) and 1 hop backup TE 574 tunnels 576 -------- -------- PRI -------- PRI -------- PRI -------- 577 | R1 | | R2 | | R3 | | R4 | | R5 | 578 TG-| HE |--| MID |---| MID |---| MP |---| TE |-TA 579 | | | PLR | | | | | | | 580 -------- -------- -------- -------- -------- 581 BKP| | 582 -------------------------- 583 Figure 8: Representing setup for section 4.7 585 Traffic No of Labels No of labels 586 before failure after failure 588 IP TRAFFIC (P-P) 1 1 589 Layer3 VPN (PE-PE) 2 2 590 Layer3 VPN (PE-P) 3 3 591 Layer2 VC (PE-PE) 2 2 592 Layer2 VC (PE-P) 3 3 593 Mid-point LSPs 1 1 595 Poretsky, Rao, Le Roux 596 Protection Mechanisms 598 4.8. Node Protection with 3+ hop primary (from PLR) and 2 hop backup 599 TE tunnels 601 -------- -------- -------- -------- -------- 602 | R1 | | R2 | | R3 | | R4 | | R5 | 603 TG-| HE | | MID |PRI| MID |PRI| MP |PRI| TE |-TA 604 | |-- | PLR |---| |---| |---| | 605 -------- -------- -------- -------- -------- 606 BKP| | 607 | -------- | 608 | | R6 | | 609 ---------| BKP |------- 610 | MID | 611 -------- 612 Figure 9: Representing setup for section 4.8 614 Traffic No of Labels No of labels 615 before failure after failure 617 IP TRAFFIC (P-P) 1 2 618 Layer3 VPN (PE-PE) 2 3 619 Layer3 VPN (PE-P) 3 4 620 Layer2 VC (PE-PE) 2 3 621 Layer2 VC (PE-P) 3 4 622 Mid-point LSPs 1 2 624 5. Test Methodology 626 The procedure described in this section can be applied to all the 8 627 base test cases and the associated topologies. The backup as well as 628 the primary tunnel are configured to be alike in terms of bandwidth 629 usage. In order to benchmark failover with all possible label stack 630 depth applicable as seen with current deployments, it is suggested 631 that the methodology includes all the scenarios listed here 633 5.1. Headend as PLR with link failure 635 Objective 637 Poretsky, Rao, Le Roux 638 Protection Mechanisms 639 To benchmark the MPLS failover time due to Link failure events 640 described in section 3.1 experienced by the DUT which is the point 641 of local repair (PLR). 643 Test Setup 645 - select any one topology out of 8 from section 4 646 - select overlay technology for FRR test e.g. IGP,VPN,or VC 647 - The DUT will also have 2 interfaces connected to the traffic 648 Generator/analyzer. (If the node downstream of the PLR is not 649 A simulated node, then the Ingress of the tunnel should have 650 one link connected to the traffic generator and the node 651 downstream to the PLR or the egress of the tunnel should have 652 a link connected to the traffic analyzer). 654 Test Configuration 656 1. Configure the number of primaries on R2 and the backups on 657 R2 as required by the topology selected. 658 2. Advertise prefixes (as per FRR Scalability table describe in 659 Appendix A) by the tail end. 661 Procedure 663 1. Establish the primary lsp on R2 required by the topology 664 selected 665 2. Establish the backup lsp on R2 required by the selected 666 topology 667 3. Verify primary and backup lsps are up and that primary is 668 protected 669 4. Verify Fast Reroute protection is enabled and ready 670 5. Setup traffic streams as described in section 3.7 671 6. Send IP traffic at maximum Forwarding Rate to DUT. 672 7. Verify traffic switched over Primary LSP. 673 8. Trigger any choice of Link failure as describe in section 674 3.1 675 9. Verify that primary tunnel and prefixes gets mapped to 676 backup tunnels 677 10. Stop traffic stream and measure the traffic loss. 679 Poretsky, Rao, Le Roux 680 Protection Mechanisms 681 11. Failover time is calculated as defined in section 6, 682 Reporting format. 683 12. Start traffic stream again to verify reversion when 684 protected interface comes up. Traffic loss should be 0 due 685 to make before break or reversion. 686 13. Enable protected interface that was down (Node in the case 687 of NNHOP) 688 14. Verify head-end signals new LSP and protection should be in 689 place again 691 5.2. Mid-Point as PLR with link failure 693 Objective 695 To benchmark the MPLS failover time due to Link failure events 696 described in section 3.1 experienced by the device under test which 697 is the point of local repair (PLR). 699 Test Setup 701 - select any one topology out of 8 from section 4 702 - select overlay technology for FRR test as Mid-Point lsps 703 - The DUT will also have 2 interfaces connected to the traffic 704 generator. 706 Test Configuration 708 1. Configure the number of primaries on R1 and the backups on 709 R2 as required by the topology selected 710 2. Advertise prefixes (as per FRR Scalability table describe in 711 Appendix A) by the tail end. 713 Procedure 715 1. Establish the primary lsp on R1 required by the topology 716 selected 718 Poretsky, Rao, Le Roux 719 Protection Mechanisms 720 2. Establish the backup lsp on R2 required by the selected 721 topology 722 3. Verify primary and backup lsps are up and that primary is 723 protected 724 4. Verify Fast Reroute protection 725 5. Setup traffic streams as described in section 3.7 726 6. Send IP traffic at maximum Forwarding Rate to DUT. 727 7. Verify traffic switched over Primary LSP. 728 8. Trigger any choice of Link failure as describe in section 729 3.1 730 9. Verify that primary tunnel and prefixes gets mapped to 731 backup tunnels 732 10. Stop traffic stream and measure the traffic loss. 733 11. Failover time is calculated as per defined in section 6, 734 Reporting format. 735 12. Start traffic stream again to verify reversion when 736 protected interface comes up. Traffic loss should be 0 due 737 to make before break or reversion 738 13. Enable protected interface that was down (Node in the case 739 of NNHOP) 740 14. Verify head-end signals new LSP and protection should be in 741 place again 743 5.3. Headend as PLR with Node failure 745 Objective 747 To benchmark the MPLS failover time due to Node failure events 748 described in section 3.1 experienced by the device under test which 749 is the point of local repair (PLR). 751 Test Setup 753 - select any one topology from section 4.5 to 4.8 754 - select overlay technology for FRR test e.g. IGP,VPN,or VC 755 - The DUT will also have 2 interfaces connected to the traffic 756 generator. 758 Test Configuration 760 Poretsky, Rao, Le Roux 761 Protection Mechanisms 762 1. Configure the number of primaries on R2 and the backups on 763 R2 as required by the topology selected 764 2. Advertise prefixes (as per FRR Scalability table describe in 765 Appendix A) by the tail end. 767 Procedure 769 1. Establish the primary lsp on R2 required by the topology 770 selected 771 2. Establish the backup lsp on R2 required by the selected 772 topology 773 3. Verify primary and backup lsps are up and that primary is 774 protected 775 4. Verify Fast Reroute protection 776 5. Setup traffic streams as described in section 3.7 777 6. Send IP traffic at maximum Forwarding Rate to DUT. 778 7. Verify traffic switched over Primary LSP. 779 8. Trigger any choice of Node failure as describe in section 780 3.1 781 9. Verify that primary tunnel and prefixes gets mapped to 782 backup tunnels 783 10. Stop traffic stream and measure the traffic loss. 784 11. Failover time is calculated as per defined in section 6, 785 Reporting format. 786 12. Start traffic stream again to verify reversion when 787 protected interface comes up. Traffic loss should be 0 due 788 to make before break or reversion 789 13. Boot protected Node that was down. 790 14. Verify head-end signals new LSP and protection should be in 791 place again 793 5.4. Mid-Point as PLR with Node failure 795 Objective 797 Poretsky, Rao, Le Roux 798 Protection Mechanisms 799 To benchmark the MPLS failover time due to Node failure events 800 described in section 3.1 experienced by the device under test which 801 is the point of local repair (PLR). 803 Test Setup 805 - select any one topology from section 4.5 to 4.8 806 - select overlay technology for FRR test as Mid-Point lsps 807 - The DUT will also have 2 interfaces connected to the traffic 808 generator. 810 Test Configuration 812 1. Configure the number of primaries on R1 and the backups on 813 R2 as required by the topology selected 814 2. Advertise prefixes (as per FRR Scalability table describe in 815 Appendix A) by the tail end. 817 Procedure 819 1. Establish the primary lsp on R1 required by the topology 820 selected 821 2. Establish the backup lsp on R2 required by the selected 822 topology 823 3. Verify primary and backup lsps are up and that primary is 824 protected 825 4. Verify Fast Reroute protection 826 5. Setup traffic streams as described in section 3.7 827 6. Send IP traffic at maximum Forwarding Rate to DUT. 828 7. Verify traffic switched over Primary LSP. 829 8. Trigger any choice of Node failure as describe in section 830 3.1 831 9. Verify that primary tunnel and prefixes gets mapped to 832 backup tunnels 833 10. Stop traffic stream and measure the traffic loss. 834 11. Failover time is calculated as per defined in section 6, 835 Reporting format. 836 12. Start traffic stream again to verify reversion when 837 protected interface comes up. Traffic loss should be 0 due 838 to make before break or reversion 839 13. Boot protected Node that was down 841 Poretsky, Rao, Le Roux 842 Protection Mechanisms 843 14. Verify head-end signals new LSP and protection should be in 844 place again 846 5.5. MPLS FRR Forwarding Performance Test Cases 848 For the following MPLS FRR Forwarding Performance Benchmarking 849 cases, Test the maximum PPS rate allowed by given hardware 851 5.5.1. PLR as Headend 853 Objective 855 To benchmark the maximum rate (pps) on the PLR (as headend) 856 over primary FRR LSP and backup lsp. 858 Test Setup 860 - select any one topology out of 8 from section 4 861 - select overlay technology for FRR test e.g. IGP,VPN,or VC 862 - The DUT will also have 2 interfaces connected to the traffic 863 Generator/analyzer. (If the node downstream of the PLR is not 864 A simulated node, then the Ingress of the tunnel should have 865 one link connected to the traffic generator and the node 866 downstream to the PLR or the egress of the tunnel should have 867 a link connected to the traffic analyzer). 869 Procedure 871 1. Establish the primary lsp on R2 required by the 872 topology selected 873 2. Establish the backup lsp on R2 required by the 874 selected topology 875 3. Verify primary and backup lsps are up and that primary 876 is protected 877 4. Verify Fast Reroute protection is enabled and ready 878 5. Setup traffic streams as described in section 3.7 879 6. Send IP traffic at maximum forwarding rate (pps) that 880 the device under test supports over the primary LSP 881 7. Record maximum PPS rate forwarded over primary LSP 883 Poretsky, Rao, Le Roux 884 Protection Mechanisms 885 8. Stop traffic stream 886 9. Trigger any choice of Link failure as describe in 887 section 3.1 888 10. Verify that primary tunnel and prefixes gets mapped to 889 backup tunnels 890 11. Send IP traffic at maximum forwarding rate (pps) that 891 the device under test supports over the primary LSP 892 12. Record maximum PPS rate forwarded over backup LSP 894 5.5.2. PLR as Mid-point 896 To benchmark the maximum rate (pps) on the PLR (as mid-point) 897 over primary FRR LSP and backup lsp. 899 Test Setup 901 - select any one topology out of 8 from section 4 902 - select overlay technology for FRR test as Mid-Point lsps 903 - The DUT will also have 2 interfaces connected to the traffic 904 generator. 906 Procedure 908 1. Establish the primary lsp on R1 required by the 909 topology selected 910 2. Establish the backup lsp on R2 required by the 911 selected topology 912 3. Verify primary and backup lsps are up and that primary 913 is protected 914 4. Verify Fast Reroute protection is enabled and ready 915 5. Setup traffic streams as described in section 3.7 916 6. Send IP traffic at maximum forwarding rate (pps) that 917 the device under test supports over the primary LSP 918 7. Record maximum PPS rate forwarded over primary LSP 919 8. Stop traffic stream 921 Poretsky, Rao, Le Roux 922 Protection Mechanisms 923 9. Trigger any choice of Link failure as describe in 924 section 3.1 925 10. Verify that primary tunnel and prefixes gets mapped to 926 backup tunnels 927 11. Send IP traffic at maximum forwarding rate (pps) that 928 the device under test supports over the backup LSP 929 12. Record maximum PPS rate forwarded over backup LSP 931 6. Reporting Format 933 For each test, it is recommended that the results be reported in the 934 following format. 936 Parameter Units 938 IGP used for the test ISIS-TE/ OSPF-TE 939 Interface types Gige,POS,ATM,VLAN etc. 940 Packet Sizes offered to the DUT Bytes 941 Forwarding rate number of packets 942 IGP routes advertised number of IGP routes 943 RSVP hello timers configured (if any) milliseconds 944 Number of FRR tunnels configured number of tunnels 945 Number of VPN routes in head-end number of VPN routes 946 Number of VC tunnels number of VC tunnels 947 Number of BGP routes number of BGP routes 948 Number of mid-point tunnels number of tunnels 949 Number of Prefixes protected by Primary number of prefixes 950 Number of LSPs being protected number of LSPs 951 Topology being used Section number 952 Failure Event Event type 954 Benchmarks 956 Minimum failover time milliseconds 957 Mean failover time milliseconds 958 Maximum failover time milliseconds 959 Minimum reversion time milliseconds 961 Poretsky, Rao, Le Roux 962 Protection Mechanisms 963 Mean reversion time milliseconds 964 Maximum reversion time milliseconds 966 Failover time suggested above is calculated using one of the 967 following 3 methods 969 1. Packet-Based Loss method (PBLM): (Number of packets 970 dropped/packets per second * 1000) milliseconds. This method 971 could also be referred as Rate Derived method. 973 2. Time-Based Loss Method (TBLM): This method relies on the 974 ability of the Traffic generators to provide statistics which 975 reveal the duration of failure in milliseconds based on when the 976 packet loss occurred (interval between non-zero packet loss and 977 zero loss). 979 3. Timestamp Based Method (TBM): This method of failover 980 calculation is based on the timestamp that gets transmitted as 981 payload in the packets originated by the generator. The Traffic 982 Analyzer records the timestamp of the last packet received 983 before the failover event and the first packet after the 984 failover and derives the time based on the difference between 985 these 2 timestamps. Note: The payload could also contain 986 sequence numbers for out-of-order packet calculation and 987 duplicate packets. 989 Note: If the primary is configured to be dynamic, and if the primary 990 is to reroute, make before break should occur from the backup that is 991 in use to a new alternate primary. If there is any packet loss seen, 992 it should be added to failover time. 994 7. IANA Considerations 996 This document requires no IANA considerations. 998 8. Security Considerations 1000 Benchmarking activities as described in this memo are limited to 1001 technology characterization using controlled stimuli in a laboratory 1003 Poretsky, Rao, Le Roux 1004 Protection Mechanisms 1005 environment, with dedicated address space and the constraints 1006 specified in the sections above. 1008 The benchmarking network topology will be an independent test setup 1009 and MUST NOT be connected to devices that may forward the test 1010 traffic into a production network, or misroute traffic to the test 1011 management network. 1013 Further, benchmarking is performed on a "black-box" basis, relying 1014 solely on measurements observable external to the DUT/SUT. 1016 Special capabilities SHOULD NOT exist in the DUT/SUT specifically 1017 for benchmarking purposes. Any implications for network security 1018 arising from the DUT/SUT SHOULD be identical in the lab and in 1019 production networks. 1021 The isolated nature of the benchmarking environments and the fact 1022 that no special features or capabilities, other than those used in 1023 operational networks, are enabled on the DUT/SUT requires no 1024 security considerations specific to the benchmarking process. 1026 9. Acknowledgements 1028 We would like to thank Jean Philip Vasseur for his invaluable input 1029 to the document and Curtis Villamizar his contribution in suggesting 1030 text on definition and need for benchmarking Correlated failures. 1032 Additionally we would like to thank Arun Gandhi, Amrit Hanspal, Karu 1033 Ratnam and for their input to the document. 1035 10. References 1037 10.1. Normative References 1039 [MPLS-FRR-EXT] Pan, P., Atlas, A., Swallow, G., "Fast Reroute 1040 Extensions to RSVP-TE for LSP Tunnels", RFC 4090. 1042 Poretsky, Rao, Le Roux 1043 Protection Mechanisms 1044 10.2. Informative References 1046 [RFC-WORDS] Bradner, S., "Key words for use in RFCs to 1047 Indicate Requirement Levels", RFC 2119, March 1997. 1049 [TERM-ID] Poretsky S., Papneja R., Karthik J., Vapiwala S., 1050 "Benchmarking Terminology for Protection 1051 Performance", draft-ietf-bmwg-protection-term- 1052 02.txt, work in progress. 1054 [MPLS-FRR-EXT] Pan P., Swollow G., Atlas A., "Fast Reroute 1055 Extensions to RSVP-TE for LSP Tunnels", RFC 4090. 1057 [IGP-METH] S. Poretsky, B. Imhoff, "Benchmarking Methodology 1058 for IGP Data Plane Route Convergence, "draft-ietf- 1059 bmwg-igp-dataplane-conv-meth-12.txt", work in 1060 progress. 1062 11. Authors' Addresses 1064 Rajiv Papneja 1065 Isocore 1066 12359 Sunrise Valley Drive, STE 100 1067 Reston, VA 20190 1068 USA 1069 Phone: +1 703 860 9273 1070 Email: rpapneja@isocore.com 1072 Samir Vapiwala 1073 Cisco System 1074 300 Beaver Brook Road 1075 Boxborough, MA 01719 1076 USA 1077 Phone: +1 978 936 1484 1078 Email: svapiwal@cisco.com 1080 Poretsky, Rao, Le Roux 1081 Protection Mechanisms 1083 Jay Karthik 1084 Cisco Systems, 1085 300 Beaver Brook Road 1086 Boxborough, MA 01719 1087 USA 1088 Phone: + 1 978 936 0533 1089 Email: jkarthik@cisco.com 1091 Scott Poretsky 1092 Reef Point Systems 1093 8 New England Executive Park 1094 Burlington, MA 01803 1095 USA 1096 Phone: + 1 781 395 5090 1097 EMail: sporetsky@reefpoint.com 1099 Shankar Rao 1100 Qwest Communications, 1101 950 17th Street 1102 Suite 1900 1103 Denver, CO 80210 1104 USA 1105 Phone: + 1 303 437 6643 1106 Email: shankar.rao@qwest.com 1108 Jean-Louis Le Roux 1109 France Telecom 1110 2 av Pierre Marzin 1111 22300 Lannion 1112 France 1113 Phone: 00 33 2 96 05 30 20 1114 Email: jeanlouis.leroux@orange-ft.com 1116 Poretsky, Rao, Le Roux 1117 Protection Mechanisms 1119 Full Copyright Statement 1121 Copyright (C) The IETF Trust (2008). 1123 This document is subject to the rights, licenses and restrictions 1124 contained in BCP 78, and except as set forth therein, the authors 1125 retain all their rights. 1127 This document and the information contained herein are provided on an 1128 "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS 1129 OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE IETF TRUST AND 1130 THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS 1131 OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 1132 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 1133 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 1135 Intellectual Property 1137 The IETF takes no position regarding the validity or scope of any 1138 Intellectual Property Rights or other rights that might be claimed to 1139 pertain to the implementation or use of the technology described in 1140 this document or the extent to which any license under such rights 1141 might or might not be available; nor does it represent that it has 1142 made any independent effort to identify any such rights. Information 1143 on the procedures with respect to rights in RFC documents can be 1144 found in BCP 78 and BCP 79. 1146 Copies of IPR disclosures made to the IETF Secretariat and any 1147 assurances of licenses to be made available, or the result of an 1148 attempt made to obtain a general license or permission for the use of 1149 such proprietary rights by implementers or users of this 1150 specification can be obtained from the IETF on-line IPR repository at 1151 http://www.ietf.org/ipr. 1153 The IETF invites any interested party to bring to its attention any 1154 copyrights, patents or patent applications, or other proprietary 1155 rights that may cover technology that may be required to implement 1156 this standard. Please address the information to the IETF at 1157 ietf-ipr@ietf.org. 1159 Acknowledgment 1161 Funding for the RFC Editor function is provided by the IETF 1162 Administrative Support Activity (IASA). 1164 Poretsky, Rao, Le Roux 1165 Protection Mechanisms 1167 Appendix A: Fast Reroute Scalability Table 1169 This section provides the recommended numbers for evaluating the 1170 scalability of fast reroute implementations. It also recommends the 1171 typical numbers for IGP/VPNv4 Prefixes, LSP Tunnels and VC entries. 1172 Based on the features supported by the device under test, appropriate 1173 scaling limits can be used for the test bed. 1175 A 1. FRR IGP Table 1177 No of Headend IGP Prefixes 1178 TE LSPs 1179 1 100 1180 1 500 1181 1 1000 1182 1 2000 1183 1 5000 1184 2(Load Balance) 100 1185 2(Load Balance) 500 1186 2(Load Balance) 1000 1187 2(Load Balance) 2000 1188 2(Load Balance) 5000 1189 100 100 1190 500 500 1191 1000 1000 1192 2000 2000 1194 A 2. FRR VPN Table 1196 No of Headend VPNv4 Prefixes 1197 TE LSPs 1199 Poretsky, Rao, Le Roux 1200 Protection Mechanisms 1202 1 100 1203 1 500 1204 1 1000 1205 1 2000 1206 1 5000 1207 1 10000 1208 1 20000 1209 1 Max 1210 2(Load Balance) 100 1211 2(Load Balance) 500 1212 2(Load Balance) 1000 1213 2(Load Balance) 2000 1214 2(Load Balance) 5000 1215 2(Load Balance) 10000 1216 2(Load Balance) 20000 1217 2(Load Balance) Max 1219 A 3. FRR Mid-Point LSP Table 1221 No of Mid-point TE LSPs could be configured at the following 1222 recommended levels 1223 100 1224 500 1225 1000 1226 2000 1227 Max supported number 1229 A 4. FRR VC Table 1231 No of Headend VC entries 1232 TE LSPs 1234 1 100 1235 1 500 1236 1 1000 1237 1 2000 1238 1 Max 1240 Poretsky, Rao, Le Roux 1241 Protection Mechanisms 1242 100 100 1243 500 500 1244 1000 1000 1245 2000 2000 1247 Appendix B: Abbreviations 1249 BFD - Bidirectional Fault Detection 1250 BGP - Border Gateway protocol 1251 CE - Customer Edge 1252 DUT - Device Under Test 1253 FRR - Fast Reroute 1254 IGP - Interior Gateway Protocol 1255 IP - Internet Protocol 1256 LSP - Label Switched Path 1257 MP - Merge Point 1258 MPLS - Multi Protocol Label Switching 1259 N-Nhop - Next - Next Hop 1260 Nhop - Next Hop 1261 OIR - Online Insertion and Removal 1262 P - Provider 1263 PE - Provider Edge 1264 PHP - Penultimate Hop Popping 1265 PLR - Point of Local Repair 1266 RSVP - Resource reSerVation Protocol 1267 SRLG - Shared Risk Link Group 1268 TA - Traffic Analyzer 1269 TE - Traffic Engineering 1270 TG - Traffic Generator 1271 VC - Virtual Circuit 1272 VPN - Virtual Private Network 1274 Poretsky, Rao, Le Roux