idnits 2.17.1 draft-ietf-bmwg-protection-meth-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 23. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 1133. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1107. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1114. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1120. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 3 instances of too long lines in the document, the longest one being 68 characters in excess of 72. ** The abstract seems to contain references ([TERM-ID], [MPLS-FRR-EXT]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? RFC 2119 keyword, line 207: '... The tester MUST record the number o...' Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year == Line 46 has weird spacing: '... This draft...' == Line 49 has weird spacing: '... bed setup ...' == Line 116 has weird spacing: '...tection mecha...' == Line 121 has weird spacing: '...tection mecha...' == Line 124 has weird spacing: '...hanisms promi...' == (21 more instances...) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (November 3, 2008) is 5653 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'TERM ID' is mentioned on line 241, but not defined == Outdated reference: A later version (-09) exists of draft-ietf-bmwg-protection-term-05 == Outdated reference: A later version (-23) exists of draft-ietf-bmwg-igp-dataplane-conv-meth-16 == Outdated reference: A later version (-06) exists of draft-ietf-bmwg-mpls-forwarding-meth-00 Summary: 4 errors (**), 0 flaws (~~), 11 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group Rajiv Papneja 2 Internet Draft Isocore 3 Intended Status: Informational S.Vapiwala 4 Expires: April 2, 2009 J. Karthik 5 Cisco Systems 6 S. Poretsky 7 Allot 8 S. Rao 9 Qwest Communications 10 Jean-Louis Le Roux 11 France Telecom 12 November 3, 2008 14 Methodology for Benchmarking MPLS Protection Mechanisms 15 draft-ietf-bmwg-protection-meth-04.txt 17 Status of this Memo 19 By submitting this Internet-Draft, each author represents that 20 any applicable patent or other IPR claims of which he or she is 21 aware have been or will be disclosed, and any of which he or she 22 becomes aware will be disclosed, in accordance with Section 6 of 23 BCP 79. 25 Internet-Drafts are working documents of the Internet Engineering 26 Task Force (IETF), its areas, and its working groups. Note that 27 other groups may also distribute working documents as Internet- 28 Drafts. 30 Internet-Drafts are draft documents valid for a maximum of six 31 months and may be updated, replaced, or obsoleted by other documents 32 at any time. It is inappropriate to use Internet-Drafts as 33 reference material or to cite them other than as "work in progress." 35 The list of current Internet-Drafts can be accessed at 36 http://www.ietf.org/ietf/1id-abstracts.txt 38 The list of Internet-Draft Shadow Directories can be accessed at 39 http://www.ietf.org/shadow.html 41 This Internet-Draft will expire on April 3, 2009. 43 Abstract 44 Protection Mechanisms 46 This draft describes the methodology for benchmarking MPLS 47 Protection mechanisms for link and node protection as defined in 48 [MPLS-FRR-EXT]. This document provides test methodologies and test 49 bed setup for measuring failover times while considering all 50 dependencies that might impact faster recovery of real-time services 51 bound to MPLS based traffic engineered tunnels. 53 The terms used in the procedures included in this document are 54 defined in [TERM-ID]. 56 Table of Contents 58 1. Introduction...................................................3 59 2. Document Scope.................................................4 60 3. General reference sample topology..............................5 61 4. Existing definitions...........................................5 62 5. Test Considerations............................................6 63 5.1. Failover Events..............................................6 64 5.2. Failure Detection [TERM-ID]..................................7 65 5.3. Use of Data Traffic for MPLS Protection benchmarking.........7 66 5.4. LSP and Route Scaling........................................8 67 5.5. Selection of IGP.............................................8 68 5.6. Reversion [TERM-ID]..........................................8 69 5.7. Traffic Generation...........................................8 70 5.8. Motivation for Topologies....................................9 71 6. Reference Test Setup...........................................9 72 6.1. Link Protection with 1 hop primary (from PLR) and 1 hop backup 73 TE tunnels.......................................................10 74 6.2. Link Protection with 1 hop primary (from PLR) and 2 hop backup 75 TE tunnels.......................................................11 76 6.3. Link Protection with 2+ hop (from PLR) primary and 1 hop backup 77 TE tunnels.......................................................11 78 6.4. Link Protection with 2+ hop (from PLR) primary and 2 hop backup 79 TE tunnels.......................................................12 80 6.5. Node Protection with 2 hop primary (from PLR) and 1 hop backup 81 TE tunnels.......................................................12 82 6.6. Node Protection with 2 hop primar (from PLR) and 2 hop backup 83 TE tunnels.......................................................13 84 6.7. Node Protection with 3+ hop primary (from PLR) and 1 hop backup 85 TE tunnels.......................................................14 86 6.8. Node Protection with 3+ hop primary (from PLR) and 2 hop backup 87 TE tunnels.......................................................15 88 7. Test Methodology..............................................15 89 7.1. Headend as PLR with link failure............................15 90 7.2. Mid-Point as PLR with link failure..........................17 91 7.3. Headend as PLR with Node Failure............................18 92 Protection Mechanisms 94 7.4. Mid-Point as PLR with Node failure..........................19 95 7.5. MPLS FRR Forwarding Performance Test cases..................21 96 7.5.1. PLR as Headend............................................21 97 7.5.2. PLR as Mid-point..........................................22 98 8. Reporting Format..............................................23 99 Benchmarks.......................................................24 100 9. Security Considerations.......................................25 101 10. IANA Considerations..........................................25 102 11. References...................................................25 103 11.1. Normative References.......................................25 104 11.2. Informative References.....................................25 105 Author's Addresses...............................................26 106 Intellectual Property Statement..................................27 107 Disclaimer of Validity...........................................28 108 Copyright Statement..............................................28 109 12. Acknowledgments..............................................28 110 Appendix A: Fast Reroute Scalability Table.......................28 111 Appendix B: Abbreviations........................................31 113 1. Introduction 115 This draft describes the methodology for benchmarking MPLS based 116 protection mechanisms. The new terminology that this document 117 introduces is defined in [TERM-ID]. 119 MPLS based protection mechanisms provide fast recovery of real-time 120 services from a planned or an unplanned link or node failures. MPLS 121 protection mechanisms are generally deployed in a network 122 infrastructure, where MPLS is used for provisioning of point-to- 123 point traffic engineered tunnels (tunnel). MPLS based protection 124 mechanisms promises to improve service disruption period by 125 minimizing recovery time from most common failures. 127 Generally there two factors impacting service availability - one is 128 frequency of failures, and other being duration for which the 129 failures last. Failures can be classified further into two types- - 130 correlated uncorrelated failures. A Correlated failure is the co- 131 occurrence of two or more failures simultaneously. A typical example 132 would be a failure of logical resource (e.g. layer-2 links), relying 133 on a common physical resource (e.g. common interface) fails. Within 134 the context of MPLS protection mechanisms, failures that arise due 135 to Shared Risk Link Groups (SRLG) [MPLS-FRR-EXT] can be considered 136 as correlations failures or. Not all correlated failures are 137 Protection Mechanisms 139 predictable in advance especially the ones caused due to natural 140 disasters. 142 Planned failures on the other hand are predictable and 143 implementations should handle both types of failures and recover 144 gracefully within the time frame acceptable for service assurance. 145 Hence, failover recovery time is one of the most important benchmark 146 that a service provider considers in choosing the building blocks 147 for their network infrastructure. 149 It is a known fact that network elements from different manufactures 150 behave differently to network failures, which impact their ability 151 to recover from the failures. It becomes imperative from network 152 service providers to have a common benchmark, which could be 153 followed to understand the performance behaviors of network 154 elements. 156 Considering failover recovery an important parameter, the test 157 methodology presented in this document considers the factors that 158 may impact the failover times. To benchmark the failover times, data 159 plane traffic is used as defined in [IGP-METH]. 161 All benchmarking test cases defined in this document apply to both 162 facility backup and local protection enabled in detour mode. The 163 test cases cover all possible failure scenarios and the associated 164 procedures benchmark the ability of the DUT to perform recovery from 165 failures within target failover time. 167 2. Document Scope 169 This document provides detailed test cases along with different 170 topologies and scenarios that should be considered to effectively 171 benchmark MPLS protection mechanisms and failover times. Different 172 failure scenarios and scaling considerations are also provided in 173 this document, in addition to reporting formats for the observed 174 results. 176 Benchmarking of unexpected correlated failures is currently out of 177 scope of this document. 179 Protection Mechanisms 181 3. General reference sample topology 183 Figure 1 illustrates the basic reference testbed and is applicable 184 to all the test cases defined in this document. TG & TA represents 185 Traffic Generator & Analyzer respectively. A tester is connected to 186 the DUT and it sends and receives IP traffic along with the working 187 Path, run protocol emulations simulating real world peering 188 scenarios. The reference testbed shown in the figure 190 --------------------------- 191 | ------------|--------------- 192 | | | | 193 | | | | 194 -------- -------- -------- -------- -------- 195 TG-| R1 |-----| R2 |----| R3 | | R4 | | R5 |-TA 196 | |-----| |----| |----| |---| | 197 -------- -------- -------- -------- -------- 198 | | | | 199 | | | | 200 | -------- | | 201 ---------| R6 |-------- | 202 | |-------------------- 203 -------- 205 Fig.1: Fast Reroute Topology. 207 The tester MUST record the number of lost, duplicate, and reordered 208 packets. It should further record arrival and departure times so 209 that failover Time, Additive Latency, and Reversion Time can be 210 measured. The tester may be a single device or a test system 211 emulating all the different roles along a primary or backup path. 213 4. Existing definitions 215 For the sake of clarity and continuity this RFC adopts the template 216 for definitions set out in Section 2 of RFC 1242. Definitions are 217 indexed and grouped together in sections for ease of reference. The 218 terms used in this document are defined in detail in [TERM-ID]. 220 Protection Mechanisms 222 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 223 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in 224 this document is to be interpreted as described in RFC 2119. 226 The reader is assumed to be familiar with the commonly used MPLS 227 terminology, some of which is defined in [MPLS-FRR-EXT]. 229 5. Test Considerations 231 This section discusses the fundamentals of MPLS Protection testing: 233 -The types of network events that causes failover 234 -Indications for failover 235 -the use of data traffic 236 -Traffic generation 237 -LSP Scaling 238 -Reversion of LSP 239 -IGP Selection 241 5.1. Failover Events [TERM ID] 243 The failover to the backup tunnel is primarily triggered by either 244 link or node failures observed downstream of the Point of Local 245 repair (PLR). Some of these failure events are listed below. 247 Link failure events 249 - Interface Shutdown on PLR side with POS Alarm 250 - Interface Shutdown on remote side with POS Alarm 251 - Interface Shutdown on PLR side with RSVP hello enabled 252 - Interface Shutdown on remote side with RSVP hello enabled 253 - Interface Shutdown on PLR side with BFD 254 - Interface Shutdown on remote side with BFD 255 - Fiber Pull on the PLR side (Both TX & RX or just the TX) 256 - Fiber Pull on the remote side (Both TX & RX or just the RX) 257 - Online insertion and removal (OIR) on PLR side 258 - OIR on remote side 259 - Sub-interface failure (e.g. shutting down of a VLAN) 260 - Parent interface shutdown (an interface bearing multiple sub- 261 interfaces 262 Protection Mechanisms 264 Node failure events 266 A System reload is initiated either by a graceful shutdown or by a 267 power failure. A system crash is referred to as a software failure 268 or an assert. 270 - Reload protected Node, when RSVP hello is enabled 271 - Crash Protected Node, when RSVP hello is enabled 272 - Reload Protected Node, when BFD is enable 273 - Crash Protected Node, when BFD is enable 275 5.2. Failure Detection [TERM-ID] 277 Link failure detection time depends on the link type and failure 278 detection protocols running. For SONET/SDH, the alarm type (such as 279 LOS, AIS, or RDI) can be used. Other link types have layer-two 280 alarms, but they may not provide a short enough failure detection 281 time. Ethernet based links do not have layer 2 failure indicators, 282 and therefore relies on layer 3 signaling for failure detection. 284 MPLS has different failure detection techniques such as BFD, or use 285 of RSVP hellos. These methods can be used for the layer 3 failure 286 indicators required by Ethernet based links, or for some other non- 287 Ethernet based links to help improve failure detection time. 289 The test procedures in this document can be used for a local failure 290 or remote failure scenarios for comprehensive benchmarking and to 291 evaluate failover performance independent of the failure detection 292 techniques. 294 5.3. Use of Data Traffic for MPLS Protection benchmarking 296 Currently end customers use packet loss as a key metric for failover 297 time. Packet loss is an externally observable event and has direct 298 impact on customers' applications. MPLS protection mechanism is 299 expected to minimize the packet loss in the event of a failure. For 300 this reason it is important to develop a standard router 301 benchmarking methodology for measuring MPLS protection that uses 302 packet loss as a metric. At a known rate of forwarding, packet loss 303 can be measured and the failover time can be determined. Measurement 304 of control plane signaling to establish backup paths is not enough 305 Protection Mechanisms 307 to verify failover. Failover is best determined when packets are 308 actually traversing the backup path. 310 An additional benefit of using packet loss for calculation of 311 failover time is that it allows use of a black-box tests 312 environment. Data traffic is offered at line-rate to the device 313 under test (DUT), and an emulated network failure event is forced to 314 occur, and packet loss is externally measured to calculate the 315 convergence time. This setup is independent of the DUT architecture. 317 In addition, this methodology considers the packets in error and 318 duplicate packets that could have been generated during the failover 319 process. In scenarios, where separate measurement of packets in 320 error and duplicate packets is difficult to obtain, these packets 321 should be attributed to lost packets. 323 5.4. LSP and Route Scaling 325 Failover time performance may vary with the number of established 326 primary and backup tunnel label switched paths (LSP) and installed 327 routes. However the procedure outlined here should be used for any 328 number of LSPs (L) and number of routes protected by PLR(R). Number 329 of L and R must be recorded. 331 5.5. Selection of IGP 333 The underlying IGP could be ISIS-TE or OSPF-TE for the methodology 334 proposed here. 336 5.6. Reversion [TERM-ID] 338 Fast Reroute provides a method to return or restore a backup path to 339 original primary LSP upon recovery from the failure. This is 340 referred to as Reversion, which can be implemented as Global 341 Reversion or Local Reversion. In all test cases listed here 342 Reversion should not produce any packet loss, out of order or 343 duplicate packets. Each of the test cases in this methodology 344 document provides a check to confirm that there is no packet loss. 346 5.7. Traffic Generation 348 It is suggested that there be one or more traffic streams as long as 349 there is a steady and constant rate of flow for all the streams. In 350 order to monitor the DUT performance for recovery times a set of 351 route prefixes should be advertised before traffic is sent. The 352 traffic should be configured towards these routes. 354 Protection Mechanisms 356 A typical example would be configuring the traffic generator to send 357 the traffic to the first, middle and last of the advertised routes. 358 (First, middle and last could be decided by the numerically 359 smallest, median and the largest respectively of the advertised 360 prefix). Generating traffic to all of the prefixes reachable by the 361 protected tunnel (probably in a Round-Robin fashion, where the 362 traffic is destined to all the prefixes but one prefix at a time in 363 a cyclic manner) is not recommended. The reason why traffic 364 generation is not recommended in a Round-Robin fashion to all the 365 prefixes, one at a time is that if there are many prefixes reachable 366 through the LSP the time interval between 2 packets destined to one 367 prefix may be significantly high and may be comparable with the 368 failover time being measured which does not aid in getting an 369 accurate failover measurement. 371 5.8. Motivation for Topologies 373 Given that the label stack is dependent of the following 3 entities 374 it is recommended that the benchmarking of failover time be 375 performed on all the 8 topologies provided in section 4 377 - Type of protection (Link Vs Node) 379 - # of remaining hops of the primary tunnel from the PLR 381 - # of remaining hops of the backup tunnel from the PLR 383 6. Reference Test Setup 385 In addition to the general reference topology shown in figure 1, 386 this section provides detailed insight into various proposed test 387 setups that should be considered for comprehensively benchmarking 388 the failover time in different roles along the primary tunnel: 390 This section proposes a set of topologies that covers all the 391 scenarios for local protection. All of these 8 topologies shown 392 (figure 2- figure 9) can be mapped to the reference topology shown 393 in figure 1. Topologies provided in sections 4.1 to 4.8 refer to 394 test-bed required to benchmark failover time when DUT is configured 395 as a PLR in either headend or midpoint role. The labels stack 396 provided with each topology is at the PLR. 398 The label stacks shown below each figure in section 4.1 to 4.9 399 considers enabling of Penultimate Hop Popping (PHP). 401 Protection Mechanisms 403 Figures 2-9 uses the following convention: 405 a) HE is Headend 407 b) TE is Tail-End 409 c) MID is Mid point 411 d) MP is Merge Point 413 e) PLR is Point of Local Repair 415 f) PRI is Primary 417 g) BKP denotes Backup Node 419 6.1. Link Protection with 1 hop primary (from PLR) and 1 hop backup TE 420 tunnels 422 ------- -------- PRI -------- 423 | R1 | | R2 | | R3 | 424 TG-| HE |--| MID |----| TE |-TA 425 | | | PLR |----| | 426 ------- -------- BKP -------- 428 Figure 2: Represents the setup for section 4.1 430 Traffic No of Labels No of labels after 431 before failure failure 432 IP TRAFFIC (P-P) 0 0 433 Layer3 VPN (PE-PE) 1 1 434 Layer3 VPN (PE-P) 2 2 435 Layer2 VC (PE-PE) 1 1 436 Layer2 VC (PE-P) 2 2 437 Mid-point LSPs 0 0 438 Protection Mechanisms 440 6.2. Link Protection with 1 hop primary (from PLR) and 2 hop backup TE 441 tunnels 443 ------- -------- -------- 444 | R1 | | R2 | | R3 | 445 TG-| HE | | MID |PRI | TE |-TA 446 | |----| PLR |----| | 447 ------- -------- -------- 448 |BKP | 449 | -------- | 450 | | R6 | | 451 |----| BKP |----| 452 | MID | 453 -------- 454 Figure 3: Representing setup for section 4.2 456 Traffic No of Labels No of labels 457 before failure after failure 458 IP TRAFFIC (P-P) 0 1 459 Layer3 VPN (PE-PE) 1 2 460 Layer3 VPN (PE-P) 2 3 461 Layer2 VC (PE-PE) 1 2 462 Layer2 VC (PE-P) 2 3 463 Mid-point LSPs 0 1 465 6.3. Link Protection with 2+ hop (from PLR) primary and 1 hop backup TE 466 tunnels 468 -------- -------- -------- -------- 469 | R1 | | R2 |PRI | R3 |PRI | R4 | 470 TG-| HE |----| MID |----| MID |------| TE |-TA 471 | | | PLR |----| | | | 472 -------- -------- BKP -------- -------- 473 Figure 4: Representing setup for section 4.3 475 Traffic No of Labels No of labels 476 before failure after failure 478 Protection Mechanisms 480 IP TRAFFIC (P-P) 1 1 481 Layer3 VPN (PE-PE) 2 2 482 Layer3 VPN (PE-P) 3 3 483 Layer2 VC (PE-PE) 2 2 484 Layer2 VC (PE-P) 3 3 485 Mid-point LSPs 1 1 487 6.4. Link Protection with 2+ hop (from PLR) primary and 2 hop backup TE 488 tunnels 490 -------- -------- PRI -------- PRI -------- 491 | R1 | | R2 | | R3 | | R4 | 492 TG-| HE |----| MID |----| MID |------| TE |-TA 493 | | | PLR | | | | | 494 -------- -------- -------- -------- 495 BKP| | 496 | -------- | 497 | | R6 | | 498 ---| BKP |- 499 | MID | 500 -------- 501 Figure 5: Representing the setup for section 4.4 503 Traffic No of Labels No of labels 504 before failure after failure 506 IP TRAFFIC (P-P) 1 2 507 Layer3 VPN (PE-PE) 2 3 508 Layer3 VPN (PE-P) 3 4 509 Layer2 VC (PE-PE) 2 3 510 Layer2 VC (PE-P) 3 4 511 Mid-point LSPs 1 2 513 6.5. Node Protection with 2 hop primary (from PLR) and 1 hop backup TE 514 tunnels 515 Protection Mechanisms 517 -------- -------- -------- -------- 518 | R1 | | R2 |PRI | R3 | PRI | R4 | 519 TG-| HE |----| MID |----| MID |------| TE |-TA 520 | | | PLR | | | | | 521 -------- -------- -------- -------- 522 |BKP | 523 ----------------------------- 524 Figure 6: Representing the setup for section 4.5 526 Traffic No of Labels No of labels 527 before failure after failure 529 IP TRAFFIC (P-P) 1 0 530 Layer3 VPN (PE-PE) 2 1 531 Layer3 VPN (PE-P) 3 2 532 Layer2 VC (PE-PE) 2 1 533 Layer2 VC (PE-P) 3 2 534 Mid-point LSPs 1 0 536 6.6. Node Protection with 2 hop primary (from PLR) and 2 hop backup TE 537 tunnels 539 -------- -------- -------- -------- 540 | R1 | | R2 | | R3 | | R4 | 541 TG-| HE | | MID |PRI | MID |PRI | TE |-TA 542 | |----| PLR |----| |----| | 543 -------- -------- -------- -------- 544 | | 545 BKP| -------- | 546 | | R6 | | 547 ---------| BKP |--------- 548 | MID | 549 -------- 550 Figure 7: Representing setup for section 4.6 551 Protection Mechanisms 553 Traffic No of Labels No of labels 554 before failure after failure 556 IP TRAFFIC (P-P) 1 1 557 Layer3 VPN (PE-PE) 2 2 558 Layer3 VPN (PE-P) 3 3 559 Layer2 VC (PE-PE) 2 2 560 Layer2 VC (PE-P) 3 3 561 Mid-point LSPs 1 1 563 6.7. Node Protection with 3+ hop primary (from PLR) and 1 hop backup TE 564 tunnels 566 -------- -------- PRI -------- PRI -------- PRI -------- 567 | R1 | | R2 | | R3 | | R4 | | R5 | 568 TG-| HE |--| MID |---| MID |---| MP |---| TE |-TA 569 | | | PLR | | | | | | | 570 -------- -------- -------- -------- -------- 571 BKP| | 572 -------------------------- 573 Figure 8: Representing setup for section 4.7 575 Traffic No of Labels No of labels 576 before failure after failure 578 IP TRAFFIC (P-P) 1 1 579 Layer3 VPN (PE-PE) 2 2 580 Layer3 VPN (PE-P) 3 3 581 Layer2 VC (PE-PE) 2 2 582 Layer2 VC (PE-P) 3 3 583 Mid-point LSPs 1 1 584 Protection Mechanisms 586 6.8. Node Protection with 3+ hop primary (from PLR) and 2 hop backup TE 587 tunnels 589 -------- -------- -------- -------- -------- 590 | R1 | | R2 | | R3 | | R4 | | R5 | 591 TG-| HE | | MID |PRI| MID |PRI| MP |PRI| TE |-TA 592 | |-- | PLR |---| |---| |---| | 593 -------- -------- -------- -------- -------- 594 BKP| | 595 | -------- | 596 | | R6 | | 597 ---------| BKP |------- 598 | MID | 599 -------- 600 Figure 9: Representing setup for section 4.8 602 Traffic No of Labels No of labels 603 before failure after failure 605 IP TRAFFIC (P-P) 1 2 606 Layer3 VPN (PE-PE) 2 3 607 Layer3 VPN (PE-P) 3 4 608 Layer2 VC (PE-PE) 2 3 609 Layer2 VC (PE-P) 3 4 610 Mid-point LSPs 1 2 612 7. Test Methodology 614 The procedure described in this section can be applied to all the 8 615 base test cases and the associated topologies. The backup as well as 616 the primary tunnels are configured to be alike in terms of bandwidth 617 usage. In order to benchmark failover with all possible label stack 618 depth applicable as seen with current deployments, it is suggested 619 that the methodology includes all the scenarios listed here 621 7.1. Headend as PLR with link failure 623 Objective 624 Protection Mechanisms 626 To benchmark the MPLS failover time due to Link failure events 627 described in section 3.1 experienced by the DUT which is the point 628 of local repair (PLR). 630 Test Setup 632 - Select any one topology out of 8 from section 4 633 - Select overlay technology for FRR test e.g. IGP,VPN,or VC 634 - The DUT will also have 2 interfaces connected to the traffic 635 Generator/analyzer. (If the node downstream of the PLR is not 636 A simulated node, then the Ingress of the tunnel should have 637 one link connected to the traffic generator and the node 638 downstream to the PLR or the egress of the tunnel should have 639 a link connected to the traffic analyzer). 641 Test Configuration 643 1. Configure the number of primaries on R2 and the backups on R2 644 as required by the topology selected. 645 2. Advertise prefixes (as per FRR Scalability table describe 646 in Appendix A) by the tail end. 648 Procedure 650 1. Establish the primary LSP on R2 required by the topology 651 selected. 652 2. Establish the backup LSP on R2 required by the selected 653 topology. 654 3. Verify primary and backup LSPs are up and that primary is 655 protected. 656 4. Verify Fast Reroute protection is enabled and ready. 657 5. Setup traffic streams as described in section 3.7. 658 6. Send IP traffic at maximum Forwarding Rate to DUT. 659 7. Verify traffic switched over Primary LSP. 660 8. Trigger any choice of Link failure as describe in section 3.1. 661 9. Verify that primary tunnel and prefixes gets mapped to backup 662 tunnels. 663 10. Stop traffic stream and measure the traffic loss. 664 11. Failover time is calculated as defined in section 6, Reporting 665 format. 667 Protection Mechanisms 669 12. Start traffic stream again to verify reversion when protected 670 interface comes up. Traffic loss should be 0 due to make 671 before break or reversion. 672 13. Enable protected interface that was down (Node in the case of 673 NNHOP). 674 14. Verify headend signals new LSP and protection should be in 675 place again. 677 7.2. Mid-Point as PLR with link failure 679 Objective 681 To benchmark the MPLS failover time due to Link failure events 682 described in section 3.1 experienced by the device under test which 683 is the point of local repair (PLR). 685 Test Setup 687 - Select any one topology out of 8 from section 4 688 - Select overlay technology for FRR test as Mid-Point LSPs 689 - The DUT will also have 2 interfaces connected to the traffic 690 generator. 692 Test Configuration 694 1. Configure the number of primaries on R1 and the backups on R2 695 as required by the topology selected. 696 2. Advertise prefixes (as per FRR Scalability table describe in 697 Appendix A) by the tail end. 699 Procedure 701 1. Establish the primary LSP on R1 required by the topology 702 selected. 703 2. Establish the backup LSP on R2 required by the selected 704 topology. 705 3. Verify primary and backup LSPs are up and that primary is 706 protected. 708 Protection Mechanisms 710 4. Verify Fast Reroute protection. 711 5. Setup traffic streams as described in section 3.7. 712 6. Send IP traffic at maximum Forwarding Rate to DUT. 713 7. Verify traffic switched over Primary LSP. 714 8. Trigger any choice of Link failure as describe in section 3.1. 715 9. Verify that primary tunnel and prefixes gets mapped to backup 716 tunnels. 717 10. Stop traffic stream and measure the traffic loss. 718 11. Failover time is calculated as per defined in section 6, 719 Reporting format. 720 12. Start traffic stream again to verify reversion when protected 721 interface comes up. Traffic loss should be 0 due to make 722 before break or reversion. 723 13. Enable protected interface that was down (Node in the case of 724 NNHOP). 725 14. Verify headend signals new LSP and protection should be in 726 place again. 728 7.3. Headend as PLR with Node Failure 730 Objective 732 To benchmark the MPLS failover time due to Node failure events 733 described in section 3.1 experienced by the device under test, which 734 is the point of local repair (PLR). 736 Test Setup 738 - Select any one topology from section 4.5 to 4.8 739 - Select overlay technology for FRR test e.g. IGP, VPN, or VC 740 - The DUT will also have 2 interfaces connected to the traffic 741 generator. 743 Test Configuration 745 1. Configure the number of primaries on R2 and the backups on R2 746 as required by the topology selected. 747 2. Advertise prefixes (as per FRR Scalability table describe in 748 Appendix A) by the tail end. 750 Procedure 751 Protection Mechanisms 753 1. Establish the primary LSP on R2 required by the topology 754 selected. 755 2. Establish the backup LSP on R2 required by the selected 756 topology. 757 3. Verify primary and backup LSPs are up and that primary is 758 protected. 759 4. Verify Fast Reroute protection. 760 5. Setup traffic streams as described in section 3.7. 761 6. Send IP traffic at maximum Forwarding Rate to DUT. 762 7. Verify traffic switched over Primary LSP. 763 8. Trigger any choice of Node failure as describe in section 3.1. 764 9. Verify that primary tunnel and prefixes gets mapped to backup 765 tunnels 766 10. Stop traffic stream and measure the traffic loss. 767 11. Failover time is calculated as per defined in section 6, 768 Reporting format. 769 12. Start traffic stream again to verify reversion when protected 770 interface comes up. Traffic loss should be 0 due to make 771 before break or reversion. 772 13. Boot protected Node that was down. 773 14. Verify headend signals new LSP and protection should be in 774 place again. 776 7.4. Mid-Point as PLR with Node failure 778 Objective 780 To benchmark the MPLS failover time due to Node failure events 781 described in section 3.1 experienced by the device under test, which 782 is the point of local repair (PLR). 784 Test Setup 786 - Select any one topology from section 4.5 to 4.8. 787 - Select overlay technology for FRR test as Mid-Point LSPs. 788 - The DUT will also have 2 interfaces connected to the traffic 789 generator. 791 Test Configuration 792 Protection Mechanisms 794 1. Configure the number of primaries on R1 and the backups on R2 795 as required by the topology selected. 796 2. Advertise prefixes (as per FRR Scalability table describe in 797 Appendix A) by the tail end. 799 Procedure 801 1. Establish the primary LSP on R1 required by the topology 802 selected. 803 2. Establish the backup LSP on R2 required by the selected 804 topology. 805 3. Verify primary and backup LSPs are up and that primary is 806 protected. 807 4. Verify Fast Reroute protection. 808 5. Setup traffic streams as described in section 3.7. 809 6. Send IP traffic at maximum Forwarding Rate to DUT. 810 7. Verify traffic switched over Primary LSP. 811 8. Trigger any choice of Node failure as describe in section 3.1. 812 9. Verify that primary tunnel and prefixes gets mapped to backup 813 tunnels. 814 10. Stop traffic stream and measure the traffic loss. 815 11. Failover time is calculated as per defined in section 6, 816 Reporting format. 817 12. Start traffic stream again to verify reversion when protected 818 interface comes up. Traffic loss should be 0 due to make 819 before break or reversion. 820 13. Boot protected Node that was down. 821 14. Verify headend signals new LSP and protection should be in 822 place again. 824 Protection Mechanisms 826 7.5. MPLS FRR Forwarding Performance Test cases 828 For the following MPLS FRR Forwarding Performance Benchmarking 829 cases, Test the maximum PPS rate allowed by given hardware. One 830 may follow the procedure for determining MPLS forwarding 831 performance defined in [MPLS-FORWARD] 833 7.5.1. PLR as Headend 835 Objective 837 To benchmark the maximum rate (pps) on the PLR (as headend) over 838 primary FRR LSP and backup LSP. 840 Test Setup 842 - Select any one topology out of 8 from section 4. 843 - Select overlay technology for FRR test e.g. IGP,VPN,or VC. 844 - The DUT will also have 2 interfaces connected to the traffic 845 Generator/analyzer. (If the node downstream of the PLR is not 846 A simulated node, then the Ingress of the tunnel should have 847 one link connected to the traffic generator and the node 848 downstream to the PLR or the egress of the tunnel should have 849 a link connected to the traffic analyzer). 851 Procedure 853 1. Establish the primary LSP on R2 required by the topology 854 selected. 855 2. Establish the backup LSP on R2 required by the selected 856 topology. 857 3. Verify primary and backup LSPs are up and that primary is 858 protected. 859 4. Verify Fast Reroute protection is enabled and ready. 860 5. Setup traffic streams as described in section 3.7. 861 6. Send IP traffic at maximum forwarding rate (pps) that the 862 device under test supports over the primary LSP. 863 7. Record maximum PPS rate forwarded over primary LSP. 864 8. Stop traffic stream. 865 9. Trigger any choice of Link failure as describe in section 3.1. 867 Protection Mechanisms 869 10. Verify that primary tunnel and prefixes gets mapped to backup 870 tunnels. 871 11. Send IP traffic at maximum forwarding rate (pps) that the 872 device under test supports over the primary LSP. 873 12. Record maximum PPS rate forwarded over backup LSP. 875 7.5.2. PLR as Mid-point 877 Objective 879 To benchmark the maximum rate (pps) on the PLR (as mid-point of the 880 primary path and ingress of the backup path) over primary FRR LSP 881 and backup LSP. 883 Test Setup 885 - Select any one topology out of 8 from section 4. 886 - Select overlay technology for FRR test as Mid-Point LSPs. 887 - The DUT will also have 2 interfaces connected to the traffic 888 generator. 890 Procedure 892 1. Establish the primary LSP on R1 required by the topology 893 selected. 894 2. Establish the backup LSP on R2 required by the selected 895 topology. 896 3. Verify primary and backup LSPs are up and that primary is 897 protected. 898 4. Verify Fast Reroute protection is enabled and ready. 899 5. Setup traffic streams as described in section 3.7. 900 6. Send IP traffic at maximum forwarding rate (pps) that the 901 device under test supports over the primary LSP. 902 7. Record maximum PPS rate forwarded over primary LSP. 903 8. Stop traffic stream. 904 9. Trigger any choice of Link failure as describe in section 3.1. 905 10. Verify that primary tunnel and prefixes gets mapped to backup 906 tunnels. 907 11. Send IP traffic at maximum forwarding rate (pps) that the 908 device under test supports over the backup LSP. 910 Protection Mechanisms 912 12. Record maximum PPS rate forwarded over backup LSP. 914 8. Reporting Format 916 For each test, it is recommended that the results be reported in the 917 following format. 919 Parameter Units 921 IGP used for the test ISIS-TE/ OSPF-TE 923 Interface types Gige,POS,ATM,VLAN etc. 925 Packet Sizes offered to the DUT Bytes 927 Forwarding rate Number of packets per 928 second 930 IGP routes advertised Number of IGP routes 932 RSVP hello timers configured Milliseconds 933 (if any) 935 Number of FRR tunnels Number of tunnels 936 configured 938 Number of VPN routes installed Number of VPN routes 939 on the headend 941 Number of VC tunnels Number of VC tunnels 943 Number of BGP routes BGP routes installed 945 Number of mid-point tunnels Number of tunnels 947 Number of Prefixes protected by Number of LSPs 948 Primary 950 Topology being used Section number, and 951 figure reference 953 Protection Mechanisms 955 Failure event Event type 957 Benchmarks 959 Parameter Unit 961 Minimum failover time Milliseconds 963 Mean failover time Milliseconds 965 Maximum failover time Milliseconds 967 Minimum reversion time Milliseconds 969 Mean reversion time Milliseconds 971 Maximum reversion time Milliseconds 973 Failover time suggested above is calculated using one of the 974 following three methods 976 1. Packet-Based Loss method (PBLM): (Number of packets 977 dropped/packets per second * 1000) milliseconds. This method 978 could also be referred as Rate Derived method. 980 2. Time-Based Loss Method (TBLM): This method relies on the 981 ability of the Traffic generators to provide statistics which 982 reveal the duration of failure in milliseconds based on when 983 the packet loss occurred (interval between non-zero packet loss 984 and zero loss). 986 3. Timestamp Based Method (TBM): This method of failover 987 calculation is based on the timestamp that gets transmitted as 988 payload in the packets originated by the generator. The Traffic 989 Analyzer records the timestamp of the last packet received 990 before the failover event and the first packet after the 991 failover and derives the time based on the difference between 992 these 2 timestamps. Note: The payload could also contain 993 sequence numbers for out-of-order packet calculation and 994 duplicate packets. 996 Note: If the primary is configured to be dynamic, and if the primary 997 is to reroute, make before break should occur from the backup that 998 Protection Mechanisms 1000 is in use to a new alternate primary. If there is any packet loss 1001 seen, it should be added to failover time. 1003 9. Security Considerations 1005 During the course of test, the test topology must be disconnected 1006 from devices that may forward the test traffic into a production 1007 environment. 1009 There are no specific security considerations within the scope of 1010 this document. 1012 10. IANA Considerations 1014 There are no considerations for IANA at this time. 1016 11. References 1018 11.1. Normative References 1020 [MPLS-FRR-EXT] Pan, P., Atlas, A., Swallow, G., "Fast Reroute 1021 Extensions to RSVP-TE for LSP Tunnels", RFC 4090. 1023 11.2. Informative References 1025 [TERM-ID] Poretsky S., Papneja R., Karthik J., Vapiwala S., 1026 "Benchmarking Terminology for Protection 1027 Performance", draft-ietf-bmwg-protection-term- 1028 05.txt, work in progress. 1030 [MPLS-FRR-EXT] Pan P., Swallow G., Atlas A., "Fast Reroute 1031 Extensions to RSVP-TE for LSP Tunnels'', RFC 4090. 1033 [IGP-METH] S. Poretsky, B. Imhoff, "Benchmarking Methodology 1034 for IGP Data Plane Route Convergence, draft-ietf- 1035 bmwg-igp-dataplane-conv-meth-16.txt, work in 1036 progress. 1038 [MPLS-FORWARD] A. Akhter, and R. Asati, ''MPLS Forwarding 1039 Benchmarking Methodology,'' draft-ietf-bmwg-mpls- 1040 forwarding-meth-00.txt, work in progress. 1042 Protection Mechanisms 1044 Author's Addresses 1046 Rajiv Papneja 1047 Isocore 1048 12359 Sunrise Valley Drive, STE 100 1049 Reston, VA 20190 1050 USA 1051 Phone: +1 703 860 9273 1052 Email: rpapneja@isocore.com 1054 Samir Vapiwala 1055 Cisco System 1056 300 Beaver Brook Road 1057 Boxborough, MA 01719 1058 USA 1059 Phone: +1 978 936 1484 1060 Email: svapiwal@cisco.com 1062 Jay Karthik 1063 Cisco System 1064 300 Beaver Brook Road 1065 Boxborough, MA 01719 1066 USA 1067 Phone: +1 978 936 0533 1068 Email: jkarthik@cisco.com 1070 Scott Poretsky 1071 Allot Communications 1072 67 South Bedford Street, Suite 400 1073 Burlington, MA 01803 1074 USA 1075 Phone: + 1 508 309 2179 1076 EMail: sporetsky@allot.com 1078 Shankar Rao 1079 Qwest Communications, 1080 950 17th Street 1081 Suite 1900 1082 Protection Mechanisms 1084 Qwest Communications 1085 Denver, CO 80210 1086 USA 1087 Phone: + 1 303 437 6643 1088 Email: shankar.rao@qwest.com 1090 Jean-Louis Le Roux 1091 France Telecom 1092 2 av Pierre Marzin 1093 22300 Lannion 1094 France 1095 Phone: 00 33 2 96 05 30 20 1096 Email: jeanlouis.leroux@orange-ft.com 1098 Intellectual Property Statement 1100 The IETF takes no position regarding the validity or scope of any 1101 Intellectual Property Rights or other rights that might be claimed 1102 to pertain to the implementation or use of the technology described 1103 in this document or the extent to which any license under such 1104 rights might or might not be available; nor does it represent that 1105 it has made any independent effort to identify any such rights. 1106 Information on the procedures with respect to rights in RFC 1107 documents can be found in BCP 78 and BCP 79. 1109 Copies of IPR disclosures made to the IETF Secretariat and any 1110 assurances of licenses to be made available, or the result of an 1111 attempt made to obtain a general license or permission for the use 1112 of such proprietary rights by implementers or users of this 1113 specification can be obtained from the IETF on-line IPR repository 1114 at http://www.ietf.org/ipr. 1116 The IETF invites any interested party to bring to its attention any 1117 copyrights, patents or patent applications, or other proprietary 1118 rights that may cover technology that may be required to implement 1119 this standard. Please address the information to the IETF at 1120 ietf-ipr@ietf.org. 1122 Protection Mechanisms 1124 Disclaimer 1126 This document and the information contained herein are provided on 1127 an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 1128 REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE 1129 IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL 1130 WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY 1131 WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE 1132 ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS 1133 FOR A PARTICULAR PURPOSE. 1135 Full Copyright Statement 1137 Copyright (C) The IETF Trust (2008). 1139 This document is subject to the rights, licenses and restrictions 1140 contained in BCP 78, and except as set forth therein, the authors 1141 retain all their rights. 1143 12. Acknowledgments 1145 We would like to thank Jean Philip Vasseur for his invaluable input 1146 to the document and Curtis Villamizar his contribution in suggesting 1147 text on definition and need for benchmarking Correlated failures. 1149 Additionally we would like to thank Arun Gandhi, Amrit Hanspal, Karu 1150 Ratnam and for their input to the document. 1152 Appendix A: Fast Reroute Scalability Table 1154 This section provides the recommended numbers for evaluating the 1155 scalability of fast reroute implementations. It also recommends the 1156 typical numbers for IGP/VPNv4 Prefixes, LSP Tunnels and VC entries. 1157 Based on the features supported by the device under test, 1158 appropriate scaling limits can be used for the test bed. 1160 Protection Mechanisms 1162 A 1. FRR IGP Table 1164 No. of Headend TE Tunnels IGP Prefixes 1166 1 100 1168 1 500 1170 1 1000 1172 1 2000 1174 1 5000 1176 2 (Load Balance) 100 1178 2 (Load Balance) 500 1180 2 (Load Balance) 1000 1182 2 (Load Balance) 2000 1184 2 (Load Balance) 5000 1186 100 100 1188 500 500 1190 1000 1000 1192 2000 2000 1193 Protection Mechanisms 1195 A 2. FRR VPN Table 1197 No. of Headend TE Tunnels VPNv4 Prefixes 1199 1 100 1201 1 500 1203 1 1000 1205 1 2000 1207 1 5000 1209 1 10000 1211 1 20000 1213 1 Max 1215 2 (Load Balance) 100 1217 2 (Load Balance) 500 1219 2 (Load Balance) 1000 1221 2 (Load Balance) 2000 1223 2 (Load Balance) 5000 1225 2 (Load Balance) 10000 1227 2 (Load Balance) 20000 1229 2 (Load Balance) Max 1231 A 3. FRR Mid-Point LSP Table 1233 No of Mid-point TE LSPs could be configured at recommended levels - 1234 100, 500, 1000, 2000, or max supported number. 1236 Protection Mechanisms 1238 A 4. FRR VC Table 1240 No. of Headend TE Tunnels VC entries 1242 1 100 1244 1 500 1246 1 1000 1248 1 2000 1250 1 Max 1252 100 100 1254 500 500 1256 1000 1000 1258 2000 2000 1260 Appendix B: Abbreviations 1262 BFD - Bidirectional Fault Detection 1263 BGP - Border Gateway protocol 1264 CE - Customer Edge 1265 DUT - Device Under Test 1266 FRR - Fast Reroute 1267 IGP - Interior Gateway Protocol 1268 IP - Internet Protocol 1269 LSP - Label Switched Path 1270 MP - Merge Point 1271 MPLS - Multi Protocol Label Switching 1272 N-Nhop - Next - Next Hop 1273 Nhop - Next Hop 1274 OIR - Online Insertion and Removal 1275 P - Provider 1276 PE - Provider Edge 1277 PHP - Penultimate Hop Popping 1278 PLR - Point of Local Repair 1279 Protection Mechanisms 1281 RSVP - Resource reSerVation Protocol 1282 SRLG - Shared Risk Link Group 1283 TA - Traffic Analyzer 1284 TE - Traffic Engineering 1285 TG - Traffic Generator 1286 VC - Virtual Circuit 1287 VPN - Virtual Private Network