idnits 2.17.1 draft-ietf-bmwg-protection-meth-02.txt: -(1063): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 24. -- Found old boilerplate from RFC 3978, Section 5.3 on line 26. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 1145. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1156. ** The document has an RFC 3978 Section 5.3 Publication Limitation clause. ** The document seems to lack an RFC 3979 Section 5, para. 2 IPR Disclosure Acknowledgement -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack an RFC 3979 Section 5, para. 3 IPR Disclosure Invitation -- however, there's a paragraph with a matching beginning. Boilerplate error? Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There are 2 instances of lines with non-ascii characters in the document. == The page length should not exceed 58 lines per page, but there was 7 longer pages, the longest (page 8) being 60 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 33 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) (A line matching the expected section header was found, but with an unexpected indentation: ' 7. IANA Considerations' ) ** The document seems to lack an Authors' Addresses Section. ** There are 289 instances of too long lines in the document, the longest one being 9 characters in excess of 72. ** The abstract seems to contain references ([TERM-ID], [RFC-WORDS], [MPLS-FRR-EXT]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year == Line 56 has weird spacing: '... The bench...' == Line 57 has weird spacing: '... based prote...' == Line 124 has weird spacing: '... This draft...' == Line 133 has weird spacing: '... There are ...' == Line 153 has weird spacing: '... Some route...' == (11 more instances...) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (July 6, 2007) is 6138 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-09) exists of draft-ietf-bmwg-protection-term-02 Summary: 8 errors (**), 0 flaws (~~), 11 warnings (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group R. Papneja 3 Internet Draft Isocore 4 Intended status: Informational S.Vapiwala 5 Expires: January 2008 J.Karthik 6 Cisco Systems 7 S. Poretsky 8 Reef Point 9 S. Rao 10 Qwest Communications 11 Jean-Louis Le Roux 12 France Telecom 13 July 6, 2007 15 Methodology for benchmarking MPLS Protection mechanisms 16 18 Status of this Memo 20 By submitting this Internet-Draft, each author represents that 21 any applicable patent or other IPR claims of which he or she is 22 aware have been or will be disclosed, and any of which he or she 23 becomes aware will be disclosed, in accordance with Section 6 of 24 BCP 79. 26 This document may only be posted in an Internet-Draft. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF), its areas, and its working groups. Note that 30 other groups may also distribute working documents as Internet- 31 Drafts. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 The list of current Internet-Drafts can be accessed at 39 http://www.ietf.org/ietf/1id-abstracts.txt 41 The list of Internet-Draft Shadow Directories can be accessed at 42 http://www.ietf.org/shadow.html 44 This Internet-Draft will expire on January 6, . 46 Copyright Notice 48 Poretsky, Rao, Le Roux 49 Protection Mechanisms 50 Copyright (C) The IETF Trust (2007). 52 Abstract 54 This draft describes the methodology for benchmarking MPLS Protection 55 mechanisms for link and node protection as defined in [MPLS-FRR-EXT]. 56 The benchmarking and terminology [TERM-ID] are to be used for 57 benchmarking MPLS based protection mechanisms [MPLS-FRR-EXT]. This 58 document provides test methodologies and test-bed setup for measuring 59 failover times while considering all dependencies that might impact 60 faster recovery of real time services riding on MPLS based primary 61 tunnel. The terms used in the procedures included in this document are 62 defined in [TERM-ID]. 64 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 65 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 66 document are to be interpreted as described in RFC-2119 [RFC-WORDS]. 68 Table of Contents 70 1. Introduction...................................................3 71 2. Existing definitions...........................................6 72 3. Test Considerations............................................6 73 3.1. Failover Events...........................................6 74 3.2. Failure Detection [TERM-ID]...............................7 75 3.3. Use of Data Traffic for MPLS Protection Benchmarking......8 76 3.4. LSP and Route Scaling.....................................8 77 3.5. Selection of IGP..........................................9 78 3.6. Reversion [TERM-ID].......................................9 79 3.7. Traffic generation........................................9 80 3.8. Motivation for topologies................................10 81 4. Test Setup....................................................10 82 4.1. Link Protection with 1 hop primary (from PLR) and 1 hop 83 backup........................................................11 84 TE tunnels....................................................11 85 4.2. Link Protection with 1 hop primary (from PLR) and 2 hop 86 backup TE tunnels.............................................11 87 4.3. Link Protection with 2+ hop (from PLR) primary and 1 hop 88 backup TE tunnels.............................................12 89 4.4. Link Protection with 2+ hop (from PLR) primary and 2 hop 90 backup TE tunnels.............................................13 92 Poretsky, Rao, Le Roux 93 Protection Mechanisms 94 4.5. Node Protection with 2 hop primary (from PLR) and 1 hop 95 backup TE tunnels.............................................14 96 4.6. Node Protection with 2 hop primary (from PLR) and 2 hop 97 backup TE tunnels.............................................15 98 4.7. Node Protection with 3+ hop primary (from PLR) and 1 hop 99 backup TE tunnels.............................................16 100 4.8. Node Protection with 3+ hop primary (from PLR) and 2 hop 101 backup TE tunnels.............................................16 102 5. Test Methodology..............................................17 103 5.1. Headend as PLR with link failure.........................18 104 5.2. Mid-Point as PLR with link failure.......................19 105 5.3. Headend as PLR with Node failure.........................20 106 5.4. Mid-Point as PLR with Node failure.......................21 107 5.5. MPLS FRR Forwarding Performance Test Cases...............23 108 5.5.1. PLR as Headend......................................23 109 5.5.2. PLR as Mid-point....................................24 110 6. Reporting Format..............................................25 111 7. IANA Considerations...........................................26 112 This document requires no IANA considerations....................26 113 8. Security Considerations.......................................27 114 9. Acknowledgements..............................................27 115 10. References...................................................28 116 10.1. Normative References....................................28 117 10.2. Informative References..................................28 118 11. Authors' Addresses...........................................28 119 Intellectual Property Statement..................................30 120 Appendix A: Fast Reroute Scalability Table.......................31 122 1. Introduction 124 This draft describes the methodology for benchmarking MPLS based 125 protection mechanisms. The new terminology that it introduces is defined 126 in [TERM-ID]. 128 MPLS based protection mechanisms provide faster recovery of real time 129 services in case of an unplanned link or node failure in the network 130 core, where MPLS is used as a signaling protocol to setup point-to-point 131 traffic engineered tunnels. MPLS based protection mechanisms improve 132 service availability by minimizing the duration of the most common 133 failures. There are generally two factors impacting service 134 availability. One is the frequency and the other is the duration of the 136 Poretsky, Rao, Le Roux 137 Protection Mechanisms 138 failure. Unexpected correlated failures are less common. Correlated 139 failures mean co-occurrence of two or more failures simultaneously. 140 These failures are often observed when two or more logical resources 141 (for e.g. layer-2 links), relying on a common physical resource (for 142 e.g. common transport) fail. Common transport may include TDM and WDM 143 links providing multiplexing at layer-2 and layer-1. Within the context 144 of MPLS protection mechanisms, Shared Risk Link Groups [MPLS-FRR-EXT] 145 encompass correlations failures. 147 Not all correlated failures can be anticipated in advance of their 148 occurrence. Failures due to natural disasters or planned failures are 149 the most notable causes. Due to the frequent occurrences of such 150 failures, it is necessary that implementations can handle these faults 151 gracefully, and recover the services affected by failures very quickly. 153 Some routers recover faster as compared to the others, hence 154 benchmarking this type of failures become very useful. Benchmarking of 155 unexpected correlated failures should include measurement of 156 restoration with and without the availability of IP fallback. This 157 document provides detailed test cases focusing on benchmarking MPLS 158 protection mechanisms. Benchmarking of unexpected correlated failures 159 is currently out of scope of this document. 161 A link or a node failure could occur either at the head-end or at the 162 mid point node of a primary tunnel. The backup tunnel could offer either 163 link or node protection following a failure along the path of the 164 primary tunnel. The time lapsed in transitioning primary tunnel traffic 165 to the backup tunnel is a key measurement that ensures the service level 166 agreements. Failover time depends upon many factors such as the number 167 of prefixes bound to a tunnel, services (such as IGP, BGP, Layer 3/ 168 Layer 2 VPNs) that are bound to the tunnel, number of primary tunnels 169 affected by the failure event, number of primary tunnels protected by 170 backup, the type of failure and the physical media on which the failover 171 occurs. This document describes all different topologies and scenarios 172 that should be considered to effectively benchmark MPLS protection 173 mechanisms and failover times. Different failure scenarios and scaling 174 considerations are also provided in this document. In addition the 175 document provides a reporting format for the observed results. 177 Poretsky, Rao, Le Roux 178 Protection Mechanisms 179 To benchmark the failover time, data plane traffic is used as defined in 180 [IGP-METH]. Traffic loss is the key component in a black-box type test 181 and is used to measure convergence. 183 All benchmarking test cases defined in this document apply to both 184 facility backup and local protection enabled in detour mode. The test 185 cases cover all possible failure scenarios and the associated procedures 186 benchmark the ability of the DUT to perform recovery from failures 187 within target failover time. 189 Figure 1 represents the basic reference test bed and is applicable to 190 all the test cases defined in this document. TG & TA represents Traffic 191 Generator & Analyzer respectively. A tester is connected to the DUT and 192 it sends and receives IP traffic along with the working Path, run 193 protocol emulations simulating real world peering scenarios. 195 --------------------------- 196 | ------------|--------------- 197 | | | | 198 | | | | 199 -------- -------- -------- -------- -------- 200 TG-| R1 |-----| R2 |----| R3 | | R4 | | R5 |-TA 201 | |-----| |----| |----| |---| | 202 -------- -------- -------- -------- -------- 203 | | | | 204 | | | | 205 | -------- | | 206 ---------| R6 |-------- | 207 | |-------------------- 208 -------- 210 Fig.1: Fast Reroute Topology. 212 The tester MUST record the number of lost, duplicate, and reordered 213 packets. It should further record arrival and departure times so that 214 Failover Time, Additive Latency, and Reversion Time can be measured. 216 Poretsky, Rao, Le Roux 217 Protection Mechanisms 218 The tester may be a single device or a test system emulating all the 219 different roles along a primary or backup path. 221 2. Existing definitions 223 For the sake of clarity and continuity this RFC adopts the template 224 for definitions set out in Section 2 of RFC 1242. Definitions are 225 indexed and grouped together in sections for ease of reference. 227 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 228 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in 229 this document are to be interpreted as described in RFC 2119. 231 The reader is assumed to be familiar with the commonly used MPLS 232 terminology, some of which is defined in [MPLS-FRR-EXT]. 234 3. Test Considerations 236 This section discusses the fundamentals of MPLS Protection testing: 238 -The types of network events that causes failover 239 -Indications for failover 240 -the use of data traffic 241 -Traffic generation 242 -LSP Scaling 243 -Reversion of LSP 244 -IGP Selection 246 3.1. Failover Events 248 The failover to the backup tunnel is primarily triggered by either a 249 link or node failures observed downstream of the Point of Local 250 repair (PLR). Some of these failure events are listed below. 252 Link failure events 254 - Interface Shutdown on PLR side with POS Alarm 256 Poretsky, Rao, Le Roux 257 Protection Mechanisms 258 - Interface Shutdown on remote side with POS Alarm 259 - Interface Shutdown on PLR side with RSVP hello 260 - Interface Shutdown on remote side with RSVP hello 261 - Interface Shutdown on PLR side with BFD 262 - Interface Shutdown on remote side with BFD 263 - Fiber Pull on the PLR side (Both TX & RX or just the Tx) 264 - Fiber Pull on the remote side (Both TX & RX or just the Rx) 265 - Online insertion and removal (OIR) on PLR side 266 - OIR on remote side 267 - Sub-interface failure (e.g. shutting down of a VLAN) 268 - Parent interface shutdown (an interface bearing multiple sub- 269 interfaces 271 Node failure events 273 A System reload is initiated either by a graceful shutdown or by a 274 power failure. A system crash is referred to as a software failure or 275 an assert. 277 - Reload protected Node, when RSVP Hello is enabled 278 - Crash Protected Node, when RSVP Hello is enable 279 - Reload Protected Node, when BFD is enable 280 - Crash Protected Node, when BFD is enable 282 3.2. Failure Detection [TERM-ID] 284 Local failures can be detected via SONET/SDH failure with directly 285 connected LSR. Failure indication may vary with the type of alarm - 286 LOS, AIS, or RDI. Failures on Ethernet links such as Gigabit Ethernet 287 rely upon Layer 3 signaling indication for failure. 289 Different MPLS protection mechanisms and different implementations 290 use different failure detection techniques such as RSVP hellos, BFD 291 etc. Ethernet technologies such as Gigabit Ethernet rely upon layer 3 292 failure indication mechanisms since there is no Layer 2 failure 293 indication mechanism. The failure detection time may not always be 294 negligible and it could impact the overall failover time. 296 The test procedures in this document can be used for a local failure 297 or remote failure scenarios for comprehensive benchmarking and to 299 Poretsky, Rao, Le Roux 300 Protection Mechanisms 301 evaluate failover performance independent of the failure detection 302 techniques. 304 3.3. Use of Data Traffic for MPLS Protection Benchmarking 306 Currently end customers use packet loss as a key metric for failover 307 time. Packet loss is an externally observable event and has direct 308 impact on customers' applications. MPLS protection mechanism is 309 expected to minimize the packet loss in the event of a failure. For 310 this reason it is important to develop a standard router benchmarking 311 methodology for measuring MPLS protection that uses packet loss as a 312 metric. At a known rate of forwarding, packet loss can be measured 313 and the Failover time can be determined. Measurement of control plane 314 signaling to establish backup paths is not enough to verify failover. 315 Failover is best determined when packets are actually traversing the 316 backup path. 318 An additional benefit of using packet loss for calculation of 319 Failover time is that it allows use of a black-box tests environment. 320 Data traffic is offered at line-rate to the device under test (DUT), 321 and an emulated network failure event is forced to occur, and packet 322 loss is externally measured to calculate the convergence time. This 323 setup is independent of the DUT architecture. 325 In addition, this methodology considers the packets in error and 326 duplicate packets that could have been generated during the failover 327 process. In scenarios, where separate measurement of packets in error 328 and duplicate packets is difficult to obtain, these packets should be 329 attributed to lost packets. 331 3.4. LSP and Route Scaling 333 Failover time performance may vary with the number of established 334 primary and backup tunnels (LSP) and installed routes. However the 335 procedure outlined here should be used for any number of LSPs (L) and 336 number of routes protected by PLR(R). Number of L and R must be 337 recorded. 339 Poretsky, Rao, Le Roux 340 Protection Mechanisms 341 3.5. Selection of IGP 343 The underlying IGP could be ISIS-TE or OSPF-TE for the methodology 344 proposed here. 346 3.6. Reversion [TERM-ID] 348 Fast Reroute provides a method to return or restore a backup path to 349 original primary LSP upon recovery from the failure. This is referred 350 to as Reversion, which can be implemented as Global Reversion or 351 Local Reversion. In all test cases listed here Reversion should not 352 produce any packet loss, out of order or duplicate packets. Each of 353 the test cases in this methodology document provides a check to 354 confirm that there is no packet loss. 356 3.7. Traffic generation 358 It is suggested that there be one or more traffic streams as long as 359 there is a steady and constant rate of flow for all the streams. In 360 order to monitor the DUT performance for recovery times a set of 361 route prefixes should be advertised before traffic is sent. The 362 traffic should be configured towards these routes. 364 A typical example would be configuring the traffic generator to send 365 the traffic to the first, middle and last of the advertised routes. 366 (First, middle and last could be decided by the numerically smallest, 367 median and the largest respectively of the advertised prefix). 368 Generating traffic to all of the prefixes reachable by the protected 369 tunnel (probably in a Round-Robin fashion, where the traffic is 370 destined to all the prefixes but one prefix at a time in a cyclic 371 manner) is not recommended. The reason why traffic generation is not 372 recommended in a Round-Robin fashion to all the prefixes, one at a 373 time is that if there are many prefixes reachable through the LSP the 374 time interval between 2 packets destined to one prefix may be 375 significantly high and may be comparable with the failover time being 376 measured which does not aid in getting an accurate failover 377 measurement. 379 Poretsky, Rao, Le Roux 380 Protection Mechanisms 382 3.8. Motivation for topologies 384 Given that the label stack is dependent of the following 3 entities 385 it is recommended that the benchmarking of failover time be performed 386 on all the 8 topologies provided in section 4 388 - Type of protection (Link Vs Node) 390 - # of remaining hops of the primary tunnel from the PLR 392 - # of remaining hops of the backup tunnel from the PLR 394 4. Test Setup 396 Topologies to be used for benchmarking the failover time: 398 This section proposes a set of topologies that covers all the 399 scenarios for local protection. All of these 8 topologies shown 400 (figure 2- figure 9) can be mapped to the reference topology shown in 401 figure 1. Topologies provided in sections 4.1 to 4.8 refer to test- 402 bed required to benchmark failover time when DUT is configured as a 403 PLR in either head-end or midpoint role. The labels stack provided 404 with each topology is at the PLR. 406 The label stacks shown below each figure in section 4.1 to 4.9 407 considers enabling of Penultimate Hop Popping (PHP). 409 Figures 2-9 uses the following convention: 411 a) HE is Head-End 413 b) TE is Tail-End 415 c) MID is Mid point 417 d) MP is Merge Point 419 e) PLR is Point of Local Repair 421 f) PRI is Primary 423 Poretsky, Rao, Le Roux 424 Protection Mechanisms 425 g) BKP denotes Backup Node 427 4.1. Link Protection with 1 hop primary (from PLR) and 1 hop backup 429 TE tunnels 431 ------- -------- PRI -------- 432 | R1 | | R2 | | R3 | 433 TG-| HE |--| MID |----| TE |-TA 434 | | | PLR |----| | 435 ------- -------- BKP -------- 436 Figure 2: Represents the setup for section 4.1 438 Traffic No of Labels No of labels after 439 before failure failure 440 IP TRAFFIC (P-P) 0 0 441 Layer3 VPN (PE-PE) 1 1 442 Layer3 VPN (PE-P) 2 2 443 Layer2 VC (PE-PE) 1 1 444 Layer2 VC (PE-P) 2 2 445 Mid-point LSPs 0 0 447 4.2. Link Protection with 1 hop primary (from PLR) and 2 hop backup TE 448 tunnels 450 ------- -------- -------- 451 | R1 | | R2 | | R3 | 452 TG-| HE | | MID |PRI | TE |-TA 453 | |----| PLR |----| | 454 ------- -------- -------- 455 |BKP | 456 | -------- | 457 | | R6 | | 458 |----| BKP |----| 459 | MID | 460 -------- 462 Poretsky, Rao, Le Roux 464 Protection Mechanisms 465 Figure 3: Representing setup for section 4.2 467 Traffic No of Labels No of labels 468 before failure after failure 469 IP TRAFFIC (P-P) 0 1 470 Layer3 VPN (PE-PE) 1 2 471 Layer3 VPN (PE-P) 2 3 472 Layer2 VC (PE-PE) 1 2 473 Layer2 VC (PE-P) 2 3 474 Mid-point LSPs 0 1 476 4.3. Link Protection with 2+ hop (from PLR) primary and 1 hop backup TE 477 tunnels 479 -------- -------- -------- -------- 480 | R1 | | R2 |PRI | R3 |PRI | R4 | 481 TG-| HE |----| MID |----| MID |------| TE |-TA 482 | | | PLR |----| | | | 483 -------- -------- BKP -------- -------- 484 Figure 4: Representing setup for section 4.3 486 Traffic No of Labels No of labels 487 before failure after failure 489 IP TRAFFIC (P-P) 1 1 490 Layer3 VPN (PE-PE) 2 2 491 Layer3 VPN (PE-P) 3 3 492 Layer2 VC (PE-PE) 2 2 493 Layer2 VC (PE-P) 3 3 494 Mid-point LSPs 1 1 496 Poretsky, Rao, Le Roux 497 Protection Mechanisms 498 4.4. Link Protection with 2+ hop (from PLR) primary and 2 hop backup TE 499 tunnels 501 Poretsky, Rao, Le Roux 502 Protection Mechanisms 504 -------- -------- PRI -------- PRI -------- 505 | R1 | | R2 | | R3 | | R4 | 506 TG-| HE |----| MID |----| MID |------| TE |-TA 507 | | | PLR | | | | | 508 -------- -------- -------- -------- 509 BKP| | 510 | -------- | 511 | | R6 | | 512 ---| BKP |- 513 | MID | 514 -------- 515 Figure 5: Representing the setup for section 4.4 517 Traffic No of Labels No of labels 518 before failure after failure 520 IP TRAFFIC (P-P) 1 2 521 Layer3 VPN (PE-PE) 2 3 522 Layer3 VPN (PE-P) 3 4 523 Layer2 VC (PE-PE) 2 3 524 Layer2 VC (PE-P) 3 4 525 Mid-point LSPs 1 2 527 4.5. Node Protection with 2 hop primary (from PLR) and 1 hop backup TE 528 tunnels 530 -------- -------- -------- -------- 531 | R1 | | R2 |PRI | R3 | PRI | R4 | 532 TG-| HE |----| MID |----| MID |------| TE |-TA 533 | | | PLR | | | | | 534 -------- -------- -------- -------- 535 |BKP | 536 ----------------------------- 537 Figure 6: Representing the setup for section 4.5 539 Poretsky, Rao, Le Roux 540 Protection Mechanisms 542 Traffic No of Labels No of labels 543 before failure after failure 545 IP TRAFFIC (P-P) 1 0 546 Layer3 VPN (PE-PE) 2 1 547 Layer3 VPN (PE-P) 3 2 548 Layer2 VC (PE-PE) 2 1 549 Layer2 VC (PE-P) 3 2 550 Mid-point LSPs 1 0 552 4.6. Node Protection with 2 hop primary (from PLR) and 2 hop backup TE 553 tunnels 555 -------- -------- -------- -------- 556 | R1 | | R2 | | R3 | | R4 | 557 TG-| HE | | MID |PRI | MID |PRI | TE |-TA 558 | |----| PLR |----| |----| | 559 -------- -------- -------- -------- 560 | | 561 BKP| -------- | 562 | | R6 | | 563 ---------| BKP |--------- 564 | MID | 565 -------- 566 Figure 7: Representing setup for section 4.6 568 Traffic No of Labels No of labels 569 before failure after failure 571 IP TRAFFIC (P-P) 1 1 572 Layer3 VPN (PE-PE) 2 2 573 Layer3 VPN (PE-P) 3 3 575 Poretsky, Rao, Le Roux 576 Protection Mechanisms 577 Layer2 VC (PE-PE) 2 2 578 Layer2 VC (PE-P) 3 3 579 Mid-point LSPs 1 1 581 4.7. Node Protection with 3+ hop primary (from PLR) and 1 hop backup TE 582 tunnels 584 -------- -------- PRI -------- PRI -------- PRI -------- 585 | R1 | | R2 | | R3 | | R4 | | R5 | 586 TG-| HE |--| MID |---| MID |---| MP |---| TE |-TA 587 | | | PLR | | | | | | | 588 -------- -------- -------- -------- -------- 589 BKP| | 590 -------------------------- 591 Figure 8: Representing setup for section 4.7 593 Traffic No of Labels No of labels 594 before failure after failure 596 IP TRAFFIC (P-P) 1 1 597 Layer3 VPN (PE-PE) 2 2 598 Layer3 VPN (PE-P) 3 3 599 Layer2 VC (PE-PE) 2 2 600 Layer2 VC (PE-P) 3 3 601 Mid-point LSPs 1 1 603 4.8. Node Protection with 3+ hop primary (from PLR) and 2 hop backup 604 TE tunnels 606 Poretsky, Rao, Le Roux 607 Protection Mechanisms 609 -------- -------- -------- -------- -------- 610 | R1 | | R2 | | R3 | | R4 | | R5 | 611 TG-| HE | | MID |PRI| MID |PRI| MP |PRI| TE |-TA 612 | |-- | PLR |---| |---| |---| | 613 -------- -------- -------- -------- -------- 614 BKP| | 615 | -------- | 616 | | R6 | | 617 ---------| BKP |------- 618 | MID | 619 -------- 620 Figure 9: Representing setup for section 4.8 622 Traffic No of Labels No of labels 623 before failure after failure 625 IP TRAFFIC (P-P) 1 2 626 Layer3 VPN (PE-PE) 2 3 627 Layer3 VPN (PE-P) 3 4 628 Layer2 VC (PE-PE) 2 3 629 Layer2 VC (PE-P) 3 4 630 Mid-point LSPs 1 2 632 5. Test Methodology 634 The procedure described in this section can be applied to all the 8 635 base test cases and the associated topologies. The backup as well as 636 the primary tunnel are configured to be alike in terms of bandwidth 637 usage. In order to benchmark failover with all possible label stack 638 depth applicable as seen with current deployments, it is suggested 639 that the methodology includes all the scenarios listed here 641 Poretsky, Rao, Le Roux 642 Protection Mechanisms 643 5.1. Headend as PLR with link failure 645 Objective 647 To benchmark the MPLS failover time due to Link failure events 648 described in section 3.1 experienced by the DUT which is the point 649 of local repair (PLR). 651 Test Setup 653 - select any one topology out of 8 from section 4 654 - select overlay technology for FRR test e.g. IGP,VPN,or VC 655 - The DUT will also have 2 interfaces connected to the traffic 656 Generator/analyzer. (If the node downstream of the PLR is not 657 A simulated node, then the Ingress of the tunnel should have 658 one link connected to the traffic generator and the node 659 downstream to the PLR or the egress of the tunnel should have 660 a link connected to the traffic analyzer). 662 Test Configuration 664 1. Configure the number of primaries on R2 and the backups on 665 R2 as required by the topology selected. 666 2. Advertise prefixes (as per FRR Scalability table describe in 667 Appendix A) by the tail end. 669 Procedure 671 1. Establish the primary lsp on R2 required by the topology 672 selected 673 2. Establish the backup lsp on R2 required by the selected 674 topology 675 3. Verify primary and backup lsps are up and that primary is 676 protected 677 4. Verify Fast Reroute protection is enabled and ready 678 5. Setup traffic streams as described in section 3.7 679 6. Send IP traffic at maximum Forwarding Rate to DUT. 680 7. Verify traffic switched over Primary LSP. 682 Poretsky, Rao, Le Roux 683 Protection Mechanisms 684 8. Trigger any choice of Link failure as describe in section 685 3.1 686 9. Verify that primary tunnel and prefixes gets mapped to 687 backup tunnels 688 10. Stop traffic stream and measure the traffic loss. 689 11. Failover time is calculated as defined in section 6, 690 Reporting format. 691 12. Start traffic stream again to verify reversion when 692 protected interface comes up. Traffic loss should be 0 due 693 to make before break or reversion. 694 13. Enable protected interface that was down (Node in the case 695 of NNHOP) 696 14. Verify head-end signals new LSP and protection should be in 697 place again 699 5.2. Mid-Point as PLR with link failure 701 Objective 703 To benchmark the MPLS failover time due to Link failure events 704 described in section 3.1 experienced by the device under test which 705 is the point of local repair (PLR). 707 Test Setup 709 - select any one topology out of 8 from section 4 710 - select overlay technology for FRR test as Mid-Point lsps 711 - The DUT will also have 2 interfaces connected to the traffic 712 generator. 714 Test Configuration 716 1. Configure the number of primaries on R1 and the backups on 717 R2 as required by the topology selected 718 2. Advertise prefixes (as per FRR Scalability table describe in 719 Appendix A) by the tail end. 721 Poretsky, Rao, Le Roux 722 Protection Mechanisms 723 Procedure 725 1. Establish the primary lsp on R1 required by the topology 726 selected 727 2. Establish the backup lsp on R2 required by the selected 728 topology 729 3. Verify primary and backup lsps are up and that primary is 730 protected 731 4. Verify Fast Reroute protection 732 5. Setup traffic streams as described in section 3.7 733 6. Send IP traffic at maximum Forwarding Rate to DUT. 734 7. Verify traffic switched over Primary LSP. 735 8. Trigger any choice of Link failure as describe in section 736 3.1 737 9. Verify that primary tunnel and prefixes gets mapped to 738 backup tunnels 739 10. Stop traffic stream and measure the traffic loss. 740 11. Failover time is calculated as per defined in section 6, 741 Reporting format. 742 12. Start traffic stream again to verify reversion when 743 protected interface comes up. Traffic loss should be 0 due 744 to make before break or reversion 745 13. Enable protected interface that was down (Node in the case 746 of NNHOP) 747 14. Verify head-end signals new LSP and protection should be in 748 place again 750 5.3. Headend as PLR with Node failure 752 Objective 754 To benchmark the MPLS failover time due to Node failure events 755 described in section 3.1 experienced by the device under test which 756 is the point of local repair (PLR). 758 Test Setup 760 - select any one topology from section 4.5 to 4.8 761 - select overlay technology for FRR test e.g. IGP,VPN,or VC 762 - The DUT will also have 2 interfaces connected to the traffic 764 Poretsky, Rao, Le Roux 765 Protection Mechanisms 766 generator. 768 Test Configuration 770 1. Configure the number of primaries on R2 and the backups on 771 R2 as required by the topology selected 772 2. Advertise prefixes (as per FRR Scalability table describe in 773 Appendix A) by the tail end. 775 Procedure 777 1. Establish the primary lsp on R2 required by the topology 778 selected 779 2. Establish the backup lsp on R2 required by the selected 780 topology 781 3. Verify primary and backup lsps are up and that primary is 782 protected 783 4. Verify Fast Reroute protection 784 5. Setup traffic streams as described in section 3.7 785 6. Send IP traffic at maximum Forwarding Rate to DUT. 786 7. Verify traffic switched over Primary LSP. 787 8. Trigger any choice of Node failure as describe in section 788 3.1 789 9. Verify that primary tunnel and prefixes gets mapped to 790 backup tunnels 791 10. Stop traffic stream and measure the traffic loss. 792 11. Failover time is calculated as per defined in section 6, 793 Reporting format. 794 12. Start traffic stream again to verify reversion when 795 protected interface comes up. Traffic loss should be 0 due 796 to make before break or reversion 797 13. Boot protected Node that was down. 798 14. Verify head-end signals new LSP and protection should be in 799 place again 801 5.4. Mid-Point as PLR with Node failure 803 Poretsky, Rao, Le Roux 804 Protection Mechanisms 805 Objective 807 To benchmark the MPLS failover time due to Node failure events 808 described in section 3.1 experienced by the device under test which 809 is the point of local repair (PLR). 811 Test Setup 813 - select any one topology from section 4.5 to 4.8 814 - select overlay technology for FRR test as Mid-Point lsps 815 - The DUT will also have 2 interfaces connected to the traffic 816 generator. 818 Test Configuration 820 1. Configure the number of primaries on R1 and the backups on 821 R2 as required by the topology selected 822 2. Advertise prefixes (as per FRR Scalability table describe in 823 Appendix A) by the tail end. 825 Procedure 827 1. Establish the primary lsp on R1 required by the topology 828 selected 829 2. Establish the backup lsp on R2 required by the selected 830 topology 831 3. Verify primary and backup lsps are up and that primary is 832 protected 833 4. Verify Fast Reroute protection 834 5. Setup traffic streams as described in section 3.7 835 6. Send IP traffic at maximum Forwarding Rate to DUT. 836 7. Verify traffic switched over Primary LSP. 837 8. Trigger any choice of Node failure as describe in section 838 3.1 839 9. Verify that primary tunnel and prefixes gets mapped to 840 backup tunnels 841 10. Stop traffic stream and measure the traffic loss. 842 11. Failover time is calculated as per defined in section 6, 843 Reporting format. 845 Poretsky, Rao, Le Roux 846 Protection Mechanisms 847 12. Start traffic stream again to verify reversion when 848 protected interface comes up. Traffic loss should be 0 due 849 to make before break or reversion 850 13. Boot protected Node that was down 851 14. Verify head-end signals new LSP and protection should be in 852 place again 854 5.5. MPLS FRR Forwarding Performance Test Cases 856 For the following MPLS FRR Forwarding Performance Benchmarking 857 cases, Test the maximum PPS rate allowed by given hardware 859 5.5.1. PLR as Headend 861 Objective 863 To benchmark the maximum rate (pps) on the PLR (as headend) 864 over primary FRR LSP and backup lsp. 866 Test Setup 868 - select any one topology out of 8 from section 4 869 - select overlay technology for FRR test e.g. IGP,VPN,or VC 870 - The DUT will also have 2 interfaces connected to the traffic 871 Generator/analyzer. (If the node downstream of the PLR is not 872 A simulated node, then the Ingress of the tunnel should have 873 one link connected to the traffic generator and the node 874 downstream to the PLR or the egress of the tunnel should have 875 a link connected to the traffic analyzer). 877 Procedure 879 1. Establish the primary lsp on R2 required by the 880 topology selected 881 2. Establish the backup lsp on R2 required by the 882 selected topology 883 3. Verify primary and backup lsps are up and that primary 884 is protected 885 4. Verify Fast Reroute protection is enabled and ready 887 Poretsky, Rao, Le Roux 888 Protection Mechanisms 889 5. Setup traffic streams as described in section 3.7 890 6. Send IP traffic at maximum forwarding rate (pps) that 891 the device under test supports over the primary LSP 892 7. Record maximum PPS rate forwarded over primary LSP 893 8. Stop traffic stream 894 9. Trigger any choice of Link failure as describe in 895 section 3.1 896 10. Verify that primary tunnel and prefixes gets mapped to 897 backup tunnels 898 11. Send IP traffic at maximum forwarding rate (pps) that 899 the device under test supports over the primary LSP 900 12. Record maximum PPS rate forwarded over backup LSP 902 5.5.2. PLR as Mid-point 904 To benchmark the maximum rate (pps) on the PLR (as mid-point) 905 over primary FRR LSP and backup lsp. 907 Test Setup 909 - select any one topology out of 8 from section 4 910 - select overlay technology for FRR test as Mid-Point lsps 911 - The DUT will also have 2 interfaces connected to the traffic 912 generator. 914 Procedure 916 1. Establish the primary lsp on R1 required by the 917 topology selected 918 2. Establish the backup lsp on R2 required by the 919 selected topology 920 3. Verify primary and backup lsps are up and that primary 921 is protected 922 4. Verify Fast Reroute protection is enabled and ready 923 5. Setup traffic streams as described in section 3.7 925 Poretsky, Rao, Le Roux 926 Protection Mechanisms 927 6. Send IP traffic at maximum forwarding rate (pps) that 928 the device under test supports over the primary LSP 929 7. Record maximum PPS rate forwarded over primary LSP 930 8. Stop traffic stream 931 9. Trigger any choice of Link failure as describe in 932 section 3.1 933 10. Verify that primary tunnel and prefixes gets mapped to 934 backup tunnels 935 11. Send IP traffic at maximum forwarding rate (pps) that 936 the device under test supports over the backup LSP 937 12. Record maximum PPS rate forwarded over backup LSP 939 6. Reporting Format 941 For each test, it is recommended that the results be reported in the 942 following format. 944 Parameter Units 946 IGP used for the test ISIS-TE/ OSPF-TE 947 Interface types Gige,POS,ATM,VLAN etc. 948 Packet Sizes offered to the DUT Bytes 949 Forwarding rate number of packets 950 IGP routes advertised number of IGP routes 951 RSVP hello timers configured (if any) milliseconds 952 Number of FRR tunnels configured number of tunnels 953 Number of VPN routes in head-end number of VPN routes 954 Number of VC tunnels number of VC tunnels 955 Number of BGP routes number of BGP routes 956 Number of mid-point tunnels number of tunnels 957 Number of Prefixes protected by Primary number of prefixes 958 Number of LSPs being protected number of LSPs 959 Topology being used Section number 960 Failure Event Event type 962 Benchmarks 964 Poretsky, Rao, Le Roux 965 Protection Mechanisms 966 Minimum failover time milliseconds 967 Mean failover time milliseconds 968 Maximum failover time milliseconds 969 Minimum reversion time milliseconds 970 Mean reversion time milliseconds 971 Maximum reversion time milliseconds 973 Failover time suggested above is calculated using one of the 974 following 3 methods 976 1. Packet-Based Loss method (PBLM): (Number of packets 977 dropped/packets per second * 1000) milliseconds. This method 978 could also be referred as Rate Derived method. 980 2. Time-Based Loss Method (TBLM): This method relies on the 981 ability of the Traffic generators to provide statistics which 982 reveal the duration of failure in milliseconds based on when the 983 packet loss occurred (interval between non-zero packet loss and 984 zero loss). 986 3. Timestamp Based Method (TBM): This method of failover 987 calculation is based on the timestamp that gets transmitted as 988 payload in the packets originated by the generator. The Traffic 989 Analyzer records the timestamp of the last packet received 990 before the failover event and the first packet after the 991 failover and derives the time based on the difference between 992 these 2 timestamps. Note: The payload could also contain 993 sequence numbers for out-of-order packet calculation and 994 duplicate packets. 996 Note: If the primary is configured to be dynamic, and if the primary 997 is to reroute, make before break should occur from the backup that is 998 in use to a new alternate primary. If there is any packet loss seen, 999 it should be added to failover time. 1001 7. IANA Considerations 1003 This document requires no IANA considerations. 1005 Poretsky, Rao, Le Roux 1006 Protection Mechanisms 1008 8. Security Considerations 1010 Benchmarking activities as described in this memo are limited to 1011 technology characterization using controlled stimuli in a laboratory 1012 environment, with dedicated address space and the constraints 1013 specified in the sections above. 1015 The benchmarking network topology will be an independent test setup 1016 and MUST NOT be connected to devices that may forward the test 1017 traffic into a production network, or misroute traffic to the test 1018 management network. 1020 Further, benchmarking is performed on a "black-box" basis, relying 1021 solely on measurements observable external to the DUT/SUT. 1023 Special capabilities SHOULD NOT exist in the DUT/SUT specifically 1024 for benchmarking purposes. Any implications for network security 1025 arising from the DUT/SUT SHOULD be identical in the lab and in 1026 production networks. 1028 The isolated nature of the benchmarking environments and the fact 1029 that no special features or capabilities, other than those used in 1030 operational networks, are enabled on the DUT/SUT requires no 1031 security considerations specific to the benchmarking process. 1033 9. Acknowledgements 1035 We would like to thank Jean Philip Vasseur for his invaluable input 1036 to the document and Curtis Villamizar his contribution in suggesting 1037 text on definition and need for benchmarking Correlated failures. 1039 Additionally we would like to thank Arun Gandhi, Amrit Hanspal, Karu 1040 Ratnam and for their input to the document. 1042 Poretsky, Rao, Le Roux 1043 Protection Mechanisms 1045 10. References 1047 10.1. Normative References 1049 [MPLS-FRR-EXT] Pan, P., Atlas, A., Swallow, G., "Fast Reroute 1050 Extensions to RSVP-TE for LSP Tunnels", RFC 4090. 1052 10.2. Informative References 1054 [RFC-WORDS] Bradner, S., "Key words for use in RFCs to 1055 Indicate Requirement Levels", RFC 2119, March 1997. 1057 [TERM-ID] Poretsky S., Papneja R., Karthik J., Vapiwala S., 1058 "Benchmarking Terminology for Protection 1059 Performance", draft-ietf-bmwg-protection-term- 1060 02.txt, work in progress. 1062 [MPLS-FRR-EXT] Pan P., Swollow G., Atlas A., "Fast Reroute 1063 Extensions to RSVP-TE for LSP Tunnels�, RFC 4090. 1065 [IGP-METH] S. Poretsky, B. Imhoff, "Benchmarking Methodology 1066 for IGP Data Plane Route Convergence, "draft-ietf- 1067 bmwg-igp-dataplane-conv-meth-12.txt�, work in 1068 progress. 1070 11. Authors' Addresses 1072 Rajiv Papneja 1073 Isocore 1074 12359 Sunrise Valley Drive, STE 100 1075 Reston, VA 20190 1076 USA 1077 Phone: +1 703 860 9273 1079 Poretsky, Rao, Le Roux 1080 Protection Mechanisms 1081 Email: rpapneja@isocore.com 1083 Samir Vapiwala 1084 Cisco System 1085 300 Beaver Brook Road 1086 Boxborough, MA 01719 1087 USA 1088 Phone: +1 978 936 1484 1089 Email: svapiwal@cisco.com 1091 Jay Karthik 1092 Cisco System 1093 300 Beaver Brook Road 1094 Boxborough, MA 01719 1095 USA 1096 Phone: +1 978 936 0533 1097 Email: jkarthik@cisco.com 1099 Scott Poretsky 1100 Reef Point Systems 1101 8 New England Executive Park 1102 Burlington, MA 01803 1103 USA 1104 Phone: + 1 781 395 5090 1105 EMail: sporetsky@reefpoint.com 1107 Shankar Rao 1108 Qwest Communications, 1109 950 17th Street 1110 Suite 1900 1111 Qwest Communications 1112 Denver, CO 80210 1113 USA 1114 Phone: + 1 303 437 6643 1115 Email: shankar.rao@qwest.com 1117 Jean-Louis Le Roux 1118 France Telecom 1119 2 av Pierre Marzin 1120 22300 Lannion 1122 Poretsky, Rao, Le Roux 1123 Protection Mechanisms 1124 France 1125 Phone: 00 33 2 96 05 30 20 1126 Email: jeanlouis.leroux@orange-ft.com 1128 Full Copyright Statement 1130 Copyright (C) The IETF Trust (2007). 1132 This document is subject to the rights, licenses and restrictions 1133 contained in BCP 78, and except as set forth therein, the authors 1134 retain all their rights. 1136 Disclaimer 1138 This document and the information contained herein are provided 1139 on an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 1140 REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE 1141 IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL 1142 WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY 1143 WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE 1144 ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS 1145 FOR A PARTICULAR PURPOSE. 1147 Intellectual Property Statement 1149 The IETF takes no position regarding the validity or scope of any 1150 Intellectual Property Rights or other rights that might be claimed to 1151 pertain to the implementation or use of the technology described in 1152 this document or the extent to which any license under such rights 1153 might or might not be available; nor does it represent that it has 1154 made any independent effort to identify any such rights. Information 1155 on the procedures with respect to rights in RFC documents can be 1156 found in BCP 78 and BCP 79. 1158 Copies of IPR disclosures made to the IETF Secretariat and any 1159 assurances of licenses to be made available, or the result of an 1161 Poretsky, Rao, Le Roux 1162 Protection Mechanisms 1163 attempt made to obtain a general license or permission for the use of 1164 such proprietary rights by implementers or users of this 1165 specification can be obtained from the IETF on-line IPR repository at 1166 http://www.ietf.org/ipr. 1168 The IETF invites any interested party to bring to its attention any 1169 copyrights, patents or patent applications, or other proprietary 1170 rights that may cover technology that may be required to implement 1171 this standard. Please address the information to the IETF at 1172 ietf-ipr@ietf.org 1174 Acknowledgement 1176 Funding for the RFC Editor function is currently provided by the 1177 Internet Society. 1179 Appendix A: Fast Reroute Scalability Table 1181 This section provides the recommended numbers for evaluating the 1182 scalability of fast reroute implementations. It also recommends the 1183 typical numbers for IGP/VPNv4 Prefixes, LSP Tunnels and VC entries. 1184 Based on the features supported by the device under test, appropriate 1185 scaling limits can be used for the test bed. 1187 A 1. FRR IGP Table 1189 No of Headend IGP Prefixes 1190 TE LSPs 1191 1 100 1192 1 500 1193 1 1000 1194 1 2000 1195 1 5000 1196 2(Load Balance) 100 1197 2(Load Balance) 500 1198 2(Load Balance) 1000 1199 2(Load Balance) 2000 1200 2(Load Balance) 5000 1201 100 100 1203 Poretsky, Rao, Le Roux 1204 Protection Mechanisms 1205 500 500 1206 1000 1000 1207 2000 2000 1209 A 2. FRR VPN Table 1211 No of Headend VPNv4 Prefixes 1212 TE LSPs 1214 1 100 1215 1 500 1216 1 1000 1217 1 2000 1218 1 5000 1219 1 10000 1220 1 20000 1221 1 Max 1222 2(Load Balance) 100 1223 2(Load Balance) 500 1224 2(Load Balance) 1000 1225 2(Load Balance) 2000 1226 2(Load Balance) 5000 1227 2(Load Balance) 10000 1228 2(Load Balance) 20000 1229 2(Load Balance) Max 1231 A 3. FRR Mid-Point LSP Table 1233 No of Mid-point TE LSPs could be configured at the following 1234 recommended levels 1235 100 1236 500 1237 1000 1238 2000 1239 Max supported number 1241 A 4. FRR VC Table 1243 Poretsky, Rao, Le Roux 1244 Protection Mechanisms 1246 No of Headend VC entries 1247 TE LSPs 1249 1 100 1250 1 500 1251 1 1000 1252 1 2000 1253 1 Max 1254 100 100 1255 500 500 1256 1000 1000 1257 2000 2000 1259 Appendix B: Abbreviations 1261 BFD - Bidirectional Fault Detection 1262 BGP - Border Gateway protocol 1263 CE - Customer Edge 1264 DUT - Device Under Test 1265 FRR - Fast Reroute 1266 IGP - Interior Gateway Protocol 1267 IP - Internet Protocol 1268 LSP - Label Switched Path 1269 MP - Merge Point 1270 MPLS - Multi Protocol Label Switching 1271 N-Nhop - Next - Next Hop 1272 Nhop - Next Hop 1273 OIR - Online Insertion and Removal 1274 P - Provider 1275 PE - Provider Edge 1276 PHP - Penultimate Hop Popping 1277 PLR - Point of Local Repair 1278 RSVP - Resource reSerVation Protocol 1279 SRLG - Shared Risk Link Group 1280 TA - Traffic Analyzer 1281 TE - Traffic Engineering 1282 TG - Traffic Generator 1283 VC - Virtual Circuit 1284 VPN - Virtual Private Network 1286 Poretsky, Rao, Le Roux