idnits 2.17.1 draft-ietf-bmwg-ospfconv-applicability-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 10 longer pages, the longest (page 2) being 60 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 11 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (May 2004) is 7284 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'TERM' is mentioned on line 67, but not defined == Missing Reference: 'OSPF' is mentioned on line 395, but not defined == Unused Reference: 'RFC2119' is defined on line 412, but no explicit reference was found in the text -- No information found for draft-bmwg-ospfconv-intraarea - is the name correct? -- Possible downref: Normative reference to a draft: ref. 'BENCHMARK' == Outdated reference: A later version (-06) exists of draft-ietf-isis-igp-p2p-over-lan-03 Summary: 4 errors (**), 0 flaws (~~), 8 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Vishwas Manral 3 Internet Draft Netplane Systems 4 Russ White 5 Cisco Systems 6 Aman Shaikh 7 Expiration Date: November 2004 University of California 8 File Name: draft-ietf-bmwg-ospfconv-applicability-05.txt May 2004 10 Benchmarking Applicability for Basic OSPF Convergence 11 draft-ietf-bmwg-ospfconv-applicability-05.txt 13 Status of this Memo 15 This document is an Internet-Draft and is in full conformance with 16 all provisions of Section 10 of RFC2026. 18 Internet Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its Areas, and its Working Groups. Note that other 20 groups may also distribute working documents as Internet Drafts. 22 Internet Drafts are draft documents valid for a maximum of six 23 months. Internet Drafts may be updated, replaced, or obsoleted by 24 other documents at any time. It is not appropriate to use Internet 25 Drafts as reference material or to cite them other than as a "working 26 draft" or "work in progress". 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/1id-abstracts.txt 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html. 34 Copyright Notice 36 Copyright (C) The Internet Society (2002). All Rights Reserved. 38 Abstract 40 This document discusses the applicability of various tests for 41 measuring single router control plane convergence, specifically in 42 regards to the Open Shortest First (OSPF) protocol. There are two 43 general sections in this document, the first discussing specific 44 advantages and limitations of specific OSPF convergence tests, and 45 the second discussing more general pitfalls to be considered when 46 testing routing protocols convergence testing. 48 1. Introduction 50 There is a growing interest in testing single router control plane 51 convergence for routing protocols, with many people looking at 52 testing methodologies which can provide information on how long it 53 takes for a network to converge after various network events occur. 54 It is important to consider the framework within which any given 55 convergence test is executed when attempting to apply the results of 56 the testing, since the framework can have a major impact on the 57 results. For instance, determining when a network is converged, what 58 parts of the router's operation are considered within the testing, 59 and other such things will have a major impact on what apparent 60 performance routing protocols provide. 62 This document describes in detail the various benefits and pitfalls 63 of tests described in [BENCHMARK]. It also explains how such 64 measurements can be useful for providers and the research community. 66 NOTE: The word convergence within this document refers to single 67 router control plane convergence [TERM]. 69 2. Advantages of Such Measurement 71 o To be able to compare the iterations of a protocol implementa- 72 tion. It is often useful to be able to compare the performance 73 of two iterations of a given implementation of a protocol to 74 determine where improvements have been made and where further 75 improvements can be made. 77 o To understand, given a set of parameters (network conditions), 78 how a particular implementation on a particular device is going 79 to perform. For instance, if you were trying to decide the pro- 80 cessing power (size of device) required in a certain location 81 within a network, you can emulate the conditions which are going 82 to exist at that point in the network and use the test described 83 to measure the perfomance of several different routers. The 84 results of these tests can provide one possible data point for 85 an intelligent decision. 87 If the device being tested is to be deployed in a running net- 88 work, using routes taken from the network where the equipment is 89 to be deployed rather than some generated topology in these 90 tests will give results which are closer to the real preformance 91 of the device. Care should be taken to emulate or take routes 92 from the actual location in the network where the device will be 93 (or would be) deployed. For instance, one set of routes may be 94 taken from an ABR, one set from an area 0 only router, various 95 sets from stub area, another set from various normal areas, etc. 97 o To measure the performance of an OSPF implementation in a wide 98 variety of scenarios. 100 o To be used as parameters in OSPF simulations by researchers. It 101 may some times be required for certain kinds of research to 102 measure the individual delays of each parameter within an OSPF 103 implementation. These delays can be measured using the methods 104 defined in [BENCHMARK]. 106 o To help optimize certain configurable parameters. It may some 107 times be helpful for operators to know the delay required for 108 individual tasks so as to optimize the resource usage in the 109 network i.e. if it is found that the processing time is x 110 seconds on an router, it would be helpful to determine the rate 111 at which to flood LSA's to that router so as to not overload the 112 network. 114 3. Assumptions Made and Limitations of such measurements 116 o The interactions of convergence and forwarding; testing is res- 117 tricted to events occurring within the control plane. Forwarding 118 performance is the primary focus in [INTERCONNECT] and it is 119 expected to be dealt with in work that ensues from [FIB-TERM]. 121 o Duplicate LSAs are Acknowledged Immediately. A few tests rely on 122 the property that duplicate LSA Acknowledgements are not delayed 123 but are done immediately. However if some implementation does 124 not acknowledge duplicate LSAs immediately on receipt, the test- 125 ing methods presented in [BENCHMARK] could give inaccurate meas- 126 urements. 128 o It is assumed that SPF is non-preemptive. If SPF is implemented 129 so that it can (and will be) preempted, the SPF measurements 130 taken in [BENCHMARK] would include the times that the SPF pro- 131 cess is not running ([BENCHMARK] measures the total time taken 132 for SPF to run, not the amount of time that SPF actually spends 133 on the device's processor), thus giving inaccurate measurements. 135 o Some implementations may be multithreaded or use a 136 multiprocess/multirouter model of OSPF. If because of this any 137 of the assumptions taken in measurement are violated in such a 138 model, it could lead to inaccurate measurements. 140 o The measurements resulting from the tests in [BENCHMARK] may not 141 provide the information required to deploy a device in a large 142 scale network. The tests described focus on individual com- 143 ponents of an OSPF implementation's performance, and it may be 144 difficult to combine the measurements in a way which accurately 145 depicts a device's performance in a large scale network. Further 146 research is required in this area. 148 o The measurements described in [BENCHMARK] should be used with 149 great care when comparing two different implementations of OSPF 150 from two different vendors. For instance, there are many other 151 factors than convergence speed that need to be taken into con- 152 sideration when comparing different vendor's products, and it's 153 difficult to align the resources available on one device to the 154 resources available on another device. 156 4. Observations on the Tests Described in [BENCHMARK] 158 Some observations taken while implementing the tests described in 159 [BENCHMARK] are noted in this section. 161 4.1. Measuring the SPF Processing Time Externally 163 The most difficult test to perform is the external measurement of the 164 time required to perform an SPF calculation, since the amount of time 165 between the first LSA which indicates a topology change and the 166 duplicate LSA is critical. If the duplicate LSA is sent too quickly, 167 it may be received before the device under test actually begins run- 168 ning SPF on the network change information. If the delay between the 169 two LSAs is too long, the device under test may finish SPF processing 170 before receiving the duplicate LSA. It is important to closely inves- 171 tigate any delays between the receipt of an LSA and the beginning of 172 an SPF calculation in the device under test; multiple tests with 173 various delays might be required to determine what delay needs to be 174 used to accurately measure the SPF calculation time. 176 Some implementations may force two intervals, the SPF hold time and 177 the SPF delay, between successive SPF calculations. If an SPF hold 178 time exists, it should be subtracted from the total SPF execution 179 time. If an SPF delay exists, it should be noted in the test results. 181 4.2. Noise in the Measurement Device 183 The device on which measurements are taken (not the device under 184 test) also adds noise to the test results, primarily in the form of 185 delay in packet processing and measurement output. The largest source 186 of noise is generally the delay between the receipt of packets by the 187 measuring device and the information about the packet reaching the 188 device's output, where the event can be measured. The following steps 189 may be taken to reduce this sampling noise: 191 o Increasing the number of samples taken will generally improve 192 the tester's ability to determine what is noise, and remove it 193 from the results. 195 o Try to take time-stamp for a packet as early as possible. 196 Depending on the operating system being used on the box, one can 197 instrument the kernel to take the time-stamp when the interrupt 198 is processed. This does not eliminate the noise completely, but 199 at least reduces it. 201 o Keep the measurement box as lightly loaded as possible. 203 o Having an estimate of noise can also be useful. 205 The DUT also adds noise to the measurement. Points (a) and (c) 206 apply to the DUT as well. 208 4.3. Gaining an Understanding of the Implementation Improves Measure- 209 ments 211 While the tester will (generally) not have access to internal infor- 212 mation about the OSPF implementation being tested using [BENCHMARK], 213 the more thorough the tester's knowledge of the implementation is, 214 the more accurate the results of the tests will be. For instance, in 215 some implementations, the installation of routes in local routing 216 tables may occur while the SPF is being calculated, dramatically 217 impacting the time required to calculate the SPF. 219 4.4. Gaining an Understanding of the Tests Improves Measurements 221 One method which can be used to become familiar with the tests 222 described in [BENCHMARK] is to perform the tests on an OSPF implemen- 223 tation for which all the internal details are available, such as 224 [GATED]. While there is no assurance that any two implementations 225 will be similar, this will provide a better understanding of the 226 tests themselves. 228 5. LSA and Destination mix 230 In many OSPF benchmark tests, a generator injecting a number of LSAs 231 is called for. There are several areas in which injected LSAs can be 232 varied in testing: 234 o The number of destinations represented by the injected LSAs 236 Each destination represents a single reachable IP network; these 237 will be leaf nodes on the shortest path tree. The primary impact 238 to performance should be the time required to insert destina- 239 tions in the local routing table and handling the memory 240 required to store the data. 242 o The types of LSAs injected 244 There are several types of LSAs which would be acceptable under 245 different situations; within an area, for instance, type 1, 2, 246 3, 4, and 5 are likely to be received by a router. Within a 247 not-so-stubby area, however, type 7 LSAs would replace the type 248 5 LSAs received. These sorts of characterizations are important 249 to note in any test results. 251 o The Number of LSAs injected 253 Within any injected set of information, the number of each type 254 of LSA injected is also important. This will impact the shortest 255 path algorithms ability to handle large numbers of nodes, large 256 shortest path first trees, etc. 258 o The Order of LSA Injection 260 The order in which LSAs are injected should not favor any given 261 data structure used for storing the LSA database on the device 262 under test. For instance, AS-External LSA's have AS wide flood- 263 ing scope; any Type-5 LSA originated is immediately flooded to 264 all neighbors. However the Type-4 LSA which announces the ASBR 265 as a border router is originated in an area at SPF time (by ABRs 266 on the edge of the area in which the ASBR is). If SPF isn't 267 scheduled immediately on the ABRs originating the type 4 LSA, 268 the type-4 LSA is sent after the type-5 LSA's reach a router in 269 the adjacent area. So routes to the external destinations aren't 270 immediately added to the routers in the other areas. When the 271 routers which already have the type 5's receive the type-4 LSA, 272 all the external routes are added to the tree at the same time. 273 This timing could produce different results than a router 274 receiving a type 4 indicating the presence of a border router, 275 followed by the type 5's originated by that border router. 277 The ordering can be changed in various tests to provide insight 278 on the efficiency of storage within the DUT. Any such changes in 279 ordering should be noted in test results. 281 6. Tree Shape and the SPF Algorithm 283 The complexity of Dijkstra's algorithm depends on the data structure 284 used for storing vertices with their current minimum distances from 285 the source, with the simplest structure being a list of vertices 286 currently reachable from the source. In a simple list of vertices, 287 finding the minimum cost vertex then would take O(size of the list). 288 There will be O(n) such operations if we assume that all the vertices 289 are ultimately reachable from the source. Moreover, after the vertex 290 with min cost is found, the algorithm iterates thru all the edges of 291 the vertex and updates cost of other vertices. With an adjacency list 292 representation, this step when iterated over all the vertices, would 293 take O(E) time, with E being the number of edges in the graph. Thus, 294 overall running time is: 296 O(sum(i:1, n)(size(list at level i) + E). 298 So, everything boils down to the size (list at level i). 300 If the graph is linear: 302 root 303 | 304 1 305 | 306 2 307 | 308 3 309 | 310 4 311 | 312 5 313 | 314 6 316 and source is a vertex on the end, then size(list at level i) = 1 317 for all i. Moreover, E = n - 1. Therefore, running time is O(n). 319 If graph is a balanced binary tree: 321 root 322 / \ 323 1 2 324 / \ / \ 325 3 4 5 6 327 size(list at level i) is a little complicated. First it increases 328 by 1 at each level upto a certain number, and then goes down by 1. 329 If we assumed that tree is a complete tree (like the one in the 330 draft) with k levels (1 to k), then size(list) goes on like this: 331 1, 2, 3, 333 Then the number of edges E is still n - 1. It then turns out that 334 the run-time is O(n^2) for such a tree. 336 If graph is a complete graph (fully-connected mesh), then 337 size(list at level i) = n - i. Number of edges E = O(n^2). There- 338 fore, run-time is O(n^2). 340 So, the performance of the shortest path first algorithm used to 341 compute the best paths through the network is dependant o the con- 342 struction of the tree The best practice would be to try and make 343 any emulated network look as much like a real network as possible, 344 especially in the area of the tree depth, the meshiness of the 345 network, the number of stub links versus transit links, and the 346 number of connections and nodes to process at each level within 347 the original tree. 349 7. Topology Generation 351 As the size of networks grows, it becomes more and more difficult to 352 actually create a large scale network on which to test the properties 353 of routing protocols and their implementations. In general, network 354 emulators are used to provide emulated topologies which can be adver- 355 tised to a device with varying conditions. Route generators either 356 tend to be a specialized device, a piece of software which runs on a 357 router, or a process that runs on another operating system, such as 358 Linux or another variant of Unix. 360 Some of the characteristics of this device should be: 362 o The ability to connect to the several devices using both point- 363 to-point and broadcast high speed media. Point-to-point links 364 can be emulated with high speed Ethernet as long as there is no 365 hub or other device in between the DUT and the route generator, 366 and the link is configured as a point-to-point link within OSPF 367 [BROADCAST-P2P]. 369 o The ability to create a set of LSAs which appear to be a logi- 370 cal, realistic topology. For instance, the generator should be 371 able to mix the number of point-to-point and broadcast links 372 within the emulated topology, and should be able to inject vary- 373 ing numbers of externally reachable destinations. 375 o The ability to withdraw and add routing information into and 376 from the emulated topology to emulate links flapping. 378 o The ability to randomly order the LSAs representing the emulated 379 topology as they are advertised. 381 o The ability to log or otherwise measure the time between packets 382 transmitted and received. 384 o The ability to change the rate at which OSPF LSAs are transmit- 385 ted. 387 o The generator and the collector should be fast enough so that 388 they are not bottle necks. The devices should also have a degree 389 of granularity of measurement atleast as small as desired from 390 the test results. 392 8. Security Considerations 394 This doecument does not modify the underlying security considerations 395 in [OSPF]. 397 9. Acknowledgements 399 Thanks to Howard Berkowitz, (hcb@clark.net) and the rest of the BGP 400 benchmarking team for their support and to Kevin 401 Dubray(kdubray@juniper.net) who realized the need of this draft. 403 10. Normative References 405 [BENCHMARK] 406 Manral, V., "Benchmarking Basic OSPF Single Router Control Plane 407 Convergence", draft-bmwg-ospfconv-intraarea-08, May 2004 409 [TERM]Manral, V., "OSPF Convergence Testing Terminiology and Con- 410 cepts", draft-bmwg-ospfconv-term-08, May 2004 412 [RFC2119] 413 Bradner, S., "Key words for use in RFCs to Indicate Requirement 414 Levels", BCP 14, RFC 2119, March 1997 416 11. Informative References 418 [INTERCONNECT] 419 Bradner, S., McQuaid, J., "Benchmarking Methodology for Network 420 Interconnect Devices", RFC2544, March 1999. 422 [FIB-TERM] 423 Trotter, G., "Terminology for Forwarding Information Base (FIB) 424 based Router Performance", RFC3222, October 2001. 426 [BROADCAST-P2P] 427 Shen, Naiming, et al., "Point-to-point operation over LAN in 428 link-state routing protocols", draft-ietf-isis-igp-p2p-over- 429 lan-03.txt, August, 2003. 431 [GATED] 432 http://www.gated.org 434 12. Authors' Addresses 435 Vishwas Manral 436 Netplane Systems 437 189 Prashasan Nagar 438 Road number 72 439 Jubilee Hills 440 Hyderabad, India 442 vmanral@netplane.com 444 Russ White 445 Cisco Systems, Inc. 446 7025 Kit Creek Rd. 447 Research Triangle Park, NC 27709 449 riw@cisco.com 451 Aman Shaikh 452 University of California 453 School of Engineering 454 1156 High Street 455 Santa Cruz, CA 95064 457 aman@soe.ucsc.edu