idnits 2.17.1 

draft-ietf-bmwg-ospfconv-applicability-04.txt:
  ** The Abstract section seems to be numbered


  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Looks like you're using RFC 2026 boilerplate.  This must be updated to
     follow RFC 3978/3979, as updated by RFC 4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity. 

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard

  == The page length should not exceed 58 lines per page, but there was 10
     longer pages, the longest (page 2) being 60 lines

  == It seems as if not all pages are separated by form feeds - found 0 form
     feeds but 11 pages


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an Introduction section.

  ** The document seems to lack a Security Considerations section.

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The abstract seems to contain references ([BENCHMARK], [TERM]), which it
     shouldn't.  Please replace those with straight textual mentions of the
     documents in question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (February 2004) is 7375 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Missing reference section? 'BENCHMARK' on line 396 looks like a reference

  -- Missing reference section? 'TERM' on line 61 looks like a reference

  -- Missing reference section? 'INTERCONNECT' on line 405 looks like a
     reference

  -- Missing reference section? 'FIB-TERM' on line 409 looks like a reference

  -- Missing reference section? 'GATED' on line 418 looks like a reference

  -- Missing reference section? 'BROADCAST-P2P' on line 413 looks like a
     reference


     Summary: 8 errors (**), 0 flaws (~~), 3 warnings (==), 8 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                     Vishwas Manral
3	Internet Draft                                          Netplane Systems
4	                                                              Russ White
5	                                                           Cisco Systems
6	                                                             Aman Shaikh
7	Expiration Date: August 2004                    University of California
8	File Name: draft-ietf-bmwg-ospfconv-applicability-04.txt   February 2004

10	         Benchmarking Applicability for Basic OSPF Convergence
11	             draft-ietf-bmwg-ospfconv-applicability-04.txt

13	1. Status of this Memo

15	   This document is an Internet-Draft and is in full conformance with
16	   all provisions of Section 10 of RFC2026.

18	   Internet Drafts are working documents of the Internet Engineering
19	   Task Force (IETF), its Areas, and its Working Groups. Note that other
20	   groups may also distribute working documents as Internet Drafts.

22	   Internet Drafts are draft documents valid for a maximum of six
23	   months.  Internet Drafts may be updated, replaced, or obsoleted by
24	   other documents at any time. It is not appropriate to use Internet
25	   Drafts as reference material or to cite them other than as a "working
26	   draft" or "work in progress".

28	   The list of current Internet-Drafts can be accessed at
29	   http://www.ietf.org/ietf/1id-abstracts.txt

31	   The list of Internet-Draft Shadow Directories can be accessed at
32	   http://www.ietf.org/shadow.html.

34	2. Abstract

36	   This draft describes the applicability of [BENCHMARK] and similar
37	   work which may be done in the future. Refer to [TERM] for terminology
38	   used in this draft and [BENCHMARK]. The draft defines the advantages
39	   as well as limitations of using the method defined in [BENCHMARK],
40	   besides describing the pitfalls to avoid during measurement.

42	3. Motivation

44	   There is a growing interest in testing single router control plane
45	   convergence for routing protocols, with many people looking at
46	   testing methodologies which can provide information on how long it
47	   takes for a network to converge after various network events occur.
48	   It is important to consider the framework within which any given
49	   convergence test is executed when attempting to apply the results of
50	   the testing, since the framework can have a major impact on the
51	   results. For instance, determining when a network is converged, what
52	   parts of the router's operation are considered within the testing,
53	   and other such things will have a major impact on what apparent
54	   performance routing protocols provide.

56	   This document describes in detail the various benefits and pitfalls
57	   of tests described in [BENCHMARK]. It also explains how such
58	   measurements can be useful for providers and the research community.

60	   NOTE: The word convergence within this document refers to single
61	   router control plane convergence [TERM].

63	4. Advantages of Such Measurement

65	      o    To be able to compare the iterations of a protocol implemen-
66	           tation. It is often useful to be able to compare the perfor-
67	           mance of two iterations of a given implementation of a proto-
68	           col to determine where improvements have been made and where
69	           further improvements can be made.

71	      o    To understand, given a set of parameters (network condi-
72	           tions), how a particular implementation on a particular dev-
73	           ice is going to perform. For instance, if you were trying to
74	           decide the processing power (size of device) required in a
75	           certain location within a network, you can emulate the condi-
76	           tions which are going to exist at that point in the network
77	           and use the test described to measure the perfomance of
78	           several different routers. The results of these tests can
79	           provide one possible data point for an intelligent decision.

81	           If the device being tested is to be deployed in a running
82	           network, using routes taken from the network where the equip-
83	           ment is to be deployed rather than some generated topology in
84	           these tests will give results which are closer to the real
85	           preformance of the device. Care should be taken to emulate or
86	           take routes from the actual location in the network where the
87	           device will be (or would be) deployed. For instance, one set
88	           of routes may be taken from an ABR, one set from an area 0
89	           only router, various sets from stub area, another set from
90	           various normal areas, etc.

92	      o    To measure the performance of an OSPF implementation in a
93	           wide variety of scenarios.

95	      o    To be used as parameters in OSPF simulations by researchers.
96	           It may some times be required for certain kinds of research
97	           to measure the individual delays of each parameter within an
98	           OSPF implementation. These delays can be measured using the
99	           methods defined in [BENCHMARK].

101	      o    To help optimize certain configurable parameters. It may some
102	           times be helpful for operators to know the delay required for
103	           individual tasks so as to optimize the resource usage in the
104	           network i.e. if it is found that the processing time is x
105	           seconds on an router, it would be helpful to determine the
106	           rate at which to flood LSA's to that router so as to not
107	           overload the network.

109	5. Assumptions Made and Limitations of such measurements

111	 o    The interactions of convergence and forwarding; testing is res-
112	      tricted to events occurring within the control plane. Forwarding
113	      performance is the primary focus in [INTERCONNECT] and it is
114	      expected to be dealt with in work that ensues from [FIB-TERM].

116	 o    Duplicate LSAs are Acknowledged Immediately. A few tests rely on
117	      the property that duplicate LSA Acknowledgements are not delayed
118	      but are done immediately. However if some implementation does not
119	      acknowledge duplicate LSAs immediately on receipt, the testing
120	      methods presented in [BENCHMARK] could give inaccurate measure-
121	      ments.

123	 o    It is assumed that SPF is non-preemptive. If SPF is implemented so
124	      that it can (and will be) preempted, the SPF measurements taken in
125	      [BENCHMARK] would include the times that the SPF process is not
126	      running ([BENCHMARK] measures the total time taken for SPF to run,
127	      not the amount of time that SPF actually spends on the device's
128	      processor), thus giving inaccurate measurements.

130	 o    Some implementations may be multithreaded or use a
131	      multiprocess/multirouter model of OSPF. If because of this any of
132	      the assumptions taken in measurement are violated in such a model,
133	      it could lead to inaccurate measurements.

135	 o    The measurements resulting from the tests in [BENCHMARK] may not
136	      provide the information required to deploy a device in a large
137	      scale network. The tests described focus on individual components
138	      of an OSPF implementation's performance, and it may be difficult
139	      to combine the measurements in a way which accurately depicts a
140	      device's performance in a large scale network. Further research is
141	      required in this area.

143	 o    The measurements described in [BENCHMARK] should be used with
144	      great care when comparing two different implementations of OSPF
145	      from two different vendors. For instance, there are many other
146	      factors than convergence speed which must be taken into considera-
147	      tion when comparing different vendor's products, and it's diffi-
148	      cult to align the resources available on one device to the
149	      resources available on another device.

151	6. Observations on the Tests Described in [BENCHMARK]

153	   Some observations taken while implementing the tests described in
154	   [BENCHMARK] are noted in this section.

156	6.1. Measuring the SPF Processing Time Externally

158	   The most difficult test to perform is the external measurement of the
159	   time required to perform an SPF calculation, since the amount of time
160	   between the first LSA which indicates a topology change and the
161	   duplicate LSA is critical. If the duplicate LSA is sent too quickly,
162	   it may be received before the device under test actually begins run-
163	   ning SPF on the network change information. If the delay between the
164	   two LSAs is too long, the device under test may finish SPF processing
165	   before receiving the duplicate LSA. It is important to closely inves-
166	   tigate any delays between the receipt of an LSA and the beginning of
167	   an SPF calculation in the device under test; multiple tests with
168	   various delays might be required to determine what delay needs to be
169	   used to accurately measure the SPF calculation time.

171	   Some implementations may force two intervals, the SPF hold time and
172	   the SPF delay, between successive SPF calculations. If an SPF hold
173	   time exists, it should be subtracted from the total SPF execution
174	   time. If an SPF delay exists, it should be noted in the test results.

176	6.2. Noise in the Measurement Device

178	   The device on which measurements are taken (not the device under
179	   test) also adds noise to the test results, primarily in the form of
180	   delay in packet processing and measurement output. The largest source
181	   of noise is generally the delay between the receipt of packets by the
182	   measuring device and the information about the packet reaching the
183	   device's output, where the event can be measured. The following steps
184	   may be taken to reduce this sampling noise:

186	    o    Increasing the number of samples taken will generally improve
187	         the tester's ability to determine what is noise, and remove it
188	         from the results.

190	    o    Try to take time-stamp for a packet as early as possible.
191	         Depending on the operating system being used on the box, one
192	         can instrument the kernel to take the time-stamp when the
193	         interrupt is processed. This does not eliminate the noise com-
194	         pletely, but at least reduces it.

196	    o    Keep the measurement box as lightly loaded as possible.

198	    o    Having an estimate of noise can also be useful.

200	         The DUT also adds noise to the measurement. Points (a) and (c)
201	         apply to the DUT as well.

203	6.3. Gaining an Understanding of the Implementation Improves Measure-
204	   ments

206	   While the tester will (generally) not have access to internal infor-
207	   mation about the OSPF implementation being tested using [BENCHMARK],
208	   the more thorough the tester's knowledge of the implementation is,
209	   the more accurate the results of the tests will be. For instance, in
210	   some implementations, the installation of routes in local routing
211	   tables may occur while the SPF is being calculated, dramatically
212	   impacting the time required to calculate the SPF.

214	6.4. Gaining an Understanding of the Tests Improves Measurements

216	   One method which can be used to become familiar with the tests
217	   described in [BENCHMARK] is to perform the tests on an OSPF implemen-
218	   tation for which all the internal details are available, such as
219	   [GATED]. While there is no assurance that any two implementations
220	   will be similar, this will provide a better understanding of the
221	   tests themselves.

223	7. LSA and Destination mix

225	   In many OSPF benchmark tests, a generator injecting a number of LSAs
226	   is called for. There are several areas in which injected LSAs can be
227	   varied in testing:

229	      o    The number of destinations represented by the injected LSAs

231	           Each destination represents a single reachable IP network;
232	           these will be leaf nodes on the shortest path tree. The pri-
233	           mary impact to performance should be the time required to
234	           insert destinations in the local routing table and handling
235	           the memory required to store the data.

237	      o    The types of LSAs injected

239	           There are several types of LSAs which would be acceptable
240	           under different situations; within an area, for instance,
241	           type 1, 2, 3, 4, and 5 are likely to be received by a router.
242	           Within a not-so-stubby area, however, type 7 LSAs would
243	           replace the type 5 LSAs received. These sorts of characteri-
244	           zations are important to note in any test results.

246	      o    The Number of LSAs injected

248	           Within any injected set of information, the number of each
249	           type of LSA injected is also important. This will impact the
250	           shortest path algorithms ability to handle large numbers of
251	           nodes, large shortest path first trees, etc.

253	      o    The Order of LSA Injection

255	           The order in which LSAs are injected should not favor any
256	           given data structure used for storing the LSA database on the
257	           device under test. For instance, AS-External LSA's have AS
258	           wide flooding scope; any Type-5 LSA originated is immediately
259	           flooded to all neighbors. However the Type-4 LSA which
260	           announces the ASBR as a border router is originated in an
261	           area at SPF time (by ABRs on the edge of the area in which
262	           the ASBR is). If SPF isn't scheduled immediately on the ABRs
263	           originating the type 4 LSA, the type-4 LSA is sent after the
264	           type-5 LSA's reach a router in the adjacent area. So routes
265	           to the external destinations aren't immediately added to the
266	           routers in the other areas. When the routers which already
267	           have the type 5's receive the type-4 LSA, all the external
268	           routes are added to the tree at the same time. This timing
269	           could produce different results than a router receiving a
270	           type 4 indicating the presence of a border router, followed
271	           by the type 5's originated by that border router.

273	           The ordering can be changed in various tests to provide
274	           insight on the efficiency of storage within the DUT. Any such
275	           changes in ordering should be noted in test results.

277	8. Tree Shape and the SPF Algorithm

279	   The complexity of Dijkstra's algorithm depends on the data structure
280	   used for storing vertices with their current minimum distances from
281	   the source, with the simplest structure being a list of vertices
282	   currently reachable from the source. In a simple list of vertices,
283	   finding the minimum cost vertex then would take O(size of the list).
284	   There will be O(n) such operations if we assume that all the vertices
285	   are ultimately reachable from the source. Moreover, after the vertex
286	   with min cost is found, the algorithm iterates thru all the edges of
287	   the vertex and updates cost of other vertices. With an adjacency list
288	   representation, this step when iterated over all the vertices, would
289	   take O(E) time, with E being the number of edges in the graph. Thus,
290	   overall running time is:

292	   O(sum(i:1, n)(size(list at level i) + E).

294	   So, everything boils down to the size (list at level i).

296	   If the graph is linear:

298	           root
299	            |
300	            1
301	            |
302	            2
303	            |
304	            3
305	            |
306	            4
307	            |
308	            5
309	            |
310	            6

312	           and source is a vertex on the end, then size(list at level i)
313	           = 1 for all i. Moreover, E = n - 1. Therefore, running time
314	           is O(n).

316	           If graph is a balanced binary tree:

318	               root
319	              /    \
320	             1      2
321	            / \    / \
322	           3   4  5   6

324	           size(list at level i) is a little complicated. First it
325	           increases by 1 at each level upto a certain number, and then
326	           goes down by 1. If we assumed that tree is a complete tree
327	           (like the one in the draft) with k levels (1 to k), then
328	           size(list) goes on like this: 1, 2, 3,

330	           Then the number of edges E is still n - 1. It then turns out
331	           that the run-time is O(n^2) for such a tree.

333	           If graph is a complete graph (fully-connected mesh), then
334	           size(list at level i) = n - i.  Number of edges E = O(n^2).
335	           Therefore, run-time is O(n^2).

337	           So, the performance of the shortest path first algorithm used
338	           to compute the best paths through the network is dependant o
339	           the construction of the tree The best practice would be to
340	           try and make any emulated network look as much like a real
341	           network as possible, especially in the area of the tree
342	           depth, the meshiness of the network, the number of stub links
343	           versus transit links, and the number of connections and nodes
344	           to process at each level within the original tree.

346	9. Topology Generation

348	   As the size of networks grows, it becomes more and more difficult to
349	   actually create a large scale network on which to test the properties
350	   of routing protocols and their implementations. In general, network
351	   emulators are used to provide emulated topologies which can be adver-
352	   tised to a device with varying conditions. Route generators either
353	   tend to be a specialized device, a piece of software which runs on a
354	   router, or a process that runs on another operating system, such as
355	   Linux or another variant of Unix.

357	   Some of the characteristics of this device should be:

359	 o    The ability to connect to the several devices using both point-
360	      to-point and broadcast high speed media. Point-to-point links can
361	      be emulated with high speed Ethernet as long as there is no hub or
362	      other device in between the DUT and the route generator, and the
363	      link is configured as a point-to-point link within OSPF
364	      [BROADCAST-P2P].

366	 o    The ability to create a set of LSAs which appear to be a logical,
367	      realistic topology. For instance, the generator should be able to
368	      mix the number of point-to-point and broadcast links within the
369	      emulated topology, and should be able to inject varying numbers of
370	      externally reachable destinations.

372	 o    The ability to withdraw and add routing information into and from
373	      the emulated topology to emulate links flapping.

375	 o    The ability to randomly order the LSAs representing the emulated
376	      topology as they are advertised.

378	 o    The ability to log or otherwise measure the time between packets
379	      transmitted and received.

381	 o    The ability to change the rate at which OSPF LSAs are transmitted.

383	 o    The generator and the collector should be fast enough so that they
384	      are not bottle necks. The devices should also have a degree of
385	      granularity of measurement atleast as small as desired from the
386	      test results.

388	10. Acknowledgements

390	    Thanks to Howard Berkowitz, (hcb@clark.net) and the rest of the BGP
391	   benchmarking team for their support and to Kevin
392	   Dubray(kdubray@juniper.net) who realized the need of this draft.

394	11. Normative References

396	 [BENCHMARK]
397	      Manral, V., "Benchmarking Basic OSPF Single Router Control Plane
398	      Convergence", draft-bmwg-ospfconv-intraarea-07, November 2003

400	 [TERM]Manral, V., "OSPF Convergence Testing Terminiology and Concepts",
401	      draft-bmwg-ospfconv-term-07, January 2004

403	12. Informative References

405	[INTERCONNECT]
406	     Bradner, S., McQuaid, J., "Benchmarking Methodology for Network
407	     Interconnect Devices", RFC2544, March 1999.

409	[FIB-TERM]
410	     Trotter, G., "Terminology for Forwarding Information Base (FIB)
411	     based Router Performance", RFC3222, October 2001.

413	[BROADCAST-P2P]
414	     Shen, Naiming, et al., "Point-to-point operation over LAN in link-
415	     state routing protocols", draft-ietf-isis-igp-p2p-over-lan-03.txt,
416	     August, 2003.

418	[GATED]
419	     http://www.gated.org

421	13. Authors' Addresses
422	      Vishwas Manral
423	      Netplane Systems
424	      189 Prashasan Nagar
425	      Road number 72
426	      Jubilee Hills
427	      Hyderabad, India

429	      vmanral@netplane.com

431	      Russ White
432	      Cisco Systems, Inc.
433	      7025 Kit Creek Rd.

435	      Research Triangle Park, NC 27709

437	      riw@cisco.com

439	      Aman Shaikh
440	      University of California
441	      School of Engineering
442	      1156 High Street
443	      Santa Cruz, CA  95064

445	      aman@soe.ucsc.edu