idnits 2.17.1 

draft-morton-bmwg-virtual-net-03.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (February 2, 2015) is 3372 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Unused Reference: 'RFC2330' is defined on line 430, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC2679' is defined on line 437, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC2680' is defined on line 440, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC2681' is defined on line 443, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC3393' is defined on line 446, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC3432' is defined on line 450, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC5357' is defined on line 462, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC5905' is defined on line 466, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC6248' is defined on line 481, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC6390' is defined on line 485, but no explicit
     reference was found in the text

  ** Obsolete normative reference: RFC 2679 (Obsoleted by RFC 7679)

  ** Obsolete normative reference: RFC 2680 (Obsoleted by RFC 7680)


     Summary: 2 errors (**), 0 flaws (~~), 11 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                          A. Morton
3	Internet-Draft                                                 AT&T Labs
4	Intended status: Informational                          February 2, 2015
5	Expires: August 6, 2015

7	  Considerations for Benchmarking Virtual Network Functions and Their
8	                             Infrastructure
9	                    draft-morton-bmwg-virtual-net-03

11	Abstract

13	   Benchmarking Methodology Working Group has traditionally conducted
14	   laboratory characterization of dedicated physical implementations of
15	   internetworking functions.  This memo investigates additional
16	   considerations when network functions are virtualized and performed
17	   in commodity off-the-shelf hardware.

19	   NOTES:

21	   3.4 Added inter-actions/dependencies within resource domains

23	   4.3 Added new metrics for characterization: PDV, reordering, mean
24	   delay, etc.

26	   4.4 Resolved the question of capacity and the 3x3 Matrix

28	Requirements Language

30	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
31	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
32	   document are to be interpreted as described in RFC 2119 [RFC2119].

34	Status of This Memo

36	   This Internet-Draft is submitted in full conformance with the
37	   provisions of BCP 78 and BCP 79.

39	   Internet-Drafts are working documents of the Internet Engineering
40	   Task Force (IETF).  Note that other groups may also distribute
41	   working documents as Internet-Drafts.  The list of current Internet-
42	   Drafts is at http://datatracker.ietf.org/drafts/current/.

44	   Internet-Drafts are draft documents valid for a maximum of six months
45	   and may be updated, replaced, or obsoleted by other documents at any
46	   time.  It is inappropriate to use Internet-Drafts as reference
47	   material or to cite them other than as "work in progress."
48	   This Internet-Draft will expire on August 6, 2015.

50	Copyright Notice

52	   Copyright (c) 2015 IETF Trust and the persons identified as the
53	   document authors.  All rights reserved.

55	   This document is subject to BCP 78 and the IETF Trust's Legal
56	   Provisions Relating to IETF Documents
57	   (http://trustee.ietf.org/license-info) in effect on the date of
58	   publication of this document.  Please review these documents
59	   carefully, as they describe your rights and restrictions with respect
60	   to this document.  Code Components extracted from this document must
61	   include Simplified BSD License text as described in Section 4.e of
62	   the Trust Legal Provisions and are provided without warranty as
63	   described in the Simplified BSD License.

65	Table of Contents

67	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
68	   2.  Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . .   3
69	   3.  Considerations for Hardware and Testing . . . . . . . . . . .   4
70	     3.1.  Hardware Components . . . . . . . . . . . . . . . . . . .   4
71	     3.2.  Configuration Parameters  . . . . . . . . . . . . . . . .   4
72	     3.3.  Testing Strategies  . . . . . . . . . . . . . . . . . . .   5
73	     3.4.  Attention to Shared Resources . . . . . . . . . . . . . .   5
74	   4.  Benchmarking Considerations . . . . . . . . . . . . . . . . .   6
75	     4.1.  Comparison with Physical Network Functions  . . . . . . .   6
76	     4.2.  Continued Emphasis on Black-Box Benchmarks  . . . . . . .   6
77	     4.3.  New Benchmarks and Related Metrics  . . . . . . . . . . .   7
78	     4.4.  Assessment of Benchmark Coverage  . . . . . . . . . . . .   7
79	   5.  Security Considerations . . . . . . . . . . . . . . . . . . .   9
80	   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   9
81	   7.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .   9
82	   8.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  10
83	     8.1.  Normative References  . . . . . . . . . . . . . . . . . .  10
84	     8.2.  Informative References  . . . . . . . . . . . . . . . . .  11
85	   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  11

87	1.  Introduction

89	   Benchmarking Methodology Working Group (BMWG) has traditionally
90	   conducted laboratory characterization of dedicated physical
91	   implementations of internetworking functions.  The Black-box
92	   Benchmarks of Throughput, Latency, Forwarding Rates and others have
93	   served our industry for many years.  [RFC1242] and [RFC2544] are the
94	   cornerstones of the work.

96	   An emerging set of service provider and vendor development goals is
97	   to reduce costs while increasing flexibility of network devices, and
98	   drastically accelerate their deployment.  Network Function
99	   Virtualization (NFV) has the promise to achieve these goals, and
100	   therefore has garnered much attention.  It now seems certain that
101	   some network functions will be virtualized following the success of
102	   cloud computing and virtual desktops supported by sufficient network
103	   path capacity, performance, and widespread deployment; many of the
104	   same techniques will help achieve NFV.

106	   See http://www.etsi.org/technologies-clusters/technologies/nfv for
107	   more background, for example, the white papers there may be a useful
108	   starting place.  The Performance and Portability Best Practices
109	   [NFV.PER001] are particularly relevant to BMWG.  There are currently
110	   work-in-progress documents available in the Open Area
111	   http://docbox.etsi.org/ISG/NFV/Open/Latest_Drafts/ including drafts
112	   describing Infrastructure aspects and service quality.

114	2.  Scope

116	   BMWG will consider the new topic of Virtual Network Functions and
117	   related Infrastructure to ensure that common issues are recognized
118	   from the start, using background materials from industry and SDOs
119	   (e.g., IETF, ETSI NFV).

121	   This memo investigates additional methodological considerations
122	   necessary when benchmarking VNF instantiated and hosted in commodity
123	   off-the-shelf (COTS) hardware.  An essential consideration is
124	   benchmarking both physical and virtual network functions, thereby
125	   allowing direct comparison.

127	   A clearly related goal: the benchmarks for the capacity of COTS to
128	   host a plurality of VNF instances should be investigated.  Existing
129	   networking technology benchmarks will also be considered for
130	   adaptation to NFV and closely associated technologies.

132	   A non-goal is any overlap with traditional computer benchmark
133	   development and their specific metrics (SPECmark suites such as
134	   SPECCPU).

136	   A colossal non-goal is any form of architecture development related
137	   to NFV and associated technologies in BMWG, as has been the case
138	   since BMWG began work in 1989.

140	3.  Considerations for Hardware and Testing

142	   This section lists the new considerations which must be addressed to
143	   benchmark VNF(s) and their supporting infrastructure.

145	3.1.  Hardware Components

147	   New Hardware devices will become part of the test set-up.

149	   1.  High volume server platforms (COTS, possibly with virtual
150	       technology enhancements).

152	   2.  Storage systems with large capacity, high speed, and high
153	       reliability.

155	   3.  Network Interface ports specially designed for efficient service
156	       of many virtual NICs.

158	   4.  High capacity Ethernet Switches.

160	   Labs conducting comparisons of different VNFs may be able to use the
161	   same hardware platform over many studies, until the steady march of
162	   innovations overtakes their capabilities (as happens with the lab's
163	   traffic generation and testing devices today).

165	3.2.  Configuration Parameters

167	   It will be necessary to configure and document the settings for the
168	   entire COTS platform, including:

170	   o  number of server blades (shelf occupation)

172	   o  CPUs

174	   o  caches

176	   o  storage system

178	   o  I/O

180	   as well as configurations that support the devices which host the VNF
181	   itself:

183	   o  Hypervisor

185	   o  Virtual Machine

187	   o  Infrastructure Virtual Network
188	   and finally, the VNF itself, with items such as:

190	   o  specific function being implemented in VNF

192	   o  number of VNF components in the service function chain

194	   o  number of physical interfaces and links transited in the service
195	      function chain

197	3.3.  Testing Strategies

199	   The concept of characterizing performance at capacity limits may
200	   change.  For example:

202	   1.  It may be more representative of system capacity to characterize
203	       the case where Virtual Machines (VM, hosting the VNF) are
204	       operating at 50% Utilization, and therefore sharing the "real"
205	       processing power across many VMs.

207	   2.  Another important case stems from the need for partitioning
208	       functions.  A noisy neighbor (VM hosting a VNF in an infinite
209	       loop) would ideally be isolated and the performance of other VMs
210	       would continue according to their specifications.

212	   3.  System errors will likely occur as transients, implying a
213	       distribution of performance characteristics with a long tail
214	       (like latency), leading to the need for longer-term tests of each
215	       set of configuration and test parameters.

217	   4.  The desire for Elasticity and flexibility among network functions
218	       will include tests where there is constant flux in the VM
219	       instances.  Requests for new VMs and Releases for VMs hosting
220	       VNFs no longer needed would be an normal operational condition.

222	   5.  All physical things can fail, and benchmarking efforts can also
223	       examine recovery aided by the virtual architecture with different
224	       approaches to resiliency.

226	3.4.  Attention to Shared Resources

228	   Since many components of the new NFV Infrastructure are virtual, test
229	   set-up design must have prior knowledge of inter-actions/dependencies
230	   within the various resource domains in the System Under Test (SUT).
231	   For example, a virtual machine performing the role of a traditional
232	   tester function such as generating and/or receiving traffic should
233	   avoid sharing any SUT resources with the Device Under Test DUT.
234	   Otherwise, the results will have unexpected dependencies not
235	   encountered in physical device benchmarking.  The shared-resource
236	   aspect of test design remains one of the critical challenges to
237	   overcome in a reasonable way to produce useful results.

239	4.  Benchmarking Considerations

241	   This section discusses considerations related to Benchmarks
242	   applicable to VNFs and their associated technologies.

244	4.1.  Comparison with Physical Network Functions

246	   In order to compare the performance of virtual designs and
247	   implementations with their physical counterparts, identical
248	   benchmarks must be used.  Since BMWG has developed specifications for
249	   many network functions already, there will be re-use of existing
250	   benchmarks through references, while allowing for the possibility of
251	   benchmark curation during development of new methodologies.
252	   Consideration should be given to quantifying the number of parallel
253	   VNFs required to achieve comparable performance with a given physical
254	   device, or whether some limit of scale was reached before the VNFs
255	   could achieve the comparable level.

257	4.2.  Continued Emphasis on Black-Box Benchmarks

259	   When the network functions under test are based on Open Source code,
260	   there may be a tendency to rely on internal measurements to some
261	   extent, especially when the externally-observable phenomena only
262	   support an inference of internal events (such as routing protocol
263	   convergence).  However, external observations remain essential as the
264	   basis for Benchmarks.  Internal observations with fixed specification
265	   and interpretation may be provided in parallel, to assist the
266	   development of operations procedures when the technology is deployed,
267	   for example.  Internal metrics and measurements from Open Source
268	   implementations may be the only direct source of performance results
269	   in a desired dimension, but corroborating external observations are
270	   still required to assure the integrity of measurement discipline was
271	   maintained for all reported results.

273	   A related aspect of benchmark development is where the scope includes
274	   multiple approaches to a common function under the same benchmark.
275	   For example, there are many ways to arrange for activation of a
276	   network path between interface points and the activation times can be
277	   compared if the start-to-stop activation interval has a generic and
278	   unambiguous definition.  Thus, generic benchmark definitions are
279	   preferred over technology/protocol specific definitions where
280	   possible.

282	4.3.  New Benchmarks and Related Metrics

284	   There will be new classes of benchmarks needed for network design and
285	   assistance when developing operational practices (possibly automated
286	   management and orchestration of deployment scale).  Examples follow
287	   in the paragraphs below, many of which are prompted by the goals of
288	   increased elasticity and flexibility of the network functions, along
289	   with accelerated deployment times.

291	   Time to deploy VNFs: In cases where the COTS hardware is already
292	   deployed and ready for service, it is valuable to know the response
293	   time when a management system is tasked with "standing-up" 100's of
294	   virtual machines and the VNFs they will host.

296	   Time to migrate VNFs: In cases where a rack or shelf of hardware must
297	   be removed from active service, it is valuable to know the response
298	   time when a management system is tasked with "migrating" some number
299	   of virtual machines and the VNFs they currently host to alternate
300	   hardware that will remain in-service.

302	   Time to create a virtual network in the COTS infrastructure: This is
303	   a somewhat simplified version of existing benchmarks for convergence
304	   time, in that the process is initiated by a request from (centralized
305	   or distributed) control, rather than inferred from network events
306	   (link failure).  The successful response time would remain dependent
307	   on dataplane observations to confirm that the network is ready to
308	   perform.

310	   Also, it appears to be valuable to measure traditional packet
311	   transfer performance metrics during the assessment of traditional and
312	   new benchmarks, including metrics that may be used to support service
313	   engineering such as the Spatial Composition metrics found in
314	   [RFC6049].  Examples include Mean one-way delay in section 4.1 of
315	   [RFC6049], Packet Delay Variation (PDV) in [RFC5481], and Packet
316	   Reordering [RFC4737] [RFC4689].

318	4.4.  Assessment of Benchmark Coverage

320	   It can be useful to organize benchmarks according to their applicable
321	   lifecycle stage and the performance criteria they intend to assess.
322	   The table below provides a way to organize benchmarks such that there
323	   is a clear indication of coverage for the intersection of lifecycle
324	   stages and performance criteria.

326	   |----------------------------------------------------------|
327	   |               |             |            |               |
328	   |               |   SPEED     |  ACCURACY  |  RELIABILITY  |
329	   |               |             |            |               |
330	   |----------------------------------------------------------|
331	   |               |             |            |               |
332	   |  Activation   |             |            |               |
333	   |               |             |            |               |
334	   |----------------------------------------------------------|
335	   |               |             |            |               |
336	   |  Operation    |             |            |               |
337	   |               |             |            |               |
338	   |----------------------------------------------------------|
339	   |               |             |            |               |
340	   | De-activation |             |            |               |
341	   |               |             |            |               |
342	   |----------------------------------------------------------|

344	   For example, the "Time to deploy VNFs" benchmark described above
345	   would be placed in the intersection of Activation and Speed, making
346	   it clear that there are other potential performance criteria to
347	   benchmark, such as the "percentage of unsuccessful VM/VNF stand-ups"
348	   in a set of 100 attempts.  This example emphasizes that the
349	   Activation and De-activation lifecycle stages are key areas for NFV
350	   and related infrastructure, and encourage expansion beyond
351	   traditional benchmarks for normal operation.  Thus, reviewing the
352	   benchmark coverage using this table (sometimes called the 3x3 matrix)
353	   can be a worthwhile exercise in BMWG.

355	   In one of the first applications of the 3x3 matrix on BMWG, we
356	   discovered that metrics on measured size, capacity, or scale do not
357	   easily match one of the three columns above.  There are three
358	   possibilities to resolve this:

360	   o  Add a column, Scaleability, but then it would be expected to have
361	      metrics in most of the Activation, Operation, and De-activation
362	      functions (which may not be the case).

364	   o  Include Scalability under Reliability: This fits the user
365	      perspective of the 3x3 matrix because the size or capacity of a
366	      device contributes to the likelihood that a request will be
367	      blocked, or that operation will be un-reliable when operating in
368	      an overload state.

370	   o  Keep size, capacity, and scale metrics separate from the 3x3
371	      matrix.

373	   After some discussion, including some of the original developers of
374	   the 3x3 matrix, it is suggested to keep capacity metrics separate
375	   from the 3x3 matrix and list them separately.  This approach
376	   encourages use of the 3x3 matrix to organize reports of results,
377	   where the capacity at which the various metrics were measured would
378	   be included in the title of the matrix (and results for multiple
379	   capacities would result in separate 3x3 matrices, if there were
380	   sufficient measurements/results to organize in that way).

382	5.  Security Considerations

384	   Benchmarking activities as described in this memo are limited to
385	   technology characterization of a Device Under Test/System Under Test
386	   (DUT/SUT) using controlled stimuli in a laboratory environment, with
387	   dedicated address space and the constraints specified in the sections
388	   above.

390	   The benchmarking network topology will be an independent test setup
391	   and MUST NOT be connected to devices that may forward the test
392	   traffic into a production network, or misroute traffic to the test
393	   management network.

395	   Further, benchmarking is performed on a "black-box" basis, relying
396	   solely on measurements observable external to the DUT/SUT.

398	   Special capabilities SHOULD NOT exist in the DUT/SUT specifically for
399	   benchmarking purposes.  Any implications for network security arising
400	   from the DUT/SUT SHOULD be identical in the lab and in production
401	   networks.

403	6.  IANA Considerations

405	   No IANA Action is requested at this time.

407	7.  Acknowledgements

409	   The author acknowledges an encouraging conversation on this topic
410	   with Mukhtiar Shaikh and Ramki Krishnan in November 2013.  Bhavani
411	   Parise and Ilya Varlashkin have provided useful suggestions to expand
412	   these considerations.  Bhuvaneswaran Vengainathan has already tried
413	   the 3x3 matrix with SDN controller draft, and contributed to many
414	   discussions.  Scott Bradner quickly pointed out shared resource
415	   dependencies in an early vSwitch measurement proposal, and the topic
416	   was included here as a key consideration.

418	8.  References

420	8.1.  Normative References

422	   [NFV.PER001]
423	              "Network Function Virtualization: Performance and
424	              Portability Best Practices", Group Specification ETSI GS
425	              NFV-PER 001 V1.1.1 (2014-06), June 2014.

427	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
428	              Requirement Levels", BCP 14, RFC 2119, March 1997.

430	   [RFC2330]  Paxson, V., Almes, G., Mahdavi, J., and M. Mathis,
431	              "Framework for IP Performance Metrics", RFC 2330, May
432	              1998.

434	   [RFC2544]  Bradner, S. and J. McQuaid, "Benchmarking Methodology for
435	              Network Interconnect Devices", RFC 2544, March 1999.

437	   [RFC2679]  Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way
438	              Delay Metric for IPPM", RFC 2679, September 1999.

440	   [RFC2680]  Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way
441	              Packet Loss Metric for IPPM", RFC 2680, September 1999.

443	   [RFC2681]  Almes, G., Kalidindi, S., and M. Zekauskas, "A Round-trip
444	              Delay Metric for IPPM", RFC 2681, September 1999.

446	   [RFC3393]  Demichelis, C. and P. Chimento, "IP Packet Delay Variation
447	              Metric for IP Performance Metrics (IPPM)", RFC 3393,
448	              November 2002.

450	   [RFC3432]  Raisanen, V., Grotefeld, G., and A. Morton, "Network
451	              performance measurement with periodic streams", RFC 3432,
452	              November 2002.

454	   [RFC4689]  Poretsky, S., Perser, J., Erramilli, S., and S. Khurana,
455	              "Terminology for Benchmarking Network-layer Traffic
456	              Control Mechanisms", RFC 4689, October 2006.

458	   [RFC4737]  Morton, A., Ciavattone, L., Ramachandran, G., Shalunov,
459	              S., and J. Perser, "Packet Reordering Metrics", RFC 4737,
460	              November 2006.

462	   [RFC5357]  Hedayat, K., Krzanowski, R., Morton, A., Yum, K., and J.
463	              Babiarz, "A Two-Way Active Measurement Protocol (TWAMP)",
464	              RFC 5357, October 2008.

466	   [RFC5905]  Mills, D., Martin, J., Burbank, J., and W. Kasch, "Network
467	              Time Protocol Version 4: Protocol and Algorithms
468	              Specification", RFC 5905, June 2010.

470	8.2.  Informative References

472	   [RFC1242]  Bradner, S., "Benchmarking terminology for network
473	              interconnection devices", RFC 1242, July 1991.

475	   [RFC5481]  Morton, A. and B. Claise, "Packet Delay Variation
476	              Applicability Statement", RFC 5481, March 2009.

478	   [RFC6049]  Morton, A. and E. Stephan, "Spatial Composition of
479	              Metrics", RFC 6049, January 2011.

481	   [RFC6248]  Morton, A., "RFC 4148 and the IP Performance Metrics
482	              (IPPM) Registry of Metrics Are Obsolete", RFC 6248, April
483	              2011.

485	   [RFC6390]  Clark, A. and B. Claise, "Guidelines for Considering New
486	              Performance Metric Development", BCP 170, RFC 6390,
487	              October 2011.

489	Author's Address

491	   Al Morton
492	   AT&T Labs
493	   200 Laurel Avenue South
494	   Middletown,, NJ  07748
495	   USA

497	   Phone: +1 732 420 1571
498	   Fax:   +1 732 368 1192
499	   Email: acmorton@att.com
500	   URI:   http://home.comcast.net/~acmacm/