idnits 2.17.1 

draft-morton-bmwg-virtual-net-02.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (October 26, 2014) is 3469 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Unused Reference: 'RFC2330' is defined on line 395, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC2679' is defined on line 402, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC2680' is defined on line 405, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC2681' is defined on line 408, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC3393' is defined on line 411, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC3432' is defined on line 415, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC4737' is defined on line 419, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC5357' is defined on line 423, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC5905' is defined on line 427, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC5481' is defined on line 436, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC6248' is defined on line 439, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC6390' is defined on line 443, but no explicit
     reference was found in the text

  ** Obsolete normative reference: RFC 2679 (Obsoleted by RFC 7679)

  ** Obsolete normative reference: RFC 2680 (Obsoleted by RFC 7680)


     Summary: 2 errors (**), 0 flaws (~~), 13 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                          A. Morton
3	Internet-Draft                                                 AT&T Labs
4	Intended status: Informational                          October 26, 2014
5	Expires: April 29, 2015

7	  Considerations for Benchmarking Virtual Network Functions and Their
8	                             Infrastructure
9	                    draft-morton-bmwg-virtual-net-02

11	Abstract

13	   Benchmarking Methodology Working Group has traditionally conducted
14	   laboratory characterization of dedicated physical implementations of
15	   internetworking functions.  This memo investigates additional
16	   considerations when network functions are virtualized and performed
17	   in commodity off-the-shelf hardware.

19	Requirements Language

21	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
22	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
23	   document are to be interpreted as described in RFC 2119 [RFC2119].

25	Status of This Memo

27	   This Internet-Draft is submitted in full conformance with the
28	   provisions of BCP 78 and BCP 79.

30	   Internet-Drafts are working documents of the Internet Engineering
31	   Task Force (IETF).  Note that other groups may also distribute
32	   working documents as Internet-Drafts.  The list of current Internet-
33	   Drafts is at http://datatracker.ietf.org/drafts/current/.

35	   Internet-Drafts are draft documents valid for a maximum of six months
36	   and may be updated, replaced, or obsoleted by other documents at any
37	   time.  It is inappropriate to use Internet-Drafts as reference
38	   material or to cite them other than as "work in progress."

40	   This Internet-Draft will expire on April 29, 2015.

42	Copyright Notice

44	   Copyright (c) 2014 IETF Trust and the persons identified as the
45	   document authors.  All rights reserved.

47	   This document is subject to BCP 78 and the IETF Trust's Legal
48	   Provisions Relating to IETF Documents
49	   (http://trustee.ietf.org/license-info) in effect on the date of
50	   publication of this document.  Please review these documents
51	   carefully, as they describe your rights and restrictions with respect
52	   to this document.  Code Components extracted from this document must
53	   include Simplified BSD License text as described in Section 4.e of
54	   the Trust Legal Provisions and are provided without warranty as
55	   described in the Simplified BSD License.

57	Table of Contents

59	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   2
60	   2.  Scope . . . . . . . . . . . . . . . . . . . . . . . . . . . .   3
61	   3.  Considerations for Hardware and Testing . . . . . . . . . . .   3
62	     3.1.  Hardware Components . . . . . . . . . . . . . . . . . . .   3
63	     3.2.  Configuration Parameters  . . . . . . . . . . . . . . . .   4
64	     3.3.  Testing Strategies  . . . . . . . . . . . . . . . . . . .   4
65	   4.  Benchmarking Considerations . . . . . . . . . . . . . . . . .   5
66	     4.1.  Comparison with Physical Network Functions  . . . . . . .   5
67	     4.2.  Continued Emphasis on Black-Box Benchmarks  . . . . . . .   5
68	     4.3.  New Benchmarks  . . . . . . . . . . . . . . . . . . . . .   6
69	     4.4.  Assessment of Benchmark Coverage  . . . . . . . . . . . .   7
70	   5.  Security Considerations . . . . . . . . . . . . . . . . . . .   8
71	   6.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .   8
72	   7.  Acknowledgements  . . . . . . . . . . . . . . . . . . . . . .   8
73	   8.  References  . . . . . . . . . . . . . . . . . . . . . . . . .   9
74	     8.1.  Normative References  . . . . . . . . . . . . . . . . . .   9
75	     8.2.  Informative References  . . . . . . . . . . . . . . . . .  10
76	   Author's Address  . . . . . . . . . . . . . . . . . . . . . . . .  10

78	1.  Introduction

80	   Benchmarking Methodology Working Group (BMWG) has traditionally
81	   conducted laboratory characterization of dedicated physical
82	   implementations of internetworking functions.  The Black-box
83	   Benchmarks of Throughput, Latency, Forwarding Rates and others have
84	   served our industry for many years.  [RFC1242] and [RFC2544] are the
85	   cornerstones of the work.

87	   An emerging set of service provider and vendor development goals is
88	   to reduce costs while increasing flexibility of network devices, and
89	   drastically accelerate their deployment.  Network Function
90	   Virtualization (NFV) has the promise to achieve these goals, and
91	   therefore has garnered much attention.  It now seems certain that
92	   some network functions will be virtualized following the success of
93	   cloud computing and virtual desktops supported by sufficient network
94	   path capacity, performance, and widespread deployment; many of the
95	   same techniques will help achieve NFV.

97	   See http://www.etsi.org/technologies-clusters/technologies/nfv for
98	   more background, for example, the white papers there may be a useful
99	   starting place.  The Performance and Portability Best Practices
100	   [NFV.PER001] are particularly relevant to BMWG.  There are currently
101	   work-in-progress documents available in the Open Area
102	   http://docbox.etsi.org/ISG/NFV/Open/Latest_Drafts/ including drafts
103	   describing Infrastructure aspects and service quality.

105	2.  Scope

107	   BMWG will consider the new topic of Virtual Network Functions and
108	   related Infrastructure to ensure that common issues are recognized
109	   from the start, using background materials from industry and SDOs
110	   (e.g., IETF, ETSI NFV).

112	   This memo investigates additional methodological considerations
113	   necessary when benchmarking VNF instantiated and hosted in commodity
114	   off-the-shelf (COTS) hardware.  An essential consideration is
115	   benchmarking both physical and virtual network functions, thereby
116	   allowing direct comparison.

118	   A clearly related goal: the benchmarks for the capacity of COTS to
119	   host a plurality of VNF instances should be investigated.  Existing
120	   networking technology benchmarks will also be considered for
121	   adaptation to NFV and closely associated technologies.

123	   A non-goal is any overlap with traditional computer benchmark
124	   development and their specific metrics (SPECmark suites such as
125	   SPECCPU).

127	   A colossal non-goal is any form of architecture development related
128	   to NFV and associated technologies in BMWG, as has been the case
129	   since BMWG began work in 1989.

131	3.  Considerations for Hardware and Testing

133	   This section lists the new considerations which must be addressed to
134	   benchmark VNF(s) and their supporting infrastructure.

136	3.1.  Hardware Components

138	   New Hardware devices will become part of the test set-up.

140	   1.  High volume server platforms (COTS, possibly with virtual
141	       technology enhancements).

143	   2.  Storage systems with large capacity, high speed, and high
144	       reliability.

146	   3.  Network Interface ports specially designed for efficient service
147	       of many virtual NICs.

149	   4.  High capacity Ethernet Switches.

151	   Labs conducting comparisons of different VNFs may be able to use the
152	   same hardware platform over many studies, until the steady march of
153	   innovations overtakes their capabilities (as happens with the lab's
154	   traffic generation and testing devices today).

156	3.2.  Configuration Parameters

158	   It will be necessary to configure and document the settings for the
159	   entire COTS platform, including:

161	   o  number of server blades (shelf occupation)

163	   o  CPUs

165	   o  caches

167	   o  storage system

169	   o  I/O

171	   as well as configurations that support the devices which host the VNF
172	   itself:

174	   o  Hypervisor

176	   o  Virtual Machine

178	   o  Infrastructure Virtual Network

180	   and finally, the VNF itself, with items such as:

182	   o  specific function being implemented in VNF

184	   o  number of VNF components in the service function chain

186	   o  number of physical interfaces and links transited in the service
187	      function chain

189	3.3.  Testing Strategies

191	   The concept of characterizing performance at capacity limits may
192	   change.  For example:

194	   1.  It may be more representative of system capacity to characterize
195	       the case where Virtual Machines (VM, hosting the VNF) are
196	       operating at 50% Utilization, and therefore sharing the "real"
197	       processing power across many VMs.

199	   2.  Another important case stems from the need for partitioning
200	       functions.  A noisy neighbor (VM hosting a VNF in an infinite
201	       loop) would ideally be isolated and the performance of other VMs
202	       would continue according to their specifications.

204	   3.  System errors will likely occur as transients, implying a
205	       distribution of performance characteristics with a long tail
206	       (like latency), leading to the need for longer-term tests of each
207	       set of configuration and test parameters.

209	   4.  The desire for Elasticity and flexibility among network functions
210	       will include tests where there is constant flux in the VM
211	       instances.  Requests for new VMs and Releases for VMs hosting
212	       VNFs no longer needed would be an normal operational condition.

214	   5.  All physical things can fail, and benchmarking efforts can also
215	       examine recovery aided by the virtual architecture with different
216	       approaches to resiliency.

218	4.  Benchmarking Considerations

220	   This section discusses considerations related to Benchmarks
221	   applicable to VNFs and their associated technologies.

223	4.1.  Comparison with Physical Network Functions

225	   In order to compare the performance of virtual designs and
226	   implementations with their physical counterparts, identical
227	   benchmarks must be used.  Since BMWG has developed specifications for
228	   many network functions already, there will be re-use of existing
229	   benchmarks through references, while allowing for the possibility of
230	   benchmark curation during development of new methodologies.
231	   Consideration should be given to quantifying the number of parallel
232	   VNFs required to achieve comparable performance with a given physical
233	   device, or whether some limit of scale was reached before the VNFs
234	   could achieve the comparable level.

236	4.2.  Continued Emphasis on Black-Box Benchmarks

238	   When the network functions under test are based on Open Source code,
239	   there may be a tendency to rely on internal measurements to some
240	   extent, especially when the externally-observable phenomena only
241	   support an inference of internal events (such as routing protocol
242	   convergence).  However, external observations remain essential as the
243	   basis for Benchmarks.  Internal observations with fixed specification
244	   and interpretation may be provided in parallel, to assist the
245	   development of operations procedures when the technology is deployed,
246	   for example.  Internal metrics and measurements from Open Source
247	   implementations may be the only direct source of performance results
248	   in a desired dimension, but corroborating external observations are
249	   still required to assure the integrity of measurement discipline was
250	   maintained for all reported results.

252	   A related aspect of benchmark development is where the scope includes
253	   multiple approaches to a common function under the same benchmark.
254	   For example, there are many ways to arrange for activation of a
255	   network path between interface points and the activation times can be
256	   compared if the start-to-stop activation interval has a generic and
257	   unambiguous definition.  Thus, generic benchmark definitions are
258	   preferred over technology/protocol specific definitions where
259	   possible.

261	4.3.  New Benchmarks

263	   There will be new classes of benchmarks needed for network design and
264	   assistance when developing operational practices (possibly automated
265	   management and orchestration of deployment scale).  Examples follow
266	   in the paragraphs below, many of which are prompted by the goals of
267	   increased elasticity and flexibility of the network functions, along
268	   with accelerated deployment times.

270	   Time to deploy VNFs: In cases where the COTS hardware is already
271	   deployed and ready for service, it is valuable to know the response
272	   time when a management system is tasked with "standing-up" 100's of
273	   virtual machines and the VNFs they will host.

275	   Time to migrate VNFs: In cases where a rack or shelf of hardware must
276	   be removed from active service, it is valuable to know the response
277	   time when a management system is tasked with "migrating" some number
278	   of virtual machines and the VNFs they currently host to alternate
279	   hardware that will remain in-service.

281	   Time to create a virtual network in the COTS infrastructure: This is
282	   a somewhat simplified version of existing benchmarks for convergence
283	   time, in that the process is initiated by a request from (centralized
284	   or distributed) control, rather than inferred from network events
285	   (link failure).  The successful response time would remain dependent
286	   on dataplane observations to confirm that the network is ready to
287	   perform.

289	4.4.  Assessment of Benchmark Coverage

291	   It can be useful to organize benchmarks according to their applicable
292	   lifecycle stage and the performance criteria they intend to assess.
293	   The table below provides a way to organize benchmarks such that there
294	   is a clear indication of coverage for the intersection of lifecycle
295	   stages and performance criteria.

297	   |----------------------------------------------------------|
298	   |               |             |            |               |
299	   |               |   SPEED     |  ACCURACY  |  RELIABILITY  |
300	   |               |             |            |               |
301	   |----------------------------------------------------------|
302	   |               |             |            |               |
303	   |  Activation   |             |            |               |
304	   |               |             |            |               |
305	   |----------------------------------------------------------|
306	   |               |             |            |               |
307	   |  Operation    |             |            |               |
308	   |               |             |            |               |
309	   |----------------------------------------------------------|
310	   |               |             |            |               |
311	   | De-activation |             |            |               |
312	   |               |             |            |               |
313	   |----------------------------------------------------------|

315	   For example, the "Time to deploy VNFs" benchmark described above
316	   would be placed in the intersection of Activation and Speed, making
317	   it clear that there are other potential performance criteria to
318	   benchmark, such as the "percentage of unsuccessful VM/VNF stand-ups"
319	   in a set of 100 attempts.  This example emphasizes that the
320	   Activation and De-activation lifecycle stages are key areas for NFV
321	   and related infrastructure, and encourage expansion beyond
322	   traditional benchmarks for normal operation.  Thus, reviewing the
323	   benchmark coverage using this table (sometimes called the 3x3 matrix)
324	   can be a worthwhile exercise in BMWG.

326	   Comment/Discussion:

328	   In one of the first applications of the 3x3 matrix on BMWG, we
329	   discovered that metrics on measured size, capacity, or scale do not
330	   easily match one of the three columns above.  There are three
331	   alternatives to resolve this:

333	   1.  Add a column, Scaleability, but then it would be expected to have
334	       metrics in most of the Activation, Operation, and De-activation
335	       functions (which may not be the case).

337	   2.  Include Scalability under Reliability: This fits the user
338	       perspective of the 3x3 matrix because the size or capacity of a
339	       device contributes to the likelihood that a request will be
340	       blocked, or that operation will be un-reliable when operating in
341	       an overload state.

343	   3.  Keep size, capacity, and scale metrics separate from the 3x3
344	       matrix, and present the results for key benchmarks in different
345	       versions of the matrix, and the titles of each matrix provide the
346	       details of configuration and scale.

348	   Alternative 3 would address a discussion comment from IETF-90, so it
349	   seems to cover a range of wanted features.

351	5.  Security Considerations

353	   Benchmarking activities as described in this memo are limited to
354	   technology characterization of a Device Under Test/System Under Test
355	   (DUT/SUT) using controlled stimuli in a laboratory environment, with
356	   dedicated address space and the constraints specified in the sections
357	   above.

359	   The benchmarking network topology will be an independent test setup
360	   and MUST NOT be connected to devices that may forward the test
361	   traffic into a production network, or misroute traffic to the test
362	   management network.

364	   Further, benchmarking is performed on a "black-box" basis, relying
365	   solely on measurements observable external to the DUT/SUT.

367	   Special capabilities SHOULD NOT exist in the DUT/SUT specifically for
368	   benchmarking purposes.  Any implications for network security arising
369	   from the DUT/SUT SHOULD be identical in the lab and in production
370	   networks.

372	6.  IANA Considerations

374	   No IANA Action is requested at this time.

376	7.  Acknowledgements

378	   The author acknowledges an encouraging conversation on this topic
379	   with Mukhtiar Shaikh and Ramki Krishnan in November 2013.
380	   Bhuvaneswaran Vengainathan, Bhavani Parise, and Ilya Varlashkin have
381	   provided useful suggestions to expand these considerations.

383	8.  References

385	8.1.  Normative References

387	   [NFV.PER001]
388	              "Network Function Virtualization: Performance and
389	              Portability Best Practices", Group Specification ETSI GS
390	              NFV-PER 001 V1.1.1 (2014-06), June 2014.

392	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
393	              Requirement Levels", BCP 14, RFC 2119, March 1997.

395	   [RFC2330]  Paxson, V., Almes, G., Mahdavi, J., and M. Mathis,
396	              "Framework for IP Performance Metrics", RFC 2330, May
397	              1998.

399	   [RFC2544]  Bradner, S. and J. McQuaid, "Benchmarking Methodology for
400	              Network Interconnect Devices", RFC 2544, March 1999.

402	   [RFC2679]  Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way
403	              Delay Metric for IPPM", RFC 2679, September 1999.

405	   [RFC2680]  Almes, G., Kalidindi, S., and M. Zekauskas, "A One-way
406	              Packet Loss Metric for IPPM", RFC 2680, September 1999.

408	   [RFC2681]  Almes, G., Kalidindi, S., and M. Zekauskas, "A Round-trip
409	              Delay Metric for IPPM", RFC 2681, September 1999.

411	   [RFC3393]  Demichelis, C. and P. Chimento, "IP Packet Delay Variation
412	              Metric for IP Performance Metrics (IPPM)", RFC 3393,
413	              November 2002.

415	   [RFC3432]  Raisanen, V., Grotefeld, G., and A. Morton, "Network
416	              performance measurement with periodic streams", RFC 3432,
417	              November 2002.

419	   [RFC4737]  Morton, A., Ciavattone, L., Ramachandran, G., Shalunov,
420	              S., and J. Perser, "Packet Reordering Metrics", RFC 4737,
421	              November 2006.

423	   [RFC5357]  Hedayat, K., Krzanowski, R., Morton, A., Yum, K., and J.
424	              Babiarz, "A Two-Way Active Measurement Protocol (TWAMP)",
425	              RFC 5357, October 2008.

427	   [RFC5905]  Mills, D., Martin, J., Burbank, J., and W. Kasch, "Network
428	              Time Protocol Version 4: Protocol and Algorithms
429	              Specification", RFC 5905, June 2010.

431	8.2.  Informative References

433	   [RFC1242]  Bradner, S., "Benchmarking terminology for network
434	              interconnection devices", RFC 1242, July 1991.

436	   [RFC5481]  Morton, A. and B. Claise, "Packet Delay Variation
437	              Applicability Statement", RFC 5481, March 2009.

439	   [RFC6248]  Morton, A., "RFC 4148 and the IP Performance Metrics
440	              (IPPM) Registry of Metrics Are Obsolete", RFC 6248, April
441	              2011.

443	   [RFC6390]  Clark, A. and B. Claise, "Guidelines for Considering New
444	              Performance Metric Development", BCP 170, RFC 6390,
445	              October 2011.

447	Author's Address

449	   Al Morton
450	   AT&T Labs
451	   200 Laurel Avenue South
452	   Middletown,, NJ  07748
453	   USA

455	   Phone: +1 732 420 1571
456	   Fax:   +1 732 368 1192
457	   Email: acmorton@att.com
458	   URI:   http://home.comcast.net/~acmacm/