idnits 2.17.1 

draft-krishnan-ipfix-flow-aware-packet-sampling-06.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- Couldn't find a document date in the document -- date freshness check
     skipped.


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  == Unused Reference: 'RFC 5476' is defined on line 371, but no explicit
     reference was found in the text

  == Unused Reference: 'PDSN' is defined on line 387, but no explicit
     reference was found in the text

  == Unused Reference: 'ALDS' is defined on line 390, but no explicit
     reference was found in the text

  == Unused Reference: 'FDDOS' is defined on line 393, but no explicit
     reference was found in the text


     Summary: 0 errors (**), 0 flaws (~~), 5 warnings (==), 1 comment (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------

1	IPFIX
2	Internet Draft                                              R. Krishnan
3	Intended status: Informational                   Brocade Communications
4	Expires: April 2014                                              Ning So
5	October 2013                                        Tata Communications
6	                                                           S. D'Antonio
7	                                      University of Napoli "Parthenope"

9	             Flow-state Dependent Packet Selection Techniques

11	          draft-krishnan-ipfix-flow-aware-packet-sampling-06.txt

13	Status of this Memo

15	   This Internet-Draft is submitted in full conformance with the
16	   provisions of BCP 78 and BCP 79. This document may not be modified,
17	   and derivative works of it may not be created, except to publish it
18	   as an RFC and to translate it into languages other than English.

20	   Internet-Drafts are working documents of the Internet Engineering
21	   Task Force (IETF), its areas, and its working groups.  Note that
22	   other groups may also distribute working documents as Internet-
23	   Drafts.

25	   Internet-Drafts are draft documents valid for a maximum of six months
26	   and may be updated, replaced, or obsoleted by other documents at any
27	   time.  It is inappropriate to use Internet-Drafts as reference
28	   material or to cite them other than as "work in progress."

30	   The list of current Internet-Drafts can be accessed at
31	   http://www.ietf.org/ietf/1id-abstracts.txt

33	   The list of Internet-Draft Shadow Directories can be accessed at
34	   http://www.ietf.org/shadow.html

36	   This Internet-Draft will expire on April 18, 2009.

38	Copyright Notice

40	   Copyright (c) 2013 IETF Trust and the persons identified as the
41	   document authors. All rights reserved.

43	   This document is subject to BCP 78 and the IETF Trust's Legal
44	   Provisions Relating to IETF Documents
45	   (http://trustee.ietf.org/license-info) in effect on the date of
46	   publication of this document. Please review these documents
47	   carefully, as they describe your rights and restrictions with respect
48	   to this document.

50	Abstract

52	   The demands on the networking infrastructure and thus the
53	   switch/router bandwidths are growing exponentially; the drivers are
54	   bandwidth hungry rich media applications, inter data center
55	   communications etc. Using sampling techniques, for a given sampling
56	   rate, the amount of samples that need to be processed is increasing
57	   exponentially especially for applications like security threat
58	   detection. This draft elaborates on flow-state dependent packet
59	   selection techniques and the relevant information models. It
60	   describes how these techniques can be effectively used to reduce the
61	   number of samples for applications like security threat detection.

63	Table of Contents

65	   1. Introduction...................................................3
66	      1.1. Acronyms..................................................3
67	      1.2. Terminology...............................................3
68	   2. Flow-state dependent packet selection techniques...............3
69	      2.1. Information Model for flow-state dependent packet selection
70	      technique configuration........................................4
71	      2.2. Handling Inactive/Misidentified Large Flows...............5
72	      2.3. Flow-state dependent packet selection - sample and hold...5
73	      2.4. IANA Considerations.......................................5
74	         2.4.1. Registration of Information Elements.................5
75	            2.4.1.1. largeFlowObservationInterval....................5
76	            2.4.1.2. largeFlowBandwidthThreshold.....................6
77	   3. Current sampling techniques for security threat detection......6
78	   4. Application of flow-state dependent packet selection techniques
79	   for security threat detection.....................................7
80	      4.1. Analysis of various flow-state dependent packet selection
81	      techniques.....................................................8
82	      4.2. Simulation................................................8
83	   5. Security Considerations........................................8
84	   6. Operational Considerations.....................................8
85	   7. Acknowledgements...............................................8
86	   8. References.....................................................9
87	      8.1. Normative References......................................9
88	      8.2. Informative References....................................9

90	1. Introduction

92	   This draft expands on the flow-state dependent packet selection
93	   techniques described in [RFC 7014] for identifying long-lived large
94	   flows and the relevant information models. This draft also describes
95	   a practical use case for efficient behavioral security detection,
96	   like Denial of Service (DOS) attacks etc., using flow-state dependent
97	   packet selection techniques.

99	1.1. Acronyms

101	   DOS: Denial of Service

103	   GRE: Generic Routing Encapsulation

105	   MPLS: Multi Protocol Label Switching

107	   NVGRE: Network Virtualization using Generic Routing Encapsulation

109	   TCAM: Ternary Content Addressable Memory

111	   STT: Stateless Transport Tunneling

113	   VXLAN: Virtual Extensible LAN

115	1.2. Terminology

117	   Large flow(s): long-lived large flow(s)

119	   Small flow(s): long-lived small flow(s) and short-lived small/large
120	   flow(s)

122	2. Flow-state dependent packet selection techniques

124	   Expanding on the work in [RFC 7014] and [RFC 5475], this draft
125	   suggests additional techniques for flow-state dependent packet
126	   selection for identifying large flows. One of these techniques is
127	   called Multistage Filters which is described in [ESVA]. This
128	   technique helps in automatically identifying large flows with a low
129	   false positive rate. This technique can be implemented as an inline
130	   solution in switches/routers and would be expected to operate at line
131	   rate.

133	   Besides the Multistage filters technique described in [ESVA],

135	   1) The technique suggested in [VRM] is also applicable. [VRM]
136	     suggests techniques for automatically identifying large flows
137	     using rotating conservative counting Bloom filters with periodic
138	     decay. This technique has a low false positive rate in large flow
139	     misidentification.

141	   2) The sample and hold technique suggested in [ESVA] is also
142	     applicable. This technique has a low false positive rate in large
143	     flow misidentification.

145	   The large flows which are automatically identified using the above
146	   techniques are populated in the IPFIX flow cache [RFC 6728]. If a
147	   large flow already exists in the IPFIX flow cache, the above
148	   techniques are not applied - this is the reason these are called
149	   flow-state dependent packet selection techniques.

151	   Please note that there is a finite probability of small flows being
152	   misidentified as large flows. These are handled as described in the
153	   section 2.2 "Handling Inactive/Misidentified Large Flows".

155	2.1. Information Model for flow-state dependent packet selection
156	   technique configuration

158	   From a bandwidth and time duration perspective, in order to identify
159	   large flows we define an observation interval and observe the
160	   bandwidth of the flow over that interval.  A flow that exceeds a
161	   certain minimum bandwidth threshold over that observation interval
162	   would be considered a large flow.

164	   The two configuration parameters -- the observation interval, and the
165	   minimum bandwidth threshold over that observation interval -- should
166	   be programmable in a switch or a router to facilitate handling of
167	   different use cases and traffic characteristics are defined below.

169	   largeFlowObservationInterval: The minimum time interval to observe a
170	   flow before performing further processing of the flow. Unit is in
171	   milliseconds.

173	   largeFlowBandwidthThreshold: The minimum bandwidth of the flow during
174	   the observation interval for declaring the flow a large flow. Unit is
175	   in Mbps.

177	   For example, a flow which is at or above 10 Mbps for a time period of
178	   at least 30 seconds could be declared a large flow.

180	   Below is the list of flow-state dependent packet selection technique
181	   Information Elements:

183	       +-----+-------------------------------+
184	       | ID  | Name                          |
185	       +-----+-------------------------------+
186	       | TBD | largeFlowObservationInterval  |
187	       | 1   |                               |
188	       +-----+-------------------------------+
189	       | TBD | largeFlowBandwidthThreshold   |
190	       | 2   |                               |
191	       +-----+-------------------------------+

193	2.2. Handling Inactive/Misidentified Large Flows

195	   Once a flow has been recognized as a large flow, it should continue
196	   to be recognized as a large flow as long as the traffic received
197	   during an observation interval exceeds some fraction of the bandwidth
198	   threshold, for example 80% of the bandwidth threshold. If the traffic
199	   received during an observation interval falls below a fraction of the
200	   bandwidth threshold, the large flow should be removed from the IPFIX
201	   flow cache.

203	2.3. Flow-state dependent packet selection - sample and hold

205	   [RFC 7014] suggests some information model parameters for the sample
206	   and hold technique suggested in [ESVA]. The large flow information
207	   model parameters suggested in section 2.1 are complementary to these.

209	2.4. IANA Considerations

211	2.4.1. Registration of Information Elements

213	   IANA will register the following IEs in the IPFIX Information
214	   Elements registry at http://www.iana.org/assignments/ipfix/ipfix.xml

216	   IANA Note: please replace TBD1, TBD2, with the assigned values,
217	   throughout the document.

219	2.4.1.1. largeFlowObservationInterval

221	   Description:

223	   The minimum time interval to observe a flow for performing further
224	   processing of the flow.

226	   Abstract Data Type: unsigned64
227	   ElementId: TBD1

229	   Units: milliseconds

231	   Status: Current

233	2.4.1.2. largeFlowBandwidthThreshold

235	   Description:

237	   The minimum bandwidth of the flow during the observation interval
238	   (largeFlowObservationInterval)for declaring the flow a large flow.
239	   Unit is in Mbps.

241	   Abstract Data Type: unsigned64

243	   ElementId: TBD2

245	   Units: Mbps

247	   Status: Current

249	3. Current sampling techniques for security threat detection

251	   Packet sampling techniques e.g. PSAMP -- [RFC 5474], [RFC 5475], [RFC
252	   5476], [RFC 5477], in switches and routers provide an effective
253	   mechanism for approximate detection of various types of flows --
254	   long-lived large flows and other flows (which include long-lived
255	   small flows, short-lived small/large flows) with minimal packet
256	   replication bandwidth overhead. The packet sampling techniques sample
257	   all flows equally.

259	   A large percentage of the packet samples comprise of long-lived large
260	   (aka large) flows and a small percentage of the packet samples
261	   comprise of other (aka small) flows. The large flows aka top-talkers
262	   consume a large percentage of the bandwidth and small percentage of
263	   the flow space.

265	   The small flows, which are the typical cause of security threats like
266	   Denial of Service (DOS) attacks, scanning attacks etc., consume a
267	   small percentage of the bandwidth and a large percentage of the flow
268	   space.

270	4. Application of flow-state dependent packet selection techniques for
271	   security threat detection

273	   Using the flow-state dependent packet selection techniques described
274	   in Section 2, the large flows or top-talkers can be detected in real-
275	   time with a high degree of accuracy. Only the small flows need to be
276	   sampled -- this makes security threat detection more effective with
277	   minimal sampling overhead.

279	   The steps in security threat detection are described below

281	   1) Large Flow Identification:

283	     For identifying large flows, use the flow-state dependent packet
284	     selection techniques described in Section 2. This helps in
285	     identifying the large flows aka top-talkers in real-time with a
286	     high degree of accuracy.

288	   2) Large Flow Classification:

290	     The identified large flows can be broadly classified into 2
291	     categories as detailed below.

293	        a.  Well behaved (steady rate) large flows, e.g. video streams

295	        b.  Bursty (fluctuating rate) large flows e.g. Peer-to-Peer
296	          traffic

298	     The large flows can be sampled at a low rate for further analysis
299	     or need not be sampled. If desired, the large flows could be
300	     exported to a central entity, e.g. Netflow Collector, using IPFIX
301	     protocol [RFC 7011] for further analysis.

303	   3) Small Flow Processing:

305	     The small flows (excluding the large flows) can be sampled at a
306	     normal rate. The small flows can be examined for determining
307	     security threats like DOS attacks (for e.g. SYN floods), Scanning
308	     attacks etc. [FDDOS, PDSN, and ALDS]

310	   Thus, we can see that, security threat detection is possible with
311	   minimal sampling overhead.

313	4.1. Analysis of various flow-state dependent packet selection
314	   techniques

316	   The multistage filter technique suggested in [ESVA] for automatic
317	   identification works well for standard applications generating large
318	   flows, for e.g. video content like movies and catch-up episodes,
319	   backup transactions etc. with a detection time of approximately 30-60
320	   seconds. These detection times ensure that short-lived large flows,
321	   for e.g. HD video clips, are not unnecessarily recognized.

323	   If faster large flow identification times are desired (much shorter
324	   than 30s), the multistage filter technique suggested in [ESVA] may
325	   pose the following problem that the effective filtered flow size is
326	   phase-dependent: that is, relatively smaller constant-rate flows, for
327	   e.g. HD video clips, beginning early within a counting Bloom filter
328	   reset interval would be unnecessarily detected with the same
329	   probability as relatively larger flows beginning toward the interval.
330	   [VRM] suggests techniques for addressing the above problem using
331	   rotating conservative counting Bloom filters with periodic decay.

333	4.2. Simulation

335	   Simulation results for these flow-state dependent packet selection
336	   techniques are presented in Appendix A. The goal of the simulation is
337	   to demonstrate the effectiveness of these techniques for security
338	   threat detection in a multi-tenant video streaming data center.

340	5. Security Considerations

342	   This document does not directly impact the security of the Internet
343	   infrastructure or its applications. In fact, it proposes techniques
344	   which could help in identifying a DOS attack pattern.

346	6. Operational Considerations

348	   For effectively using the flow-state dependent packet selection
349	   techniques, the operator should adjust the programmable parameters
350	   largeFlowObservationInterval and largeFlowBandwidthThreshold in
351	   switches/routers based on the applications which are being deployed.

353	7. Acknowledgements

355	   The authors would like to thank Juergen Quittek, Brian Carpenter,
356	   Michael Fargano, Michael Bugenhagen, Jianrong Wong, Brian Trammell
357	   and Paul Aitken for all the support and valuable input.

359	8. References

361	8.1. Normative References

363	8.2. Informative References

365	   [RFC 5474] N. Duffield et al., "A Framework for Packet Selection and
366	   Reporting", March 2009.

368	   [RFC 5475] T. Zseby et al., "Sampling and Filtering Techniques for IP
369	   Packet Selection", March 2009.

371	   [RFC 5476] B. Claise, Ed. et al., "Packet Sampling (PSAMP) Protocol
372	   Specifications", March 2009.

374	   [RFC 5477] T. Dietz et al., "Information Model for Packet Sampling
375	   Exports", March 2009.

377	   [RFC 7011] B. Claise, "Specification of the IP Flow Information
378	   Export (IPFIX) Protocol for the Exchange of Flow
379	   Information", September 2013

381	   [RFC 6728] G. Muenz et al., "Configuration Data Model for the IP Flow
382	   Information Export (IPFIX) and Packet Sampling (PSAMP) Protocols"

384	   [VRM] G. Bianchi et al., "Measurement Data Reduction through
385	   Variation Rate Metering", INFOCOM 2010

387	   [PDSN] Ignasi Paredes-Oliva et al., "Portscan Detection with Sampled
388	   NetFlow", TMA 2009

390	   [ALDS] Z. Morley Mao et al., "Analyzing Large DDoS Attacks Using
391	   Multiple Data Sources", SIGCOMM 2006

393	   [FDDOS] David Holmes, "The DDoS Threat Spectrum", F5 White paper 2012

395	   [ESVA] C. Estan and G. Varghese, "New Directions in Traffic
396	   Measurement and Accounting", ACM SIGCOMM Internet Measurement
397	   Workshop 2001, San Francisco (CA) Nov. 2001.

399	   [RFC 7014] S. D'Antonio et al., "Flow Selection Techniques",
400	   September 2013

402	Appendix A: Simulation of Flow aware packet sampling

404	   Goal:

406	   Demonstrate the effectiveness of flow aware packet sampling in a
407	   practical use case, for e.g. multi-tenant video streaming in a data
408	   center.

410	   Test Topology:

412	   Multiple virtual servers (server hosted on a virtual machine)
413	   connected to a virtual switch (vSwitch) which in turn connects to the
414	   data center network using a 10Gbps ethernet interface.

416	   2 virtual servers are active.

418	   First virtual server

420	     .  Traffic types

422	          o HD MPEG-4 video streams (bit rate 10Mbps) - 100 - 1Gbps

424	          o SD MPEG-2 video streams (bit rate 4Mbps) - 300 - 1.2Gbps

426	          o Other traffic - 500Mbps (Video clips, DOS attacks (for e.g.
427	             SYN floods), Scanning attacks etc.)

429	     .  Aggregate traffic - 2.7Gbps

431	   Second virtual server

433	     .  Traffic types

435	          o HD MPEG-4 video streams (bit rate 10Mbps) - 50 - .5Gbps

437	          o SD MPEG-2 video streams (bit rate 4Mbps) - 500 - 2.0Gbps

439	          o Backup transaction - 100Mbps

441	          o Other traffic - 500Mbps (Video clips, DOS attacks (for e.g.
442	             SYN floods), Scanning attacks etc.)

444	     .  Aggregate traffic - 3.1Gbps

446	   Total traffic on 2 servers - 5.8Gbps

448	   Existing techniques:

450	   Normal sampling rate - 1:1000

452	   Total sampled traffic = 5.8Gbps/1000 = 5.8Mbps
453	   Flow aware sampling technique:

455	   Large flow recognition parameters

457	     .  Observation interval for large flow - 60 seconds

459	     .  Minimum bandwidth threshold over the observation interval -
460	        2Mbps

462	   Aggregate bit rate of large flows = 4.8Gbps

464	   Aggregate bit rate of small flows = 1Gbps

466	   Low sampling rate of large flows - 1:10000

468	   Normal sampling rate of small flows - 1:1000

470	   Total sampled traffic = 4.8Gbps/10000 + 1Gbps/1000 = 1.48Mbps

472	   Percentage improvement in sampling (most of the samples are only
473	   small flows) = (5.8 - 1.48)/5.8 ~= 78%

475	   The small flows can be examined in a central entity like Netflow
476	   Collector for determining security threats like DOS attacks, Scanning
477	   attacks etc. Thus, we can see that, security threat detection is
478	   possible with minimal sampling overhead.

480	Authors' Addresses

482	   Ram Krishnan
483	   Brocade Communications
484	   San Jose, 95134, USA

486	   Phone: +001-408-406-7890
487	   Email: ramk@brocade.com

489	   Ning So
490	   Tata Communications
491	   Plano, TX 75082, USA

493	   Phone: +001-972-955-0914
494	   Email: ning.so@tatacommunications.com

496	   Salvatore D'Antonio
497	   University of Napoli "Parthenope"
498	   Centro Direzionale di Napoli Is. C4
499	   Naples  80143
500	   Italy

502	   Phone: +39 081 5476766
503	   EMail: salvatore.dantonio@uniparthenope.it