idnits 2.17.1 draft-krishnan-ipfix-flow-aware-packet-sampling-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Informational ---------------------------------------------------------------------------- == Unused Reference: 'RFC 5476' is defined on line 371, but no explicit reference was found in the text == Unused Reference: 'PDSN' is defined on line 387, but no explicit reference was found in the text == Unused Reference: 'ALDS' is defined on line 390, but no explicit reference was found in the text == Unused Reference: 'FDDOS' is defined on line 393, but no explicit reference was found in the text Summary: 0 errors (**), 0 flaws (~~), 5 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 IPFIX 2 Internet Draft R. Krishnan 3 Intended status: Informational Brocade Communications 4 Expires: April 2014 Ning So 5 October 2013 Tata Communications 6 S. D'Antonio 7 University of Napoli "Parthenope" 9 Flow-state Dependent Packet Selection Techniques 11 draft-krishnan-ipfix-flow-aware-packet-sampling-06.txt 13 Status of this Memo 15 This Internet-Draft is submitted in full conformance with the 16 provisions of BCP 78 and BCP 79. This document may not be modified, 17 and derivative works of it may not be created, except to publish it 18 as an RFC and to translate it into languages other than English. 20 Internet-Drafts are working documents of the Internet Engineering 21 Task Force (IETF), its areas, and its working groups. Note that 22 other groups may also distribute working documents as Internet- 23 Drafts. 25 Internet-Drafts are draft documents valid for a maximum of six months 26 and may be updated, replaced, or obsoleted by other documents at any 27 time. It is inappropriate to use Internet-Drafts as reference 28 material or to cite them other than as "work in progress." 30 The list of current Internet-Drafts can be accessed at 31 http://www.ietf.org/ietf/1id-abstracts.txt 33 The list of Internet-Draft Shadow Directories can be accessed at 34 http://www.ietf.org/shadow.html 36 This Internet-Draft will expire on April 18, 2009. 38 Copyright Notice 40 Copyright (c) 2013 IETF Trust and the persons identified as the 41 document authors. All rights reserved. 43 This document is subject to BCP 78 and the IETF Trust's Legal 44 Provisions Relating to IETF Documents 45 (http://trustee.ietf.org/license-info) in effect on the date of 46 publication of this document. Please review these documents 47 carefully, as they describe your rights and restrictions with respect 48 to this document. 50 Abstract 52 The demands on the networking infrastructure and thus the 53 switch/router bandwidths are growing exponentially; the drivers are 54 bandwidth hungry rich media applications, inter data center 55 communications etc. Using sampling techniques, for a given sampling 56 rate, the amount of samples that need to be processed is increasing 57 exponentially especially for applications like security threat 58 detection. This draft elaborates on flow-state dependent packet 59 selection techniques and the relevant information models. It 60 describes how these techniques can be effectively used to reduce the 61 number of samples for applications like security threat detection. 63 Table of Contents 65 1. Introduction...................................................3 66 1.1. Acronyms..................................................3 67 1.2. Terminology...............................................3 68 2. Flow-state dependent packet selection techniques...............3 69 2.1. Information Model for flow-state dependent packet selection 70 technique configuration........................................4 71 2.2. Handling Inactive/Misidentified Large Flows...............5 72 2.3. Flow-state dependent packet selection - sample and hold...5 73 2.4. IANA Considerations.......................................5 74 2.4.1. Registration of Information Elements.................5 75 2.4.1.1. largeFlowObservationInterval....................5 76 2.4.1.2. largeFlowBandwidthThreshold.....................6 77 3. Current sampling techniques for security threat detection......6 78 4. Application of flow-state dependent packet selection techniques 79 for security threat detection.....................................7 80 4.1. Analysis of various flow-state dependent packet selection 81 techniques.....................................................8 82 4.2. Simulation................................................8 83 5. Security Considerations........................................8 84 6. Operational Considerations.....................................8 85 7. Acknowledgements...............................................8 86 8. References.....................................................9 87 8.1. Normative References......................................9 88 8.2. Informative References....................................9 90 1. Introduction 92 This draft expands on the flow-state dependent packet selection 93 techniques described in [RFC 7014] for identifying long-lived large 94 flows and the relevant information models. This draft also describes 95 a practical use case for efficient behavioral security detection, 96 like Denial of Service (DOS) attacks etc., using flow-state dependent 97 packet selection techniques. 99 1.1. Acronyms 101 DOS: Denial of Service 103 GRE: Generic Routing Encapsulation 105 MPLS: Multi Protocol Label Switching 107 NVGRE: Network Virtualization using Generic Routing Encapsulation 109 TCAM: Ternary Content Addressable Memory 111 STT: Stateless Transport Tunneling 113 VXLAN: Virtual Extensible LAN 115 1.2. Terminology 117 Large flow(s): long-lived large flow(s) 119 Small flow(s): long-lived small flow(s) and short-lived small/large 120 flow(s) 122 2. Flow-state dependent packet selection techniques 124 Expanding on the work in [RFC 7014] and [RFC 5475], this draft 125 suggests additional techniques for flow-state dependent packet 126 selection for identifying large flows. One of these techniques is 127 called Multistage Filters which is described in [ESVA]. This 128 technique helps in automatically identifying large flows with a low 129 false positive rate. This technique can be implemented as an inline 130 solution in switches/routers and would be expected to operate at line 131 rate. 133 Besides the Multistage filters technique described in [ESVA], 135 1) The technique suggested in [VRM] is also applicable. [VRM] 136 suggests techniques for automatically identifying large flows 137 using rotating conservative counting Bloom filters with periodic 138 decay. This technique has a low false positive rate in large flow 139 misidentification. 141 2) The sample and hold technique suggested in [ESVA] is also 142 applicable. This technique has a low false positive rate in large 143 flow misidentification. 145 The large flows which are automatically identified using the above 146 techniques are populated in the IPFIX flow cache [RFC 6728]. If a 147 large flow already exists in the IPFIX flow cache, the above 148 techniques are not applied - this is the reason these are called 149 flow-state dependent packet selection techniques. 151 Please note that there is a finite probability of small flows being 152 misidentified as large flows. These are handled as described in the 153 section 2.2 "Handling Inactive/Misidentified Large Flows". 155 2.1. Information Model for flow-state dependent packet selection 156 technique configuration 158 From a bandwidth and time duration perspective, in order to identify 159 large flows we define an observation interval and observe the 160 bandwidth of the flow over that interval. A flow that exceeds a 161 certain minimum bandwidth threshold over that observation interval 162 would be considered a large flow. 164 The two configuration parameters -- the observation interval, and the 165 minimum bandwidth threshold over that observation interval -- should 166 be programmable in a switch or a router to facilitate handling of 167 different use cases and traffic characteristics are defined below. 169 largeFlowObservationInterval: The minimum time interval to observe a 170 flow before performing further processing of the flow. Unit is in 171 milliseconds. 173 largeFlowBandwidthThreshold: The minimum bandwidth of the flow during 174 the observation interval for declaring the flow a large flow. Unit is 175 in Mbps. 177 For example, a flow which is at or above 10 Mbps for a time period of 178 at least 30 seconds could be declared a large flow. 180 Below is the list of flow-state dependent packet selection technique 181 Information Elements: 183 +-----+-------------------------------+ 184 | ID | Name | 185 +-----+-------------------------------+ 186 | TBD | largeFlowObservationInterval | 187 | 1 | | 188 +-----+-------------------------------+ 189 | TBD | largeFlowBandwidthThreshold | 190 | 2 | | 191 +-----+-------------------------------+ 193 2.2. Handling Inactive/Misidentified Large Flows 195 Once a flow has been recognized as a large flow, it should continue 196 to be recognized as a large flow as long as the traffic received 197 during an observation interval exceeds some fraction of the bandwidth 198 threshold, for example 80% of the bandwidth threshold. If the traffic 199 received during an observation interval falls below a fraction of the 200 bandwidth threshold, the large flow should be removed from the IPFIX 201 flow cache. 203 2.3. Flow-state dependent packet selection - sample and hold 205 [RFC 7014] suggests some information model parameters for the sample 206 and hold technique suggested in [ESVA]. The large flow information 207 model parameters suggested in section 2.1 are complementary to these. 209 2.4. IANA Considerations 211 2.4.1. Registration of Information Elements 213 IANA will register the following IEs in the IPFIX Information 214 Elements registry at http://www.iana.org/assignments/ipfix/ipfix.xml 216 IANA Note: please replace TBD1, TBD2, with the assigned values, 217 throughout the document. 219 2.4.1.1. largeFlowObservationInterval 221 Description: 223 The minimum time interval to observe a flow for performing further 224 processing of the flow. 226 Abstract Data Type: unsigned64 227 ElementId: TBD1 229 Units: milliseconds 231 Status: Current 233 2.4.1.2. largeFlowBandwidthThreshold 235 Description: 237 The minimum bandwidth of the flow during the observation interval 238 (largeFlowObservationInterval)for declaring the flow a large flow. 239 Unit is in Mbps. 241 Abstract Data Type: unsigned64 243 ElementId: TBD2 245 Units: Mbps 247 Status: Current 249 3. Current sampling techniques for security threat detection 251 Packet sampling techniques e.g. PSAMP -- [RFC 5474], [RFC 5475], [RFC 252 5476], [RFC 5477], in switches and routers provide an effective 253 mechanism for approximate detection of various types of flows -- 254 long-lived large flows and other flows (which include long-lived 255 small flows, short-lived small/large flows) with minimal packet 256 replication bandwidth overhead. The packet sampling techniques sample 257 all flows equally. 259 A large percentage of the packet samples comprise of long-lived large 260 (aka large) flows and a small percentage of the packet samples 261 comprise of other (aka small) flows. The large flows aka top-talkers 262 consume a large percentage of the bandwidth and small percentage of 263 the flow space. 265 The small flows, which are the typical cause of security threats like 266 Denial of Service (DOS) attacks, scanning attacks etc., consume a 267 small percentage of the bandwidth and a large percentage of the flow 268 space. 270 4. Application of flow-state dependent packet selection techniques for 271 security threat detection 273 Using the flow-state dependent packet selection techniques described 274 in Section 2, the large flows or top-talkers can be detected in real- 275 time with a high degree of accuracy. Only the small flows need to be 276 sampled -- this makes security threat detection more effective with 277 minimal sampling overhead. 279 The steps in security threat detection are described below 281 1) Large Flow Identification: 283 For identifying large flows, use the flow-state dependent packet 284 selection techniques described in Section 2. This helps in 285 identifying the large flows aka top-talkers in real-time with a 286 high degree of accuracy. 288 2) Large Flow Classification: 290 The identified large flows can be broadly classified into 2 291 categories as detailed below. 293 a. Well behaved (steady rate) large flows, e.g. video streams 295 b. Bursty (fluctuating rate) large flows e.g. Peer-to-Peer 296 traffic 298 The large flows can be sampled at a low rate for further analysis 299 or need not be sampled. If desired, the large flows could be 300 exported to a central entity, e.g. Netflow Collector, using IPFIX 301 protocol [RFC 7011] for further analysis. 303 3) Small Flow Processing: 305 The small flows (excluding the large flows) can be sampled at a 306 normal rate. The small flows can be examined for determining 307 security threats like DOS attacks (for e.g. SYN floods), Scanning 308 attacks etc. [FDDOS, PDSN, and ALDS] 310 Thus, we can see that, security threat detection is possible with 311 minimal sampling overhead. 313 4.1. Analysis of various flow-state dependent packet selection 314 techniques 316 The multistage filter technique suggested in [ESVA] for automatic 317 identification works well for standard applications generating large 318 flows, for e.g. video content like movies and catch-up episodes, 319 backup transactions etc. with a detection time of approximately 30-60 320 seconds. These detection times ensure that short-lived large flows, 321 for e.g. HD video clips, are not unnecessarily recognized. 323 If faster large flow identification times are desired (much shorter 324 than 30s), the multistage filter technique suggested in [ESVA] may 325 pose the following problem that the effective filtered flow size is 326 phase-dependent: that is, relatively smaller constant-rate flows, for 327 e.g. HD video clips, beginning early within a counting Bloom filter 328 reset interval would be unnecessarily detected with the same 329 probability as relatively larger flows beginning toward the interval. 330 [VRM] suggests techniques for addressing the above problem using 331 rotating conservative counting Bloom filters with periodic decay. 333 4.2. Simulation 335 Simulation results for these flow-state dependent packet selection 336 techniques are presented in Appendix A. The goal of the simulation is 337 to demonstrate the effectiveness of these techniques for security 338 threat detection in a multi-tenant video streaming data center. 340 5. Security Considerations 342 This document does not directly impact the security of the Internet 343 infrastructure or its applications. In fact, it proposes techniques 344 which could help in identifying a DOS attack pattern. 346 6. Operational Considerations 348 For effectively using the flow-state dependent packet selection 349 techniques, the operator should adjust the programmable parameters 350 largeFlowObservationInterval and largeFlowBandwidthThreshold in 351 switches/routers based on the applications which are being deployed. 353 7. Acknowledgements 355 The authors would like to thank Juergen Quittek, Brian Carpenter, 356 Michael Fargano, Michael Bugenhagen, Jianrong Wong, Brian Trammell 357 and Paul Aitken for all the support and valuable input. 359 8. References 361 8.1. Normative References 363 8.2. Informative References 365 [RFC 5474] N. Duffield et al., "A Framework for Packet Selection and 366 Reporting", March 2009. 368 [RFC 5475] T. Zseby et al., "Sampling and Filtering Techniques for IP 369 Packet Selection", March 2009. 371 [RFC 5476] B. Claise, Ed. et al., "Packet Sampling (PSAMP) Protocol 372 Specifications", March 2009. 374 [RFC 5477] T. Dietz et al., "Information Model for Packet Sampling 375 Exports", March 2009. 377 [RFC 7011] B. Claise, "Specification of the IP Flow Information 378 Export (IPFIX) Protocol for the Exchange of Flow 379 Information", September 2013 381 [RFC 6728] G. Muenz et al., "Configuration Data Model for the IP Flow 382 Information Export (IPFIX) and Packet Sampling (PSAMP) Protocols" 384 [VRM] G. Bianchi et al., "Measurement Data Reduction through 385 Variation Rate Metering", INFOCOM 2010 387 [PDSN] Ignasi Paredes-Oliva et al., "Portscan Detection with Sampled 388 NetFlow", TMA 2009 390 [ALDS] Z. Morley Mao et al., "Analyzing Large DDoS Attacks Using 391 Multiple Data Sources", SIGCOMM 2006 393 [FDDOS] David Holmes, "The DDoS Threat Spectrum", F5 White paper 2012 395 [ESVA] C. Estan and G. Varghese, "New Directions in Traffic 396 Measurement and Accounting", ACM SIGCOMM Internet Measurement 397 Workshop 2001, San Francisco (CA) Nov. 2001. 399 [RFC 7014] S. D'Antonio et al., "Flow Selection Techniques", 400 September 2013 402 Appendix A: Simulation of Flow aware packet sampling 404 Goal: 406 Demonstrate the effectiveness of flow aware packet sampling in a 407 practical use case, for e.g. multi-tenant video streaming in a data 408 center. 410 Test Topology: 412 Multiple virtual servers (server hosted on a virtual machine) 413 connected to a virtual switch (vSwitch) which in turn connects to the 414 data center network using a 10Gbps ethernet interface. 416 2 virtual servers are active. 418 First virtual server 420 . Traffic types 422 o HD MPEG-4 video streams (bit rate 10Mbps) - 100 - 1Gbps 424 o SD MPEG-2 video streams (bit rate 4Mbps) - 300 - 1.2Gbps 426 o Other traffic - 500Mbps (Video clips, DOS attacks (for e.g. 427 SYN floods), Scanning attacks etc.) 429 . Aggregate traffic - 2.7Gbps 431 Second virtual server 433 . Traffic types 435 o HD MPEG-4 video streams (bit rate 10Mbps) - 50 - .5Gbps 437 o SD MPEG-2 video streams (bit rate 4Mbps) - 500 - 2.0Gbps 439 o Backup transaction - 100Mbps 441 o Other traffic - 500Mbps (Video clips, DOS attacks (for e.g. 442 SYN floods), Scanning attacks etc.) 444 . Aggregate traffic - 3.1Gbps 446 Total traffic on 2 servers - 5.8Gbps 448 Existing techniques: 450 Normal sampling rate - 1:1000 452 Total sampled traffic = 5.8Gbps/1000 = 5.8Mbps 453 Flow aware sampling technique: 455 Large flow recognition parameters 457 . Observation interval for large flow - 60 seconds 459 . Minimum bandwidth threshold over the observation interval - 460 2Mbps 462 Aggregate bit rate of large flows = 4.8Gbps 464 Aggregate bit rate of small flows = 1Gbps 466 Low sampling rate of large flows - 1:10000 468 Normal sampling rate of small flows - 1:1000 470 Total sampled traffic = 4.8Gbps/10000 + 1Gbps/1000 = 1.48Mbps 472 Percentage improvement in sampling (most of the samples are only 473 small flows) = (5.8 - 1.48)/5.8 ~= 78% 475 The small flows can be examined in a central entity like Netflow 476 Collector for determining security threats like DOS attacks, Scanning 477 attacks etc. Thus, we can see that, security threat detection is 478 possible with minimal sampling overhead. 480 Authors' Addresses 482 Ram Krishnan 483 Brocade Communications 484 San Jose, 95134, USA 486 Phone: +001-408-406-7890 487 Email: ramk@brocade.com 489 Ning So 490 Tata Communications 491 Plano, TX 75082, USA 493 Phone: +001-972-955-0914 494 Email: ning.so@tatacommunications.com 496 Salvatore D'Antonio 497 University of Napoli "Parthenope" 498 Centro Direzionale di Napoli Is. C4 499 Naples 80143 500 Italy 502 Phone: +39 081 5476766 503 EMail: salvatore.dantonio@uniparthenope.it