idnits 2.17.1 draft-duffield-framework-papame-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 557 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an Introduction section. ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** There are 15 instances of too long lines in the document, the longest one being 7 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'DG01' on line 463 looks like a reference -- Missing reference section? 'SPSJTKS01' on line 467 looks like a reference -- Missing reference section? 'B88' on line 460 looks like a reference Summary: 6 errors (**), 0 flaws (~~), 3 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 INTERNET-DRAFT Nick Duffield 2 draft-duffield-framework-papame-01 Albert Greenberg 3 Matthias Grossglauser 4 Feb 27, 2002 Jennifer Rexford 5 AT&T Labs - Research 7 A Framework for Passive Packet Measurement 9 Copyright (C) The Internet Society (2001). All Rights Reserved. 11 This document is an Internet-Draft and is in full conformance with 12 all provisions of Section 10 of RFC2026. 14 Internet-Drafts are working documents of the Internet Engineering 15 Task Force (IETF), its areas, and its working groups. Note that 16 other groups may also distribute working documents as Internet- 17 Drafts. 19 Internet-Drafts are draft documents valid for a maximum of six months 20 and may be updated, replaced, or obsoleted by other documents at any 21 time. It is inappropriate to use Internet-Drafts as reference 22 material or to cite them other than as "work in progress." 24 The list of current Internet-Drafts can be accessed at 25 http://www.ietf.org/ietf/1id-abstracts.txt 27 The list of Internet-Draft Shadow Directories can be accessed at 28 http://www.ietf.org/shadow.html. 30 Abstract 32 A wide range of traffic engineering and troubleshooting tasks rely 33 on reliable, timely, and detailed traffic measurements. We describe 34 a passive packet measurement framework that is (a) general enough 35 to serve as the basis for a wide range of operational tasks, and 36 (b) relies on a small set of primitives that facilitate uniform 37 deployment in router interfaces or dedicated measurement devices, 38 even at very high speeds. This document describes the motivation 39 for such a framework through several operational examples, defines 40 the measurement primitives (filtering, sampling, and hashing), and 41 illustrates their use. 43 1 Motivation 45 Framework: This document is described a framework for a standard 46 set of capabilities for network elements to sample packets and 47 report on them. One motivation to standardize these capabilities 48 comes from the requirement for measurement-based support for 49 network management and control across multivendor domains. This 50 requires domain wide consistency in the types of sampling schemes 51 available, the manner in which the resulting measurements are 52 presented, and consequently, consistency of the interpretation that 53 can be put on them. 55 Relation to other work: The measurement capabilities are positioned 56 as suppliers of packet samples to higher level consumers, including 57 both remote collectors and applications, and on board 58 measurement-based applications. Indeed, development of the 59 standards within the framework described here should take into 60 account the measurement requirements of standards in other IETF 61 WGs, including IPPM and TEWG. Conversely, we expect that aspects of 62 this framework not specifically concerned with the central issue of 63 packet sampling may be able to leverage work in other WGs. The 64 prime example is the format and export of measurement reports, 65 which may leverage the work of IPFIX. 67 Applications: We first describe several representative operational 68 applications that require traffic measurements at various levels of 69 temporal and spatial granularity. 71 Example 1: Troubleshooting 73 A network operator typically monitors aggregate statistics on a 74 per- link basis. Such aggregate statistics may include total number 75 of packets and bytes, dropped number of packets and bytes. These 76 statistics are typically moving averages over relatively long time 77 windows (e.g., 5 minutes), and serve as a coarse-grain indication 78 of operational health of the network. The most common method of 79 obtaining such measurements are through the appropriate SNMP MIBs 80 (MIB-II and vendor-specific MIBs.) 82 Suppose an operator detects a link that is persistently overloaded 83 and experiences significant packet drop rates. There is a wide 84 range of potential causes: routing parameters (e.g., OSPF link 85 weights) that are poorly adapted to the traffic matrix, e.g., 86 because of a shift in that matrix; a denial of service attack or a 87 flash crowd; a routing problem (link flapping). In most cases, 88 aggregate link statistics are not sufficient to distinguish between 89 such causes, and to decide on an appropriate corrective action. For 90 example, if routing over two links is unstable, and the links flap 91 between being overloaded and inactive, this might be averaged out 92 in a 5 min window, indicating moderate loads on both links. 94 Hence, the operator must be able to drill down into the traffic on 95 a link, and obtain measurements that are more fine-grained both in 96 space and in time. The operator has to be able to determine how 97 many bytes/packets are generated for each source/destination 98 address, port number, and prefix, or other attributes, such as 99 protocol number, MPLS forwarding equivalence class (FEC), type of 100 service, etc. This allows to pinpoint precisely the nature of the 101 offending traffic. For example, in the case of a DDoS attack, the 102 operator would see a significant fraction of traffic with an 103 identical destination address. 105 Example 2: Characterizing Demand 107 Traffic engineering has two goals: optimizing the quality of 108 service provided to customers, and optimizing the use of network 109 resources. This is achieved through network-wide control of 110 routing, traffic classification and differentiation, and resource 111 allocation. Traffic measurements are necessarily part of such a 112 closed control loop. Specifically, the operator has to be able to 113 measure the total network-wide traffic demand at several levels of 114 granularity and time scales. 116 For example, in order to optimize intradomain routing by modifying 117 OSPF link weights or by configuring MPLS tunnels, the volume per 118 ingress-egress pair has to be measured (traffic matrix.) At a 119 longer time scale (weeks to months), measurements also drive 120 topology and capacity planning and the management of peering 121 agreements. Topology and capacity planning involves upgrading 122 links and routers and modifying the network topology to be 123 well-adapted to the prevailing traffic pattern. This includes 124 deciding where new customers should be attached. A natural 125 representation for traffic demand to drive topology and capacity 126 planning is a previous/next-hop AS traffic matrix, which 127 characterizes demand in terms of neighboring ASs. Managing peering 128 agreements, i.e., making strategic decisions about setting up and 129 retiring peering agreements, and modifying the terms of existing 130 ones (e.g., where to interconnect with peers.), benefits from a 131 source/destination AS traffic matrix, because the set of 132 neighboring ASs may change as a result of peering management. 134 Therefore, in general, it is necessary to obtain averages over 135 various time scales of the entire traffic carried by a network 136 domain. The spatial resolution of these averages include the 137 source and destination IP address, AS, prefix, port number, and the 138 previous and next hop AS with respect to the measurement domain. 139 Furthermore, if a service provider uses multiple service types, it 140 should also be possible to measure these matrices individually per 141 service type. 143 Example 3: Direct Observation of Network Behavior 145 In certain circumstances, precise information about the spatial 146 flow of traffic through the network domain is required to detect 147 and diagnose problems and verify correct network behavior. For 148 example, in the case of the overloaded link in Example 1, it would 149 be very helpful to know the precise set of paths that packets 150 traversing this link follow. This would readily reveal a routing 151 problem such as a loop, or a link with a misconfigured weight. More 152 generally, complex diagnosis scenarios can benefit from measurement 153 of traffic intensities (and other attributes) over a set of paths 154 that is constrained in some way. For example, if a multihomed 155 customer complains about performance problems on one of the access 156 links from a particular source address prefix, the operator should 157 be able to examine in detail the traffic from that source prefix 158 which also traverses the specified access link towards the 159 customer. 161 While it is in principle possible to obtain the spatial flow of 162 traffic through auxiliary network state information, e.g., by 163 downloading routing and forwarding tables from routers, this 164 information is often unreliable, outdated, voluminous, and 165 contingent on a network model. For operational purposes, a direct 166 observation of traffic flow is more reliable, as it does not depend 167 on any such auxiliary information. For example, if there was a bug 168 in a router's software, direct observation would allow to diagnose 169 the effect of this bug, while an indirect method would not. 171 2 Goals 173 The main goal of this proposal is to define a measurement framework 174 that relies on three canonical primitives: packet sampling, 175 filtering, and hashing. A wide spectrum of applications, including 176 those described in the previous section, are enabled by 177 measurements obtained through combinations of these three 178 primitives. Furthermore, a sampling device based on these 179 measurement primitives is relatively simple, as (a) it requires 180 only minimal per-packet processing, and (b) it requires little 181 (local) memory. Therefore, the proposed framework represents an 182 effective tradeoff between implementation complexity and the range 183 of traffic engineering applications and other operational tasks it 184 enables. 186 More generally, the following goals motivate the proposed framework: 188 o Greatly assist a very wide range of applications that can be 189 built on traffic measurement (Section 4), from a very small set of 190 primitives implemented ubiquitously. 192 o Aim for ubiquity, by including in the minimal set of primitives 193 functions that can be implemented at maximal line rate with minimal 194 additional state. 196 o Aim for ubiquity, by not forcing tight integration with packet 197 control actions (policing, marking, shaping, queueing). 199 o Allow for extensibility, which can be applied where needed 200 (depending on the application) for enhanced functionality. 202 o Aim for flexibility in data export format and options. 204 o A common data stream must support different applications, teams 205 and organizations (e.g., traffic engineering, marketing, billing) 206 concurrently. 208 o Allow for flexibility in implementation. In particular, export 209 of local router state information can be decoupled from export of 210 usage information. 212 o Ease of configuration of sampling an export parameters, e.g. for 213 automated remote reconfiguration in response to measurements. 215 o Allow transparent interpretation of measurements through 216 inclusion of sampling configuration in the reporting stream. 218 o Allow robust interpretation of measurements with respect to 219 reports missing due to loss in transport, or omission at the 220 measurement device. 222 3 Measurement Functionality 224 3.1 Measurement Information Flow 226 The framework for passive measurement has three main parts: the 227 selection of packets for measurement, the creation and export of 228 measurement reports, and the content and format of the measurement 229 records. Because of the increasing number of distinct measurement 230 applications, we believe it is desirable to set up parallel 231 measurement information flows from the stream of packets. Each 232 information flow should consist of independently-configurable 233 pipelines for selecting packets and exporting measurement records. 235 The processing of each measurement information flow should, as far 236 as possible, be independent. However, resource constraints may 237 prevent complete reporting on a packet selected for multiple 238 information flows. In this case, reporting for the packet must be 239 complete for at least one information flow; other information flows 240 need only report that they selected the packet. The priority 241 amongst information flows to report packets must be configurable. 243 3.2 Packet Selection 245 The function of packet selection is to select a subset out of the 246 stream of all packets. Selection may be used to select a subset of 247 packets of interest based on their content, and/or to reduce the 248 rate of packets into the measurement flow regardless of content. 249 Packet selection is performed through combination a number of 250 measurement primitives described below. In this document we do not 251 set any restrictions on the form these combinations can take. 253 o Hashing: 255 A hashing function operates on a subset of packet bits and 256 associates the resulting hash with the packet. Bit positions can 257 be excluded from the input to the hashing function by masking. This 258 ability would be used, for example, by applications that require 259 the hash to be independent on packet header fields, such as TTL or 260 header CRC, that are mutable on its passage through the network. 262 o Filtering: 264 Filtering is accomplished by applying mask/match operations to any 265 combination of bit positions from the packet and the configured 266 hashes. The mask/match operation is configurable independently for 267 each filter. Higher level interfaces to the match/mask primitive 268 may be used to specify mask and matches for particular fields, for 269 example, for IP addresses and/or TCP/UDP port numbers. 271 o Sampling: 273 Each sampler will be individually configurable to sample packets 274 with a certain probability p. Examples are probabilistic sampling, 275 in which each packet is selected quasirandomly with probability p, 276 and deterministic sampling, in which packets are sampled 277 periodically with period 1/p. In some sampling schemes, the 278 sampling probability may depend on the packet content. Sampling at 279 full line rate with probability p=1 is not excluded in principle, 280 although resource constraints may not support it in practice. 282 In order to be able to function at line rates, each measurement 283 primitive take as its input only a packet itself, or quantities 284 that have been calculated from the packet previously by other 285 measurement primitives. Router state is not assumed to be available 286 to the measurement primitives. 288 3.3 Report Generation and Export 290 Although the primary goal of this draft is to set up a framework 291 for the sampling operations themselves, utilization of the 292 resulting measurements places requirements the information 293 available for export, and the methods by which reports are 294 exported. Any scheme that can accommodate the framework described 295 in this section and section 3.4 is a convenient candidate for the 296 job. 298 Report preparation involves selecting fields of interest from each 299 sampled packet, then adjoining subsidiary information (e.g., hash 300 values, byte and packet counts, timestamps, etc.) from the 301 selection process and router state information. The router state 302 values may depend on the packet content (e.g., the IP prefix or 303 Autonomous System associated with the destination address in the IP 304 header, the input and output interfaces that carried the packet, 305 etc.). Reports may also include subsidiary quantities calculated 306 as a function of the selected packet and the router state. To 307 simplify the design, some of the subsidiary information and router 308 state may be incorporated when the records are exported, rather 309 than when the packets are selected. However, all such router state 310 information must be included for reporting in a timely manner, in 311 order that it reflects the actual state encountered by the packet. 313 The device generating the measurement records is configured to 314 transmit the data to one or more collection systems, identified by 315 IP address and port number. Exporting these records to other 316 systems introduces several practical issues that have important 317 implications on the analysis of the data: 319 o Transport: Two basic modes of transport are possible: unreliable 320 and reliable. In the unreliable mode, a completed measurement 321 packet from the export module is encapsulated into a UDP packet and 322 sent to the configured address (the collection system). The 323 sending device does not need to keep state about this packet (other 324 than possibly a sequence number to detect lost measurement 325 packets). In the reliable mode, the device exports records via a 326 TCP connection to the collection system. The device must be 327 capable of receiving packets (such as acknowledgments) from the 328 collection system and retransmitting lost packets. 330 o Export rate: The device should impose a (configurable) limit of 331 the number of measurement records per unit time. Otherwise, the 332 measurement device could overload the network and the collection 333 system. This problem would be exacerbated in the reliable 334 transport mode, where the device would retransmit any lost packets 335 (thereby imposing an additional load on the network). At times, 336 the device may generate new records faster than the allowed export 337 rate. In this situation, the device should discard the excess 338 records rather than transmitting them to the collection system. 339 The device may record information (such as sequence numbers, or 340 packet and byte counter values accumulated at the inputs and 341 outputs of a packet selector) to aid the collection system in 342 compensating for the missing data in any subsequent analysis. 344 o Maximum delay in exporting records: The device may queue 345 measurement records in order to export multiple records in a single 346 packet. However, the device should bound the delay in exporting 347 measurement records, even if the number of records is small. This 348 is important for two reasons. First, having an upper bound on the 349 export delay ensures that the collection system has up-to-date 350 information about the sampled packets. Second, in some scenarios, 351 the device may associate a timestamp with the record(s) at the 352 export stage. Limiting the delay in exporting the records places a 353 tight bound on the inaccuracy in the timestamp information. 355 The device can impose a (configurable) Maximum Transmission Unit 356 (MTU) size for reports. 358 o Local Export: packet reports may also be directly exported to 359 on-board measurement-based applications, for example those that 360 for composite statistics from more than one packet. Local export 361 may be presented through an interface direct to the higher level 362 applications, i.e., without employing the transport used for 363 off-board export. 365 3.4 Measurement Record Format 367 Report export involves the bundling of one or more measurement 368 records and sending a packet to the collection system. The report 369 includes several types of information, such as: 371 o Per-packet information: The measurement record for each sampled 372 packet includes various header fields (e.g., IP addresses, port 373 numbers, ToS bits, TCP flags, etc.), as well as subsidiary 374 information (e.g., timestamp, input and output links, other router 375 state, hash values, etc.). 377 o Configuration information: The stream of reports should provide 378 information about the configuration of the measurement flow (e.g., 379 the sampling frequency, the sampling technique and associated 380 parameters, the match/mask filter, etc.). This ensures that the 381 measurement data are self-describing and allows the collection 382 system to analyze the measurement data without a separate feed of 383 the configuration state. Changes in configuration must be 384 immediately reflected in the report stream. 386 o Aggregate information: The reports should include sufficient 387 information for the collection system to account for discarded 388 measurement records and lost exported packets. For example, the 389 reports could include sequence numbers to enable the collection 390 machine to detect lost reports. The reports could include a count 391 of the number of bytes and packets that matched the filter, or that 392 passed both the filtering and sampling stages. 394 To conserve storage space and network bandwidth, the device may 395 compress the measurement records as they are stored or exported. 396 Compression should be quite effective since the sampled packets may 397 share many fields in common (especially if the filter focuses on 398 packets with certain values in particular header fields). 400 3.5. Configuration and Management 402 All configuration parameters associated with the sampling of 403 packets and export of measurements are to be contained in a MIB. A 404 secure protocol is to be used to access to the MIB for 405 reconfiguration and retrieval of the parameters. 407 4 Applications 409 We describe a representative set of operational applications 410 enabled by the passive measurement device described in the previous 411 section, by referring back to the examples in Section 1. 413 Example 1: Troubleshooting 415 Packet sampling is ideally suited to determine the composition of 416 the traffic (e.g., on a link) in terms of various attributes 417 (source and destination address and port numbers, prefix, protocol 418 number, type of service, etc.) Typically, unfiltered sampling would 419 be used to obtain a coarse-grained view of the traffic on a link, 420 say. Once the characteristics of an interesting subset of traffic 421 (e.g., a service type, or a source address prefix corresponding to 422 some customer) has been identified, the resolution can be refined 423 by filtering out this traffic, and by boosting the sampling rate 424 correspondingly. In this way, the traffic can be examined and 425 characterized ("sliced and diced") arbitrarily. 427 Example 2: Characterizing Demand 429 Characterizing demand for an entire network domain will likely be 430 achieved by sampling packets on all the ingress links, or some 431 other well-chosen cut set. The sampling rate would typically be 432 chosen relatively low, given that we are interested in averages 433 over longer time scales, e.g., to detect significant systemic 434 shifts in demand not due to random fluctuations. Some of the 435 subsidiary fields included in reports, such as source and 436 destination AS, and input and output link, will be useful, 437 depending on the spatial granularity of demand characterization. 439 Example 3: Direct Observation of Network Behavior 441 Direct observation of the spatial flow of traffic through the 442 domain can be achieved through a method called trajectory sampling, 443 which relies on the hash function to make sampling decisions 444 [DG01]. Specifically, the hash function is computed over a 445 predefined set of fields of the IP packet header and payload. If 446 the hash function for a packet falls within a configurable interval 447 [a,b], then the packet should be sampled; otherwise, it should not 448 be sampled. This features yields the full paths followed by sampled 449 packets, by ensuring that a packet is sampled on every router it 450 traverses, or no router at all. This requires that the hash 451 function and the set of packet fields over which it is computed are 452 the same everywhere. 454 A similar use of hash functions has also been considered for hash- 455 based IP traceback of distributed denial-of-service (DDoS) attacks 456 [SPSJTKS01]. 458 5 References 460 [B88] R.T. Braden, A pseudo-machine for packet monitoring and 461 statistics, in Proc ACM SIGCOMM 1988 463 [DG01] N. G. Duffield and M. Grossglauser, Trajectory Sampling for 464 Direct Traffic Observation, IEEE/ACM Trans. on Networking, 9(3), pp. 465 280-292, June 2001. 467 [SPSJTKS01] A. C. Snoeren, C. Partridge, L. A. Sanchez, C. E. Jones, 468 F. Tchakountio, S. T. Kent, W. T. Strayer, Hash-Based IP Traceback, 469 Proc. ACM SIGCOMM 2001, San Diego, CA, September 2001. 471 6 Author's Addresses 473 Nicholas G. Duffield 474 AT&T Labs - Research 475 Room B-139 476 180 Park Ave 477 Florham Park NJ 07932, USA 478 Phone: +1 973-360-8726 479 Email: duffield@research.att.com 481 Albert Greenberg 482 AT&T Labs - Research 483 Room A-161 484 180 Park Ave 485 Florham Park NJ 07932, USA 486 Phone: +1 973-360-8730 487 Email: albert@research.att.com 489 Matthias Grossglauser 490 AT&T Labs - Research 491 Room A-167 492 180 Park Ave 493 Florham Park NJ 07932, USA 494 Phone: +1 973-360-7172 495 Email: mgross@research.att.com 497 Jennifer Rexford 498 AT&T Labs - Research 499 Room A-169 500 180 Park Ave 501 Florham Park NJ 07932, USA 502 Phone: +1 973-360-8728 503 Email: jrex@research.att.com 505 7 Intellectual Property Statement 507 AT&T Corp. may own intellectual property applicable to this 508 contribution. AT&T is currently reviewing its licensing intent 509 relative to the Intellectual Property and will notify the IETF when 510 AT&T has made a determination of that intent. 512 8 Full Copyright Statement 514 Copyright (C) The Internet Society (1999). All Rights Reserved. 516 This document and translations of it may be copied and furnished to others, 517 and derivative works that comment on or otherwise explain it or assist in 518 its implementation may be prepared, copied, published and distributed, in 519 whole or in part, without restriction of any kind, provided that the above 520 copyright notice and this paragraph are included on all such copies and 521 derivative works. However, this document itself may not be modified in any 522 way, such as by removing the copyright notice or references to the Internet 523 Society or other Internet organizations, except as needed for the purpose of 524 developing Internet standards in which case the procedures for copyrights 525 defined in the Internet Standards process must be followed, or as required 526 to translate it into languages other than English. 528 The limited permissions granted above are perpetual and will not be revoked 529 by the Internet Society or its successors or assigns. 531 This document and the information contained herein is provided on an "AS IS" 532 basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE 533 DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO 534 ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY 535 RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A 536 PARTICULAR PURPOSE.