idnits 2.17.1 draft-wlai-tewg-measure-01.txt: ** The Abstract section seems to be numbered Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? == The page length should not exceed 58 lines per page, but there was 14 longer pages, the longest (page 11) being 61 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 15 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. Miscellaneous warnings: ---------------------------------------------------------------------------- == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (May 2001) is 8375 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Missing reference section? '1' on line 55 looks like a reference -- Missing reference section? '2' on line 101 looks like a reference -- Missing reference section? '3' on line 104 looks like a reference -- Missing reference section? '4' on line 105 looks like a reference -- Missing reference section? '5' on line 131 looks like a reference -- Missing reference section? '6' on line 213 looks like a reference -- Missing reference section? '7' on line 551 looks like a reference -- Missing reference section? '8' on line 551 looks like a reference -- Missing reference section? '9' on line 551 looks like a reference -- Missing reference section? '10' on line 601 looks like a reference -- Missing reference section? '11' on line 601 looks like a reference -- Missing reference section? '12' on line 631 looks like a reference -- Missing reference section? '13' on line 642 looks like a reference -- Missing reference section? '14' on line 663 looks like a reference Summary: 6 errors (**), 0 flaws (~~), 3 warnings (==), 16 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Traffic Engineering Working Group Wai Sum Lai 2 Internet Draft AT&T Labs 3 Document: 4 Category: Informational Blaine Christian 5 UUNET 7 Richard W. Tibbs 8 OPNET Technologies 10 Steven Van den Berghe 11 Ghent Univeristy/IMEC 13 May 2001 15 A Framework for Internet Traffic Engineering Measurement 17 Status of this Memo 19 This document is an Internet-Draft and is in full conformance with 20 all provisions of Section 10 of RFC2026. 22 Internet-Drafts are working documents of the Internet Engineering 23 Task Force (IETF), its areas, and its working groups. Note that 24 other groups may also distribute working documents as Internet- 25 Drafts. Internet-Drafts are draft documents valid for a maximum of 26 six months and may be updated, replaced, or obsoleted by other 27 documents at any time. It is inappropriate to use Internet- Drafts 28 as reference material or to cite them other than as "work in 29 progress." 31 The list of current Internet-Drafts can be accessed at 32 http://www.ietf.org/ietf/1id-abstracts.txt 34 The list of Internet-Draft Shadow Directories can be accessed at 35 http://www.ietf.org/shadow.html. 37 1. Abstract 39 In this document, a measurement framework for supporting the traffic 40 engineering of IP-based networks is presented. It is intended for 41 the TEM (Traffic Engineering Measurement) category as described in 42 the TEWG charter. Consideration for including this document as a 43 TEWG working-group item for further development is requested. 45 2. Conventions used in this document 47 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 48 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in 49 this document are to be interpreted as described in RFC-2119. 51 3. Introduction 52 This document describes a framework for Internet traffic engineering 53 measurement, with the objective of providing principles for the 54 development of a set of measurement systems to support the traffic 55 engineering of IP-based networks [1]. A major goal is to provide 56 guidance for establishing protocol-independent and platform-neutral 57 traffic measurement standards to achieve multi-vendor inter- 58 operability. It is critical to minimize the possibilities of 59 inconsistencies arising from, e.g., overlapping data collecting and 60 processing at various protocol levels, due to the use of different 61 measurement principles by different vendors or network operators. 63 The initial scope is limited to those aspects of measurement 64 pertaining to intra-domain, i.e., within a given autonomous system 65 as well as on its boundary with other domains. The focus is 66 primarily on traffic engineering in Internet service provider 67 environments. 69 In this document, the use of traffic measurement in traffic 70 characterization, network monitoring, and traffic control is first 71 described. Depending on the network operations to be performed in 72 these tasks, three different time scales can be identified, ranging 73 from months, through days or hours, to minutes or less. To support 74 these operations, traffic measurement must be able to capture 75 accurately, within a given confidence interval, the traffic 76 variations and peaks without degrading network performance and 77 without generating an immense amount of data. Therefore, 78 specification of a suitable read-out period for each service class 79 for traffic summarization is essential. 81 Traffic measurement can be performed on the basis of flows, 82 interfaces, links, nodes, node-pairs, or paths. Based on these 83 objects, different measurement entities can be defined, such as 84 traffic volume, average holding time, bandwidth availability, 85 throughput, delay, delay variation, packet loss, and resource usage. 86 Using these measured traffic data, in conjunction with other network 87 data such as topological data and router configuration data, traffic 88 matrix and other relevant statistics can be derived for traffic 89 engineering purposes. Traffic measurement also plays a key role in 90 network performance management. 92 As a framework, this document is mainly concerned with a discussion 93 of various technical issues surrounding traffic measurement. 94 Requirements for traffic measurement are contained in the Annex. As 95 far as possible and to avoid duplication of effort, relevant work 96 done in this area by other standards organizations will be applied 97 or adapted, and references to them will be made. These include, in 98 particular, 100 . IP Performance Metrics (IPPM) Working Group of the IETF: its 101 framework document [2] and documents on individual metrics 102 [references to be added] 104 . ITU-T: Recommendation I.380/Y.1540 [3] and Draft Recommendation 105 Y.1541 [4] 107 4. Terminology 109 The intent of this section is not to provide definition or 110 description of terms used in this document. Rather, it is to 111 highlight the difference in usage of related terms. 113 Path, route 115 A path refers to an MPLS tunnel, i.e., a label-switched path. A 116 route is any unidirectional sequence of nodes and links, for sending 117 packets from a source node to a destination node. (Note: There are 118 also methods for creating paths with other technologies such as 119 frame-relay or ATM. Applicability of the measurement described in 120 this document to these technologies is to be covered in the next 121 version of this document.) 123 Throughput, traffic volume 125 Both quantities can be applied to a network, a network segment, or 126 an individual network element. Throughput of a network, as a 127 measure of delivered performance, refers to the maximum sustainable 128 rate of transferring packets successfully across the network, under 129 given network conditions (e.g., a given traffic mix) while meeting 130 QoS objectives. (This is consistent with the definition of 131 throughput for a network interconnect device as specified in [5].) 132 For real-time network control, active measurement of throughput by 133 probing may be used to determine the currently available capacity of 134 a network to carry additional traffic. Traffic volume, as a measure 135 of the traffic carried, characterizes the level of traffic that a 136 network is designed to support. Passive measurement of the traffic 137 volume is usually used to estimate the long-term offered traffic for 138 the purposes of network dimensioning in the capacity-management and 139 network-planning processes (see the Section on Time Scales for 140 Network Operations). A network should be properly dimensioned so 141 that its throughput is adequate to handle the expected traffic 142 volume. 144 Throughput is expressed in terms of number of data units per time 145 unit. Traffic volume is expressed in data units with reference to a 146 read-out period (see the Section on Read-Out Periods). For 147 transmission systems, the data unit is usually a multiple of either 148 bits or bytes. For processing systems, the data unit is usually a 149 multiple of packets. 151 5. Uses of Traffic Measurement 153 Traffic measurement is used to collect traffic data for the 154 following purposes: 156 Traffic characterization 157 . identifying traffic patterns, particularly traffic peak patterns, 158 and their variations in statistical analysis; this includes 159 developing traffic profiles to capture daily, weekly, or seasonal 160 variations 161 . determining traffic distributions in the network on the basis of 162 flows, interfaces, links, nodes, node-pairs, paths, or 163 destinations 164 . estimation of the traffic load according to service classes in 165 different routers and the network 166 . observing trends for traffic growth and forecasting of traffic 167 demands 169 Network monitoring 171 . determining the operational state of the network, including fault 172 detection 173 . monitoring the continuity and quality of network services, to 174 ensure that QoS/GoS objectives are met for various classes of 175 traffic, to verify the performance of delivered services, or to 176 serve as a means of sectionalizing performance issues seen by a 177 customer (QoS reflects the performance perceivable by a user of a 178 service, while GoS is used by a service provider for internal 179 design and operation) 180 . evaluating the effectiveness of traffic engineering policies, or 181 triggering certain policy-based actions (such as alarm generation, 182 or path preemption) upon threshold crossing; this may be based on 183 the use of performance history data 184 . verifying peering agreements between service providers by 185 monitoring/measuring the traffic flows over interconnecting links 186 at border routers; this includes the estimation of inter- and 187 intra-network traffic, as well as originating, terminating, and 188 transit traffic that are being exchanged between peers 190 Traffic control 192 . adaptively optimizing network performance in response to network 193 events, e.g., rerouting to work around congestion or failures 194 . providing a feedback mechanism in the reverse flow messaging of 195 RSVP-TE or CR-LDP signaling to report on actual topology state 196 information such as link bandwidth availability 197 . support of measurement-based admission control, i.e., by 198 predicting the future demands of the aggregate of existing flows 199 so that admission decisions can be made on new flows 201 6. Time Scales for Network Operations 203 The information collected by traffic measurement can be provided to 204 the end user or application either in real time or for record in 205 non-real time, depending on the activities to be performed and the 206 network actions to be taken. Traffic control will generally require 207 real-time information. For network planning and capacity management 208 as described below, information may be provided in non-real time 209 after the processing of raw data. 211 Broadly speaking, the following three time scales can be classified, 212 according to the use of observed traffic information for network 213 operations [6]. 215 Network planning 216 Information that changes on the order of months is used to make 217 traffic forecasts as a basis for network extensions and long-term 218 network configuration. That is, for planning the topology of the 219 network, planning alternative routes to survive failures or 220 determining where capacity must be augmented in advance of projected 221 traffic growth. Forecasting and planning may also lead to the 222 introduction of new technology and architecture. 224 Capacity management 225 Information that changes on the order of days or hours is used to 226 manage the deployed facilities, by taking appropriate maintenance or 227 engineering actions to optimize utilization. For example, new MPLS 228 tunnels may be set up or existing tunnels modified while meeting 229 Service Level Agreements. Also, load balancing may be performed, or 230 traffic may be rerouted for re-optimization after a failure. 232 Real-time network control 233 Information that changes on the order of minutes or less is used to 234 adapt to the current network conditions in near real time. Thus, to 235 combat localized congestion, traffic management actions may perform 236 temporary rerouting to redistribute the load. Upon detecting a 237 failure, traffic may be diverted to pre-established, secondary 238 routes until more optimized routes can be arranged. 240 7. Read-Out Periods 242 A measurement infrastructure must be able to scale with the size and 243 the speed of a network as it evolves. Hence, it is important to 244 minimize the amount of data to be collected, and to condense the 245 collected data by periodic summarization. This is to prevent 246 network performance from being adversely affected by the 247 unnecessarily excessive loading of router control processors, router 248 memories, transmission facilities, and the administrative support 249 systems. 251 A measurement interval is the time interval over which measurements 252 are taken. Some traffic data must be collected continuously, while 253 others by sampling, or on a scheduled basis. For example, peak 254 loads and peak periods can be identified only by continuous 255 measurement as traffic typically fluctuates irregularly during the 256 whole day. If traffic variations are regular and predictable, it 257 may be possible to measure the expected normal load on pre- 258 determined portions of the day. This requires the definition of a 259 busy period. Special studies on selected segments of the network 260 may be conducted on a scheduled basis. Active measurement, with the 261 involvement of network operator, may be activated manually. For 262 instance, active throughput measurement may be used to identify 263 alternate routes during periods of network congestion. 265 A measurement interval consists of a sequence of consecutive read- 266 out periods. Summarization is usually done by integrating the raw 267 data over a pre-specified read-out period. The granularity of this 268 period must be suitably chosen. It should be short enough to 269 capture, with acceptable accuracy, the bursty nature of the traffic, 270 i.e., the traffic variations and peaks. Since measurements 271 represent a load for the router, the read-out period should not be 272 so short that router performance is degraded while a voluminous 273 quantity of data is produced. Also, read-out may be started when 274 the measured data exceeds a preset threshold, or when the space 275 allocated for temporarily holding the data in a router is exhausted. 277 For a multi-service IP-based network, each service typically has its 278 own traffic characteristics and performance objectives. To ensure 279 that service-specific features are reflected in the measurement 280 process, different read-out periods may be needed for different 281 classes of service. 283 8. Measurement Bases 285 Measurements can be classified on the basis of where, and at which 286 level the traffic data are gathered and aggregated. This is similar 287 to the concept of a population of interest as specified in ITU-T 288 Recommendation I.380/Y.1540. As defined therein, this refers to a 289 set of packets, possibly relative to a particular pair of source and 290 destination hosts, for the purposes of defining performance 291 parameters. However, measurement bases as used here may not have 292 any association with a source-destination pair. 294 In this document, customer-based measurements are not considered. 295 Service providers will make decisions on how to perform the 296 measurements needed, and there are various tradeoffs involved. One 297 option is to obtain the measurements directly from the network 298 elements themselves, e.g., via SNMP. Collecting the measurements on 299 the operational network elements such as routers is sometimes a 300 performance concern. Currently, there are a number of third-party 301 measurement/monitoring products available. Hence, another option is 302 to deploy such equipment, which might have performance advantages 303 but also introduces additional cost. 305 Regardless of the type of measurement source, either a network 306 element or a third-party product, measurements should be collected, 307 as far as possible, by a measurement source without requiring 308 coordination with other measurement sources. Thus, it is desirable 309 to perform those measurements that do not require the use of 310 specialized monitoring equipment connected to the network at 311 multiple locations. While each measurement source may act 312 autonomously with regard to taking measurements, a network operator 313 may specify some network-wide policy regarding measurement 314 scheduling. Such policy may be, say, the use of the same time of 315 day, the same measurement interval, or measurement intervals that 316 are multiples of each other (e.g., nested intervals with 317 synchronized boundaries). A schedule therefore should include such 318 time information as the start, the duration, and periodicity of a 319 certain measurement. 321 The following measurement bases are considered in this document. 323 Flow-based 325 This is conceptually similar to the call detail record (CDR) in 326 telecommunication networks. It is primarily used on interfaces at 327 access routers, edge routers, or aggregation routers where traffic 328 originates or terminates, rather than on backbone routers in the 329 core network. Like CDR measurements, flow-based records are used to 330 collect detailed information about a flow. This includes such 331 information as source and destination IP addresses/port numbers, 332 protocol, type of service, timestamps for the start and end of a 333 flow, packet count, octet count, etc. 335 As flow is a fine-grained object, measuring every flow that passes 336 through all the edge devices may not be scalable or feasible. 337 Hence, per-flow data are usually used in a special study conducted 338 on a non-continuous schedule and on selected routers only. Sampling 339 of flow-based measurements may also be needed to reduce both the 340 amount of data collected and the associated overhead. 342 Interface-based, link-based, node-based 344 Passive, i.e., in-service non-intrusive, measurement can be taken at 345 each network element. For example, SNMP MIBs use passive monitoring 346 to collect raw data on an interface at an edge or backbone router. 347 This includes data such as counts on packets and octets 348 sent/received, packet discards, errored packets. While not intended 349 for core network, RMON can possibly be used in the access link of an 350 ISP to provide managed Internet service to corporate LANs. 351 (Consideration for link bundling in next version of this document.) 353 Node-pair-based 355 Active measurements by probing, as specified in the IPPM framework, 356 can be conducted between each pair of major routing hubs for 357 determining edge-to-edge performance of a core network. This 358 complements the passive measurements of the previous sub-section, 359 which provide local views of the performance of individual network 360 elements. 362 In telecommunications networks, each established call has an 363 associated node-pair. By maintaining a set of node-pair data 364 registers (usage, peg count, overflow, etc) in each switch, node- 365 pair-based measurements for traffic statistics such as the load 366 between a given node pair are taken directly. In contrast, in IP- 367 based networks, currently such kind of node-pair-based measurements 368 cannot be taken directly. However, it is possible to infer them 369 from flow-based passive measurements and other network information. 370 A problem with this approach is that flow-based measurement data are 371 voluminous. Also, another problem that must be accounted for is the 372 routing changes among the multiple routes due to, e.g., a change in 373 the configuration of intradomain routing, or a change in interdomain 374 policies made by another autonomous system. This is further 375 discussed in the Section on Traffic Matrix Statistics. 377 Path-based 379 In this document, the term path specifically refers to MPLS tunnel, 380 or label-switched path. 382 The ability of MPLS to use fixed preferred paths for routing 383 traffic, so-called route pinning, gives the means to develop path- 384 based measurements. This may enable the development of 385 methodologies for such functions as admission control and 386 performance verification of delivered service. 388 Like a flow, a path is associated with a pair of nodes. However, 389 path is a more coarse-grained object than flow, as paths are usually 390 used to carry aggregated traffic. In addition, when routing changes 391 occur, the amount of traffic to be carried by a path will either not 392 be affected or be merged with that of another path. Because of 393 these properties, path-based measurements are more scalable and may 394 be used to provide more readily an accurate, network-wide, view of 395 the traffic demands. (For example, the traffic between a given pair 396 of nodes may be inferred from the aggregate of the traffic carried 397 by the all the paths either terminated by or passed through the same 398 node-pair.) 400 9. Measurement Entities 402 A measurement entity defines what is measured: it is a quantity for 403 which data collection must be performed with a certain measurement. 404 A measurement type can be specified by a (meaningful) combination of 405 a measurement entity with the measurement basis described in the 406 previous section. 408 Entities related to traffic and performance 410 Some of the measurement entities listed below, such as throughput, 411 delay, delay variation, and packet loss, are related to the IPPM 412 performance metrics or the I.380/Y.1540 performance parameters. 414 . Traffic volume (mean and variance for normal/high load, in bits, 415 bytes, or packets transferred, as averaged over a given time 416 interval), on a per service class basis, at various aggregation 417 levels (IP address prefix, node, network edge, customer, or 418 autonomous system) 419 Note: (1) This is a measurement for the traffic carried by a 420 network, a network segment, or an individual network element. 421 When measured during the busy period, this entity is normally used 422 to estimate the traffic offered. However, the estimation 423 procedure should take into account such factors as congestion, 424 which may result in decreased carried traffic. In addition, 425 congestion may lead to user behavior such as reattempt or 426 abandonment, which may affect the actual traffic offered. (To 427 include a discussion of the relevance and applicability of second- 428 order statistics.) (2) Measurement of traffic volumes over 429 interconnecting links at border routers can be used to estimate 430 the traffic exchange between peers for contract verification. 432 . Average holding time (e.g., flow duration or lifetime, duration of 433 an MPLS path), on a per service class basis 434 Note: Similar to call holding time in telecommunications network, 435 holding time statistics are useful in network planning for sizing 436 network elements. Also, the holding time statistics of long- 437 living static paths reflect the effect of network equipment 438 failures, link outages, or scheduled maintenance, and hence may to 439 used to derive information about up-time or service availability. 441 . Available bandwidth of a link or path - useful for load balancing, 442 measurement-based admission control to determine the feasibility 443 of creating a new MPLS tunnel (real-time information can be used 444 for dynamic establishment) 446 . Throughput (in both bits (or bytes) per second and packets per 447 second) 448 Note: (1) This is a measure of the "goodput." That is, the rate 449 at which a given amount of traffic excluding lost, misdelivered, 450 or errored packets, passes between a set of end points, where end 451 points can be logically or physically defined. The condition of 452 the network, e.g., normal or high load, under which the 453 measurement is taken should be noted. (2) The protocol level at 454 which a throughput measurement is taken must be specified, as the 455 packet payload and packet overheads are protocol dependent. (3) 456 The average packet size may be inferred from the bit rate and 457 packet rate measurements. This quantity is useful to gauge router 458 performance, since router operations are typically packet-oriented 459 and small packets are more processing-intensive. 461 . Delay (e.g., cross-router delay from node-based measurement may be 462 used to measure queueing delay within a router; end-to-end one-way 463 or round-trip packet delay can be obtained by node-pair-based 464 measurement) 465 Note: The condition of the network, e.g., normal or high load, 466 under which the measurement is taken should be noted. This is 467 useful to determine if delay objectives are met. 469 . Delay variation 470 Note: There are several ways to measure this quantity as specified 471 in IPPM and I.380/Y.1540 (a brief summary to be included). 473 . Packet loss 474 Note: (1) While packet losses due to transmission and/or protocol 475 errors may not be traffic related, unexpected excessive loss may 476 be used as a means of fault detection. (2) Packet losses due to 477 policing or network congestion should be distinguished. The 478 former is a result of user violation of service contract and the 479 network operator should not be penalized for it. The latter, 480 whether intentional or unintentional, is caused by network 481 conditions such as buffer overflow, router forwarding process 482 busy, and may not be the user's fault. When policing is done by a 483 network, measurement of non-conforming packets at the edge 484 provides an indication on the extent to which the network is 485 carrying this type of packets (which can potentially be dropped if 486 network gets congested). Loss due to congestion of any packets, 487 including loss of non-conforming packets, is a useful measure in 488 traffic engineering to account for resource management. (3) Long- 489 term averages can be measured by the I.380/Y.1540 IP packet loss 490 ratio or by the IPPM Poisson sampling of one-way loss. However, 491 during the convergence times associated with routing updating, the 492 loss may be high enough as to cause service unavailability. This 493 effect needs to be captured and statistics such as loss patterns, 494 burst loss, or severe loss ratio may be required (reference to be 495 included). 497 . Resource usage, such as link/router utilization, buffer occupancy 498 (e.g., fraction of arriving packets finding the buffer above a 499 given set of thresholds) 500 Note: Trigger points may be set when resource usage consistently 501 exceeds a certain threshold. 503 Entities related to establishment of connection or path 505 Where connection admission control is used, a measurement entity for 506 monitoring network performance may be the proportion of connections 507 denied admission. Also, it may be useful to score the requested 508 bandwidth within the traffic parameters for the setup request. 509 Corresponding to telecommunications network, connection request rate 510 may be measured to characterize the offered traffic. 512 To characterize paths, the following measurement entities may be 513 defined: path setup delay, path setup error probability, path setup 514 denial (blocking) probability, path release delay, path disconnect 515 probability, path restoration time. 517 10. Measurement Types 519 A measurement matrix can be defined wherein each column represents a 520 measurement basis and each row represents a measurement entity. An 521 entry in this measurement matrix, corresponding to a meaningful and 522 measurable combination of an entity and a basis, defines a 523 particular measurement type. For each measurement type, there 524 should be a set of measurement points specified to bound the network 525 segment for the purposes of taking measurement. A measurement point 526 may be the physical boundary between a node and an adjacent link, or 527 the logical interface between two protocol layers in a protocol 528 stack. 530 The following measurement matrix illustrates some of the measurement 531 types related to traffic or performance. Potentially, there can be 532 one such matrix for each service class. 534 Bases: Flow Interface, Node Pair Path 535 Node 536 Entities: (passive) (passive) (both) (both) 538 Traffic Volume x(1) x x(3) x(3) 539 Avg. Hold. Time x x(3) 540 Avail. Bandwidth x x(3) 541 Throughput x(4) x(4) 542 Delay x(2) x(4) x(4) 543 Delay Variation x(2) x(4) x(4) 544 Packet Loss x x(5) x(5) 546 Notes: 547 (1) This measurement type can be used to derive flow size 548 statistics. 549 (2) These are 1-point measurements. 550 (3) As a starting point, statistics collected by passive measurement 551 through the MPLS traffic engineering MIBs [7, 8, 9] may be used. 552 (4) Active measurements based on IPPM metrics are currently in use 553 for node-pairs; they may be developed for paths. 554 (5) Besides active measurements based on IPPM, path loss may 555 possibly be inferred from the difference between ingress and egress 556 traffic statistics at the two endpoints of a path. However, such 557 inference for the cumulative losses between a given node pair over 558 multiple routes may be less useful, since different routes may have 559 different loss characteristics. 561 Another measurement matrix can be constructed for resource 562 consumption. This leads to a set of measurement types comprising 563 the different usage, one for each network resource object such as 564 router, link, and buffer, by different classes of traffic: 566 . control (e.g., routing control) traffic 567 . signaling traffic 568 . user traffic from different service classes 570 Bases: Node Link Buffer 571 Entities: 572 Control Util. x x x 573 Signaling Util. x x x 574 Service Class Util. x x x 576 The amount of control and signaling traffic carried by a network is 577 a function of many factors. To name a few, they include the size 578 and topology of the network, the control and signaling protocols 579 used, the amount of user traffic carried, the number of failure 580 events, etc. The above utilization measurements for control and 581 signaling traffic are intended to help develop guidelines for the 582 proper dimensioning and apportionment of network resources so that a 583 given level of user traffic can be adequately supported. As the 584 primary focus here is on user traffic measurements, the additional 585 needs and properties of control and signaling traffic measurements 586 are beyond the scope of this document. 588 11. Traffic Matrix Statistics 590 An important set of data for traffic engineering is point-to-point 591 or point-to-multipoint demands. This data is needed in the 592 provisioning of intradomain routes and external peering in the 593 existing network, as well as planning for the placement and sizing 594 of new links, routers, or peers. 596 In current practice, estimates for traffic demands are usually 597 determined from a combination of traffic projections, customer 598 prescriptions, and SLAs. Under existing mode of operation, it is 599 not easy to obtain network-wide traffic demands from the local 600 interface measurements taken by different IP routers. As explained 601 in [10, 11], information from diverse network measurements and 602 various configuration files are needed to infer the traffic volume. 603 Besides raw measurement data, additional information such as 604 topological data and router configuration data are required to 605 obtain a network view. Furthermore, destination-based 606 routing/forwarding in IGP (such as OSPF or IS-IS) provides a network 607 operator with primitive and limited control over the routing of 608 traffic flows. This necessitates the association of a time sequence 609 of forwarding tables from different routers to reconstruct the 610 different routes used by the network over time. By using this 611 auxiliary information, together with flow-based measurements, the 612 above-cited references describe how to determine the traffic volume 613 from an ingress link to a set of egress links by validating and 614 joining various data sets together. 616 The routing control offered by MPLS can be used to avoid the above 617 shortcomings of existing measurements. It is recommended that path- 618 based passive measurement for traffic volume, average holding time, 619 and available bandwidth be developed so that traffic matrix 620 statistics, on a per service class basis, can be derived. 622 Besides traffic engineering, a major application of MPLS is the 623 support of network-based virtual private networks (VPNs). A VPN can 624 be an enterprise network or a carrier's carrier network. Path-based 625 measurement by a network operator on behalf of the VPN customers 626 facilitates the estimation of the traffic offered by these VPNs. 628 12. Performance Monitoring 629 General aspects of measurements required to support the operation, 630 administration, and maintenance of a network are outside the scope 631 of this document (see [12] for a discussion of MPLS OAM). The focus 632 of the measurements here is only on operations related to traffic 633 engineering and network performance management. 635 A major component of performance management is performance 636 monitoring, i.e., continuous real-time monitoring of the quality or 637 health of the network and its various elements to ensure a 638 sustained, uninterrupted delivery of quality service. This requires 639 the use of measurement, either passively or actively, to collect 640 information about the operational state of the network and to track 641 its performance. For a discussion of passive monitoring and the use 642 of synthetic traffic sources in active probing, see [13]. Alarms 643 may be generated when the state of a network element exceeds 644 prescribed thresholds. 646 Performance degradation can occur as a result of routing 647 instability, congestion, or failure of network components. Periods 648 of congestion may be detected when the resource usage of a network 649 segment consistently exceeds a certain threshold, or when the cross- 650 router delay is unexpectedly high. After the identification of a 651 hot spot, active throughput measurement may be used to seek out 652 alternate routes for congestion bypass. Unexpected excessive loss 653 of packets or throughput drops may be used as a means of fault 654 detection, and may result in restoration activities. 656 Internet utilities such as ping and traceroute have been useful to 657 help diagnose network problems and performance debugging. Utilities 658 with similar functions would be essential for path-oriented 659 operations like in MPLS. This would include the capability to list, 660 at any time, (1) for a given path, all the nodes traversed by it, 661 and (2) for a given node, all the paths originating from it, 662 transiting through it, and/or terminating on it. A proposal for 663 route tracing is described in [14]. 665 13. Report Generation 667 Data storage, data processing, statistics generation and reporting 668 are outside the scope of this document. 670 14. Annex: Traffic Measurement Requirements 672 (Note: This annex may be spun off as a separate document when it 673 matures.) 675 The contents of this annex are to be provided. For example, it 676 should specify some expected range of service-specific measurement 677 intervals, read-out periods, and busy periods. Also, it should 678 specify the different reference points in the traffic flow for both 679 node-pair-based and path-based measurements, and their associated 680 measurement types. 682 15. Security Considerations 684 Security considerations are not addressed in this version of the 685 draft. 687 16. References 689 1 D.O. Awduche, A. Chiu, A. Elwalid, I. Widjaja, and X. Xiao, "A 690 Framework for Internet Traffic Engineering," Internet-Draft, Work 691 in Progress, April 2001. 692 2 V. Paxson, G. Almes, J. Mahdavi, and M. Mathis, "Framework for IP 693 Performance Metrics," RFC 2330, May 1998. 694 3 ITU-T Recommendation I.380/Y.1540, "Internet Protocol Data 695 Communication Service -- IP Packet Transfer and Availability 696 Performance Parameters," February 1999. 697 4 ITU-T Draft Recommendation Y.1541, "Internet Protocol 698 Communication Service -- IP Performance and Availability 699 Objectives and Allocations," November 2000. 700 5 S. Bradner (Editor), "Benchmarking Terminology for Network 701 Interconnection Devices," RFC 1242, July 1991. 702 6 G. Ash, "Traffic Engineering & QoS Methods for IP-, ATM-, & TDM- 703 Based Networks," Internet-Draft, Work in Progress, December 2000. 704 7 C. Srinivasan, A. Viswanathan, and T.D. Nadeau, "MPLS Label 705 Switch Router Management Information Base Using SMIv2," Internet- 706 Draft, Work in Progress, January 2001. 707 8 C. Srinivasan, A. Viswanathan, and T.D. Nadeau, "MPLS Traffic 708 Engineering Management Information Base Using SMIv2," Internet- 709 Draft, Work in Progress, March 2001. 710 9 K. Kompella, " A Traffic Engineering MIB," Internet-Draft, Work 711 in Progress, September 2000. 712 10 A. Feldmann, A. Greenberg, C. Lund, N. Reingold, J. Rexford, and 713 F. True, "Deriving Traffic Demands for Operational IP Networks: 714 Methodology and Experience," Proc. ACM SIGCOMM 2000, Stockholm, 715 Swedan. 716 11 A. Feldmann, A. Greenberg, C. Lund, N. Reingold, and J. Rexford, 717 "NetScope: Traffic Engineering for IP Networks," IEEE Network, 718 March/April 2000. 719 12 N. Harrison, P. Willis, S. Davari, B. Mack-Crane, and H. Ohta, 720 "OAM Functionality for MPLS Networks," Internet-Draft, Work in 721 Progress, February 2001. 722 13 R.G. Cole, R. Dietz, C. Kalbfleisch, and D. Romascanu, "A 723 Framework for Synthetic Sources for Performance Monitoring," 724 Internet-Draft, Work in Progress, February 2001. 725 14 R. Bonica, K. Kompella, and D. Meyer, "Tracing Requirements for 726 Generic Tunnels," Internet-Draft, Work in Progress, February 727 2001. 729 17. Acknowledgments 731 The support of Gerald Ash on this work and his comments are much 732 appreciated. Also, thanks to the inputs from Robert Cole, Enrique 733 Cuevas, Alfred Morton, Moshe Segal, and the Tequila project. 735 18. Author's Addresses 737 Wai Sum Lai 738 AT&T Labs 739 Room D5-3D18 740 200 Laurel Avenue 741 Middletown, New Jersey 07748, USA 742 Phone: +1 732-420-3712 743 Email: wlai@att.com 745 Blaine Christian 746 UUNET 747 Email: Blaine@uu.net 749 Richard W. Tibbs 750 OPNET Technologies 751 Email: rtibbs@opnet.com 753 Steven Van den Berghe 754 Ghent Univeristy/IMEC 755 St. Pietersnieuwsstraat 41 756 B-9000 Ghent, Belgium 757 Phone: ++32 9 267 35 86 758 E-mail: steven.vandenberghe@intec.rug.ac.be 760 Full Copyright Statement 762 "Copyright (C) The Internet Society (date). All Rights Reserved. 763 This document and translations of it may be copied and furnished to 764 others, and derivative works that comment on or otherwise explain it 765 or assist in its implmentation may be prepared, copied, published 766 and distributed, in whole or in part, without restriction of any 767 kind, provided that the above copyright notice and this paragraph 768 are included on all such copies and derivative works. However, this 769 document itself may not be modified in any way, such as by removing 770 the copyright notice or references to the Internet Society or other 771 Internet organizations, except as needed for the purpose of 772 developing Internet standards in which case the procedures for 773 copyrights defined in the Internet Standards process must be 774 followed, or as required to translate it into languages other than 775 English. 777 The limited permissions granted above are perpetual and will not be 778 revoked by the Internet Society or its successors or assigns. 780 This document and the information contained herein is provided on an 781 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 782 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 783 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 784 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 785 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.