idnits 2.17.1 draft-ietf-tewg-framework-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** Missing expiration date. The document expiration date should appear on the first and last page. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 63 longer pages, the longest (page 1) being 63 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 674 has weird spacing: '...ay) and timin...' == Line 781 has weird spacing: '...ormance optim...' == Line 892 has weird spacing: '... merits of di...' == Line 1547 has weird spacing: '...provide bound...' == Line 1896 has weird spacing: '...ct from time-...' == (5 more instances...) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'Elwalid' is mentioned on line 316, but not defined == Missing Reference: 'RFC-2474' is mentioned on line 1637, but not defined == Missing Reference: 'RFC2330' is mentioned on line 1720, but not defined == Missing Reference: 'RFC2680' is mentioned on line 1721, but not defined ** Obsolete undefined reference: RFC 2680 (Obsoleted by RFC 7680) == Missing Reference: 'RFC2679' is mentioned on line 1721, but not defined ** Obsolete undefined reference: RFC 2679 (Obsoleted by RFC 7679) == Missing Reference: 'RFC2678' is mentioned on line 1722, but not defined == Unused Reference: 'ASH1' is defined on line 3014, but no explicit reference was found in the text == Unused Reference: 'AWD4' is defined on line 3034, but no explicit reference was found in the text == Unused Reference: 'CAL' is defined on line 3045, but no explicit reference was found in the text == Unused Reference: 'FGLR' is defined on line 3049, but no explicit reference was found in the text == Unused Reference: 'FLoyd2000' is defined on line 3057, but no explicit reference was found in the text == Unused Reference: 'LNO96' is defined on line 3083, but no explicit reference was found in the text == Unused Reference: 'ELW95' is defined on line 3090, but no explicit reference was found in the text == Unused Reference: 'RFC-1458' is defined on line 3115, but no explicit reference was found in the text == Unused Reference: 'RFC-1771' is defined on line 3118, but no explicit reference was found in the text == Unused Reference: 'RFC-1812' is defined on line 3121, but no explicit reference was found in the text == Unused Reference: 'RFC-1997' is defined on line 3124, but no explicit reference was found in the text == Unused Reference: 'RFC-1998' is defined on line 3127, but no explicit reference was found in the text == Unused Reference: 'RFC-2215' is defined on line 3141, but no explicit reference was found in the text == Unused Reference: 'RFC-2216' is defined on line 3145, but no explicit reference was found in the text == Unused Reference: 'RFC-2330' is defined on line 3148, but no explicit reference was found in the text == Unused Reference: 'RFC-2678' is defined on line 3164, but no explicit reference was found in the text == Unused Reference: 'RFC-2679' is defined on line 3167, but no explicit reference was found in the text == Unused Reference: 'RFC-2680' is defined on line 3170, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. 'ASH1' -- Possible downref: Non-RFC (?) normative reference: ref. 'ASH2' -- Possible downref: Non-RFC (?) normative reference: ref. 'ASH3' ** Downref: Normative reference to an Informational RFC: RFC 2702 (ref. 'AWD1') -- Possible downref: Non-RFC (?) normative reference: ref. 'AWD2' -- Possible downref: Non-RFC (?) normative reference: ref. 'AWD3' -- Possible downref: Non-RFC (?) normative reference: ref. 'AWD4' -- Possible downref: Non-RFC (?) normative reference: ref. 'AWD5' -- Possible downref: Non-RFC (?) normative reference: ref. 'AWD6' -- Possible downref: Non-RFC (?) normative reference: ref. 'CAL' -- Possible downref: Non-RFC (?) normative reference: ref. 'FGLR' -- Possible downref: Non-RFC (?) normative reference: ref. 'FlJa93' -- Possible downref: Non-RFC (?) normative reference: ref. 'FLoyd2000' -- Possible downref: Non-RFC (?) normative reference: ref. 'Floy94' -- Possible downref: Non-RFC (?) normative reference: ref. 'HuSS87' -- Possible downref: Non-RFC (?) normative reference: ref. 'JAM' -- Possible downref: Non-RFC (?) normative reference: ref. 'Li-IGP' -- Possible downref: Non-RFC (?) normative reference: ref. 'Berger' -- Possible downref: Non-RFC (?) normative reference: ref. 'LNO96' -- Possible downref: Non-RFC (?) normative reference: ref. 'MATE' -- Possible downref: Non-RFC (?) normative reference: ref. 'ELW95' -- Possible downref: Non-RFC (?) normative reference: ref. 'Cruz' -- Possible downref: Non-RFC (?) normative reference: ref. 'McQ80' ** Downref: Normative reference to an Informational RFC: RFC 1992 -- Possible downref: Non-RFC (?) normative reference: ref. 'MR99' -- Possible downref: Non-RFC (?) normative reference: ref. 'OMP' ** Obsolete normative reference: RFC 1349 (Obsoleted by RFC 2474) ** Downref: Normative reference to an Informational RFC: RFC 1458 ** Obsolete normative reference: RFC 1771 (Obsoleted by RFC 4271) ** Downref: Normative reference to an Informational RFC: RFC 1998 ** Obsolete normative reference: RFC 2178 (Obsoleted by RFC 2328) ** Downref: Normative reference to an Informational RFC: RFC 2216 ** Downref: Normative reference to an Informational RFC: RFC 2330 ** Downref: Normative reference to an Informational RFC: RFC 2386 ** Downref: Normative reference to an Informational RFC: RFC 2475 ** Obsolete normative reference: RFC 2679 (Obsoleted by RFC 7679) ** Obsolete normative reference: RFC 2680 (Obsoleted by RFC 7680) ** Downref: Normative reference to an Informational RFC: RFC 2722 ** Downref: Normative reference to an Informational RFC: RFC 2753 -- Possible downref: Non-RFC (?) normative reference: ref. 'RoVC' -- Possible downref: Non-RFC (?) normative reference: ref. 'SLDC98' -- Possible downref: Non-RFC (?) normative reference: ref. 'MAK' -- Possible downref: Non-RFC (?) normative reference: ref. 'XIAO' -- Possible downref: Non-RFC (?) normative reference: ref. 'YaRe95' -- Possible downref: Non-RFC (?) normative reference: ref. 'SMIT' -- Possible downref: Non-RFC (?) normative reference: ref. 'KATZ' -- Possible downref: Non-RFC (?) normative reference: ref. 'SHEN' Summary: 21 errors (**), 0 flaws (~~), 32 warnings (==), 34 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force 2 INTERNET-DRAFT 3 TE Working Group 4 Daniel O. Awduche 5 July 2000 UUNET (Worldcom) 7 Angela Chiu 8 AT&T 10 Anwar Elwalid 11 Lucent Technologies 13 Indra Widjaja 14 Fujitsu Network Communications 16 Xipeng Xiao 17 Global Crossing 19 A Framework for Internet Traffic Engineering 21 draft-ietf-tewg-framework-02.txt 23 Status of this Memo 25 This document is an Internet-Draft and is in full conformance with 26 all provisions of Section 10 of RFC2026. 28 Internet-Drafts are working documents of the Internet Engineering 29 Task Force (IETF), its areas, and its working groups. Note that 30 other groups may also distribute working documents as Internet- 31 Drafts. 33 Internet-Drafts are draft documents valid for a maximum of six months 34 and may be updated, replaced, or obsoleted by other documents at any 35 time. It is inappropriate to use Internet-Drafts as reference 36 material or to cite them other than as "work in progress." 38 The list of current Internet-Drafts can be accessed at 39 http://www.ietf.org/ietf/1id-abstracts.txt 41 The list of Internet-Draft Shadow Directories can be accessed at 42 http://www.ietf.org/shadow.html. 44 Abstract 46 This memo describes a framework for Traffic Engineering (TE) in the 47 Internet. The framework is intended to promote better understanding 48 of the issues surrounding traffic engineering in IP networks, and to 49 provide a common basis for the development of traffic engineering 50 capabilities for the Internet. The principles, architectures, and 51 methodologies for performance evaluation and performance optimization 52 of operational IP networks are discussed throughout this document. 53 The optimization goals of traffic engineering are to enhance the 54 performance of IP traffic while utilizing network resources 55 economically and reliably. The framework includes a set of generic 56 requirements, recommendations, and options for Internet traffic 57 engineering. The framework can serve as a guide to implementors of 58 online and offline Internet traffic engineering mechanisms, tools, 59 and support systems. The framework can also help service providers 60 devise traffic engineering solutions for their networks. 62 Table of Contents 64 1.0 Introduction 65 1.1 What is Internet Traffic Engineering? 66 1.2 Scope 67 1.3 Terminology 68 2.0 Background 69 2.1 Context of Internet Traffic Engineering 70 2.2 Network Context 71 2.3 Problem Context 72 2.3.1 Congestion and its Ramifications 73 2.4 Solution Context 74 2.4.1 Combating the Congestion Problem 75 2.5 Implementation and Operational Context 76 3.0 Traffic Engineering Process Model 77 3.1 Components of the Traffic Engineering Process Model 78 3.2 Measurement 79 3.3 Modeling, Analysis, and Simulation 80 3.4 Optimization 81 4.0 Historical Review and Recent Developments 82 4.1 Traffic Engineering in Classical Telephone Networks 83 4.2 Evolution of Traffic Engineering in the Internet 84 4.2.1 Adaptive Routing in ARPANET 85 4.2.2 Dynamic Routing in the Internet 86 4.2.3 ToS Routing 87 4.2.4 Equal Cost MultiPath 88 4.2.5 Nimrod 89 4.3 Overlay Model 90 4.4 Constraint-Based Routing 91 4.5 Overview of Other IETF Projects Related to Traffic 92 Engineering 93 4.5.1 Integrated Services 94 4.5.2 RSVP 95 4.5.3 Differentiated Services 96 4.5.4 MPLS 97 4.5.5 IP Performance Metrics 98 4.5.6 Flow Measurement 99 4.5.7 Endpoint Congestion Management 100 4.6 Overview of ITU Activities Related to Traffic 101 Engineering 102 5.0 Taxonomy of Traffic Engineering Systems 103 5.1 Time-Dependent Versus State-Dependent 104 5.2 Offline Versus Online 105 5.3 Centralized Versus Distributed 106 5.4 Local Versus Global 107 5.5 Prescriptive Versus Descriptive 108 5.6 Open-Loop Versus Closed-Loop 109 5.7 Tactical vs Strategic 110 6.0 Requirements for Internet Traffic Engineering 111 6.1 Generic Requirements 112 6.2 Routing Requirements 113 6.3 Traffic Mapping Requirements 114 6.4 Measurement Requirements 115 6.5 Network Survivability 116 6.5.1 Survivability in MPLS Based Networks 117 6.5.2 Protection Option 118 6.5.3 Resilience Attributes 119 6.6 Content Distribution (Webserver) Requirements 120 6.7 Traffic Engineering in Diffserv Environments 121 6.8 Network Controllability 122 7.0 Inter-Domain Considerations 123 8.0 Overview of Contemporary TE Practices in Operational 124 IP Networks 125 9.0 Conclusion 126 10.0 Security Considerations 127 11.0 Acknowledgments 128 12.0 References 129 13.0 Authors' Addresses 131 1.0 Introduction 133 This memo describes a framework for Internet traffic engineering. 134 The objective of the document is to articulate the general issues, 135 principles and requirements for Internet traffic engineering; and 136 where appropriate to provide recommendations, guidelines, and options 137 for the development of online and offline Internet traffic 138 engineering capabilities and support systems. 140 The framework can aid service providers in devising and implementing 141 traffic engineering solutions for their networks. Networking hardware 142 and software vendors will also find the framework helpful in the 143 development of mechanisms and support systems for the Internet 144 environment that support the traffic engineering function. 146 The framework provides a terminology for describing and understanding 147 common Internet traffic engineering concepts. The framework also 148 provides a taxonomy of known traffic engineering styles. In this 149 context, a traffic engineering style abstracts important aspects from 150 a traffic engineering methodology. Traffic engineering styles can be 151 viewed in different ways depending upon the specific context in which 152 they are used and the specific purpose which they serve. The 153 combination of styles and views results in a natural taxonomy of 154 traffic engineering systems. 156 Even though Internet traffic engineering is most effective when 157 applied end-to-end, the initial focus of this framework document is 158 intra-domain traffic engineering (that is, traffic engineering within 159 a given autonomous system). However, because a preponderance of 160 Internet traffic tends to be inter-domain (originating in one 161 autonomous system and terminating in another), this document provides 162 an overview of aspects pertaining to inter-domain traffic 163 engineering. 165 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 166 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 167 document are to be interpreted as described in RFC 2119. 169 1.1. What is Internet Traffic Engineering? 171 Internet traffic engineering is defined as that aspect of Internet 172 network engineering dealing with the issue of performance evaluation 173 and performance optimization of operational IP networks. Traffic 174 Engineering encompasses the application of technology and scientific 175 principles to the measurement, characterization, modeling, and 176 control of Internet traffic [AWD1, AWD2]. 178 Enhancing the performance of an operational network, at both the 179 traffic and resource levels, are major objectives of Internet traffic 180 engineering. This is accomplished by addressing traffic oriented 181 performance requirements, while utilizing network resources 182 economically and reliably. Traffic oriented performance measures 183 include delay, delay variation, packet loss, and goodput. 185 An important objective of Internet traffic engineering is to 186 facilitate reliable network operations [AWD1]. Reliable network 187 operations can be facilitated by providing mechanisms that enhance 188 network integrity and by embracing policies emphasizing network 189 survivability. This results in a minimization of the vulnerability of 190 the network to service outages arising from errors, faults, and 191 failures occurring within the infrastructure. 193 An Internet exists in order to transfer information from source nodes 194 to destination nodes. Accordingly, one of the most significant 195 functions performed by an Internet is the routing of traffic from 196 ingress nodes to egress nodes. Therefore, one of the most distinctive 197 functions performed by Internet traffic engineering is the control 198 and optimization of the routing function, to steer traffic through 199 the network in the most effective way. 201 Ultimately, it is the performance of the network as seen by end users 202 of network services that is truly paramount. This crucial point 203 should be considered throughout the development of traffic 204 engineering mechanisms and policies. The characteristics visible to 205 end users are the emergent properties of the network, which are the 206 characteristics of the network when viewed as a whole. A central goal 207 of the service provider, therefore, is to enhance the emergent 208 properties of the network while taking economic considerations into 209 account. 211 The importance of the above observation regarding the emergent 212 properties of networks is that special care must be taken when 213 choosing network performance measures to optimize. Optimizing the 214 wrong measures may achieve certain local objectives, but may have 215 disastrous consequences on the emergent properties of the network and 216 thereby on the quality of service perceived by end-users of network 217 services. 219 A subtle, but practical advantage of the systematic application of 220 traffic engineering concepts to operational networks is that it helps 221 to identify and structure goals and priorities in terms of enhancing 222 the quality of service delivered to end-users of network services. 223 The application of traffic engineering concepts also aids in the 224 measurement and analysis of the achievement of these goals. 226 The optimization aspects of traffic engineering can be achieved 227 through capacity management and traffic management. As used in this 228 document, capacity management includes capacity planning, routing 229 control, and resource management. Network resources of particular 230 interest include link bandwidth, buffer space, and computational 231 resources. Likewise, as used in this document, traffic management 232 includes (1) nodal traffic control functions such as traffic 233 conditioning, queue management, scheduling, and (2) other functions 234 that regulate traffic flow through the network or that arbitrate 235 access to network resources between different packets or between 236 different traffic streams. 238 The optimization objectives of Internet traffic engineering should be 239 viewed as a continual and iterative process of network performance 240 improvement and not simply as a one time goal. Traffic engineering 241 also demands continual development of new technologies and new 242 methodologies for network performance enhancement. 244 The optimization objectives of Internet traffic engineering may 245 change over time as new requirements are imposed, as new technologies 246 emerge, or as new insights are brought to bear on the underlying 247 problems. Moreover, different networks may have different 248 optimization objectives, depending upon their business models, 249 capabilities, and operating constraints. The optimization aspects of 250 traffic engineering are ultimately concerned with network control 251 regardless of the specific optimization goals in any particular 252 environment. 254 Thus, the optimization aspects of traffic engineering can be viewed 255 from a control perspective. The aspect of control within the Internet 256 traffic engineering arena can be pro-active and/or reactive. In the 257 pro-active case, the traffic engineering control system takes 258 preventive action to obviate predicted unfavorable future network 259 states. It may also take perfective action to induce a more 260 desirable state in the future. In the reactive case, the control 261 system responds correctively and perhaps adaptively to events that 262 have already transpired in the network. 264 The control dimension of Internet traffic engineering responds at 265 multiple levels of temporal resolution to network events. Certain 266 aspects of capacity management, such as capacity planning, respond at 267 very coarse temporal levels, ranging from days to possibly years. The 268 introduction of automatically switched optical transport networks 269 (e.g. based on the Multiprotocol Lambda Switching concepts [AWD6]) 270 could significantly reduce the lifecycle for capacity planning by 271 expediting provisioning of optical bandwidth. Routing control 272 functions operate at intermediate levels of temporal resolution, 273 ranging from milliseconds to days. Finally, the packet level 274 processing functions (e.g. rate shaping, queue management, and 275 scheduling) operate at very fine levels of temporal resolution, 276 ranging from picoseconds to milliseconds while responding to the 277 real-time statistical behavior of traffic. The subsystems of Internet 278 traffic engineering control include: capacity augmentation, routing 279 control, traffic control, and resource control (including control of 280 service policies at network elements). When capacity is to be 281 augmented for tactical purposes, it may be desirable to devise a 282 deployment plan expedites bandwidth provisioning while minimizing 283 installation costs. 285 Inputs into the traffic engineering control system include network 286 state variables, policy variables, and decision variables. 288 One major challenge of Internet traffic engineering is the 289 realization of automated control capabilities that adapt quickly and 290 cost effectively to significant changes in a network's state, while 291 still maintaining stability. 293 Another critical dimension of Internet traffic engineering is network 294 performance evaluation, which is important for assessing the 295 effectiveness of traffic engineering methods, and for monitoring and 296 verifying compliance with network performance goals. Results from 297 performance evaluation can be used to identify existing problems, 298 guide network re-optimization, and aid in the prediction of potential 299 future problems. 301 Performance evaluation can be achieved in many different ways. The 302 most notable techniques include analytical methods, simulation, and 303 empirical methods based on measurements. When analytical methods or 304 simulation are used, network nodes and links can be modeled to 305 capture relevant operational features such as topology, bandwidth, 306 buffer space, and nodal service policies (link scheduling, packet 307 prioritization, buffer management, etc). Analytical traffic models 308 can be used to depict dynamic and behavioral traffic characteristics, 309 such as burstiness, statistical distributions, dependence, and 310 seasonality. 312 Performance evaluation can be quite complicated in practical network 313 contexts. A number of techniques can be used to simplify the 314 analysis, such as abstraction, decomposition, and approximation. For 315 example, simplifying concepts such as effective bandwidth and 316 effective buffer [Elwalid] may be used to approximate nodal behaviors 317 at the packet level and simplify the analysis at the connection 318 level. Network analysis techniques using, for example, queuing models 319 and approximation schemes based on asymptotic and decomposition 320 techniques can render the analysis even more tractable. In 321 particular, an emerging set of concepts known as network calculus 322 [Cruz] based on deterministic bounds may simplify network analysis 323 relative to classical stochastic techniques. When using analytical 324 techniques, care should be taken to ensure that the models faithfully 325 reflect the relevant operational characteristics of the modeled 326 network entities. 328 Simulation can be used to evaluate network performance or to verify 329 and validate analytical approximations. Simulation can, however, be 330 computationally costly and may not always provide sufficient 331 insights. An appropriate approach to a given network performance 332 evaluation problem may involve a hybrid combination of analytical 333 techniques, simulation, and empirical methods. 335 As a general rule, traffic engineering concepts and mechanisms must 336 be sufficiently specific and well defined to address known 337 requirements, but simultaneously flexible and extensible to 338 accommodate unforeseen future demands. 340 1.2. Scope 342 The scope of this document is intra-domain traffic engineering; that 343 is, traffic engineering within a given autonomous system in the 344 Internet. The framework will discuss concepts pertaining to intra- 345 domain traffic control, including such issues as routing control, 346 micro and macro resource allocation, and the control coordination 347 problems that arise consequently. 349 This document will describe and characterize techniques already in 350 use or in advanced development for Internet traffic engineering. The 351 way these techniques fit together will be discussed and scenarios in 352 which they are useful will be identified. 354 Although the emphasis is on intra-domain traffic engineering, in 355 Section 7.0, however, an overview of the high level considerations 356 pertaining to inter-domain traffic engineering will be provided. 357 Inter-domain Internet traffic engineering is crucial to the 358 performance enhancement of the global Internet infrastructure. 360 Whenever possible, relevant requirements from existing IETF documents 361 and other sources will be incorporated by reference. 363 1.3 Terminology 365 This subsection provides terminology which is useful for Internet 366 traffic engineering. The definitions presented apply to this 367 framework document. These terms may have other meanings elsewhere. 369 - Baseline analysis: 370 A study conducted to serve as a baseline for comparison to the 371 actual behavior of the network. 373 - Busy hour: 374 A one hour period within a specified interval of time 375 (typically 24 hours) in which the traffic load in a 376 network or subnetwork is greatest. 378 - Bottleneck 379 A network element whose input traffic rate tends to be greater 380 than its output rate. 382 - Congestion: 383 A state of a network resource in which the traffic incident 384 on the resource exceeds its output capacity over an interval 385 of time. 387 - Congestion avoidance: 388 An approach to congestion management that attempts to obviate 389 the occurrence of congestion. 391 - Congestion control: 392 An approach to congestion management that attempts to remedy 393 congestion problems that have already occurred. 395 - Constraint-based routing: 396 A class of routing protocols that take specified traffic 397 attributes, network constraints, and policy constraints into 398 account in making routing decisions. Constraint-based routing 399 is applicable to traffic aggregates as well as flows. It is a 400 generalization of QoS routing. 402 - Demand side congestion management: 403 A congestion management scheme that addresses congestion 404 problems by regulating or conditioning offered load. 406 - Effective bandwidth: 407 The minimum amount of bandwidth that can be assigned to a flow 408 or traffic aggregate in order to deliver 'acceptable service 409 quality' to the flow or traffic aggregate. 411 - Egress traffic: 412 Traffic exiting a network or network element. 414 - Hot-spot 415 A network element or subsystem which is in a state of 416 congestion. 418 - Ingress traffic: 419 Traffic entering a network or network element. 421 - Inter-domain traffic: 422 Traffic that originates in one Autonomous system and 423 terminates in another. 425 - Loss network: 426 A network that does not provide adequate buffering for 427 traffic, so that traffic entering a busy resource within 428 the network will be dropped rather than queued. 430 - Metric: 431 A parameter defined in terms of standard units of 432 measurement. 434 - Measurement Methodology: 435 A repeatable measurement technique used to derive one or 436 more metrics of interest. 438 - Network Survivability: 439 The capability to provide a prescribed level of QoS for 440 existing services after a given number of failures occur 441 within the network. 443 - Offline traffic engineering: 444 A traffic engineering system that exists outside of the 445 network. 447 - Online traffic engineering: 448 A traffic engineering system that exists within the network, 449 typically implemented on or as adjuncts to operational network 450 elements. 452 - Performance measures: 453 Metrics that provide quantitative or qualitative measures of 454 the performance of systems or subsystems of interest. 456 - Performance management: 457 A systematic approach to improving effectiveness in the 458 accomplishment of specific networking goals related to 459 performance improvement. 461 - Performance Metric: 462 A performance parameter defined in terms of standard units of 463 measurement. 465 - Provisioning: 466 The process of assigning or configuring network resources to 467 meet certain requests. 469 - QoS routing: 470 Class of routing systems that selects paths to be used by a 471 flow based on the QoS requirements of the flow. 473 - Service Level Agreement: 474 A contract between a provider and a customer that guarantees 475 specific levels of performance and reliability at a certain 476 cost. 478 - Stability: 479 An operational state in which a network does not oscillate 480 in a disruptive manner from one mode to another mode. 482 - Supply side congestion management: 483 A congestion management scheme that provisions additional 484 network resources to address existing and/or anticipated 485 congestion problems. 487 - Transit traffic: 488 Traffic whose origin and destination are both outside of 489 the network under consideration. 491 - Traffic characteristic: 492 A description of the temporal behavior or a description of the 493 attributes of a given traffic flow or traffic aggregate. 495 - Traffic engineering system 496 A collection of objects, mechanisms, and protocols that are 497 used conjunctively to accomplish traffic engineering 498 objectives. 500 - Traffic flow: 501 A stream of packets between two end-points that can be 502 characterized in a certain way. A micro-flow has a more 503 specific definition: A micro-flow is a stream of packets with 504 a bounded inter-arrival time and with the same source and 505 destination addresses, source and destination ports, and 506 protocol ID. 508 - Traffic intensity: 509 A measure of traffic loading with respect to a resource 510 capacity over a specified period of time. In classical 511 telephony systems, traffic intensity is measured in units of 512 Erlang. 514 - Traffic matrix: 515 A representation of the traffic demand between a set of origin 516 and destination abstract nodes. An abstract node can consist 517 of one or more network elements. 519 - Traffic monitoring: 521 The process of observing traffic characteristics at a given 522 point in a network and collecting the traffic information for 523 analysis and further action. 525 - Traffic trunk: 526 An aggregation of traffic flows belonging to the same class 527 which are forwarded through a common path. A traffic trunk 528 may be characterized by an ingress and egress node, and a 529 set of attributes which determine its behavioral 530 characteristics and requirements from the network. 532 2.0 Background 534 The Internet has quickly evolved into a very critical communications 535 infrastructure, supporting significant economic, educational, and 536 social activities. Simultaneously, the delivery of Internet 537 communications services has become very competitive and end-users are 538 demanding very high quality service from their service providers. 539 Consequently, performance optimization of large scale IP networks, 540 especially public Internet backbones, has become an important 541 problem. Network performance requirements are multidimensional, 542 complex, and sometimes contradictory; making the traffic engineering 543 problem very challenging. 545 The network must convey IP packets from ingress nodes to egress nodes 546 efficiently, expeditiously, reliably, and economically. Furthermore, 547 in a multiclass service environment (e.g. Diffserv capable networks), 548 the resource sharing parameters of the network must be appropriately 549 determined and configured according to prevailing policies and 550 service models to resolve resource contention issues arising from 551 mutual interference between packets traversing through the network. 552 Thus, consideration must be given to resolving competition for 553 network resources between traffic streams belonging to the same 554 service class (intra-class contention resolution) and traffic streams 555 belonging to different classes (inter-class contention resolution). 557 2.1 Context of Internet Traffic Engineering 559 The context of Internet traffic engineering pertains to the scenarios 560 in which the problems that traffic engineering attempts to solve 561 manifest. A traffic engineering methodology establishes appropriate 562 rules to resolve traffic performance issues occurring in a specific 563 context. The context of Internet traffic engineering includes: 565 (1) A network context defining the universe of discourse, 566 and in particular the situations in which the traffic 567 engineering problems occur. The network context 568 encompasses network structure, network policies, network 569 characteristics, network constraints, network quality 570 attributes, network optimization criteria, etc. 572 (2) A problem context defining the general and concrete 573 issues that traffic engineering addresses. The problem 574 context encompasses identification, abstraction of relevant 575 features, representation, formulation, specification of 576 the requirements on the solution space, specification 577 of the desirable features of acceptable solutions, etc. 579 (3) A solution context suggesting how to solve the traffic 580 engineering problems. The solution context encompasses 581 analysis, evaluation of alternatives, prescription, and 582 resolution. 584 (4) An implementation and operational context in which the 585 solutions are methodologically instantiated. The 586 implementation and operational context encompasses 587 planning, organization, and execution. 589 The context of Internet traffic engineering and the different problem 590 scenarios are discussed in the following subsections. 592 2.2 Network Context 594 IP networks range in size from small clusters of routers situated 595 within a given location, to thousands of interconnected routers, 596 switches, and other components distributed all over the world. 598 Conceptually, at the most basic level of abstraction, an IP network 599 can be represented as a distributed dynamical system consisting of: 600 (1) a set of interconnected resources which provide transport 601 services for IP traffic subject to certain constraints, (2) a demand 602 system representing the offered load to be transported through the 603 network, and (3) a response system consisting of network processes, 604 protocols, and related mechanisms which facilitate the movement of 605 traffic through the network [see also AWD2]. 607 The network elements and resources may have specific characteristics 608 restricting the manner in which the demand is handled. Additionally, 609 network resources may be equipped with traffic control mechanisms 610 superintending the way in which the demand is serviced. Traffic 611 control mechanisms may, for example, be used to control various 612 packet processing activities within a given resource, arbitrate 613 contention for access to the resource by different packets, and 614 regulate traffic behavior through the resource. A configuration 615 management and provisioning system may allow the settings of the 616 traffic control mechanisms to be manipulated by external or internal 617 entities in order to exercise control over the way in which the 618 network elements respond to internal and external stimuli. 620 The details of how the network provides transport services for 621 packets are specified in the policies of the network administrators 622 and are installed through network configuration management and policy 623 based provisioning systems. Generally, the types of services 624 provided by the network also depends upon the technology and 625 characteristics of the network elements and protocols, the prevailing 626 service and utility models, and the ability of the network 627 administrators to translate policies into network configurations. 629 Contemporary Internet networks have three significant 630 characteristics: (1) they provide real-time services, (2) they have 631 become mission critical, and (3) their operating environments are 632 very dynamic. The dynamic characteristics of IP networks can be 633 attributed in part to fluctuations in demand, to the interaction 634 between various network protocols and processes, to the rapid 635 evolution of the infrastructure which demands the constant inclusion 636 of new technologies and new network elements, and to transient and 637 persistent impairments which occur within the system. 639 Packets contend for the use of network resources as they are conveyed 640 through the network. A network resource is considered to be 641 congested if the arrival rate of packets exceed the output capacity 642 of the resource over an interval of time. Congestion may result in 643 some of the arrival packets being delayed or even dropped. 644 Congestion increases transit delays, delay variation, packet loss, 645 and reduces the predictability of network services. Clearly, 646 congestion is a highly undesirable phenomenon. 648 Combating congestion at reasonable cost is a major objective of 649 Internet traffic engineering. 651 Efficient sharing of network resources by multiple traffic streams is 652 a basic economic premise for packet switched networks in general and 653 the Internet in particular. A fundamental challenge in network 654 operation, especially in a large scale public IP network, is to 655 increase the efficiency of resource utilization while minimizing the 656 possibility of congestion. 658 Increasingly, the Internet will have to function in the presence of 659 different classes of traffic with different service requirements. The 660 advent of differentiated services makes this requirement particularly 661 acute. Thus, packets may be grouped into behavior aggregates such 662 that each behavior aggregate may have a common set of behavioral 663 characteristics or a common set of delivery requirements. In 664 practice, the delivery requirements of a specific set of packets may 665 be specified explicitly or implicitly. Two of the most important 666 traffic delivery requirements are capacity constraints and QoS 667 constraints. 669 Capacity constraints can be expressed statistically as peak rates, 670 mean rates, burst sizes, or as some deterministic notion of effective 671 bandwidth. QoS requirements can be expressed in terms of (1) 672 integrity constraints such as packet loss and (2) in terms of 673 temporal constraints such as timing restrictions for the delivery of 674 each packet (delay) and timing restrictions for the delivery of 675 consecutive packets belonging to the same traffic stream (delay 676 variation). 678 2.3 Problem Context 680 Fundamental problems exist in association with the operation of a 681 network described by the simple model of the previous subsection. 682 This subsection reviews the problem context in relation to the 683 traffic engineering function. 685 The identification, abstraction, representation, and measurement of 686 network features relevant to traffic engineering is a significant 687 issue. 689 One particularly important class of problems concerns how to 690 explicitly formulate the problems that traffic engineering attempts 691 to solve, how to identify the requirements on the solution space, how 692 to specify the desirable features of good solutions, how to actually 693 solve the problems, and how to measure and characterize the 694 effectiveness of the solutions. 696 Another class of problems concerns how to measure and estimate 697 relevant network state parameters. Effective traffic engineering 698 relies on a good estimate of the offered traffic load as well as a 699 view of the underlying topology and associated resource constraints. 700 A network-wide view of the topology is also a must for offline 701 planning. 703 Still another class of problems concerns how to characterize the 704 state of the network and how to evaluate its performance under a 705 variety of scenarios. The performance evaluation problem is two-fold. 706 One aspect of this problem relates to the evaluation of the system 707 level performance of the network. The other aspect relates to the 708 evaluation of the resource level performance, which restricts 709 attention to the performance analysis of individual network 710 resources. In this memo, we shall refer to the system level 711 characteristics of the network as the "macro-states" and the resource 712 level characteristics as the "micro-states." The system level 713 characteristics are also known as the emergent properties of the 714 network as noted earlier. Correspondingly, we shall refer to the 715 traffic engineering schemes dealing with network performance 716 optimization at the systems level as "macro-TE" and the schemes that 717 optimize at the individual resource level as "micro-TE." Under 718 certain circumstances, the system level performance can be derived 719 from the resource level performance using appropriate rules of 720 composition, depending upon the particular performance measures of 721 interest. 723 Another fundamental class of problems concerns how to effectively 724 optimize network performance. Performance optimization may entail 725 translating solutions to specific traffic engineering problems into 726 network configurations. Optimization may also entail some degree of 727 resource management control, routing control, and/or capacity 728 augmentation. 730 As noted previously, congestion is an undesirable phenomena in 731 operational networks. Therefore, the next subsection addresses the 732 issue of congestion and its ramifications within the problem context 733 of Internet traffic engineering. 735 2.3.1 Congestion and its Ramifications 737 Congestion is one of the most significant problems in an operational 738 IP context. A network element is said to be congested if it 739 experiences sustained overload over an interval of time. Congestion 740 almost always results in degradation of service quality to end users. 741 Congestion control schemes can include demand side policies and 742 supply side policies. Demand side policies may restrict access to 743 congested resources and/or dynamically regulate the demand to 744 alleviate the overload situation. Supply side policies may expand or 745 augment network capacity to better accommodate offered traffic. 746 Supply side policies may also re-allocate network resources by 747 redistributing traffic over the infrastructure. Traffic 748 redistribution and resource re-allocation serve to increase the 749 'effective capacity' seen by the demand. 751 The emphasis of this memo is primarily on congestion management 752 schemes falling within the scope of the network, rather than on 753 congestion management systems dependent upon sensitivity and 754 adaptivity from end-systems. That is, the aspects that are considered 755 in this memo with respect to congestion management are those 756 solutions that can be provided by control entities operating on the 757 network and by the actions of network administrators and network 758 operations systems. 760 2.4 Solution Context 762 The solution context for Internet traffic engineering involves 763 analysis, evaluation of alternatives, and choice between alternative 764 courses of action. Generally the solution context is predicated on 765 making reasonable inferences about the current or future state of the 766 network, and subsequently making appropriate decisions that may 767 involve a preference between alternative sets of action. More 768 specifically, the solution context demands reasonable estimates of 769 traffic workload, characterization of network state, deriving 770 solutions to traffic engineering problems which may be implicitly or 771 explicitly formulated, and possibly instantiating a set of control 772 actions. Control actions may involve the manipulation of parameters 773 associated with routing, control over tactical capacity acquisition, 774 and control over the traffic management functions. 776 The following list of instruments may be applicable to the solution 777 context of Internet traffic engineering. 779 (1) A set of policies, objectives, and requirements (which may be 780 context dependent) for network performance evaluation and 781 performance optimization. 783 (2) A collection of online and possibly offline tools and mechanisms 784 for measurement, characterization, modeling, and control 785 of Internet traffic and control over the placement and allocation 786 of network resources, as well as control over the mapping or 787 distribution of traffic onto the infrastructure. 789 (3) A set of constraints on the operating environment, the network 790 protocols, and the traffic engineering system itself. 792 (4) A set of quantitative and qualitative techniques and 793 methodologies for abstracting, formulating, and 794 solving traffic engineering problems. 796 (5) A set of administrative control parameters which may be 797 manipulated through a Configuration Management (CM) system. 798 The CM system itself may include a configuration control 799 subsystem, a configuration repository, a configuration 800 accounting subsystem, and a configuration auditing subsystem. 802 (6) A set of guidelines for network performance evaluation, 803 performance optimization, and performance improvement. 805 Derivation of traffic characteristics through measurement and/or 806 estimation is very useful within the realm of the solution space for 807 traffic engineering. Traffic estimates can be derived from customer 808 subscription information, traffic projections, traffic models, and 809 from actual empirical measurements. The empirical measurements may be 810 performed at the traffic aggregate level or at the flow level in 811 order to derive traffic statistics at various levels of detail. 812 Measurements at the flow level or on small traffic aggregates may be 813 performed at edge nodes, where traffic enters and leaves the network. 814 Measurements at large traffic aggregate levels may be performed 815 within the core of the network where potentially numerous traffic 816 flows may be in transit concurrently. 818 To conduct performance studies and to support planning of existing 819 and future networks, a routing analysis may be performed to determine 820 the path(s) the routing protocols will choose for various traffic 821 demands, and to ascertain the utilization of network resources as 822 traffic is routed through the network. The routing analysis should 823 capture the selection of paths through the network, the assignment of 824 traffic across multiple feasible routes, and the multiplexing of IP 825 traffic over traffic trunks (if such constructs exists) and over the 826 underlying network infrastructure. A network topology model is a 827 necessity for routing analysis. A network topology model may be 828 extracted from network architecture documents, from network designs, 829 from information contained in router configuration files, from 830 routing databases, from routing tables, or from automated tools that 831 discover and depict network topology information. Topology 832 information may also be derived from servers that monitor network 833 state, and from servers that perform provisioning functions. 835 Routing in operational IP networks can be administratively controlled 836 at various levels of abstraction including the manipulation of BGP 837 attributes; manipulation of IGP metrics. For path oriented 838 technologies such as MPLS and its derivatives, routing can be further 839 controlled by the manipulation of relevant traffic engineering 840 parameters, resource parameters, and administrative policy 841 constraints. Within the context of MPLS, the path of an explicit 842 label switched path (LSP) can be computed and established in various 843 ways including: (1) manually, (2) automatically online using 844 constraint-based routing processes implemented on label switching 845 routers, and (3) automatically offline using constraint-based routing 846 entities implemented on external traffic engineering support systems. 848 2.4.1 Combating the Congestion Problem 850 Minimizing congestion is a significant aspect of Internet traffic 851 engineering. This subsection gives an overview of the general 852 approaches that have been used or proposed to combat congestion 853 problems. 855 Congestion management policies can be categorized based upon the 856 following criteria (see e.g., [YaRe95] for a more detailed taxonomy 857 of congestion control schemes): (1) Response time scale which can be 858 characterized as long, medium, or short; (2) reactive versus 859 preventive which relates to congestion control and congestion 860 avoidance; and (3) supply side versus demand side congestion 861 management schemes. These aspects are discussed in the following 862 paragraphs. 864 (1) Congestion Management based on Response Time Scales 866 - Long (weeks to months): Capacity planning works over a relatively 867 long time scale to expand network capacity based on estimates or 868 forecasts of future traffic demand and traffic distribution. Since 869 router and link provisioning take time and are generally expensive, 870 these upgrades are typically carried out in the weeks-to-months or 871 even years time scale. 873 - Medium (minutes to days): Several control policies fall within the 874 medium time scale category. Examples include: (1) Adjusting IGP 875 and/or BGP parameters to route traffic away or towards certain 876 segments of the network; (2) Setting up and/or adjusting some 877 explicitly routed label switched paths (ER-LSPs) in MPLS networks to 878 route some traffic trunks away from possibly congested resources or 879 towards possibly more favorable routes; (3) re-configuring the 880 logical topology of the network to make it correlate more closely 881 with the spatial traffic distribution using for example some 882 underlying path-oriented technology such as MPLS LSPs, ATM PVCs, or 883 optical channel trails (see e.g. [AWD6]). Many of these adaptive 884 medium time scale response schemes rely on a measurement system that 885 monitors changes in traffic distribution, traffic shifts, and network 886 resource utilization and subsequently provides feedback to the online 887 and/or offline traffic engineering mechanisms and tools which employ 888 this feedback information to trigger certain control actions to occur 889 within the network. The traffic engineering mechanisms and tools can 890 be implemented in a distributed fashion or in a centralized fashion, 891 and may have a hierarchical structure or a flat structure. The 892 comparative merits of distributed and centralized control structures 893 for networks are well known. A centralized scheme may have global 894 visibility into the network state and may produce potentially more 895 optimal solutions. However, centralized schemes are prone to single 896 points of failure and may not scale as well as distributed schemes. 897 Moreover, the information utilized by a centralized scheme may be 898 stale and may not reflect the actual state of the network. It is not 899 an objective of this memo to make a recommendation between 900 distributed and centralized schemes. This is a choice that network 901 administrators must make based on their specific needs. 903 - Short (picoseconds to minutes): This category includes packet level 904 processing functions and events on the order of several round trip 905 times. It includes router mechanisms such as passive and active 906 buffer management. These mechanisms are used to control congestion 907 and/or signal congestion to end systems so that they can adaptively 908 regulate the rate at which traffic is injected into the network. One 909 of the most popular active queue management schemes, especially for 910 TCP traffic, is Random Early Detection (RED) [FlJa93], which supports 911 congestion avoidance by controlling the average queue size. During 912 congestion (but before the queue is filled), the RED scheme chooses 913 arriving packets to "mark" according to a probabilistic algorithm 914 which takes into account the average queue size. For a router that 915 does not utilize explicit congestion notification (ECN) see e.g., 916 [Floy94]), the marked packets can simply be dropped to signal the 917 inception of congestion to end systems. On the other hand, if the 918 router supports ECN, then it can set the ECN field in the packet 919 header. Several variations of RED have been proposed to support 920 different drop precedence levels in multiclass environments [RFC- 921 2597], e.g., RED with In and Out (RIO) and Weighted RED. There is 922 general consensus that RED provides congestion avoidance performance 923 which is not worse than traditional Tail-Drop (TD) queue management 924 (drop arriving packets only when the queue is full). Importantly, 925 however, RED reduces the possibility of global synchronization and 926 improves fairness among different TCP sessions. However, RED by 927 itself can not prevent congestion and unfairness caused by 928 unresponsive sources, e.g., UDP traffic and some misbehaved greedy 929 connections. Other schemes have been proposed to improve the 930 performance and fairness in the presence of unresponsive traffic. 931 Some of these schemes were proposed as theoretical frameworks and are 932 typically not available in existing commercial products. Two such 933 schemes are Longest Queue Drop (LQD) and Dynamic Soft Partitioning 934 with Random Drop (RND) [SLDC98]. 936 (2) Congestion Management: Reactive versus Preventive Schemes 938 - Reactive: reactive (recovery) congestion management policies react 939 to existing congestion problems to improve it. All the policies 940 described in the long and medium time scales above can be categorized 941 as being reactive especially if the policies are based on monitoring 942 and identifying existing congestion problems, and on the initiation 943 of relevant actions to ease the situation. 945 - Preventive: preventive (predictive/avoidance) policies take 946 proactive action to prevent congestion based on estimates and 947 predictions of future potential congestion problems. Some of the 948 policies described in the long and medium time scales fall into this 949 category. They do not necessarily respond immediately to existing 950 congestion problems. Instead forecasts of traffic demand and workload 951 distribution are considered and action may be taken to prevent 952 potential congestion problems in the future. The schemes described in 953 the short time scale (e.g., RED and its variations, ECN, LQD, and 954 RND) are also used for congestion avoidance since dropping or marking 955 packets before queues actually overflow would trigger corresponding 956 TCP sources to slow down. 958 (3) Congestion Management: Supply Side versus Demand Side Schemes 960 - Supply side: supply side congestion management policies increase 961 the effective capacity available to traffic in order to control or 962 obviate congestion. This can be accomplished by augmenting capacity. 963 Another way to accomplish this is to minimize congestion by having a 964 relatively balanced distribution of traffic over the network. For 965 example, capacity planning should aim to provide a physical topology 966 and associated link bandwidths that match estimated traffic workload 967 and traffic distribution based on forecasting (subject to budgetary 968 and other constraints). However, if actual traffic distribution does 969 not match the topology derived from capacity panning (due to 970 forecasting errors or facility constraints for example), then the 971 traffic can be mapped onto the existing topology using routing 972 control mechanisms, using path oriented technologies (e.g., MPLS LSPs 973 and optical channel trails) to modify the logical topology, or by 974 using some other load redistribution mechanisms. 976 - Demand side: demand side congestion management policies control or 977 regulate the offered traffic to alleviate congestion problems. For 978 example, some of the short time scale mechanisms described earlier 979 (such as RED and its variations, ECN, LQD, and RND) as well as 980 policing and rate shaping mechanisms attempt to regulate the offered 981 load in various ways. Tariffs may also be applied as a demand side 982 instrument. To date, however, tariffs have not been used as a means 983 of demand side congestion management within the Internet. 985 In summary, a variety of mechanisms can be used to address congestion 986 problems in IP networks. These mechanisms may operate at multiple 987 time-scales. 989 2.5 Implementation and Operational Context 991 The operational context of Internet traffic engineering is 992 characterized by constant change which occur at multiple levels of 993 abstraction. The implementation context demands effective planning, 994 organization, and execution. The planning aspects may involve 995 determining prior sets of actions to achieve desired objectives. 996 Organizing involves arranging and assigning responsibility to the 997 various components of the traffic engineering system and coordinating 998 the activities to accomplish the desired TE objectives. Execution 999 involves measuring and applying corrective or perfective actions to 1000 attain and maintain desired TE goals. 1002 3.0 Traffic Engineering Process Model(s) 1004 This section describes a generic process model that captures the high 1005 level practical aspects of Internet traffic engineering in an 1006 operational context. The process model is described as a sequence of 1007 actions that a traffic engineer, or more generally a traffic 1008 engineering system, must perform to optimize the performance of an 1009 operational network (see also [AWD1, AWD2]). The process model 1010 described here represents the broad activities common to most traffic 1011 engineering methodologies although the details regarding how traffic 1012 engineering is executed may differ from network to network. This 1013 process model may be enacted explicitly or implicitly, by an 1014 automaton and/or by a human. 1016 The traffic engineering process model is iterative [AWD2]. The four 1017 phases of the process model described below are repeated continually. 1019 The first phase of the TE process model is to define the relevant 1020 control policies that govern the operation of the network. These 1021 policies may depend upon many factors including the prevailing 1022 business model, the network cost structure, the operating 1023 constraints, the utility model, and optimization criteria. 1025 The second phase of the process model is a feedback mechanism 1026 involving the acquisition of measurement data from the operational 1027 network. If empirical data is not readily available from the network, 1028 then synthetic workloads may be used instead which reflect either the 1029 prevailing or the expected workload of the network. Synthetic 1030 workloads may be derived by estimation or extrapolation using prior 1031 empirical data. Their derivation may also be obtained using 1032 mathematical models of traffic characteristics or other means. 1034 The third phase of the process model is to analyze the network state 1035 and to characterize traffic workload. Performance analysis may be 1036 proactive and/or reactive. Proactive performance analysis identifies 1037 potential problems that do not exist, but could manifest in the 1038 future. Reactive performance analysis identifies existing problems, 1039 determines their cause through diagnosis, and evaluates alternative 1040 approaches to remedy the problem, if necessary. A number of 1041 quantitative and qualitative techniques may be used in the analysis 1042 process, including modeling based analysis and simulation. The 1043 analysis phase of the process model may involve investigating the 1044 concentration and distribution of traffic across the network or 1045 relevant subsets of the network, identifying the characteristics of 1046 the offered traffic workload, identifying existing or potential 1047 bottlenecks, and identifying network pathologies such as ineffective 1048 link placement, single points of failures, etc. Network pathologies 1049 may result from many factors including inferior network architecture, 1050 inferior network design, and configuration problems. A traffic 1051 matrix may be constructed as part of the analysis process. Network 1052 analysis may also be descriptive or prescriptive. 1054 The fourth phase of the TE process model is the performance 1055 optimization of the network. The performance optimization phase 1056 involves a decision process which selects and implements a set of 1057 actions from a set of alternatives. Optimization actions may include 1058 the use of appropriate techniques to either control the offered 1059 traffic or to control the distribution of traffic across the network. 1060 Optimization actions may also involve adding additional links or 1061 increasing link capacity, deploying additional hardware such as 1062 routers and switches, systematically adjusting parameters associated 1063 with routing such as IGP metrics and BGP attributes, and adjusting 1064 traffic management parameters. Network performance optimization may 1065 also involve starting a network planning process to improve the 1066 network architecture, network design, network capacity, network 1067 technology, and the configuration of network elements to accommodate 1068 current and future growth. 1070 3.1 Components of the Traffic Engineering Process Model 1072 The key components of the traffic engineering process model include a 1073 measurement subsystem, a modeling and analysis subsystem, and an 1074 optimization subsystem. The following subsections examine these 1075 components as they apply to the traffic engineering process model. 1077 3.2 Measurement 1079 Measurement is crucial to the traffic engineering function. The 1080 operational state of a network can be conclusively determined only 1081 through measurement. Measurement is also critical to the optimization 1082 function because it provides feedback data which is used by traffic 1083 engineering control subsystems. This data is used to adaptively 1084 optimize network performance in response to events and stimuli 1085 originating within and outside the network. Measurement is also 1086 needed to determine the quality of network services and to evaluate 1087 the effectiveness of traffic engineering policies. Experience 1088 suggests that measurement is most effective when acquired and applied 1089 systematically. 1091 When developing a measurement system to support the traffic 1092 engineering function in IP networks, the following questions should 1093 be carefully considered: Why is measurement needed in this particular 1094 context? What parameters are to be measured? How should the 1095 measurement be accomplished? Where should the measurement be 1096 performed? When should the measurement be performed? How frequently 1097 should the monitored variables be measured? What level of 1098 measurement accuracy and reliability is desirable? What level of 1099 measurement accuracy and reliability is realistically attainable? To 1100 what extent can the measurement system permissibly interfere with the 1101 monitored network components and variables? What is the acceptable 1102 cost of measurement? The answers to these questions will determine 1103 the measurement tools and methodologies appropriate in any given 1104 traffic engineering context. 1106 It should also be noted that there is a distinction between 1107 measurement and evaluation. Measurement provides raw data concerning 1108 state parameters and variables of monitored network elements. 1109 Evaluation utilizes the raw data to make inferences regarding the 1110 monitored system. 1112 Measurement in support of the TE function can occur at different 1113 levels of abstraction. For example, measurement can be used to derive 1114 packet level characteristics, flow level characteristics, user or 1115 customer level characteristics, traffic aggregate characteristics, 1116 component level characteristics, network wide characteristics, etc. 1118 3.3 Modeling, Analysis, and Simulation 1120 Modeling and analysis are important aspects of Internet traffic 1121 engineering. Modeling involves constructing an abstract or physical 1122 representation which depicts relevant traffic characteristics and 1123 network attributes. 1125 A network model is an abstract representation of the network which 1126 captures relevant network features, attributes, and characteristics, 1127 such as link and nodal attributes and constraints. A network model 1128 may facilitate analysis and/or simulation which can be used to 1129 predict network performance under various conditions as well as to 1130 guide network expansion plans. 1132 In general, Internet traffic engineering models can be classified as 1133 either structural or behavioral. Structural models focus on the 1134 organization of the network and its components. Behavioral models 1135 focus on the dynamics of the network and the traffic workload. 1136 Modeling for Internet traffic engineering may also be formal or 1137 informal. 1139 Accurate behavioral models for traffic sources are particularly 1140 useful for analysis. Development of behavioral traffic source models 1141 that are consistent with empirical data obtained from operational 1142 networks is a major research topic in Internet traffic engineering. 1143 These source models should also be tractable and amenable to 1144 analysis. The topic of source models for IP traffic is a research 1145 topic and is therefore outside the scope of this document. Its 1146 importance, however, must be emphasized. 1148 Network simulation tools are extremely useful for traffic 1149 engineering. Because of the complexity of realistic quantitative 1150 analysis of network behavior, certain aspects of network performance 1151 studies can only be conducted effectively using simulation. A good 1152 network simulator can be used to mimic and visualize network 1153 characteristics under various conditions in a safe and non-disruptive 1154 manner. For example, a network simulator may be used to depict 1155 congested resources and hot spots, and to provide hints regarding 1156 possible solutions to network performance problems. A good simulator 1157 may also be used to validate the effectiveness of planned solutions 1158 to network issues without the need to tamper with the operational 1159 network, or to commence an expensive network upgrade which may not 1160 achieve the desired objectives. Furthermore, during the process of 1161 network planning, a network simulator may reveal pathologies such as 1162 single points of failure which may require additional redundancy, and 1163 potential bottlenecks and hot spots which may require additional 1164 capacity. 1166 Routing simulators are especially useful in large networks. A routing 1167 simulator may identify planned links which may not actually be used 1168 to route traffic by the existing routing protocols. Simulators can 1169 also be used to conduct scenario based and perturbation based 1170 analysis, as well as sensitivity studies. Simulation results can be 1171 used to initiate appropriate actions in various ways. For example, an 1172 important application of network simulation tools is to investigate 1173 and identify how best to evolve and grow the network in order to 1174 accommodate projected future demands. 1176 3.4 Optimization 1178 Network performance optimization involves resolving network issues by 1179 transforming such issues into concepts that enable a solution, 1180 identification of a solution, and implementation of the solution. 1181 Network performance optimization can be corrective or perfective. In 1182 corrective optimization, the goal is to remedy a problem that has 1183 occurred or that is incipient. In perfective optimization, the goal 1184 is to improve network performance even when explicit problems do not 1185 exist and are not anticipated. 1187 Network performance optimization is a continual process, as noted 1188 previously. Performance optimization iterations may consist of 1189 real-time optimization sub-processes and non-real-time network 1190 planning sub-processes. The difference between real-time 1191 optimization and network planning is primarily in the relative time- 1192 scale in they operate and in the granularity of actions. One of the 1193 objectives of a real-time optimization sub-process is to control the 1194 mapping and distribution of traffic over the existing network 1195 infrastructure to avoid and/or relieve congestion, to assure 1196 satisfactory service delivery, and to optimize resource utilization. 1197 Real-time optimization is needed because random incidents such as 1198 fiber cuts or shifts in traffic demand will occur irrespective of how 1199 well a network is designed. These incidents can cause congestion and 1200 other problems to manifest in an operational network. Real-time 1201 optimization must solve such problems in small to medium time-scales 1202 ranging from micro-seconds to minutes or hours. Examples of real-time 1203 optimization include queue management, IGP/BGP metric tuning, and 1204 using technologies such as MPLS explicit LSPs to change the paths of 1205 some traffic trunks [XIAO]. 1207 One of the functions of the network planning sub-process is to 1208 initiate actions to systematically evolve the architecture, 1209 technology, topology, and capacity of a network. When a problem 1210 exists in the network, real-time optimization should provide an 1211 immediate remedy. Because a prompt response is necessary, the real- 1212 time solution may not be the best possible solution. Network 1213 planning may subsequently be needed to refine the solution and 1214 improve the situation. Network planning is also required to expand 1215 the network to support traffic growth and changes in traffic 1216 distribution over time. As previously noted, a change in the topology 1217 and/or capacity of the network may be the outcome of network 1218 planning. 1220 Clearly, network planning and real-time performance optimization are 1221 mutually complementary activities. A well-planned and designed 1222 network makes real-time optimization easier, while a systematic 1223 approach to real-time network performance optimization allows network 1224 planning to focus on long term issues rather than tactical 1225 considerations. Systematic real-time network performance 1226 optimization also provides valuable inputs and insights toward 1227 network planning. 1229 Stability is an important consideration in real-time network 1230 performance optimization. This aspect will be repeatedly addressed 1231 throughout this memo. 1233 4.0 Historical Review and Recent Developments 1235 This section briefly reviews different traffic engineering approaches 1236 proposed and implemented in telecommunications and computer networks. 1237 The discussion is not intended to be comprehensive. It is primarily 1238 intended to illuminate pre-existing perspectives and prior art 1239 concerning traffic engineering in the Internet and in legacy 1240 telecommunications networks. 1242 4.1 Traffic Engineering in Classical Telephone Networks 1244 This subsection presents a brief overview of traffic engineering in 1245 telephone networks which often relates to the way user traffic is 1246 steered from an originating node to the terminating node. This 1247 subsection presents a brief overview of this topic. A detailed 1248 description of the various routing strategies applied in telephone 1249 networks is included in the book by G. Ash [ASH2]. 1251 The early telephone network relied on static hierarchical routing, 1252 whereby routing patterns remained fixed independent of the state of 1253 the network or time of day. The hierarchy was intended to accommodate 1254 overflow traffic, improve network reliability via alternate routes, 1255 and prevent call looping by employing strict hierarchical rules. The 1256 network was typically over-provisioned since a given fixed route had 1257 to be dimensioned so that it could carry user traffic during a busy 1258 hour of any busy day. Hierarchical routing in the telephony network 1259 was found to be too rigid upon the advent of digital switches and 1260 stored program control which were able to manage more complicated 1261 traffic engineering rules. 1263 Dynamic routing was introduced to alleviate the routing inflexibility 1264 in the static hierarchical routing so that the network would operate 1265 more efficiently. This resulted in significant economic gains 1266 [HuSS87]. Dynamic routing typically reduces the overall loss 1267 probability by 10 to 20 percent (compared to static hierarchical 1268 routing). Dynamic routing can also improve network resilience by 1269 recalculating routes on a per-call basis and periodically updating 1270 routes. 1272 There are three main types of dynamic routing in the telephone 1273 network. They are time-dependent routing, state-dependent routing 1274 (SDR), and event dependent routing (EDR). 1276 In time-dependent routing, regular variations in traffic loads due to 1277 time of day and seasonality are exploited in pre-planned routing 1278 tables. In state-dependent routing, routing tables are updated 1279 online according to the current state of the network (e.g, traffic 1280 demand, utilization, etc.). In event dependent routing, routing 1281 changes are incepted by events (such as call setups encountering 1282 congested or blocked links) whereupon new paths are searched out 1283 using learning models. EDR methods are real-time adaptive, but they 1284 do not require global state information as does SDR. Examples of EDR 1285 schemes include the dynamic alternate routing (DAR) from BT, the 1286 state-and-time dependent routing (STR) from NTT, and the success-to- 1287 the-top (STT) routing from AT&T. 1289 Dynamic non-hierarchical routing (DNHR) is an example of dynamic 1290 routing that was introduced in the AT&T toll network in the 1980's to 1291 respond to time-dependent information such as regular load variations 1292 as a function of time. Time-dependent information in terms of load 1293 may be divided into three time scales: hourly, weekly, and yearly. 1294 Correspondingly, three algorithms are defined to pre-plan the routing 1295 tables. The network design algorithm operates over a year-long 1296 interval while the demand servicing algorithm operates on a weekly 1297 basis to fine tune link sizes and routing tables to correct forecast 1298 errors on the yearly basis. At the smallest time scale, the routing 1299 algorithm is used to make limited adjustments based on daily traffic 1300 variations. Network design and demand servicing are computed using 1301 offline calculations. Typically, the calculations require extensive 1302 search on possible routes. On the other hand, routing may need 1303 online calculations to handle crankback. DNHR adopts a "two-link" 1304 approach whereby a path can consist of two links at most. The 1305 routing algorithm presents an ordered list of route choices between 1306 an originating switch and a terminating switch. If a call overflows, 1307 a via switch (a tandem exchange between the originating switch and 1308 the terminating switch) would send a crankback signal to the 1309 originating switch. This switch would then select the next route, 1310 and so on, until there are no alternative routes available in which 1311 the call is blocked. 1313 4.2 Evolution of Traffic Engineering in Packet Networks 1315 This subsection reviews related prior work that was intended to 1316 improve the performance of data networks. Indeed, optimization of 1317 the performance of data networks started in the early days of the 1318 ARPANET. Other early commercial networks such as SNA also recognized 1319 the importance of performance optimization and service 1320 differentiation. 1322 In terms of traffic management, the Internet has been a best effort 1323 service environment until recently. In particular, very limited 1324 traffic management capabilities existed in IP networks to provide 1325 differentiated queue management and scheduling services to packets 1326 belonging to different classes. 1328 In terms of routing control, the Internet has employed distributed 1329 protocols for intra-domain routing. These protocols are highly 1330 scalable and resilient. However, they are based on simple algorithms 1331 for path selection which have very limited functionality to allow 1332 flexible control of the path selection process. 1334 In the following subsections, the evolution of practical traffic 1335 engineering mechanisms in IP networks and its predecessors is 1336 reviewed. 1338 4.2.1 Adaptive Routing in the ARPANET 1340 The early ARPANET recognized the importance of adaptive routing where 1341 routing decisions were based on the current state of the network 1342 [McQ80]. Early minimum delay routing approaches forwarded each 1343 packet to its destination along a path for which the total estimated 1344 transit time is the smallest. Each node maintained a table of 1345 network delays, representing the estimated delay that a packet would 1346 experience along a given path toward its destination. The minimum 1347 delay table was periodically transmitted by a node to its neighbors. 1348 The shortest path, in terms of hop count, was also propagated to give 1349 the connectivity information. 1351 One drawback to this approach is that dynamic link metrics tend to 1352 create "traffic magnets" causing congestion to be shifted from one 1353 location of a network to another location, resulting in oscillation 1354 and network instability. 1356 4.2.2 Dynamic Routing in the Internet 1358 The Internet evolved from the APARNET and adopted dynamic routing 1359 algorithms with distributed control to determine the paths that 1360 packets should take en-route to their destinations. The routing 1361 algorithms are adaptations of shortest path algorithms where costs 1362 are based on link metrics. The link metric can be based on static or 1363 dynamic quantities. The link metric based on static quantities may be 1364 assigned administratively according to local criteria. The link 1365 metric based on dynamic quantities may be a function of a network 1366 congestion measure such as delay or packet loss. 1368 It was apparent early that static link metric assignment was 1369 inadequate because it can easily lead to unfavorable scenarios in 1370 which some links become congested while others remain lightly loaded. 1371 One of the many reasons for the inadequacy of static link metrics is 1372 that link metric assignment was often done without considering the 1373 traffic matrix in the network. Also, the routing protocols did not 1374 take traffic attributes and capacity constraints into account when 1375 making routing decisions. This results in traffic concentration being 1376 localized in subsets of the network infrastructure and potentially 1377 causing congestion. Even if link metrics are assigned in accordance 1378 with the traffic matrix, unbalanced loads in the network can still 1379 occur due to a number factors including: 1381 - Resources may not be deployed in the most optimal locations 1382 from a routing perspective. 1384 - Forecasting errors in traffic volume and/or traffic distribution. 1386 - Dynamics in traffic matrix due to the temporal nature of traffic 1387 patterns, BGP policy change from peers, etc. 1389 The inadequacy of the legacy Internet interior gateway routing system 1390 is one of the factors motivating the interest in path oriented 1391 technologies with explicit routing and constraint-based routing 1392 capability, such as MPLS. 1394 4.2.3 ToS Routing 1396 Type-of-Service (ToS) routing involves different routes going to the 1397 same destination being selected depending upon the ToS field of an IP 1398 packet [RFC-1349]. The ToS classes may be classified as low delay 1399 and high throughput. Each link is associated with multiple link 1400 costs and each link cost is used to compute routes for a particular 1401 ToS. A separate shortest path tree is computed for each ToS. The 1402 shortest path algorithm must be run for each ToS resulting in very 1403 expensive computation. Classical ToS-based routing is now outdated 1404 as the IP header field has been replaced by a Diffserv field. 1405 Effective traffic engineering is difficult to perform in classical 1406 ToS-based routing because each class still relies exclusively on 1407 shortest path routing which results in localization of traffic 1408 concentration within the network. 1410 4.2.4 Equal Cost MultiPath 1412 Equal Cost MultiPath (ECMP) is another technique that attempts to 1413 address the deficiency in Shortest Path First (SPF) interior gateway 1414 routing systems [RFC-2178]. In the classical SPF algorithm, if two or 1415 more shortest paths exist to a given destination, the algorithm will 1416 choose one of them. The algorithm is modified slightly in ECMP so 1417 that if two or more equal cost shortest paths exist between two 1418 nodes, the traffic between the nodes is distributed among the 1419 multiple equal-cost paths. Traffic distribution across the equal- 1420 cost paths is usually performed in one of two ways: (1) packet-based 1421 in a round-robin fashion, or (2) flow-based using hashing on source 1422 and destination IP addresses and possibly other fields of the IP 1423 header. The first approach can easily cause out-of-order packets 1424 while the second approach is dependent upon the number and 1425 distribution of flows. Flow-based load sharing may be unpredictable 1426 in an enterprise network where the number of flows is relatively 1427 small and less heterogeneous (for example, hashing may not be 1428 uniform), but it is generally effective in core public networks where 1429 the number of flows is large and heterogeneous. 1431 In ECMP, link costs are static and bandwidth constraints are not 1432 considered, so ECMP attempts to distribute the traffic as equally as 1433 possible among the equal-cost paths independent of the congestion 1434 status of each path. As a result, given two equal-cost paths, it is 1435 possible that one of the paths will be more congested than the other. 1436 Another drawback of ECMP is that load sharing cannot be achieved on 1437 multiple paths which have non-identical costs. 1439 4.2.5 Nimrod 1441 Nimrod is a routing system developed to provide heterogeneous service 1442 specific routing in the Internet, while taking multiple constraints 1443 into account [RFC-1992]. Essentially, Nimrod is a link state routing 1444 protocol which supports path oriented packet forwarding. It uses the 1445 concept of maps to represent network connectivity and services at 1446 multiple levels of abstraction. Mechanisms are provided to allow 1447 restriction of the distribution of routing information. 1449 Even though Nimrod did not enjoy deployment in the public Internet, a 1450 number of key concepts incorporated into the Nimrod architecture, 1451 such as explicit routing which allows selection of paths at 1452 originating nodes, are beginning to find applications in some recent 1453 constraint-based routing initiatives. 1455 4.3 Overlay Model 1457 In the overlay model, a virtual-circuit network, such as ATM, frame 1458 relay, or WDM provides virtual-circuit connectivity between routers 1459 that are located at the edges of a virtual-circuit cloud. In this 1460 mode, two routers that are connected through a virtual circuit see a 1461 direct adjacency between themselves independent of the physical route 1462 taken by the virtual circuit through the ATM, frame relay, or WDM 1463 network. Thus, the overlay model essentially decouples the logical 1464 topology that routers see from the physical topology that the ATM, 1465 frame relay, or WDM network manages. The overlay model based on ATM 1466 or frame relay enables a network administrator or an automaton to 1467 employ traffic engineering concepts to perform path optimization by 1468 re-configuring or rearranging the virtual circuits so that a virtual 1469 circuit on a congested or suboptimal physical link can be re-routed 1470 to a less congested or more optimal one. In the overlay model, 1471 traffic engineering is also employed to establish relationships 1472 between the traffic management parameters (e.g. PCR, SCR, MBS for 1473 ATM; or CIR, Be, and Bc for frame relay) of the virtual-circuit 1474 technology and the actual traffic that traverses each circuit. These 1475 relationships can be established based upon known or projected 1476 traffic profiles, and some other factors. 1478 The overlay model using IP over ATM requires the management of two 1479 separate networks with different technologies (IP and ATM) resulting 1480 in increased operational complexity and cost. In the fully-meshed 1481 overlay model, each router would peer to every other router in the 1482 network, so that the total number of adjacencies is a quadratic 1483 function of the number of routers. Some of the issues with the 1484 overlay model are discussed in [AWD2]. 1486 4.4 Constrained-Based Routing 1488 Constraint-based routing refers to a class of routing systems that 1489 compute routes through a network subject to satisfaction of a set of 1490 constraints and requirements. In the most general setting, 1491 constraint-based routing may also seek to optimize overall network 1492 performance while minimizing costs. 1494 The constraints and requirements may be imposed by the network itself 1495 or by administrative policies. Constraints may include bandwidth, hop 1496 count, delay, and policy instruments such as resource class 1497 attributes. Constraints may also include domain specific attributes 1498 of certain network technologies and contexts which impose 1499 restrictions on the solution space of the routing function. Path 1500 oriented technologies such as MPLS have made constraint-based routing 1501 feasible and attractive in public IP networks. 1503 The concept of constraint-based routing within the context of MPLS 1504 traffic engineering requirements in IP networks was first defined in 1505 [AWD1]. 1507 Unlike QoS routing (see [RFC-2386] and the references therein) which 1508 generally addresses the issue of routing individual traffic flows to 1509 satisfy prescribed flow based QoS requirements subject to network 1510 resource availability, constraint-based routing is applicable to 1511 traffic aggregates as well as flows and may be subject to a wide 1512 variety of constraints which may include policy restrictions. 1514 4.5 Overview of Other IETF Projects Related to Traffic Engineering 1516 This subsection reviews a number of IETF activities pertinent to 1517 Internet traffic engineering. These activities are primarily intended 1518 to evolve the IP architecture to support new service definitions 1519 which allow preferential or differentiated treatment to be accorded 1520 to certain types of traffic. 1522 4.5.1 Integrated Services 1524 The IETF Integrated Services working group developed the integrated 1525 services (Intserv) model. This model requires resources, such as 1526 bandwidth and buffers, to be reserved a priori for a given traffic 1527 flow to ensure that the quality of service requested by the traffic 1528 flow is satisfied. The integrated services model includes additional 1529 components beyond those used in the best-effort model such as packet 1530 classifiers, packet schedulers, and admission control. A packet 1531 classifier is used to identify flows that are to receive a certain 1532 level of service. A packet scheduler handles the scheduling of 1533 service to different packet flows to ensure that QoS commitments are 1534 met. Admission control is used to determine whether a router has the 1535 necessary resources to accept a new flow. 1537 Two services have been defined under the Integrated Services model: 1538 guaranteed service [RFC-2212] and controlled-load service [RFC-2211]. 1540 The guaranteed service can be used for applications requiring bounded 1541 packet delivery time. For this type of application, data that is 1542 delivered to the application after a pre-defined amount of time has 1543 elapsed is usually considered worthless. Therefore, guaranteed 1544 service was intended to provide a firm quantitative bound on the 1545 end-to-end packet delay for a flow. This is accomplished by 1546 controlling the queuing delay on network elements along the data flow 1547 path. The guaranteed service model does not, however, provide bounds 1548 on jitter (inter-arrival times between consecutive packets). 1550 The controlled-load service can be used for adaptive applications 1551 that can tolerate some delay but are sensitive to traffic overload 1552 conditions. This type of application typically functions 1553 satisfactorily when the network is lightly loaded but its performance 1554 degrades significantly when the network is heavily loaded. 1555 Controlled-load service therefore has been designed to provide 1556 approximately the same service as best-effort service in a lightly 1557 loaded network regardless of actual network conditions. Controlled- 1558 load service is described qualitatively in that no target values of 1559 delay or loss are specified. 1561 The main issue with the Integrated services model has been 1562 scalability, especially in large public IP networks which may 1563 potentially have millions of active micro-flows in transit 1564 concurrently. 1566 A notable feature of the Integrated Services model is that it 1567 requires explicit signaling of QoS requirements from end systems to 1568 routers [RFC-2753]. The Resource Reservation Protocol (RSVP) performs 1569 this signaling function and is a critical component of the Integrated 1570 Services model. The RSVP protocol is described next. 1572 4.5.2 RSVP 1574 RSVP is a soft state signaling protocol [RFC-2205]. It supports 1575 receiver initiated establishment of resource reservations for both 1576 multicast and unicast flows. RSVP was originally developed as a 1577 signaling protocol within the integrated services framework for 1578 applications to communicate QoS requirements to the network and for 1579 the network to reserve relevant resources to satisfy the QoS 1580 requirements [RFC-2205]. 1582 Under RSVP, the sender or source node sends a PATH message to the 1583 receiver with the same source and destination addresses as the 1584 traffic which the sender will generate. The PATH message contains: 1585 (1) a sender Tspec specifying the characteristics of the traffic, (2) 1586 a sender Template specifying the format of the traffic, and (3) an 1587 optional Adspec which is used to support the concept of one pass with 1588 advertising" (OPWA) [RFC-2205]. Every intermediate router along the 1589 path forwards the PATH Message to the next hop determined by the 1590 routing protocol. Upon receiving a PATH Message, the receiver 1591 responds with a RESV message which includes a flow descriptor used to 1592 request resource reservations. The RESV message travels to the sender 1593 or source node in the opposite direction along the path that the PATH 1594 message traversed. Every intermediate router along the path can 1595 reject or accept the reservation request of the RESV message. If the 1596 request is rejected, the rejecting router will send an error message 1597 to the receiver and the signaling process will terminate. If the 1598 request is accepted, link bandwidth and buffer space are allocated 1599 for the flow and the related flow state information is installed in 1600 the router. 1602 One of the issues with the original RSVP specification was 1603 scalability. This is because reservations were required for micro- 1604 flows, so that the amount of state maintained by network elements 1605 tends to increase linearly with the number of micro-flows. 1607 Recently, RSVP has been modified and extended in several ways to 1608 overcome the scaling problems. As a result, it is becoming a 1609 versatile signaling protocol for the Internet. For example, RSVP has 1610 been extended to reserve resources for aggregation of flows, to set 1611 up MPLS explicit label switched paths, and to perform other signaling 1612 functions within the Internet. There are also a number of proposals 1613 to reduce the amount of refresh messages required to maintain 1614 established RSVP sessions [Berger]. 1616 A number of IETF working groups have been engaged in activities 1617 related to the RSVP protocol. These include the original RSVP working 1618 group, the MPLS working group, the Resource Allocation Protocol 1619 working group, and the Policy Framework working group. 1621 4.5.3 Differentiated Services 1623 The goal of the Differentiated Services (Diffserv) effort within the 1624 IETF is to devise scalable mechanisms for categorization of traffic 1625 into behavior aggregates, which ultimately allows each behavior 1626 aggregate to be treated differently, especially when there is a 1627 shortage of resources such as link bandwidth and buffer space [RFC- 1628 2475]. One of the primary motivations for the Diffserv effort was to 1629 devise alternative mechanisms for service differentiation in the 1630 Internet that mitigate the scalability issues encountered with the 1631 Intserv model. 1633 The IETF Diffserv working group has defined a Differentiated Services 1634 field in the IP header (DS field). The DS field consists of six bits 1635 of the part of the IP header formerly known as TOS octet. The DS 1636 field is used to indicate the forwarding treatment that a packet 1637 should receive at a node [RFC-2474]. The Diffserv working group has 1638 also standardized a number of Per-Hop Behavior (PHB) groups. Using 1639 the PHBs, several classes of services can be defined using different 1640 classification, policing, shaping and scheduling rules. 1642 For an end-user of network services to receive Differentiated 1643 Services from its Internet Service Provider (ISP), it may be 1644 necessary for the user to have a Service Level Agreement (SLA) with 1645 the ISP. An SLA may explicitly or implicitly specify a Traffic 1646 Conditioning Agreement (TCA) which defines classifier rules as well 1647 as metering, marking, discarding, and shaping rules. 1649 Packets are classified, and possibly policed and shaped at the 1650 ingress to a Diffserv network. When a packet traverses the boundary 1651 between different Diffserv domains, the DS field of the packet may be 1652 re-marked according to existing agreements between the domains. 1654 Differentiated Services allows only a finite number of service 1655 classes to be indicated by the DS field. The main advantage of the 1656 Diffserv approach relative to the Intserv model is scalability. 1657 Resources are allocated on a per-class basis and the amount of state 1658 information is proportional to the number of classes rather than to 1659 the number of application flows. 1661 It should be obvious from the previous discussion that the Diffserv 1662 model essentially deals with traffic management issues on a per hop 1663 basis. The Diffserv control model consists of a collection of micro- 1664 TE control mechanisms. Other traffic engineering capabilities, such 1665 as capacity management (including routing control), are also required 1666 in order to deliver acceptable service quality in Diffserv networks. 1668 4.5.4 MPLS 1670 MPLS is an advanced forwarding scheme which also includes extensions 1671 to conventional IP control plane protocols. MPLS extends the Internet 1672 routing model and enhances packet forwarding and path control [RoVC]. 1674 At the ingress to an MPLS domain, label switching routers (LSRs) 1675 classify IP packets into forwarding equivalence classes (FECs) based 1676 on a variety of factors, including e.g. a combination of the 1677 information carried in the IP header of the packets and the local 1678 routing information maintained by the LSRs. An MPLS label is then 1679 appended to each packet according to their forwarding equivalence 1680 classes. In a non-ATM/FR environment, the label is 32 bits long and 1681 contains a 20-bit label field, a 3-bit experimental field (formerly 1682 known as Class-of-Service or CoS field), a 1-bit label stack 1683 indicator and an 8-bit TTL field. In an ATM (FR) environment, the 1684 label consists information encoded in the VCI/VPI (DLCI) field. An 1685 MPLS capable router (an LSR) examines the label and possibly the 1686 experimental field and uses this information to make packet 1687 forwarding decisions. 1689 An LSR makes forwarding decisions by using the label prepended to 1690 packets as the index into a local next hop label forwarding entry 1691 (NHLFE). The packet is then processed as specified in the NHLFE. The 1692 incoming label may be replaced by an outgoing label, and the packet 1693 may be switched to the next LSR. This label-switching process is very 1694 similar to the label (VCI/VPI) swapping process in ATM networks. 1695 Before a packet leaves an MPLS domain, its MPLS label may be removed. 1696 A Label Switched Path (LSP) is the path between an ingress LSRs and 1697 an egress LSRs through which a labeled packet traverses. The path of 1698 an explicit LSP is defined at the originating (ingress) node of the 1699 LSP. MPLS can use a signaling protocol such as RSVP or LDP to set up 1700 LSPs. 1702 MPLS is a very powerful technology for Internet traffic engineering 1703 because it supports explicit LSPs which allow constraint-based 1704 routing to be implemented efficiently in IP networks [AWD2]. The 1705 requirements for traffic engineering over MPLS are described in 1706 [AWD1]. Extensions to RSVP to support instantiation of explicit LSP 1707 are discussed in [AWD3]. Extensions to LDP, known as CR-LDP, to 1708 support explicit LSPs are presented in [JAM]. 1710 4.5.5 IP Performance Metrics 1712 The IETF IP Performance Metrics (IPPM) working group has been 1713 developing a set of standard metrics that can be used to monitor the 1714 quality, performance, and reliability of Internet services. These 1715 metrics can be applied by network operators, end-users, and 1716 independent testing groups to provide users and service providers 1717 with a common understanding of the performance and reliability of the 1718 Internet component 'clouds' they use/provide [RFC2330]. The criteria 1719 for performance metrics developed by the IPPM WG are described in 1720 [RFC2330]. Examples of performance metrics include one-way packet 1721 loss [RFC2680], one-way delay [RFC2679], and connectivity measures 1722 between two nodes [RFC2678]. Other metrics include second-order 1723 measures of packet loss and delay. 1725 Some of the performance metrics specified by the IPPM WG are useful 1726 for specifying Service Level Agreements (SLAs). SLAs are sets of 1727 service level objectives negotiated between users and service 1728 providers, wherein each objective is a combination of one or more 1729 performance metrics possibly subject to certain constraints. 1731 4.5.6 Flow Measurement 1733 The IETF Real Time Flow Measurement (RTFM) working group has produced 1734 an architecture document defining a method to specify traffic flows 1735 as well as a number of components for flow measurement (meters, meter 1736 readers, manager) [RFC-2722]. A flow measurement system enables 1737 network traffic flows to be measured and analyzed at the flow level 1738 for a variety of purposes. As noted in RFC-2722, a flow measurement 1739 system can be very useful in the following contexts: (1) 1740 understanding the behavior of existing networks, (2) planning for 1741 network development and expansion, (3) quantification of network 1742 performance, (4) verifying the quality of network service, and (5) 1743 attribution of network usage to users [RFC-2722]. 1745 A flow measurement system consists of meters, meter readers, and 1746 managers. A meter observe packets passing through a measurement 1747 point, classifies them into certain groups, accumulates certain usage 1748 data (such as the number of packets and bytes for each group), and 1749 stores the usage data in a flow table. A group may represent a user 1750 application, a host, a network, a group of networks, etc. A meter 1751 reader gathers usage data from various meters so it can be made 1752 available for analysis. A manager is responsible for configuring and 1753 controlling meters and meter readers. The instructions received by a 1754 meter from a manager include flow specification, meter control 1755 parameters, and sampling techniques. The instructions received by a 1756 meter reader from a manager include the address of the meter whose 1757 date is to be collected, the frequency of data collection, and the 1758 types of flows to be collected. 1760 4.5.7 Endpoint Congestion Management 1762 The IETF Endpoint Congestion Management working group is intended to 1763 provide a set of congestion control mechanisms that transport 1764 protocols can use. It is also intended to develop mechanisms for 1765 unifying congestion control across a subset of an endpoint's active 1766 unicast connections (called a congestion group). A congestion 1767 manager continuously monitors the state of the path for each 1768 congestion group under its control. The manager uses that 1769 information to instruct a scheduler on how to partition bandwidth 1770 among the connections of that congestion group. 1772 4.6 Overview of ITU Activities Related to Traffic Engineering 1774 This section provides an overview of prior work within the ITU-T 1775 pertaining to traffic engineering in traditional telecommunications 1776 networks. 1778 ITU-T Recommendations E.600 [itu-e600], E.701 [itu-e701], and E.801 1779 [itu-e801] address traffic engineering issues in traditional 1780 telecommunications networks. Recommendation E.600 provides a 1781 vocabulary for describing traffic engineering concepts, while E.701 1782 defines reference connections, Grade of Service (GOS), and traffic 1783 parameters for ISDN. Recommendation E.701 uses the concept of a 1784 reference connection to identify representative cases of different 1785 types of connections without describing the specifics of their actual 1786 realizations by different physical means. As defined in 1787 Recommendation E.600, "a connection is an association of resources 1788 providing means for communication between two or more devices in, or 1789 attached to, a telecommunication network." Also, E.600 defines "a 1790 resource as any set of physically or conceptually identifiable 1791 entities within a telecommunication network, the use of which can be 1792 unambiguously determined" [itu-e600]. There can be different types 1793 of connections as the number and types of resources in a connection 1794 may vary. 1796 Typically, different network segments are involved in the path of a 1797 connection. For example, a connection may be local, national, or 1798 international. The purposes of reference connections are to clarify 1799 and specify traffic performance issues at various interfaces between 1800 different network domains. Each domain may consist of one or more 1801 service provider networks. 1803 Reference connections provide a basis to define grade of service 1804 (GoS) parameters related to traffic engineering within the ITU-T 1805 framework. As defined in E.600, "GoS refers to a number of traffic 1806 engineering variables which are used to provide a measure of the 1807 adequacy of a group of resources under specified conditions." These 1808 GoS variables may be probability of loss, dial tone, delay, etc. 1809 They are essential for network internal design and operation as well 1810 as for component performance specification. 1812 GoS is different from quality of service (QoS) in the ITU framework. 1813 QoS is the performance perceivable by a telecommunication service 1814 user and expresses the user's degree of satisfaction of the service. 1815 QoS parameters focus on performance aspects observable at the service 1816 access points and network interfaces, rather than their causes within 1817 the network. GoS, on the other hand, is a set of network oriented 1818 measures which characterize the adequacy of a group of resources 1819 under specified conditions. For a network to be effective in serving 1820 its users, the values of both GoS and QoS parameters must be related, 1821 with GoS parameters typically making a major contribution to the QoS. 1823 Recommendation E.600 stipulates that a set of GoS parameters must be 1824 selected and defined on an end-to-end basis for each major service 1825 category provided by a network to assist the network provider improve 1826 efficiency and effectiveness of the network. Based on a selected set 1827 of reference connections, suitable target values are assigned to the 1828 selected GoS parameters under normal and high load conditions. These 1829 end-to-end GoS target values are then apportioned to individual 1830 resource components of the reference connections for dimensioning 1831 purposes. 1833 5.0 Taxonomy of Traffic Engineering Systems 1835 This section presents a short taxonomy of traffic engineering 1836 systems. A taxonomy of traffic engineering systems can be constructed 1837 based on traffic engineering styles and views as listed below: 1839 - Time-dependent vs State-dependent vs Event-dependent 1840 - Offline vs Online 1841 - Centralized vs Distributed 1842 - Local vs Global Information 1843 - Prescriptive vs Descriptive 1844 - Open Loop vs Closed Loop 1845 - Tactical vs Strategic 1847 These classification systems are described in greater detail in the 1848 following subsections of this document. 1850 5.1 Time-Dependent Versus State-Dependent Versus Event Dependent 1852 Traffic engineering methodologies can be classified as time-dependent 1853 or state-dependent. All TE schemes are considered to be dynamic in 1854 this framework. Static TE implies that no traffic engineering 1855 methodology or algorithm is being applied. 1857 In the time-dependent TE, historical information based on seasonal 1858 variations in traffic is used to pre-program routing plans and other 1859 TE control mechanisms. Additionally, customer subscription or 1860 traffic projection may be used. Pre-programmed routing plans 1861 typically change on a relatively long time scale (e.g., diurnal). 1863 Time-dependent algorithms do not attempt to adapt to random 1864 variations in traffic or changing network conditions. An example of a 1865 time-dependent algorithm is a global centralized optimizer where the 1866 input to the system is a traffic matrix and multiclass QoS 1867 requirements as described [MR99]. 1869 State-dependent TE adapts the routing plans for packets based on the 1870 current state of the network. The current state of the network 1871 provides additional information on variations in actual traffic 1872 (i.e., perturbations from regular variations) that could not be 1873 predicted using historical information. Constraint-based routing is 1874 an example of state-dependent TE operating in a relatively long time 1875 scale. An example operating in a relatively short time scale is a 1876 load-balancing algorithm described in [OMP] and [MATE]. 1878 The state of the network can be based on parameters such as 1879 utilization, packet delay, packet loss, etc. These parameters can be 1880 obtained in several ways. For example, each router may flood these 1881 parameters periodically or by means of some kind of trigger to other 1882 routers. Another approach is for a particular router performing 1883 adaptive TE to send probe packets along a path to gather the state of 1884 that path. Still another approach is for a management system to 1885 gather relevant information from network elements. 1887 Expeditious and accurate gathering and distribution of state 1888 information is critical for adaptive TE due to the dynamic nature of 1889 network conditions. State-dependent algorithms may be applied to 1890 increase network efficiency and resilience. Time-dependent algorithms 1891 are more suitable for predictable traffic variations. On the other 1892 hand, state-dependent algorithms are more suitable for adapting to 1893 the prevailing network state. 1895 Event-dependent TE methods can also be used for TE path selection. 1896 Event-dependent TE methods are distinct from time-dependent and 1897 state-dependent TE methods in the manner in which paths are selected. 1898 These algorithms are adaptive and distributed in nature and typically 1899 use learning models to find good paths for TE in a network. While 1900 state-dependent TE models typically use available-link-bandwidth 1901 (ALB) flooding for TE path selection, event-dependent TE methods do 1902 not require ALB flooding. Rather, event-dependent TE methods 1903 typically search out capacity by learning models, as in the success- 1904 to-the-top (STT) method. ALB flooding can be resource intensive, 1905 since it requires link bandwidth to carry LSAs, processor capacity to 1906 process LSAs, and the overhead can limit area/autonomous system (AS) 1907 size. Modeling results suggest that event-dependent TE methods can 1908 lead to a reduction in ALB flooding overhead without loss of network 1909 throughput performance [ASH3]. 1911 As an example of event-dependent methods, consider an MPLS network 1912 that uses a success-to-the-top (STT) event-dependent TE method. In 1913 this case, if the bandwidth between two label switching routers (say 1914 LSR-A to LSR-B) needs to be modified, say increased by delta-BW, the 1915 primary LSP-p is tried first. If delta-BW is not available on one or 1916 more links of LSP-p, then the currently successful LSP-s is tried 1917 next. If delta-BW is not available on one or more links of LSP-s, 1918 then a new LSP is searched by trying additional candidate paths until 1919 a new successful LSP-n is found or the candidate paths are exhausted. 1920 LSP-n is then marked as the currently successful path for the next 1921 time bandwidth needs to be modified. 1923 5.2 Offline Versus Online 1925 Traffic engineering requires the computation of routing plans. The 1926 computation may be performed offline or online. The computation can 1927 be done offline for scenarios where routing plans need not be 1928 executed in real-time. For example, routing plans computed from 1929 forecast information may be computed offline. Typically, offline 1930 computation is also used to perform extensive searches on multi- 1931 dimensional solution spaces. 1933 Online computation is required when the routing plans must adapt to 1934 changing network conditions as in state-dependent algorithms. Unlike 1935 offline computation (which can be computationally demanding), online 1936 computation is geared toward relative simple and fast calculations to 1937 select routes, fine-tune the allocations of resources, and perform 1938 load balancing. 1940 5.3 Centralized Versus Distributed 1942 Centralized control has a central authority which determines routing 1943 plans and perhaps other TE control parameters on behalf of each 1944 router. The central authority collects the network-state information 1945 from all routers periodically and returns the routing information to 1946 the routers. The routing update cycle is a critical parameter 1947 directly impacting the performance of the network being controlled. 1948 Centralized control may need high processing power and high bandwidth 1949 control channels. 1951 Distributed control determines route selection by each router 1952 autonomously based on the routers view of the state of the network. 1953 The network state information may be obtained by the router using a 1954 probing method or distributed by other routers on a periodic basis 1955 using link state advertisements. Network state information may also 1956 be disseminated under exceptional conditions. 1958 5.4 Local Versus Global 1960 Traffic engineering algorithms may require local or global network- 1961 state information. Note that the scope of network-state information 1962 does not necessarily refer to the scope of the optimization. In other 1963 words, it is possible for a TE algorithm to perform global 1964 optimization based on local state information. Similarly, a TE 1965 algorithm may arrive at a locally optimum solution even if it relies 1966 on global state information. 1968 Local information pertains to the state of a portion of the domain. 1969 Examples include the bandwidth and packet loss rate of a particular 1970 path. Local state information may be sufficient for certain 1971 instances of distributed-controlled TEs. 1973 Global information pertains to the state of the entire domain 1974 undergoing traffic engineering. Examples include a global traffic 1975 matrix and loading information on each link throughout the domain of 1976 interest. Global state information is typically required with 1977 centralized control. Distributed TE systems may also need global 1978 information in some cases. 1980 5.5 Prescriptive Versus Descriptive 1982 TE systems may also be classified as prescriptive or descriptive. 1984 Prescriptive traffic engineering evaluates alternatives and 1985 recommends a course of action. Prescriptive traffic engineering can 1986 be further categorized as either corrective or perfective. Corrective 1987 TE prescribes a course of action to address an existing or predicted 1988 anomaly. Perfective TE prescribes a course of action to evolve and 1989 improve network performance even when no anomalies are evident. 1991 Descriptive traffic engineering characterizes, on the other hand, the 1992 state of the network and assesses the impact of various policies 1993 without recommending any particular course of action. 1995 5.6 Open-Loop Versus Closed-Loop 1997 Open-loop traffic engineering control is where control action does 1998 not use feedback information from the current network state. The 1999 control action may use its own local information for accounting 2000 purposes, however. 2002 Closed-loop traffic engineering control is where control action 2003 utilizes feedback information from the network state. The feedback 2004 information may be in the form of historical information or current 2005 measurement. 2007 5.7 Tactical vs Strategic 2009 Tactical traffic engineering aims to address specific performance 2010 problems (such as hot-spots) that occur in the network from a 2011 tactical perspective, without consideration of overall strategic 2012 imperatives. Without proper planning and insights, tactical TE tends 2013 to be ad hoc in nature. 2015 Strategic traffic engineering approaches the TE problem from a more 2016 organized and systematic perspective, taking into consideration the 2017 immediate and longer term term consequences of specific policies and 2018 actions. 2020 6.0 Requirements for Internet Traffic Engineering 2022 This section describes high level requirements and recommendations 2023 for traffic engineering in the Internet. These requirements are 2024 presented in very general terms because this is a framework document. 2025 Additional documents to follow may elaborate on specific aspects of 2026 these requirements. 2028 A traffic engineering requirement is a capability needed to solve a 2029 traffic engineering problem or to achieve a traffic engineering 2030 objective. Broadly speaking, these requirements can be categorized as 2031 either non-functional or functional requirements. 2033 Non-functional requirements for Internet traffic engineering relate 2034 to the quality attributes or state characteristics of a traffic 2035 engineering system. Non-functional traffic engineering requirements 2036 may contain conflicting assertions and may sometimes be difficult to 2037 quantify precisely. 2039 Functional requirements for Internet traffic engineering stipulate 2040 the functions that a traffic engineering system should perform. These 2041 functions are needed to realize traffic engineering objectives by 2042 addressing traffic engineering problems. 2044 6.1 Generic Non-functional Requirements 2046 The generic non-functional requirements for Internet traffic 2047 engineering include: usability, automation, scalability, stability, 2048 visibility, simplicity, efficiency, reliability, survivability, 2049 correctness, maintainability, extensibility, interoperability, and 2050 security. In a given context, some of these non-functional 2051 requirements may be critical while others may be optional. Therefore, 2052 prioritization may be required during the development phase of a 2053 traffic engineering system (or components thereof) to tailor it to a 2054 specific operational context. 2056 In the following paragraphs, some of the aspects of the non- 2057 functional requirements for Internet traffic engineering are 2058 summarized. 2060 Usability: Usability is a human factors aspect of traffic engineering 2061 systems. Usability refers to the ease with which a traffic 2062 engineering system can be deployed and operated. In general, it is 2063 desirable to have a TE system that can be readily deployed in an 2064 existing network. It is also desirable to have a TE system that is 2065 easy to operate and maintain. 2067 Automation: Whenever feasible, a traffic engineering system should 2068 automate as much of the traffic engineering functions as possible to 2069 minimize the amount of human effort needed to control and analyze 2070 operational networks. Automation is particularly imperative in large 2071 scale public networks because of the high cost of the human aspects 2072 of network operations and the high risk of network problems caused by 2073 human errors. Automation may entail the incorporation of automatic 2074 feedback and intelligence into some components of the traffic 2075 engineering system. 2077 Scalability: Contemporary public networks are growing very fast with 2078 respect to network size and traffic volume. Therefore, a TE system 2079 should be scalable to remain applicable as the network evolves. In 2080 particular, a TE system should remain functional as the network 2081 expands with regard to the number of routers and links, and with 2082 respect to the traffic volume. A TE system should have a scalable 2083 architecture, should not adversely impair other functions and 2084 processes in a network element, and should not consume too much 2085 network resources when collecting and distributing state information 2086 or when exerting control. 2088 Stability: Stability is a very important consideration in traffic 2089 engineering systems that respond to changes in the state of the 2090 network. State-dependent traffic engineering methodologies typically 2091 mandate a tradeoff between responsiveness and stability. It is 2092 strongly recommended that when tradeoffs are warranted between 2093 responsiveness and stability, that the tradeoff should be made in 2094 favor of stability (especially in public IP backbone networks). 2096 Flexibility: A TE system should be flexible to allow for changes in 2097 optimization policy. In particular, a TE system should provide 2098 sufficient configuration options so that a network administrator can 2099 tailor the TE system to a particular environment. It may also be 2100 desirable to have both online and offline TE subsystems which can be 2101 independently enabled and disabled. TE systems that are used in 2102 multi-class networks should also have options to support class based 2103 performance evaluation and optimization. 2105 Visibility: As part of the TE system, mechanisms should exist to 2106 collect statistics from the network and to analyze these statistics 2107 to determine how well the network is functioning. Derived statistics 2108 such as traffic matrices, link utilization, latency, packet loss, and 2109 other performance measures of interest which are determined from 2110 network measurements can be used as indicators of prevailing network 2111 conditions. Other examples of status information which should be 2112 observed include existing functional routing information 2113 (additionally, in the context of MPLS existing LSP routes), etc. 2115 Simplicity: Generally, a TE system should be as simple as possible 2116 consistent with the intended applications. More importantly, the TE 2117 system should be relatively easy to use (i.e., clean, convenient, and 2118 intuitive user interfaces). Simplicity in user interface does not 2119 necessarily imply that the TE system will use naive algorithms. Even 2120 when complex algorithms and internal structures are used, such 2121 complexities should be hidden as much as possible from the network 2122 administrator through the user interface. 2124 Survivability: It is critical for an operational network to recover 2125 promptly from network failures and to maintain the required QoS for 2126 existing services. Survivability generally mandates introducing 2127 redundancy into the architecture, design, and operation of networks. 2128 There is a tradeoff between the level of survivability that can be 2129 attained and the cost required to attain it. The time required to 2130 restore a network service from a failure depends on several factors, 2131 including the particular context in which the failure occurred, the 2132 architecture and design of network, the characteristics of the 2133 network elements and network protocols, the applications and services 2134 that were impacted by the failure, etc. The extent and impact of 2135 service disruptions due to a network failure or outage can vary 2136 depending on the length of the outage, the part of the network where 2137 the failure occurred, the type and criticality of the network 2138 resources that were impaired by the failure, the types of services 2139 that were impacted by the failure (e.g., voice quality degradation 2140 following network impairments may be tolerable for an inexpensive 2141 VoIP service, but may not be tolerable for a toll-quality VoIP 2142 service). Survivability can be addressed at the device level by 2143 developing network elements that are more reliable; and at the 2144 network level by incorporating redundancy into the architecture, 2145 design, and operation of networks. It is recommended that a 2146 philosophy of robustness and survivability should be adopted in the 2147 architecture, design, and operation of traffic engineering that 2148 control IP networks (especially public IP networks). Because 2149 different contexts may demand different levels of survivability, the 2150 mechanisms developed to support network survivability should be 2151 flexible so that they can be tailored to different needs. 2153 Interoperability: Whenever feasible, traffic engineering systems and 2154 their components should be developed with open standards based 2155 interfaces to allow interoperation with other systems and components. 2157 Security: Security is a critical consideration in traffic engineering 2158 systems that optimize network performance. Such traffic engineering 2159 systems typical exert control over certain functional aspects of the 2160 network to achieve the desired performance objectives. Therefore, 2161 adequate measures must be taken to safeguard the integrity of the 2162 traffic engineering system. Adequate measures must also be taken to 2163 protect the network from vulnerabilities that originate from security 2164 breaches and other impairments within the traffic engineering system. 2166 The remainder of this section will focus on some of the high level 2167 functional requirements for traffic engineering. 2169 6.2 Routing Requirements 2171 Routing control is a significant aspect of Internet traffic 2172 engineering. Routing impacts many of the key performance measures 2173 associated with networks, such as throughput, delay, and utilization. 2174 Generally, it is very difficult to provide good service quality in a 2175 wide area network without effective routing control. A desirable 2176 routing system is one that takes traffic characteristics and network 2177 constraints into account during route selection while maintaining 2178 stability. 2180 Traditional shortest path first (SPF) interior gateway protocols are 2181 based on shortest path algorithms and have limited control 2182 capabilities for traffic engineering [AWD1, AWD2]. These limitations 2183 include : 2185 1. The well known issues with pure SPF protocols, which 2186 do not take network constraints and traffic characteristics 2187 into account during route selection. For example, since IGPs 2188 always use the shortest paths (based on administratively 2189 assigned link metrics) to forward traffic, load sharing cannot 2190 be accomplished among paths of different costs. Using shortest 2191 paths to forward traffic conserves network resources, but may 2192 cause the following problems: 1) If traffic from a source to a 2193 destination exceeds the capacity of a link along the shortest 2194 path, the link (hence the shortest path) becomes congested while 2195 a longer path between these two nodes may be under-utilized; 2196 2) the shortest paths from different sources can overlap at some 2197 links. If the total traffic from the sources exceeds the 2198 capacity of any of these links, congestion will occur. Problems 2199 can also occur because traffic demand changes over time but 2200 network topology and routing configuration cannot be changed as 2201 rapidly. This causes the network topology and routing 2202 configuration to become suboptimal over time, which may result 2203 in persistent congestion problems. 2205 2. The Equal-Cost Multi-Path (ECMP) capability of SPF IGPs supports 2206 sharing of traffic among equal cost paths between two nodes. 2207 However, ECMP attempts to divide the traffic as equally as 2208 possible among the equal cost shortest paths. Generally, ECMP 2209 does not support configurable load sharing ratios among equal 2210 cost paths. The result is that one of the paths may carry 2211 significantly more traffic than other paths because it 2212 may also carry traffic from other sources. This situation can 2213 result in congestion along the path that carries more traffic. 2215 3. Modifying IGP metrics to control traffic routing tends to 2216 have network-wide effect. Consequently, undesirable and 2217 unanticipated traffic shifts can be triggered as a result. 2219 Because of these limitations, new capabilities are needed to enhance 2220 the routing function in IP networks. Some of these capabilities have 2221 been described elsewhere and are summarized below. 2223 Constraint-based routing is desirable to evolve the routing 2224 architecture of IP networks, especially public IP backbones with 2225 complex topologies [AWD1]. Constraint-based routing computes routes 2226 to fulfill requirements subject to constraints. Constraints may 2227 include bandwidth, hop count, delay, and administrative policy 2228 instruments such as resource class attributes [AWD1, RFC-2386]. This 2229 makes it possible to select routes that satisfy a given set of 2230 requirements subject to network and administrative policy 2231 constraints. Routes computed through constraint-based routing are not 2232 necessarily the shortest paths. Constraint-based routing works best 2233 with path oriented technologies that support explicit routing, such 2234 as MPLS. 2236 Constraint-based routing can also be used as a way to redistribute 2237 traffic onto the infrastructure (even for best effort traffic). For 2238 example, if the bandwidth requirements for path selection and 2239 reservable bandwidth attributes of network links are appropriately 2240 defined and configured, then congestion problems caused by uneven 2241 traffic distribution may be avoided or reduced. In this way, the 2242 performance and efficiency of the network can be improved. 2244 A number of enhancements are needed to conventional link state IGPs, 2245 such as OSPF and IS-IS, to allow them to distribute additional state 2246 information required for constraint-based routing. The basic 2247 extensions required are outlined in [Li-IGP]. Specializations of 2248 these requirements to OSPF were described in [KATZ] and to IS-IS in 2249 [SMIT]. Essentially, these enhancements require the propagation of 2250 additional information in link state advertisements. Specifically, in 2251 addition to normal link-state information, an enhanced IGP is 2252 required to propagate topology state information needed for 2253 constraint-based routing. Some of the additional topology state 2254 information include link attributes such as reservable bandwidth and 2255 link resource class attribute (an administratively specified property 2256 of the link). The resource class attribute concept was defined in 2257 [AWD1]. The additional topology state information is carried in new 2258 TLVs and sub-TLVs in IS-IS, or in the Opaque LSA in OSPF [SMIT, 2259 KATZ]. 2261 An enhanced link-state IGP may flood information more frequently than 2262 a normal IGP. This is because even without changes in topology, 2263 changes in reservable bandwidth or link affinity can trigger the 2264 enhanced IGP to initiate flooding. A tradeoff is typically required 2265 between the timeliness of the information flooded and the flooding 2266 frequency to avoid consuming excessive link bandwidth and 2267 computational resources, and more importantly to avoid instability. 2269 In a TE system, it is also desirable for the routing subsystem to 2270 make the load splitting ratio among multiple paths (with equal cost 2271 or different cost) configurable. This capability gives network 2272 administrators more flexibility in the control of traffic 2273 distribution across the network. It can be very useful for 2274 avoiding/relieving congestion in certain situations. Examples can be 2275 found in [XIAO]. 2277 The routing system should also have the capability to control the 2278 routes of subsets of traffic without affecting the routes of other 2279 traffic if sufficient resources exist for this purpose. This 2280 capability allows a more refined control over the distribution of 2281 traffic across the network. For example, the ability to move traffic 2282 from a source to a destination away from its original path to another 2283 path (without affecting other traffic paths) allows traffic to be 2284 moved from resource-poor network segments to resource-rich segments. 2285 Path oriented technologies such as MPLS inherently support this 2286 capability as discussed in [AWD2]. 2288 Additionally, the routing subsystem should be able to select 2289 different paths for different classes of traffic (or for different 2290 traffic behavior aggregates) if the network supports multiple classes 2291 of service (different behavior aggregates). 2293 6.3 Traffic Mapping Requirements 2295 Traffic mapping pertains to the assignment of traffic workload onto 2296 pre-established paths to meet certain requirements. Thus, while 2297 constraint-based routing deals with path selection, traffic mapping 2298 deals with the assignment of traffic to established paths which may 2299 have been selected by constraint-based routing or by some other 2300 means. Traffic mapping can be performed by time-dependent or state- 2301 dependent mechanisms, as described in Section 5.1. 2303 An important aspect of the traffic mapping function is the ability to 2304 establish multiple paths between an originating node and a 2305 destination node, and the capability to distribute the traffic 2306 between the two nodes across the paths according to some policies. A 2307 pre-condition for this scheme is the existence of flexible mechanisms 2308 to partition traffic and then assign the traffic partitions onto the 2309 parallel paths. This requirement was noted in [AWD1]. When traffic is 2310 assigned to multiple parallel paths, it is recommended that special 2311 care should be taken to ensure proper ordering of packets belonging 2312 to the same application (or micro-flow) at the destination node of 2313 the parallel paths. 2315 As a general rule, mechanisms that perform the traffic mapping 2316 functions should aim to map the traffic onto the network 2317 infrastructure to minimize congestion. If the total traffic load 2318 cannot be accommodated, or if the routing and mapping functions 2319 cannot react fast enough to changing traffic conditions, then a 2320 traffic mapping system may rely on short time scale congestion 2321 control mechanisms (such as queue management, scheduling, etc) to 2322 mitigate congestion. Thus, mechanisms that perform the traffic 2323 mapping functions should complement existing congestion control 2324 mechanisms. In an operational network it is generally desirable to 2325 map the traffic onto the infrastructure such that intra-class and 2326 inter-class resource contention are minimized. 2328 When traffic mapping techniques that depend on dynamic state feedback 2329 (e.g. MATE, OMP, and such like) are used, special care must be taken 2330 to guarantee network stability. 2332 6.4 Measurement Requirements 2334 The importance of measurement in traffic engineering has been 2335 discussed throughout this document. Mechanisms should be provided to 2336 measure and collect statistics from the network to support the 2337 traffic engineering function. Additional capabilities may be needed 2338 to help in the analysis of the statistics. The actions of these 2339 mechanisms should not adversely affect the accuracy and integrity of 2340 the statistics collected. The mechanisms for statistical data 2341 acquisition should also be able to scale as the network evolves. 2343 Traffic statistics may be classified according to long-term or 2344 short-term time scales. Long-term time scale traffic statistics are 2345 very useful for traffic engineering. Long-term time scale traffic 2346 statistics may capture or reflect seasonality in network workload 2347 (hourly, daily, and weekly variations in traffic profiles) as well as 2348 traffic trends. Aspects of the monitored traffic statistics may also 2349 depict class of service characteristics for a network supporting 2350 multiple classes of service. Analysis of the long-term traffic 2351 statistics MAY yield secondary statistics such as busy hour 2352 characteristics, traffic growth patterns, persistent congestion 2353 problems, hot-spot, and imbalances in link utilization caused by 2354 routing anomalies. 2356 A mechanism for constructing traffic matrices for both long-term and 2357 short-term traffic statistics should be in place. In multiservice IP 2358 networks, the traffic matrices may be constructed for different 2359 service classes. Each element of a traffic matrix represents a 2360 statistic of traffic flow between a pair of abstract nodes. An 2361 abstract node may represent a router, a collection of routers, or a 2362 site in a VPN. 2364 Measured traffic statistics should provide reasonable and reliable 2365 indicators of the current state of the network on the short-term 2366 scale. Some short term traffic statistics may reflect link 2367 utilization and link congestion status. Examples of congestion 2368 indicators include excessive packet delay, packet loss, and high 2369 resource utilization. Examples of mechanisms for distributing this 2370 kind of information include SNMP, probing techniques, FTP, IGP link 2371 state advertisements, etc. 2373 6.5 Network Survivability 2375 Network survivability refers to the capability of a network to 2376 maintain service continuity in the presence of faults. This can be 2377 accomplished by promptly recovering from network impairments and 2378 maintaining the required QoS for existing services after recovery. 2380 Survivability has become an issue of great concern within the 2381 Internet community due to the increasing demands to carry mission 2382 critical traffic, real-time traffic, and other high priority traffic 2383 over the Internet. Failure protection and restoration capabilities 2384 have become available from multiple layers as network technologies 2385 have continued to improve. At the bottom of the layered stack, 2386 optical networks are now capable of providing dynamic ring and mesh 2387 restoration functionality at the wavelength level as well as 2388 traditional protection functionality. At the SONET/SDH layer 2389 survivability capability is provided with Automatic Protection 2390 Switching (APS) as well as self-healing ring and mesh architectures. 2391 Similar functionality is provided by layer 2 technologies such as ATM 2392 (generally with slower mean restoration times). Rerouting is 2393 traditionally used at the IP layer to restore service following link 2394 and node outages. Rerouting at the IP layer occurs after a period of 2395 routing convergence which may require seconds to minutes to complete. 2396 Some new developments in the MPLS context make it possible to achieve 2397 recovery at the IP layer prior to convergence. 2399 To support advanced survivability requirements, path-oriented 2400 technologies such a MPLS can be used to enhance the survivability of 2401 IP networks in a potentially cost effective manner. The advantages of 2402 path oriented technologies such as MPLS for IP restoration becomes 2403 even more evident when class based protection and restoration 2404 capabilities are required. 2406 Recently, a common suite of control plane protocols has been proposed 2407 for both MPLS and optical transport networks under the acronym 2408 Multiprotocol Lambda Switching [AWD5]. This new paradigm of 2409 Multiprotocol Lambda Switching will support even more sophisticated 2410 mesh restoration capabilities at the optical layer for the emerging 2411 IP over WDM network architectures. 2413 Another important aspect regarding multi-layer survivability is that 2414 technologies at different layers provide protection and restoration 2415 capabilities at different temporal granularities (in terms of time 2416 scales) and at different bandwidth granularity (from packet-level to 2417 wavelength level). Protection and restoration capabilities can also 2418 be sensitive to different service classes and different network 2419 utility models. 2421 The impact of service outages varies significantly for different 2422 service classes depending upon the effective duration of the outage. 2423 The duration of an outage can vary from milliseconds (with minor 2424 service impact) to seconds (with possible call drops for IP telephony 2425 and session time-outs for connection oriented transactions) to 2426 minutes and hours (with potentially considerable social and business 2427 impact). 2429 Coordinating different protection and restoration capabilities across 2430 multiple layers in a cohesive manner to ensure network survivability 2431 is maintained at reasonable cost is a challenging task. Protection 2432 and restoration coordination across layers may not always be 2433 feasible, because networks at different layers may belong to 2434 different administrative domains. 2436 The following paragraphs present some of the general requirements for 2437 protection and restoration coordination. 2439 - Protection and restoration capabilities from different layers 2440 should be coordinated whenever feasible and appropriate to 2441 provide network survivability in a flexible and cost effective 2442 manner. Minimization of function duplication across layers is 2443 one way to achieve the coordination. Escalation of alarms and 2444 other fault indicators from lower to higher layers may also 2445 be performed in a coordinated manner. A temporal order of 2446 restoration trigger timing at different layers is another way 2447 to coordinate multi-layer protection/restoration. 2449 - Spare capacity at higher layers is often regarded as working 2450 traffic at lower layers. Placing protection/restoration 2451 functions in many layers may increase redundancy and robustness, 2452 but it should not result in significant and avoidable 2453 inefficiencies in network resource utilization. 2455 - It is generally desirable to have protection and restoration 2456 schemes that are bandwidth efficient. 2458 - Failure notification throughout the network should be timely 2459 and reliable. 2461 - Alarms and other fault monitoring and reporting capabilities 2462 should be provided at appropriate layers. 2464 6.5.1 Survivability in MPLS Based Networks 2466 MPLS is an important emerging technology that enhances IP networks in 2467 terms of features, capabilities, and services. Because MPLS is path- 2468 oriented it can potentially provide faster and more predictable 2469 protection and restoration capabilities than conventional hop by hop 2470 routed IP systems. This subsection describes of some of the basic 2471 aspects and requirements for MPLS networks regarding protection and 2472 restoration. See [MAK] for a more comprehensive discussion on MPLS 2473 based recovery. 2475 Protection types for MPLS networks can be categorized as link 2476 protection, node protection, path protection, and segment protection. 2478 - Link Protection: The objective for link protection is to protect 2479 an LSP from a given link failure. Under link protection, the 2480 path of the protect or backup LSP (the secondary LSP) is disjoint 2481 from the path of the working or operational LSP at the particular 2482 link over which protection is required. When the protected link 2483 fails, traffic on the working LSP is switched over to the protect 2484 LSP at the head-end of the failed link. This is a local repair 2485 method which can be fast. It might be more appropriate in 2486 situations where some network elements along a given path are 2487 less reliable than others. 2489 - Node Protection: The objective of LSP node protection is to protect 2490 an LSP from a given node failure. Under node protection, the path 2491 of the protect LSP is disjoint from the path of the working LSP 2492 at the particular node to be protected. The secondary path is 2493 also disjoint from the primary path at all links associated with 2494 the node to be protected. When the node fails, traffic on the 2495 working LSP is switched over to the protect LSP at the upstream 2496 LSR directly connected to the failed node. 2498 - Path Protection: The goal of LSP path protection is to protect an 2499 LSP from failure at any point along its routed path. Under path 2500 protection, the path of the protect LSP is completely disjoint from 2501 the path of the working LSP. The advantage of path protection is 2502 that the backup LSP protects the working LSP from all possible link 2503 and node failures along the path, except for failures that might 2504 occur at the ingress and egress LSRs, or for correlated failures 2505 that might impact both working and backup paths 2506 simultaneously. Additionally, since the path selection is 2507 end-to-end, path protection might be more efficient in terms of 2508 resource usage than link or node protection. However, path 2509 protection may be slower than link and node protection in general. 2511 - Segment Protection: An MPLS domain may be partitioned into multiple 2512 protection domains whereby a failure in a protection domain is 2513 rectified within that domain. In cases where an LSP traverses 2514 multiple protection domains, a protection mechanism within a domain 2515 only needs to protect the segment of the LSP that lies within the 2516 domain. Segment protection will generally be faster than path 2517 protection because recovery generally occurs closer to the fault. 2519 6.5.2 Protection Option 2521 Another issue to consider is the concept of protection options. The 2522 protection option uses the notation m:n protection where m is the 2523 number of protect LSPs used to protect n working LSPs. Feasible 2524 protection options follow. 2526 - 1:1: one working LSP is protected/restored by one protect LSP. 2528 - n:1: one working LSP is protected/restored by n protect LSPs, 2529 possibly with configurable load splitting ratio. When more than 2530 one protect LSP is used, it may be desirable to share the traffic 2531 across the protect LSPs when the working LSP fails to satisfy the 2532 bandwidth requirement of the traffic trunk associated with the 2533 working LSP. This may be especially useful when it is not feasible 2534 to find one path that can satisfy the the bandwidth requirement of 2535 the primary LSP. 2537 - 1:n: one protection LSP is used to protect/restore n working LSPs. 2539 - 1+1: traffic is sent concurrently on both the working LSP and the 2540 protect LSP. In this case, the egress LSR selects one of the two 2541 LSPs based on a local traffic integrity decision process, which 2542 compares the traffic received from both the working and the protect 2543 LSP and identifies discrepancies. It is unlikely that this option 2544 would be used extensively in IP networks due to its resource 2545 utilization inefficiency. However, if bandwidth becomes plentiful 2546 and cheap, then this option might become quite viable and 2547 attractive in IP networks. 2549 6.5.3 Resilience Attributes 2551 Resilience attributes can be associated with explicit label switched 2552 in MPLS domains to indicate the manner in which traffic flowing 2553 through the LSP is restored when the LSP fails. These attributes can 2554 be categorized into basic attributes and extended attributes. The 2555 concept of resilience attributes within the MPLS context was first 2556 described in [AWD1]. 2558 Basic resilience attributes can indicate whether the traffic through 2559 an LSP can be rerouted using the IGP or mapped onto protect LSP(s) 2560 when a segment of the working path fails. A basic resilience 2561 attribute may also indicate that no rerouting is to occur at all. 2563 Extended resilience attributes can be used to specify more 2564 sophisticated recovery options. Some feasible options are described 2565 below: 2567 1. Protection LSP establishment attribute: Indicates whether the 2568 protect LSP is pre-established or established-on-demand 2569 after receiving a failure notification. A pre-established 2570 protect LSP can restore service faster, while an 2571 established-on-demand LSP is more likely to find a more 2572 efficient path with respect to resource usage. In the case 2573 of pre-established LSPs, if a fault impacts the working and 2574 protect LSPs simultaneously, it might not be feasible to 2575 restore the affected traffic if an alternative mechanism does 2576 not exist. 2578 2. Constraint attribute under failure condition: Indicates whether 2579 the protect LSP requires certain constraint(s) to be satisfied 2580 in order for it to be established. These constraints can be the 2581 same or less than the ones used to establish the primary LSP 2582 under normal conditions, e.g., bandwidth requirement, or no 2583 bandwidth requirement may be indicated under failure conditions. 2585 3. Protection LSP resource reservation attribute: Indicates whether 2586 resource allocation for a pre-established protection LSP is 2587 reserved a priori or reserved-on-demand after failure 2588 notification is received. 2590 We now discuss the relative merits of the resilience attributes. A 2591 pre-established protection LSP with pre-reserved resources can 2592 guarantee that the QoS of existing services is maintained upon 2593 failure of the primary LSP, while a pre-established and reserve-on- 2594 demand or an established-on-demand LSP may not be able to guarantee 2595 the QoS. The pre-established and pre-reserved approach is also the 2596 fastest among the three. It can switch packets onto the protection 2597 LSP once the ingress LSR receives the failure notification message 2598 without experiencing any delay for routing, resource allocation, and 2599 LSP establishment. However, a pre-established protection LSP may not 2600 be able to adapt to changes in the network since it cannot be re- 2601 established if a better path becomes available due to changes in the 2602 network. Additionally, the bandwidth reserved on the protection LSP 2603 is subtracted from the available bandwidth pool on all associated 2604 links, so it is not available for instantiating new LSPs in the 2605 future. On the other hand, it differs from SONET protection in that 2606 the reserved bandwidth does not remain under utilized. Instead, when 2607 deployed in an IP context, it can be used by any traffic present on 2608 those links. When pre-established protection LSP and established- 2609 on-demand LSp are compared, the it can be seen that the former will 2610 tend to restore traffic faster because there is no need to wait for 2611 the path to be set up prior to switching over traffic. However, if 2612 the requested bandwidth is not available on the pre-established path, 2613 it may be possible to use an established-on-demand LSP as a secondary 2614 option. 2616 Failure Notification: 2618 Failure notification should be reliable and fast, i.e., at least as 2619 fast as IGP notification, but preferably faster. 2621 6.6 Content Distribution (Webserver) Requirements 2623 The Internet is dominated by client-server interactions, especially 2624 Web traffic (in the future, more sophisticated media servers may 2625 become dominant). The location of major information servers has a 2626 significant impact on the traffic patterns within the Internet as 2627 well as on the perception of service quality by end users. 2629 A number of dynamic load balancing techniques have been devised to 2630 improve the performance of replicated information servers. These 2631 techniques can cause spatial traffic characteristics to become more 2632 dynamic in the Internet because information servers can be 2633 dynamically picked based upon the location of the clients, the 2634 location of the servers, the relative utilization of the servers, the 2635 relative performance of different networks, and the relative 2636 performance of different parts of a network. This process of 2637 assignment of distributed servers to clients is called Traffic 2638 Directing (TD). It is similar to traffic engineering but operates at 2639 the application layer. 2641 TD scheduling schemes that allocate servers to clients in replicated, 2642 geographically dispersed information distribution systems may require 2643 empirical network performance statistics to make more effective 2644 decisions. In the future, network measurement systems may be 2645 required to provide this type of information. The exact parameters 2646 needed are not yet defined. When congestion exists in the network, 2647 the TD and TE systems should act in a coordinated manner. This topic 2648 is for further study. 2650 Network planning should take into consideration the fact that TD can 2651 introduce more traffic dynamics into a network. It can be helpful 2652 for a certain amount of additional link capacity to be reserved so 2653 that the links can accommodate this additional traffic fluctuation. 2655 6.7 Traffic Engineering in Diffserv Environments 2657 This section provides an overview of the traffic engineering features 2658 and requirements that are specifically pertinent to Differentiated 2659 Services (Diffserv) capable IP networks. 2661 Increasing requirements to support multiple classes of traffic, such 2662 as best effort and mission critical data, in the Internet calls for 2663 IP networks to differentiate traffic according to some criteria, and 2664 to accord preferential treatment to certain types of traffic. Large 2665 numbers of flows can be aggregated into a few behavior aggregates 2666 based on some criteria in terms of common performance requirements in 2667 terms of packet loss ratio, delay, and jitter; or in terms of common 2668 fields within the IP packet headers. 2670 As Diffserv evolves and becomes deployed in operational networks, 2671 traffic engineering will be critical to ensuring that SLAs defined 2672 within a given Diffserv service model are met. Classes of service 2673 (CoS) can be supported in a Diffserv environment by concatenating 2674 per-hop behaviors (PHBs) along the routing path, using service 2675 provisioning mechanisms, and by appropriately configuring edge 2676 functionality such as traffic classification, marking, policing, and 2677 shaping. PHB is the forwarding behavior that a packet receives at a 2678 DS node (a Diffserv-compliant node). This is accomplished by means of 2679 buffer management and packet scheduling mechanisms. In this context, 2680 packets belonging to a class are those that are members of a 2681 corresponding ordering aggregate. 2683 In order to provide enhanced quality of service in a Diffserv domain, 2684 it is simply not enough to implement proper buffer management and 2685 scheduling mechanisms. Instead, in addition to buffer management and 2686 scheduling mechanisms, it may be desirable to control the performance 2687 of some service classes by enforcing certain relationships between 2688 the traffic workload contributed by each service class and the amount 2689 of network resources allocated or provisioned for that service class. 2690 Such relationships between demand and resource allocation can be 2691 enforced using a combination of, for example: (1) traffic engineering 2692 mechanisms that enforce the desired relationship between the amount 2693 of traffic contributed by a given service class and the resources 2694 allocated to that class and (2) mechanisms that dynamically adjust 2695 the resources allocated to a give service class to relate to the 2696 amount of traffic contributed by that service class. 2698 It may also be desirable to limit the performance impact of high 2699 priority traffic on relatively low priority traffic. This can be 2700 achieved by, for example, controlling the percentage of high priority 2701 traffic that is routed through a given link. Another way to 2702 accomplish this is to increase link capacities appropriately so that 2703 lower priority traffic can still enjoy adequate service quality. When 2704 the ratio of traffic workload contributed by different service 2705 classes vary significantly from router to router, it may not suffice 2706 to rely exclusively on conventional IGP routing protocols or on 2707 traffic engineering mechanisms that are insensitive to different 2708 service classes. Instead, it may be desirable to perform traffic 2709 engineering, especially routing control and mapping functions, on a 2710 per service class basis. One way to accomplish this in a domain that 2711 supports both MPLS and Diffserv is to define class specific LSPs and 2712 to map traffic from each class onto one or more LSPs that correspond 2713 to that service class. An LSP corresponding to a given service class 2714 can then be routed and protected/restored in a class dependent 2715 manner, according specific policies. 2717 Performing traffic engineering on a per class basis in multi-class IP 2718 networks might be beneficial in terms of both performance and 2719 scalability. It allows traffic trunk in a given class to utilize 2720 available resources on both shortest path(s) and non-shortest paths 2721 that meet constraints and requirements that are specific to the given 2722 class. MPLS is capable of providing different levels of 2723 protection/restoration mechanisms, from the fastest link/node 2724 protection to path protection which can be pre-established with or 2725 without pre-reserved resources or established-on-demand. The faster a 2726 mechanism, the more it costs to network resources. By performing 2727 per-class protection/restoration, each class can select some 2728 protection/restoration mechanisms that satisfy its survivability 2729 requirements in a cost effective manner. 2731 The following paragraphs describe very high level requirements that 2732 are specific to the control of traffic trunks in Diffserv/MPLS 2733 environments. These are additional to the general requirements for 2734 traffic engineering over MPLS described in [AWD1]. 2736 - An LSR should provide configurable maximum reservable bandwidth 2737 and/or buffer for each supported service class (Ordering Aggregate). 2739 - An LSR should provide configurable minimum available bandwidth 2740 and/or buffer for each class on each of its links. 2742 - In order to perform constraint-based routing on a per-class basis 2743 for LSPs, the conventional IGPs (e.g., IS-IS and OSPF) should provide 2744 extensions to propagate per-class resource information. 2746 - In contexts where delay bounds are a factor, then path selection 2747 algorithms for traffic trunks with bounded delay requirements should 2748 take into account delay constraint. Delay consists mainly 2749 serialization delay, propagation delay, (which is fixed for a given 2750 path), and queuing delay (which varies). In practice, it is quite to 2751 estimate delays analytically. Delay models are contemporary research 2752 topics. In practice, the queuing delay can be approximated using 2753 estimates of fixed per-hop queuing delay bound at each hop for each 2754 PHB. 2756 - When an LSR dynamically adjusts resource allocation based on per- 2757 class LSP resource requests, adjustment of weight used by scheduling 2758 algorithms should not adversely impact the delay and jitter 2759 characteristics of certain service classes. 2761 - An LSR should provide configurable maximum allocation multiplier on 2762 a per-class basis. 2764 - Measurement-based admission control may be used to improve resource 2765 usage, especially for those classes without stringent loss or delay 2766 and jitter requirements. For example, an LSR may dynamically adjust 2767 maximum allocation multiplier (i.e., over-subscribing and under- 2768 subscribing ratios) for certain classes based on their resource 2769 measured utilization. 2771 Instead of having per-class parameters being configured and 2772 propagated on each LSR interface, per-class parameters can be 2773 aggregated into per-class-type parameters. The main motivation for 2774 grouping a set of classes into a class-type is to improve the 2775 scalability of IGP link state advertisements by propagating 2776 information on a per-class-type basis instead of on a per-class 2777 basis, and also to allow better bandwidth sharing between classes in 2778 the same class-type. A class-type is a set of classes that satisfy 2779 the following two conditions: 2781 1) Classes in the same class-type have common aggregate maximum or 2782 minimum bandwidth requirements to satisfy required performance 2783 levels. 2785 2) There is no maximum or minimum bandwidth requirement to be 2786 enforced at the level of individual class in the class-type. It is 2787 still, nevertheless, to implement some "priority" policies for 2788 classes in the same class-type to permit preferential access to the 2789 class-type bandwidth. 2791 An example of the class-type can be a low-loss class-type that 2792 includes both AF1-based and AF2-based Ordering Aggregates. With such 2793 a class-type, one may implement some priority policy which assigns 2794 higher preemption priority to AF1-based traffic trunks over AF2-based 2795 ones, vice versa, or the same priority. 2797 6.8 Network Controllability 2799 Off-line (and on-line) traffic engineering considerations would be of 2800 limited utility if the network could not be controlled effectively to 2801 implement the results of TE decisions and to achieve desired network 2802 performance objectives. Capacity augmentation is a coarse grained 2803 solution to traffic engineering issues. However, it is simple and may 2804 be advantageous if bandwidth is abundant and cheap or if the current 2805 or expected network workload demands it. However, bandwidth is not 2806 always abundant and cheap, and the workload may not always demand 2807 additional capacity. Adjustments of administrative weights and other 2808 parameters associated with routing protocols provide finer grained 2809 control, but is difficult to use and imprecise because of the routing 2810 interactions that occur across the network. In certain network 2811 contexts, more flexible, finer grained approaches which provide more 2812 precise control over the mapping of traffic to routes and over the 2813 the selection and placement of routes may be appropriate and useful. 2815 Control mechanisms can be manual (e.g. administrative configuration), 2816 partially-automated (e.g. scripts) or fully-automated (e.g. policy 2817 based management systems). Automated mechanisms are particularly 2818 required in large scale networks. Multi-vendor interoperability can 2819 be facilitated by developing and deploying standardized management 2820 systems (e.g. standard MIBs) and policies (PIBs) to support the 2821 control functions required to address traffic engineering objectives 2822 such as load distribution and protection/restoration. 2824 Network control functions should be secure, reliable, and stable as 2825 these are often needed to operate correctly in times of network 2826 impairments (e.g. during network congestion or security attacks). 2828 7.0 Inter-Domain Considerations 2830 Inter-domain traffic engineering is concerned with the performance 2831 optimization for traffic that originates in one administrative domain 2832 and terminates in a different one. 2834 Traffic exchange between autonomous systems in the Internet occurs 2835 through exterior gateway protocols. Currently, BGP-4 [bgp4] is the 2836 standard exterior gateway protocol for the Internet. BGP-4 provides 2837 a number of capabilities that can be used to define import and export 2838 policies for network reachability information. BGP attributes are 2839 used by the BGP decision process to select exit points for traffic to 2840 other peer networks. 2842 Inter-domain traffic engineering is inherently more difficult than 2843 intra-domain TE under the current Internet architecture. The reasons 2844 for this are both technical and administrative. Technically, the 2845 current version of BGP does not propagate topology and link state 2846 information across domain boundaries. There are stability and 2847 scalability issues involved in propagating such details, which 2848 require careful consideration. Administratively, there are 2849 differences in operating costs and network capacities between 2850 domains. Generally, what may be considered a good solution in one 2851 domain may not necessarily be a good solution in another domain. 2853 Moreover, it would generally be considered inadvisable for one domain 2854 to permit another domain to influence the routing and management of 2855 traffic in its network. 2857 If Diffserv becomes widely deployed, inter-domain TE will become more 2858 important, and more challenging to address. 2860 MPLS TE-tunnels (explicit LSPs) can potentially add a degree of 2861 flexibility in the selection of exit points for inter-domain routing. 2862 The concept of relative and absolute metrics defined in [SHEN] can be 2863 applied to this purpose. The idea is that if BGP attributes are 2864 defined such that the BGP decision process depends on IGP metrics to 2865 select exit points for Inter-domain traffic, then some inter-domain 2866 traffic destined to a given peer network can be made to prefer a 2867 specific exit point by establishing a TE-tunnel between the router 2868 making the selection to the peering point via a TE-tunnel and 2869 assigning the TE-tunnel a metric which is smaller than the IGP cost 2870 to all other peering points. If a peer accepts and processes MEDs, 2871 then a similar MPLS TE-tunnel based scheme can be applied to cause 2872 certain entrance points to be preferred by setting MED to be an IGP 2873 cost, which has been modified by the tunnel metric. 2875 Similar to intra-domain TE, Inter-domain TE is best accomplished when 2876 a traffic matrix can be derived to depict the volume of traffic from 2877 one autonomous system to another. 2879 Generally, redistribution of inter-domain traffic requires 2880 coordination between peering partners. An export policy in one domain 2881 that results in load redistribution across peer points with another 2882 domain can significantly affect the local traffic matrix inside the 2883 domain of the peering partner. This, in turn, will affect the intra- 2884 domain TE due to changes in the spatial distribution traffic. 2885 Therefore, it is critical for peering partners to coordinate with 2886 each other before attempting any policy changes that may result in 2887 significant shifts in inter-domain traffic. In certain contexts, this 2888 coordination can be quite challenging due to technical and non- 2889 technical reasons. 2891 It is a matter of speculation as to whether MPLS, or similar 2892 technologies, can be extended to allow selection of constrained-paths 2893 across domain boundaries. 2895 8.0 Overview of Contemporary TE Practices in Operational IP Networks 2897 This section provides an overview of some contemporary traffic 2898 engineering practices in IP networks. The focus is primarily on the 2899 aspects that pertain to the control of the routing function in 2900 operational contexts. The intent here is to provide an overview of 2901 the commonly used practices. The discussion is not intended to be 2902 exhaustive. 2904 Currently, service providers apply many of the traffic engineering 2905 mechanisms discussed in this document to optimize the performance of 2906 their IP networks. These techniques include capacity planning for 2907 long time scales, routing control using IGP metrics and MPLS for 2908 medium time scales, the overlay model also for medium time scales, 2909 and traffic management mechanisms for short time scale. 2911 When a service provider plans to build an IP network, or expand the 2912 capacity of an existing network, effective capacity planning should 2913 be an important component of the process. Such plans may take the 2914 following aspects into account: location of new nodes if any, 2915 existing and predicted traffic patterns, costs, link capacity, 2916 topology, routing design, and survivability. 2918 Performance optimization of operational networks is usually an 2919 ongoing process in which traffic statistics, performance parameters, 2920 and fault indicators are continually collected from the network. 2921 These empirical data are then analyzed and used to trigger various 2922 traffic engineering mechanisms. For example, IGP parameters, e.g., 2923 OSPF or IS-IS metrics, can be adjusted based on manual computations 2924 or based on the output of some traffic engineering support tools. 2925 Such tools may use the following as input the: traffic matrix, 2926 network topology, and network performance objective(s). Tools that 2927 perform what-if analysis can also be used to assist the TE process by 2928 allowing various scenarios to be reviewed before a new set of 2929 configurations are implemented in the operational network. 2931 The overlay model (IP over ATM or IP over Frame relay) is another 2932 approach which is commonly used in practice [AWD2]. The IP over ATM 2933 technique is no longer viewed favorably due to recent advances in 2934 MPLS and router hardware technology. 2936 Deployment of MPLS for traffic engineering applications has commenced 2937 in some service provider networks. One operational scenario is to 2938 deploy MPLS in conjunction with an IGP (IS-IS-TE or OSPF-TE) that 2939 supports the traffic engineering extensions, in conjunction with 2940 constraint-based routing for explicit route computations, and a 2941 signaling protocol (e.g. RSVP-TE or CRLDP) for LSP instantiation. 2943 In contemporary MPLS traffic engineering contexts, network 2944 administrators specify and configure link attributes and resource 2945 constraints such as maximum reservable bandwidth and resource class 2946 attributes for links (interfaces) within the MPLS domain. A link 2947 state protocol that supports TE extensions (IS-IS-TE or OSPF-TE) is 2948 used to propagate information about network topology and link 2949 attribute to all routers in the routing area. Network administrators 2950 also specify all the LSPs that are to originate each router. For each 2951 LSP, the network administrator specifies the destination node and the 2952 attributes of the LSP which indicate the requirements that to be 2953 satisfied during the path selection process. Each router then uses a 2954 local constraint-based routing process to compute explicit paths for 2955 all LSPs originating from it. Subsequently, a signaling protocol is 2956 used to instantiate the LSPs. By assigning proper bandwidth values to 2957 links and LSPs, congestion caused by uneven traffic distribution can 2958 generally be avoided or mitigated. 2960 The bandwidth attributes of LSPs used for traffic engineering can be 2961 updated periodically. The basic concept is that the bandwidth 2962 assigned to an LSP should relate in some manner to the bandwidth 2963 requirements of traffic that actually flows through the LSP. The 2964 traffic attribute of an LSP can be modified to accommodate traffic 2965 growth and persistent traffic shifts. If network congestion occurs 2966 due to some unexpected events, existing LSPs can be rerouted to 2967 alleviate the situation or network administrator can configure new 2968 LSPs to divert some traffic to alternative paths. The reservable 2969 bandwidth of the congested links can also be reduced to force some 2970 LSPs to be rerouted to other paths. 2972 In an MPLS domain, a traffic matrix can also be estimated by 2973 monitoring the traffic on LSPs. Such traffic statistics can be used 2974 for a variety of purposes including network planning and network 2975 optimization. Current practice suggests that deploying an MPLS 2976 network consisting of hundreds of routers and thousands of LSPs is 2977 feasible. In summary, recent deployment experience suggests that MPLS 2978 approach is very effective for traffic engineering in IP networks 2979 [XIAO]. 2981 9.0 Conclusion 2983 This document described a framework for traffic engineering in the 2984 Internet. It presented an overview of some of the basic issues 2985 surrounding traffic engineering in IP networks. The context of TE was 2986 described, a TE process models and a taxonomy of TE styles were 2987 presented. A brief historical review of pertinent developments 2988 related to traffic engineering was provided. A survey of contemporary 2989 TE techniques in operational networks was presented. Additionally, 2990 the document specified a set of generic requirements, 2991 recommendations, and options for Internet traffic engineering. 2993 10.0 Security Considerations 2995 This document does not introduce new security issues. 2997 11.0 Acknowledgments 2999 The authors would like to thank Jim Boyle for inputs on the 3000 requirements section, Francois Le Faucheur for inputs on Diffserv 3001 aspects, Blaine Christian for inputs on measurement, Gerald Ash for 3002 inputs on routing in telephone networks and for text on event- 3003 dependent TE methods , and Steven Wright for inputs on network 3004 controllability. Special thanks to Randy Bush for proposing the TE 3005 taxonomy based on "tactical vs strategic" methods. The subsection 3006 describing an "Overview of ITU Activities Related to Traffic 3007 Engineering" was adapted from a contribution by Waisum Lai. Useful 3008 feedback and pointers to relevant materials were provided by J. Noel 3009 Chiappa. Finally, the authors would like to thank Ed Kern, the TEWG 3010 co-chair, for his comments and support. 3012 12.0 References 3014 [ASH1] J. Ash, M. Girish, E. Gray, B. Jamoussi, G. Wright, 3015 "Applicability Statement for CR-LDP," Work in Progress, 1999. 3017 [ASH2] J. Ash, Dynamic Routing in Telecommunications Networks, McGraw 3018 Hill, 1998 3020 [ASH3] TE & QoS Methods for IP-, ATM-, & TDM-Based Networks, , Work in Progress, 2000. 3023 [AWD1] D. Awduche, J. Malcolm, J. Agogbua, M. O'Dell, J. McManus, 3024 "Requirements for Traffic Engineering over MPLS," RFC 2702, September 3025 1999. 3027 [AWD2] D. Awduche, "MPLS and Traffic Engineering in IP Networks," 3028 IEEE Communications Magazine, December 1999. 3030 [AWD3] D. Awduche, L. Berger, D. Gan, T. Li, G. Swallow, and V. 3031 Srinivasan "Extensions to RSVP for LSP Tunnels," Work in Progress, 3032 1999. 3034 [AWD4] D. Awduche, A. Hannan, X. Xiao, " Applicability Statement for 3035 Extensions to RSVP for LSP-Tunnels" Work in Progress, 1999. 3037 [AWD5] D. Awduche et al, "An Approach to Optimal Peering Between 3038 Autonomous Systems in the Internet," International Conference on 3039 Computer Communications and Networks (ICCCN'98), October 1998. 3041 [AWD6] D. Awduche, Y. Rekhter, J. Drake, R. Coltun, "Multiprotocol 3042 Lambda Switching: Combining MPLS Traffic Engineering Control with 3043 Optical Crossconnects," Work in Progress, 1999. 3045 [CAL] R. Callon, P. Doolan, N. Feldman, A. Fredette, G. Swallow, A. 3046 Viswanathan, A Framework for Multiprotocol Label Switching," Work in 3047 Progress, 1999. 3049 [FGLR] A. Feldmann, A. Greenberg, C. Lund, N. Reingold, and J. 3050 Rexford, "NetScope: Traffic Engineering for IP Networks," to appear 3051 in IEEE Network Magazine, 2000. 3053 [FlJa93] S. Floyd and V. Jacobson, "Random Early Detection Gateways 3054 for Congestion Avoidance", IEEE/ACM Transactions on Networking, Vol. 3055 1 Nov. 4., August 1993, p. 387-413. 3057 [FLoyd2000] S. Floyd, "Congestion Control Principles," Work in 3058 Progress, 2000. 3060 [Floy94] S. Floyd, "TCP and Explicit Congestion Notification", ACM 3061 Computer Communication Review, V. 24, No. 5, October 1994, p. 10-23. 3063 [HuSS87] B.R. Hurley, C.J.R. Seidl and W.F. Sewel, "A Survey of 3064 Dynamic Routing Methods for Circuit-Switched Traffic", IEEE 3065 Communication Magazine, Sep 1987. 3067 [itu-e600] ITU-T Recommendation E.600, "Terms and Definitions of 3068 Traffic Engineering", March 1993. 3070 [itu-e701] ITU-T Recommendation E.701 "Reference Connections for 3071 Traffic Engineering", October 1993. 3073 [JAM] B. Jamoussi, "Constraint-Based LSP Setup using LDP," Work in 3074 Progress, 1999. 3076 [Li-IGP] T. Li, G. Swallow, and D. Awduche, "IGP Requirements for 3077 Traffic Engineering with MPLS," Work in Progress, 1999 3079 [Berger] L. Berger, D. Gan, G. Swallow, P. Pan, F. Tommasi, and S. 3080 Molendini "RSVP Refresh Overhead Reduction Extensions," Work in 3081 Progress, 2000 3083 [LNO96] T. Lakshman, A. Neidhardt, and T. Ott, "The Drop from Front 3084 Strategy in TCP over ATM and its Interworking with other Control 3085 Features", Proc. INFOCOM'96, p. 1242-1250. 3087 [MATE] I. Widjaja and A. Elwalid, "MATE: MPLS Adaptive Traffic 3088 Engineering," Work in Progress, 1999. 3090 [ELW95] A. Elwalid, D. Mitra and R.H. Wentworth, "A New Approach for 3091 Allocating Buffers and Bandwidth to Heterogeneous, Regulated Traffic 3092 in an ATM Node," IEEE IEEE Journal on Selected Areas in 3093 Communications, 13:6, August 1995, pp. 1115-1127. 3095 [Cruz] R. L. Cruz, "A Calculus for Network Delay, Part II: Network 3096 Analysis,'' IEEE Transactions on Information Theory, vol. 37, pp. 3097 132--141, 1991. 3099 [McQ80] J.M. McQuillan, I. Richer, and E.C. Rosen, "The New Routing 3100 Algorithm for the ARPANET", IEEE. Trans. on Communications, vol. 28, 3101 no. 5, pp. 711-719, May 1980. 3103 [RFC-1992] I. Castineyra, N. Chiappa, and M. Steenstrup, "The Nimrod 3104 Routing Architecture," RFC-1992, August 1996. 3106 [MR99] D. Mitra and K.G. Ramakrishnan, "A Case Study of Multiservice, 3107 Multipriority Traffic Engineering Design for Data Networks, Proc. 3108 Globecom'99, Dec 1999. 3110 [OMP] C. Villamizar, "MPLS Optimized OMP", Work in Progress, 1999. 3112 [RFC-1349] P. Almquist, "Type of Service in the Internet Protocol 3113 Suite", RFC 1349, Jul 1992. 3115 [RFC-1458] R. Braudes, S. Zabele, "Requirements for Multicast 3116 Protocols," RFC 1458, May 1993. 3118 [RFC-1771] Y. Rekhter and T. Li, "A Border Gateway Protocol 4 (BGP- 3119 4), RFC 1771, March 195. 3121 [RFC-1812] F. Baker (Editor), "Requirements for IP Version 4 3122 Routers," RFC 1812, June 1995. 3124 [RFC-1997] R. Chandra, P. Traina, and T. Li, "BGP Community 3125 Attributes" RFC 1997, August 1996. 3127 [RFC-1998] E. Chen and T. Bates, "An Application of the BGP Community 3128 Attribute in Multi-home Routing," RFC 1998, August 1996. 3130 [RFC-2178] J. Moy, "OSPF Version 2", RFC 2178, July 1997. 3132 [RFC-2205] R. Braden, et. al., "Resource Reservation Protocol (RSVP) 3133 - Version 1 Functional Specification", RFC 2205, September 1997. 3135 [RFC-2211] J. Wroclawski, "Specification of the Controlled-Load 3136 Network Element Service", RFC 2211, Sep 1997. 3138 [RFC-2212] S. Shenker, C. Partridge, R. Guerin, "Specification of 3139 Guaranteed Quality of Service," RFC 2212, September 1997 3141 [RFC-2215] S. Shenker, and J. Wroclawski, "General Characterization 3142 Parameters for Integrated Service Network Elements", RFC 2215, 3143 September 1997. 3145 [RFC-2216] S. Shenker, and J. Wroclawski, "Network Element Service 3146 Specification Template", RFC 2216, September 1997. 3148 [RFC-2330] V. Paxson et al., "Framework for IP Performance Metrics", 3149 RFC 2330, May 1998. 3151 [RFC-2386] E. Crawley, R. Nair, B. Rajagopalan, and H. Sandick, "A 3152 Framework for QoS-based Routing in the Internet", RFC 2386, Aug. 3153 1998. 3155 Q. Ma, "Quality of Service Routing in Integrated Services Networks," 3156 PhD Dissertation, CMU-CS-98-138, CMU, 1998. 3158 [RFC-2475] S. Blake et al., "An Architecture for Differentiated 3159 Services", RFC 2475, Dec 1998. 3161 [RFC-2597] J. Heinanen, F. Baker, W. Weiss, and J. Wroclawski, 3162 "Assured Forwarding PHB Group", RFC 2597, June 1999. 3164 [RFC-2678] J. Mahdavi and V. Paxson, "IPPM Metrics for Measuring 3165 Connectivity", RFC 2678, Sep 1999. 3167 [RFC-2679] G. Almes, S. Kalidindi, and M. Zekauskas, "A One-way Delay 3168 Metric for IPPM", RFC 2679, Sep 1999. 3170 [RFC-2680] G. Almes, S. Kalidindi, and M. Zekauskas, "A One-way 3171 Packet Loss Metric for IPPM", RFC 2680, Sep 1999. 3173 [RFC-2722] N. Brownlee, C. Mills, and G. Ruth, "Traffic Flow 3174 Measurement: Architecture", RFC 2722, Oct 1999. 3176 [RFC-2753] R. Yavatkar, D. Pendarakis, R. Guerin, "A Framework for 3177 Policy-based Admission Control, RFC 2753, January 2000. 3179 [RoVC] E. Rosen, A. Viswanathan, R. Callon, "Multiprotocol Label 3180 Switching Architecture," Work in Progress, 1999. 3182 [SLDC98] B. Suter, T. Lakshman, D. Stiliadis, and A. Choudhury, 3183 "Design Considerations for Supporting TCP with Per-flow Queueing", 3184 Proc. INFOCOM'99, 1998, p. 299-306. 3186 [MAK] S. Makam, et. al., "Framework for MPLS Based Recovery", Work in 3187 Progress, 2000. 3189 [XIAO] X. Xiao, A. Hannan, B. Bailey, L. Ni, "Traffic Engineering 3190 with MPLS in the Internet", IEEE Network magazine, March 2000. 3192 [YaRe95] C. Yang and A. Reddy, "A Taxonomy for Congestion Control 3193 Algorithms in Packet Switching Networks", IEEE Network Magazine, 1995 3194 p. 34-45. 3196 [SMIT] H. Smit and T. Li, "IS-IS extensions for Traffic 3197 Engineering,"Internet Draft, Work in Progress, 1999 3199 [KATZ] D. Katz, D. Yeung, "Traffic Engineering Extensions to 3200 OSPF,"Internet Draft, Work in Progress, 1999 3202 [SHEN] N. Shen and H. Smit, "Calculating IGP routes over Traffic 3203 Engineering tunnels" Internet Draft, Work in Progress, 1999. 3205 13.0 Authors' Addresses: 3207 Daniel O. Awduche 3208 UUNET (MCI Worldcom) 3209 22001 Loudoun County Parkway 3210 Ashburn, VA 20147 3211 Phone: 703-886-5277 3212 Email: awduche@uu.net 3214 Angela Chiu 3215 AT&T Labs 3216 Rm 4-204, 3217 100 Schulz Dr. 3218 Red Bank, NJ 07701 3219 Phone: (732) 345-3441 3220 Email: alchiu@att.com 3222 Anwar Elwalid 3223 Lucent Technologies 3224 Murray Hill, NJ 07974, USA 3225 Phone: 908 582-7589 3226 Email: anwar@lucent.com 3228 Indra Widjaja 3229 Fujitsu Network Communications 3230 Two Blue Hill Plaza 3231 Pearl River, NY 10965, USA 3232 Phone: 914-731-2244 3233 Email: indra.widjaja@fnc.fujitsu.com 3235 Xipeng Xiao 3236 Global Crossing 3237 141 Caspian Court, 3238 Sunnyvale, CA 94089 3239 Email: xipeng@globalcenter.net 3240 Voice: +1 408-543-4801