idnits 2.17.1 draft-dt-teas-rfc3272bis-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (November 20, 2019) is 1616 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Obsolete informational reference (is this intentional?): RFC 3272 (Obsoleted by RFC 9522) Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TEAS Working Group A. Farrel, Ed. 3 Internet-Draft Old Dog Consulting 4 Obsoletes: 3272 (if approved) November 20, 2019 5 Intended status: Informational 6 Expires: May 23, 2020 8 Overview and Principles of Internet Traffic Engineering 9 draft-dt-teas-rfc3272bis-02 11 Abstract 13 This memo describes the principles of Traffic Engineering (TE) in the 14 Internet. The document is intended to promote better understanding 15 of the issues surrounding traffic engineering in IP networks, and to 16 provide a common basis for the development of traffic engineering 17 capabilities for the Internet. The principles, architectures, and 18 methodologies for performance evaluation and performance optimization 19 of operational IP networks are discussed throughout this document. 21 This work was first published as RFC 3272 in May 2002. This document 22 obsoletes RFC 3272 by making a complete update to bring the text in 23 line with current best practices for Internet traffic engineering and 24 to include references to the latest relevant work in the IETF. 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at https://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on May 23, 2020. 43 Copyright Notice 45 Copyright (c) 2019 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (https://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 61 1.1. What is Internet Traffic Engineering? . . . . . . . . . . 4 62 1.2. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 8 63 1.3. Terminology . . . . . . . . . . . . . . . . . . . . . . . 8 64 2. Background . . . . . . . . . . . . . . . . . . . . . . . . . 11 65 2.1. Context of Internet Traffic Engineering . . . . . . . . . 12 66 2.2. Network Context . . . . . . . . . . . . . . . . . . . . . 12 67 2.3. Problem Context . . . . . . . . . . . . . . . . . . . . . 14 68 2.3.1. Congestion and its Ramifications . . . . . . . . . . 15 69 2.4. Solution Context . . . . . . . . . . . . . . . . . . . . 16 70 2.4.1. Combating the Congestion Problem . . . . . . . . . . 18 71 2.5. Implementation and Operational Context . . . . . . . . . 21 72 2.6. High-Level Objectives . . . . . . . . . . . . . . . . . . 21 73 3. Traffic Engineering Process Models . . . . . . . . . . . . . 23 74 3.1. Components of the Traffic Engineering Process Model . . . 25 75 3.2. Measurement . . . . . . . . . . . . . . . . . . . . . . . 25 76 3.3. Modeling, Analysis, and Simulation . . . . . . . . . . . 26 77 3.4. Optimization . . . . . . . . . . . . . . . . . . . . . . 27 78 4. Review of TE Techniques . . . . . . . . . . . . . . . . . . . 28 79 4.1. Historic Overview . . . . . . . . . . . . . . . . . . . . 28 80 4.1.1. Traffic Engineering in Classical Telephone Networks . 28 81 4.1.2. Evolution of Traffic Engineering in Packet Networks . 30 82 4.2. Development of Internet Traffic Engineering . . . . . . . 33 83 4.2.1. Overlay Model . . . . . . . . . . . . . . . . . . . . 33 84 4.2.2. Constraint-Based Routing . . . . . . . . . . . . . . 33 85 4.3. Overview of IETF Projects Related to Traffic Engineering 34 86 4.3.1. Integrated Services . . . . . . . . . . . . . . . . . 34 87 4.3.2. RSVP . . . . . . . . . . . . . . . . . . . . . . . . 35 88 4.3.3. Differentiated Services . . . . . . . . . . . . . . . 36 89 4.3.4. MPLS . . . . . . . . . . . . . . . . . . . . . . . . 37 90 4.3.5. Generalized MPLS . . . . . . . . . . . . . . . . . . 38 91 4.3.6. IP Performance Metrics . . . . . . . . . . . . . . . 38 92 4.3.7. Flow Measurement . . . . . . . . . . . . . . . . . . 39 93 4.3.8. Endpoint Congestion Management . . . . . . . . . . . 39 94 4.3.9. TE Extensions to the IGPs . . . . . . . . . . . . . . 39 95 4.3.10. Link-State BGP . . . . . . . . . . . . . . . . . . . 40 96 4.3.11. Path Computation Element . . . . . . . . . . . . . . 40 97 4.3.12. Segment Routing . . . . . . . . . . . . . . . . . . . 40 98 4.3.13. Network Virtualization and Abstraction . . . . . . . 40 99 4.3.14. Deterministic Networking . . . . . . . . . . . . . . 40 100 4.3.15. Network TE State Definition and Presentation . . . . 40 101 4.3.16. System Management and Control Interfaces . . . . . . 40 102 4.4. Overview of ITU Activities Related to Traffic Engineering 41 103 4.5. Content Distribution . . . . . . . . . . . . . . . . . . 42 104 5. Taxonomy of Traffic Engineering Systems . . . . . . . . . . . 43 105 5.1. Time-Dependent Versus State-Dependent Versus Event 106 Dependent . . . . . . . . . . . . . . . . . . . . . . . . 43 107 5.2. Offline Versus Online . . . . . . . . . . . . . . . . . . 44 108 5.3. Centralized Versus Distributed . . . . . . . . . . . . . 45 109 5.3.1. Hybrid Systems . . . . . . . . . . . . . . . . . . . 45 110 5.3.2. Considerations for Software Defined Networking . . . 45 111 5.4. Local Versus Global . . . . . . . . . . . . . . . . . . . 45 112 5.5. Prescriptive Versus Descriptive . . . . . . . . . . . . . 46 113 5.5.1. Intent-Based Networking . . . . . . . . . . . . . . . 46 114 5.6. Open-Loop Versus Closed-Loop . . . . . . . . . . . . . . 46 115 5.7. Tactical vs Strategic . . . . . . . . . . . . . . . . . . 46 116 6. Objectives for Internet Traffic Engineering . . . . . . . . . 47 117 6.1. Routing . . . . . . . . . . . . . . . . . . . . . . . . . 47 118 6.2. Traffic Mapping . . . . . . . . . . . . . . . . . . . . . 50 119 6.3. Measurement . . . . . . . . . . . . . . . . . . . . . . . 50 120 6.4. Network Survivability . . . . . . . . . . . . . . . . . . 51 121 6.4.1. Survivability in MPLS Based Networks . . . . . . . . 53 122 6.4.2. Protection Option . . . . . . . . . . . . . . . . . . 55 123 6.5. Traffic Engineering in Diffserv Environments . . . . . . 55 124 6.6. Network Controllability . . . . . . . . . . . . . . . . . 57 125 7. Inter-Domain Considerations . . . . . . . . . . . . . . . . . 58 126 8. Overview of Contemporary TE Practices in Operational IP 127 Networks . . . . . . . . . . . . . . . . . . . . . . . . . . 60 128 9. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 64 129 10. Security Considerations . . . . . . . . . . . . . . . . . . . 65 130 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 65 131 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 65 132 13. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 65 133 14. Informative References . . . . . . . . . . . . . . . . . . . 68 134 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 74 136 1. Introduction 138 This memo describes the principles of Internet traffic engineering. 139 The objective of the document is to articulate the general issues and 140 principles for Internet traffic engineering; and where appropriate to 141 provide recommendations, guidelines, and options for the development 142 of online and offline Internet traffic engineering capabilities and 143 support systems. 145 This document can aid service providers in devising and implementing 146 traffic engineering solutions for their networks. Networking 147 hardware and software vendors will also find this document helpful in 148 the development of mechanisms and support systems for the Internet 149 environment that support the traffic engineering function. 151 This document provides a terminology for describing and understanding 152 common Internet traffic engineering concepts. This document also 153 provides a taxonomy of known traffic engineering styles. In this 154 context, a traffic engineering style abstracts important aspects from 155 a traffic engineering methodology. Traffic engineering styles can be 156 viewed in different ways depending upon the specific context in which 157 they are used and the specific purpose which they serve. The 158 combination of styles and views results in a natural taxonomy of 159 traffic engineering systems. 161 Even though Internet traffic engineering is most effective when 162 applied end-to-end, the initial focus of this document document is 163 intra-domain traffic engineering (that is, traffic engineering within 164 a given autonomous system). However, because a preponderance of 165 Internet traffic tends to be inter-domain (originating in one 166 autonomous system and terminating in another), this document provides 167 an overview of aspects pertaining to inter-domain traffic 168 engineering. 170 This work was first published as [RFC3272] in May 2002. This 171 document obsoletes [RFC3272] by making a complete update to bring the 172 text in line with current best practices for Internet traffic 173 engineering and to include references to the latest relevant work in 174 the IETF. 176 1.1. What is Internet Traffic Engineering? 178 Internet traffic engineering is defined as that aspect of Internet 179 network engineering dealing with the issue of performance evaluation 180 and performance optimization of operational IP networks. Traffic 181 Engineering encompasses the application of technology and scientific 182 principles to the measurement, characterization, modeling, and 183 control of Internet traffic [RFC2702], [AWD2]. 185 Enhancing the performance of an operational network, at both the 186 traffic and resource levels, are major objectives of Internet traffic 187 engineering. This is accomplished by addressing traffic oriented 188 performance requirements, while utilizing network resources 189 economically and reliably. Traffic oriented performance measures 190 include delay, delay variation, packet loss, and throughput. 192 An important objective of Internet traffic engineering is to 193 facilitate reliable network operations [RFC2702]. Reliable network 194 operations can be facilitated by providing mechanisms that enhance 195 network integrity and by embracing policies emphasizing network 196 survivability. This results in a minimization of the vulnerability 197 of the network to service outages arising from errors, faults, and 198 failures occurring within the infrastructure. 200 The Internet exists in order to transfer information from source 201 nodes to destination nodes. Accordingly, one of the most significant 202 functions performed by the Internet is the routing of traffic from 203 ingress nodes to egress nodes. Therefore, one of the most 204 distinctive functions performed by Internet traffic engineering is 205 the control and optimization of the routing function, to steer 206 traffic through the network in the most effective way. 208 Ultimately, it is the performance of the network as seen by end users 209 of network services that is truly paramount. This crucial point 210 should be considered throughout the development of traffic 211 engineering mechanisms and policies. The characteristics visible to 212 end users are the emergent properties of the network, which are the 213 characteristics of the network when viewed as a whole. A central 214 goal of the service provider, therefore, is to enhance the emergent 215 properties of the network while taking economic considerations into 216 account. 218 The importance of the above observation regarding the emergent 219 properties of networks is that special care must be taken when 220 choosing network performance measures to optimize. Optimizing the 221 wrong measures may achieve certain local objectives, but may have 222 disastrous consequences on the emergent properties of the network and 223 thereby on the quality of service perceived by end-users of network 224 services. 226 A subtle, but practical advantage of the systematic application of 227 traffic engineering concepts to operational networks is that it helps 228 to identify and structure goals and priorities in terms of enhancing 229 the quality of service delivered to end-users of network services. 230 The application of traffic engineering concepts also aids in the 231 measurement and analysis of the achievement of these goals. 233 The optimization aspects of traffic engineering can be achieved 234 through capacity management and traffic management. As used in this 235 document, capacity management includes capacity planning, routing 236 control, and resource management. Network resources of particular 237 interest include link bandwidth, buffer space, and computational 238 resources. Likewise, as used in this document, traffic management 239 includes (1) nodal traffic control functions such as traffic 240 conditioning, queue management, scheduling, and (2) other functions 241 that regulate traffic flow through the network or that arbitrate 242 access to network resources between different packets or between 243 different traffic streams. 245 The optimization objectives of Internet traffic engineering should be 246 viewed as a continual and iterative process of network performance 247 improvement and not simply as a one time goal. Traffic engineering 248 also demands continual development of new technologies and new 249 methodologies for network performance enhancement. 251 The optimization objectives of Internet traffic engineering may 252 change over time as new requirements are imposed, as new technologies 253 emerge, or as new insights are brought to bear on the underlying 254 problems. Moreover, different networks may have different 255 optimization objectives, depending upon their business models, 256 capabilities, and operating constraints. The optimization aspects of 257 traffic engineering are ultimately concerned with network control 258 regardless of the specific optimization goals in any particular 259 environment. 261 Thus, the optimization aspects of traffic engineering can be viewed 262 from a control perspective. The aspect of control within the 263 Internet traffic engineering arena can be pro-active and/or reactive. 264 In the pro-active case, the traffic engineering control system takes 265 preventive action to obviate predicted unfavorable future network 266 states. It may also take perfective action to induce a more 267 desirable state in the future. In the reactive case, the control 268 system responds correctively and perhaps adaptively to events that 269 have already transpired in the network. 271 The control dimension of Internet traffic engineering responds at 272 multiple levels of temporal resolution to network events. Certain 273 aspects of capacity management, such as capacity planning, respond at 274 very coarse temporal levels, ranging from days to possibly years. 275 The introduction of automatically switched optical transport networks 276 (e.g., based on the Multi-protocol Lambda Switching concepts) could 277 significantly reduce the lifecycle for capacity planning by 278 expediting provisioning of optical bandwidth. Routing control 279 functions operate at intermediate levels of temporal resolution, 280 ranging from milliseconds to days. Finally, the packet level 281 processing functions (e.g., rate shaping, queue management, and 282 scheduling) operate at very fine levels of temporal resolution, 283 ranging from picoseconds to milliseconds while responding to the 284 real-time statistical behavior of traffic. The subsystems of 285 Internet traffic engineering control include: capacity augmentation, 286 routing control, traffic control, and resource control (including 287 control of service policies at network elements). When capacity is 288 to be augmented for tactical purposes, it may be desirable to devise 289 a deployment plan that expedites bandwidth provisioning while 290 minimizing installation costs. 292 Inputs into the traffic engineering control system include network 293 state variables, policy variables, and decision variables. 295 One major challenge of Internet traffic engineering is the 296 realization of automated control capabilities that adapt quickly and 297 cost effectively to significant changes in a network's state, while 298 still maintaining stability. 300 Another critical dimension of Internet traffic engineering is network 301 performance evaluation, which is important for assessing the 302 effectiveness of traffic engineering methods, and for monitoring and 303 verifying compliance with network performance goals. Results from 304 performance evaluation can be used to identify existing problems, 305 guide network re-optimization, and aid in the prediction of potential 306 future problems. 308 Performance evaluation can be achieved in many different ways. The 309 most notable techniques include analytical methods, simulation, and 310 empirical methods based on measurements. When analytical methods or 311 simulation are used, network nodes and links can be modeled to 312 capture relevant operational features such as topology, bandwidth, 313 buffer space, and nodal service policies (link scheduling, packet 314 prioritization, buffer management, etc.). Analytical traffic models 315 can be used to depict dynamic and behavioral traffic characteristics, 316 such as burstiness, statistical distributions, and dependence. 318 Performance evaluation can be quite complicated in practical network 319 contexts. A number of techniques can be used to simplify the 320 analysis, such as abstraction, decomposition, and approximation. For 321 example, simplifying concepts such as effective bandwidth and 322 effective buffer [ELW95] may be used to approximate nodal behaviors 323 at the packet level and simplify the analysis at the connection 324 level. Network analysis techniques using, for example, queuing 325 models and approximation schemes based on asymptotic and 326 decomposition techniques can render the analysis even more tractable. 327 In particular, an emerging set of concepts known as network calculus 328 [CRUZ] based on deterministic bounds may simplify network analysis 329 relative to classical stochastic techniques. When using analytical 330 techniques, care should be taken to ensure that the models faithfully 331 reflect the relevant operational characteristics of the modeled 332 network entities. 334 Simulation can be used to evaluate network performance or to verify 335 and validate analytical approximations. Simulation can, however, be 336 computationally costly and may not always provide sufficient 337 insights. An appropriate approach to a given network performance 338 evaluation problem may involve a hybrid combination of analytical 339 techniques, simulation, and empirical methods. 341 As a general rule, traffic engineering concepts and mechanisms must 342 be sufficiently specific and well defined to address known 343 requirements, but simultaneously flexible and extensible to 344 accommodate unforeseen future demands. 346 1.2. Scope 348 The scope of this document is intra-domain traffic engineering; that 349 is, traffic engineering within a given autonomous system in the 350 Internet. This document will discuss concepts pertaining to intra- 351 domain traffic control, including such issues as routing control, 352 micro and macro resource allocation, and the control coordination 353 problems that arise consequently. 355 This document will describe and characterize techniques already in 356 use or in advanced development for Internet traffic engineering. The 357 way these techniques fit together will be discussed and scenarios in 358 which they are useful will be identified. 360 While this document considers various intra-domain traffic 361 engineering approaches, it focuses more on traffic engineering with 362 MPLS. Traffic engineering based upon manipulation of IGP metrics is 363 not addressed in detail. This topic may be addressed by other 364 working group document(s). 366 Although the emphasis is on intra-domain traffic engineering, in 367 Section 7, an overview of the high level considerations pertaining to 368 inter-domain traffic engineering will be provided. Inter-domain 369 Internet traffic engineering is crucial to the performance 370 enhancement of the global Internet infrastructure. 372 Whenever possible, relevant requirements from existing IETF documents 373 and other sources will be incorporated by reference. 375 1.3. Terminology 377 This subsection provides terminology which is useful for Internet 378 traffic engineering. The definitions presented apply to this 379 document. These terms may have other meanings elsewhere. 381 Baseline analysis A study conducted to serve as a baseline for 382 comparison to the actual behavior of the network. 384 Busy hour A one hour period within a specified interval of time 385 (typically 24 hours) in which the traffic load in a network or 386 sub-network is greatest. 388 Bottleneck A network element whose input traffic rate tends to be 389 greater than its output rate. 391 Congestion A state of a network resource in which the traffic 392 incident on the resource exceeds its output capacity over an 393 interval of time. 395 Congestion avoidance An approach to congestion management that 396 attempts to obviate the occurrence of congestion. 398 Congestion control An approach to congestion management that 399 attempts to remedy congestion problems that have already occurred. 401 Constraint-based routing A class of routing protocols that take 402 specified traffic attributes, network constraints, and policy 403 constraints into account when making routing decisions. 404 Constraint-based routing is applicable to traffic aggregates as 405 well as flows. It is a generalization of QoS routing. 407 Demand side congestion management A congestion management scheme 408 that addresses congestion problems by regulating or conditioning 409 offered load. 411 Effective bandwidth The minimum amount of bandwidth that can be 412 assigned to a flow or traffic aggregate in order to deliver 413 'acceptable service quality' to the flow or traffic aggregate. 415 Egress traffic Traffic exiting a network or network element. 417 Hot-spot A network element or subsystem which is in a state of 418 congestion. 420 Ingress traffic Traffic entering a network or network element. 422 Inter-domain traffic Traffic that originates in one Autonomous 423 system and terminates in another. 425 Loss network A network that does not provide adequate buffering for 426 traffic, so that traffic entering a busy resource within the 427 network will be dropped rather than queued. 429 Metric A parameter defined in terms of standard units of 430 measurement. 432 Measurement Methodology A repeatable measurement technique used to 433 derive one or more metrics of interest. 435 Network Survivability The capability to provide a prescribed level 436 of QoS for existing services after a given number of failures 437 occur within the network. 439 Offline traffic engineering A traffic engineering system that exists 440 outside of the network. 442 Online traffic engineering A traffic engineering system that exists 443 within the network, typically implemented on or as adjuncts to 444 operational network elements. 446 Performance measures Metrics that provide quantitative or 447 qualitative measures of the performance of systems or subsystems 448 of interest. 450 Performance management A systematic approach to improving 451 effectiveness in the accomplishment of specific networking goals 452 related to performance improvement. 454 Performance Metric A performance parameter defined in terms of 455 standard units of measurement. 457 Provisioning The process of assigning or configuring network 458 resources to meet certain requests. 460 QoS routing Class of routing systems that selects paths to be used 461 by a flow based on the QoS requirements of the flow. 463 Service Level Agreement A contract between a provider and a customer 464 that guarantees specific levels of performance and reliability at 465 a certain cost. 467 Stability An operational state in which a network does not oscillate 468 in a disruptive manner from one mode to another mode. 470 Supply side congestion management A congestion management scheme 471 that provisions additional network resources to address existing 472 and/or anticipated congestion problems. 474 Transit traffic Traffic whose origin and destination are both 475 outside of the network under consideration. 477 Traffic characteristic A description of the temporal behavior or a 478 description of the attributes of a given traffic flow or traffic 479 aggregate. 481 Traffic engineering system A collection of objects, mechanisms, and 482 protocols that are used conjunctively to accomplish traffic 483 engineering objectives. 485 Traffic flow A stream of packets between two end-points that can be 486 characterized in a certain way. A micro-flow has a more specific 487 definition A micro-flow is a stream of packets with the same 488 source and destination addresses, source and destination ports, 489 and protocol ID. 491 Traffic intensity A measure of traffic loading with respect to a 492 resource capacity over a specified period of time. In classical 493 telephony systems, traffic intensity is measured in units of 494 Erlang. 496 Traffic matrix A representation of the traffic demand between a set 497 of origin and destination abstract nodes. An abstract node can 498 consist of one or more network elements. 500 Traffic monitoring The process of observing traffic characteristics 501 at a given point in a network and collecting the traffic 502 information for analysis and further action. 504 Traffic trunk An aggregation of traffic flows belonging to the same 505 class which are forwarded through a common path. A traffic trunk 506 may be characterized by an ingress and egress node, and a set of 507 attributes which determine its behavioral characteristics and 508 requirements from the network. 510 2. Background 512 The Internet has quickly evolved into a very critical communications 513 infrastructure, supporting significant economic, educational, and 514 social activities. Simultaneously, the delivery of Internet 515 communications services has become very competitive and end-users are 516 demanding very high quality service from their service providers. 517 Consequently, performance optimization of large scale IP networks, 518 especially public Internet backbones, have become an important 519 problem. Network performance requirements are multi-dimensional, 520 complex, and sometimes contradictory; making the traffic engineering 521 problem very challenging. 523 The network must convey IP packets from ingress nodes to egress nodes 524 efficiently, expeditiously, and economically. Furthermore, in a 525 multiclass service environment (e.g., Diffserv capable networks), the 526 resource sharing parameters of the network must be appropriately 527 determined and configured according to prevailing policies and 528 service models to resolve resource contention issues arising from 529 mutual interference between packets traversing through the network. 530 Thus, consideration must be given to resolving competition for 531 network resources between traffic streams belonging to the same 532 service class (intra-class contention resolution) and traffic streams 533 belonging to different classes (inter-class contention resolution). 535 2.1. Context of Internet Traffic Engineering 537 The context of Internet traffic engineering pertains to the scenarios 538 where traffic engineering is used. A traffic engineering methodology 539 establishes appropriate rules to resolve traffic performance issues 540 occurring in a specific context. The context of Internet traffic 541 engineering includes: 543 1. A network context defining the universe of discourse, and in 544 particular the situations in which the traffic engineering 545 problems occur. The network context includes network structure, 546 network policies, network characteristics, network constraints, 547 network quality attributes, and network optimization criteria. 549 2. A problem context defining the general and concrete issues that 550 traffic engineering addresses. The problem context includes 551 identification, abstraction of relevant features, representation, 552 formulation, specification of the requirements on the solution 553 space, and specification of the desirable features of acceptable 554 solutions. 556 3. A solution context suggesting how to address the issues 557 identified by the problem context. The solution context includes 558 analysis, evaluation of alternatives, prescription, and 559 resolution. 561 4. An implementation and operational context in which the solutions 562 are methodologically instantiated. The implementation and 563 operational context includes planning, organization, and 564 execution. 566 The context of Internet traffic engineering and the different problem 567 scenarios are discussed in the following subsections. 569 2.2. Network Context 571 IP networks range in size from small clusters of routers situated 572 within a given location, to thousands of interconnected routers, 573 switches, and other components distributed all over the world. 575 Conceptually, at the most basic level of abstraction, an IP network 576 can be represented as a distributed dynamical system consisting of: 578 (1) a set of interconnected resources which provide transport 579 services for IP traffic subject to certain constraints, (2) a demand 580 system representing the offered load to be transported through the 581 network, and (3) a response system consisting of network processes, 582 protocols, and related mechanisms which facilitate the movement of 583 traffic through the network (see also [AWD2]). 585 The network elements and resources may have specific characteristics 586 restricting the manner in which the demand is handled. Additionally, 587 network resources may be equipped with traffic control mechanisms 588 superintending the way in which the demand is serviced. Traffic 589 control mechanisms may, for example, be used to control various 590 packet processing activities within a given resource, arbitrate 591 contention for access to the resource by different packets, and 592 regulate traffic behavior through the resource. A configuration 593 management and provisioning system may allow the settings of the 594 traffic control mechanisms to be manipulated by external or internal 595 entities in order to exercise control over the way in which the 596 network elements respond to internal and external stimuli. 598 The details of how the network provides transport services for 599 packets are specified in the policies of the network administrators 600 and are installed through network configuration management and policy 601 based provisioning systems. Generally, the types of services 602 provided by the network also depends upon the technology and 603 characteristics of the network elements and protocols, the prevailing 604 service and utility models, and the ability of the network 605 administrators to translate policies into network configurations. 607 Contemporary Internet networks have three significant 608 characteristics: (1) they provide real-time services, (2) they have 609 become mission critical, and (3) their operating environments are 610 very dynamic. The dynamic characteristics of IP networks can be 611 attributed in part to fluctuations in demand, to the interaction 612 between various network protocols and processes, to the rapid 613 evolution of the infrastructure which demands the constant inclusion 614 of new technologies and new network elements, and to transient and 615 persistent impairments which occur within the system. 617 Packets contend for the use of network resources as they are conveyed 618 through the network. A network resource is considered to be 619 congested if the arrival rate of packets exceed the output capacity 620 of the resource over an interval of time. Congestion may result in 621 some of the arrival packets being delayed or even dropped. 623 Congestion increases transit delays, delay variation, packet loss, 624 and reduces the predictability of network services. Clearly, 625 congestion is a highly undesirable phenomenon. 627 Combating congestion at a reasonable cost is a major objective of 628 Internet traffic engineering. 630 Efficient sharing of network resources by multiple traffic streams is 631 a basic economic premise for packet switched networks in general and 632 for the Internet in particular. A fundamental challenge in network 633 operation, especially in a large scale public IP network, is to 634 increase the efficiency of resource utilization while minimizing the 635 possibility of congestion. 637 Increasingly, the Internet will have to function in the presence of 638 different classes of traffic with different service requirements. 639 The advent of Differentiated Services [RFC2475] makes this 640 requirement particularly acute. Thus, packets may be grouped into 641 behavior aggregates such that each behavior aggregate may have a 642 common set of behavioral characteristics or a common set of delivery 643 requirements. In practice, the delivery requirements of a specific 644 set of packets may be specified explicitly or implicitly. Two of the 645 most important traffic delivery requirements are capacity constraints 646 and QoS constraints. 648 Capacity constraints can be expressed statistically as peak rates, 649 mean rates, burst sizes, or as some deterministic notion of effective 650 bandwidth. QoS requirements can be expressed in terms of (1) 651 integrity constraints such as packet loss and (2) in terms of 652 temporal constraints such as timing restrictions for the delivery of 653 each packet (delay) and timing restrictions for the delivery of 654 consecutive packets belonging to the same traffic stream (delay 655 variation). 657 2.3. Problem Context 659 Fundamental problems exist in association with the operation of a 660 network described by the simple model of the previous subsection. 661 This subsection reviews the problem context in relation to the 662 traffic engineering function. 664 The identification, abstraction, representation, and measurement of 665 network features relevant to traffic engineering is a significant 666 issue. 668 One particularly important class of problems concerns how to 669 explicitly formulate the problems that traffic engineering attempts 670 to solve, how to identify the requirements on the solution space, how 671 to specify the desirable features of good solutions, how to actually 672 solve the problems, and how to measure and characterize the 673 effectiveness of the solutions. 675 Another class of problems concerns how to measure and estimate 676 relevant network state parameters. Effective traffic engineering 677 relies on a good estimate of the offered traffic load as well as a 678 view of the underlying topology and associated resource constraints. 679 A network-wide view of the topology is also a must for offline 680 planning. 682 Still another class of problems concerns how to characterize the 683 state of the network and how to evaluate its performance under a 684 variety of scenarios. The performance evaluation problem is two- 685 fold. One aspect of this problem relates to the evaluation of the 686 system level performance of the network. The other aspect relates to 687 the evaluation of the resource level performance, which restricts 688 attention to the performance analysis of individual network 689 resources. In this memo, we refer to the system level 690 characteristics of the network as the "macro-states" and the resource 691 level characteristics as the "micro-states." The system level 692 characteristics are also known as the emergent properties of the 693 network as noted earlier. Correspondingly, we shall refer to the 694 traffic engineering schemes dealing with network performance 695 optimization at the systems level as "macro-TE" and the schemes that 696 optimize at the individual resource level as "micro-TE." Under 697 certain circumstances, the system level performance can be derived 698 from the resource level performance using appropriate rules of 699 composition, depending upon the particular performance measures of 700 interest. 702 Another fundamental class of problems concerns how to effectively 703 optimize network performance. Performance optimization may entail 704 translating solutions to specific traffic engineering problems into 705 network configurations. Optimization may also entail some degree of 706 resource management control, routing control, and/or capacity 707 augmentation. 709 As noted previously, congestion is an undesirable phenomena in 710 operational networks. Therefore, the next subsection addresses the 711 issue of congestion and its ramifications within the problem context 712 of Internet traffic engineering. 714 2.3.1. Congestion and its Ramifications 716 Congestion is one of the most significant problems in an operational 717 IP context. A network element is said to be congested if it 718 experiences sustained overload over an interval of time. Congestion 719 almost always results in degradation of service quality to end users. 720 Congestion control schemes can include demand side policies and 721 supply side policies. Demand side policies may restrict access to 722 congested resources and/or dynamically regulate the demand to 723 alleviate the overload situation. Supply side policies may expand or 724 augment network capacity to better accommodate offered traffic. 725 Supply side policies may also re-allocate network resources by 726 redistributing traffic over the infrastructure. Traffic 727 redistribution and resource re-allocation serve to increase the 728 'effective capacity' seen by the demand. 730 The emphasis of this memo is primarily on congestion management 731 schemes falling within the scope of the network, rather than on 732 congestion management systems dependent upon sensitivity and 733 adaptivity from end-systems. That is, the aspects that are 734 considered in this memo with respect to congestion management are 735 those solutions that can be provided by control entities operating on 736 the network and by the actions of network administrators and network 737 operations systems. 739 2.4. Solution Context 741 The solution context for Internet traffic engineering involves 742 analysis, evaluation of alternatives, and choice between alternative 743 courses of action. Generally the solution context is predicated on 744 making reasonable inferences about the current or future state of the 745 network, and subsequently making appropriate decisions that may 746 involve a preference between alternative sets of action. More 747 specifically, the solution context demands reasonable estimates of 748 traffic workload, characterization of network state, deriving 749 solutions to traffic engineering problems which may be implicitly or 750 explicitly formulated, and possibly instantiating a set of control 751 actions. Control actions may involve the manipulation of parameters 752 associated with routing, control over tactical capacity acquisition, 753 and control over the traffic management functions. 755 The following list of instruments may be applicable to the solution 756 context of Internet traffic engineering. 758 1. A set of policies, objectives, and requirements (which may be 759 context dependent) for network performance evaluation and 760 performance optimization. 762 2. A collection of online and possibly offline tools and mechanisms 763 for measurement, characterization, modeling, and control of 764 Internet traffic and control over the placement and allocation of 765 network resources, as well as control over the mapping or 766 distribution of traffic onto the infrastructure. 768 3. A set of constraints on the operating environment, the network 769 protocols, and the traffic engineering system itself. 771 4. A set of quantitative and qualitative techniques and 772 methodologies for abstracting, formulating, and solving traffic 773 engineering problems. 775 5. A set of administrative control parameters which may be 776 manipulated through a Configuration Management (CM) system. The 777 CM system itself may include a configuration control subsystem, a 778 configuration repository, a configuration accounting subsystem, 779 and a configuration auditing subsystem. 781 6. A set of guidelines for network performance evaluation, 782 performance optimization, and performance improvement. 784 Derivation of traffic characteristics through measurement and/or 785 estimation is very useful within the realm of the solution space for 786 traffic engineering. Traffic estimates can be derived from customer 787 subscription information, traffic projections, traffic models, and 788 from actual empirical measurements. The empirical measurements may 789 be performed at the traffic aggregate level or at the flow level in 790 order to derive traffic statistics at various levels of detail. 791 Measurements at the flow level or on small traffic aggregates may be 792 performed at edge nodes, where traffic enters and leaves the network. 793 Measurements at large traffic aggregate levels may be performed 794 within the core of the network where potentially numerous traffic 795 flows may be in transit concurrently. 797 To conduct performance studies and to support planning of existing 798 and future networks, a routing analysis may be performed to determine 799 the path(s) the routing protocols will choose for various traffic 800 demands, and to ascertain the utilization of network resources as 801 traffic is routed through the network. The routing analysis should 802 capture the selection of paths through the network, the assignment of 803 traffic across multiple feasible routes, and the multiplexing of IP 804 traffic over traffic trunks (if such constructs exists) and over the 805 underlying network infrastructure. A network topology model is a 806 necessity for routing analysis. A network topology model may be 807 extracted from network architecture documents, from network designs, 808 from information contained in router configuration files, from 809 routing databases, from routing tables, or from automated tools that 810 discover and depict network topology information. Topology 811 information may also be derived from servers that monitor network 812 state, and from servers that perform provisioning functions. 814 Routing in operational IP networks can be administratively controlled 815 at various levels of abstraction including the manipulation of BGP 816 attributes and manipulation of IGP metrics. For path oriented 817 technologies such as MPLS, routing can be further controlled by the 818 manipulation of relevant traffic engineering parameters, resource 819 parameters, and administrative policy constraints. Within the 820 context of MPLS, the path of an explicit label switched path (LSP) 821 can be computed and established in various ways including: (1) 822 manually, (2) automatically online using constraint-based routing 823 processes implemented on label switching routers, and (3) 824 automatically offline using constraint-based routing entities 825 implemented on external traffic engineering support systems. 827 2.4.1. Combating the Congestion Problem 829 Minimizing congestion is a significant aspect of Internet traffic 830 engineering. This subsection gives an overview of the general 831 approaches that have been used or proposed to combat congestion 832 problems. 834 Congestion management policies can be categorized based upon the 835 following criteria (see e.g., [YARE95] for a more detailed taxonomy 836 of congestion control schemes): (1) Response time scale which can be 837 characterized as long, medium, or short; (2) reactive versus 838 preventive which relates to congestion control and congestion 839 avoidance; and (3) supply side versus demand side congestion 840 management schemes. These aspects are discussed in the following 841 paragraphs. 843 1. Congestion Management based on Response Time Scales 845 * Long (weeks to months): Capacity planning works over a 846 relatively long time scale to expand network capacity based on 847 estimates or forecasts of future traffic demand and traffic 848 distribution. Since router and link provisioning take time 849 and are generally expensive, these upgrades are typically 850 carried out in the weeks-to-months or even years time scale. 852 * Medium (minutes to days): Several control policies fall within 853 the medium time scale category. Examples include: (1) 854 Adjusting IGP and/or BGP parameters to route traffic away or 855 towards certain segments of the network; (2) Setting up and/or 856 adjusting some explicitly routed label switched paths (ER- 857 LSPs) in MPLS networks to route some traffic trunks away from 858 possibly congested resources or towards possibly more 859 favorable routes; (3) re-configuring the logical topology of 860 the network to make it correlate more closely with the spatial 861 traffic distribution using for example some underlying path- 862 oriented technology such as MPLS LSPs, ATM PVCs, or optical 863 channel trails. Many of these adaptive medium time scale 864 response schemes rely on a measurement system that monitors 865 changes in traffic distribution, traffic shifts, and network 866 resource utilization and subsequently provides feedback to the 867 online and/or offline traffic engineering mechanisms and tools 868 which employ this feedback information to trigger certain 869 control actions to occur within the network. The traffic 870 engineering mechanisms and tools can be implemented in a 871 distributed fashion or in a centralized fashion, and may have 872 a hierarchical structure or a flat structure. The comparative 873 merits of distributed and centralized control structures for 874 networks are well known. A centralized scheme may have global 875 visibility into the network state and may produce potentially 876 more optimal solutions. However, centralized schemes are 877 prone to single points of failure and may not scale as well as 878 distributed schemes. Moreover, the information utilized by a 879 centralized scheme may be stale and may not reflect the actual 880 state of the network. It is not an objective of this memo to 881 make a recommendation between distributed and centralized 882 schemes. This is a choice that network administrators must 883 make based on their specific needs. 885 * Short (picoseconds to minutes): This category includes packet 886 level processing functions and events on the order of several 887 round trip times. It includes router mechanisms such as 888 passive and active buffer management. These mechanisms are 889 used to control congestion and/or signal congestion to end 890 systems so that they can adaptively regulate the rate at which 891 traffic is injected into the network. One of the most popular 892 active queue management schemes, especially for TCP traffic, 893 is Random Early Detection (RED) [FLJA93], which supports 894 congestion avoidance by controlling the average queue size. 895 During congestion (but before the queue is filled), the RED 896 scheme chooses arriving packets to "mark" according to a 897 probabilistic algorithm which takes into account the average 898 queue size. For a router that does not utilize explicit 899 congestion notification (ECN) see e.g., [FLOY94], the marked 900 packets can simply be dropped to signal the inception of 901 congestion to end systems. On the other hand, if the router 902 supports ECN, then it can set the ECN field in the packet 903 header. Several variations of RED have been proposed to 904 support different drop precedence levels in multi-class 905 environments [RFC2597], e.g., RED with In and Out (RIO) and 906 Weighted RED. There is general consensus that RED provides 907 congestion avoidance performance which is not worse than 908 traditional Tail-Drop (TD) queue management (drop arriving 909 packets only when the queue is full). Importantly, however, 910 RED reduces the possibility of global synchronization and 911 improves fairness among different TCP sessions. However, RED 912 by itself can not prevent congestion and unfairness caused by 913 sources unresponsive to RED, e.g., UDP traffic and some 914 misbehaved greedy connections. Other schemes have been 915 proposed to improve the performance and fairness in the 916 presence of unresponsive traffic. Some of these schemes were 917 proposed as theoretical frameworks and are typically not 918 available in existing commercial products. Two such schemes 919 are Longest Queue Drop (LQD) and Dynamic Soft Partitioning 920 with Random Drop (RND) [SLDC98]. 922 2. Congestion Management: Reactive versus Preventive Schemes 924 * Reactive: reactive (recovery) congestion management policies 925 react to existing congestion problems to improve it. All the 926 policies described in the long and medium time scales above 927 can be categorized as being reactive especially if the 928 policies are based on monitoring and identifying existing 929 congestion problems, and on the initiation of relevant actions 930 to ease a situation. 932 * Preventive: preventive (predictive/avoidance) policies take 933 proactive action to prevent congestion based on estimates and 934 predictions of future potential congestion problems. Some of 935 the policies described in the long and medium time scales fall 936 into this category. They do not necessarily respond 937 immediately to existing congestion problems. Instead 938 forecasts of traffic demand and workload distribution are 939 considered and action may be taken to prevent potential 940 congestion problems in the future. The schemes described in 941 the short time scale (e.g., RED and its variations, ECN, LQD, 942 and RND) are also used for congestion avoidance since dropping 943 or marking packets before queues actually overflow would 944 trigger corresponding TCP sources to slow down. 946 3. Congestion Management: Supply Side versus Demand Side Schemes 948 * Supply side: supply side congestion management policies 949 increase the effective capacity available to traffic in order 950 to control or obviate congestion. This can be accomplished by 951 augmenting capacity. Another way to accomplish this is to 952 minimize congestion by having a relatively balanced 953 distribution of traffic over the network. For example, 954 capacity planning should aim to provide a physical topology 955 and associated link bandwidths that match estimated traffic 956 workload and traffic distribution based on forecasting 957 (subject to budgetary and other constraints). However, if 958 actual traffic distribution does not match the topology 959 derived from capacity panning (due to forecasting errors or 960 facility constraints for example), then the traffic can be 961 mapped onto the existing topology using routing control 962 mechanisms, using path oriented technologies (e.g., MPLS LSPs 963 and optical channel trails) to modify the logical topology, or 964 by using some other load redistribution mechanisms. 966 * Demand side: demand side congestion management policies 967 control or regulate the offered traffic to alleviate 968 congestion problems. For example, some of the short time 969 scale mechanisms described earlier (such as RED and its 970 variations, ECN, LQD, and RND) as well as policing and rate 971 shaping mechanisms attempt to regulate the offered load in 972 various ways. Tariffs may also be applied as a demand side 973 instrument. To date, however, tariffs have not been used as a 974 means of demand side congestion management within the 975 Internet. 977 In summary, a variety of mechanisms can be used to address congestion 978 problems in IP networks. These mechanisms may operate at multiple 979 time-scales. 981 2.5. Implementation and Operational Context 983 The operational context of Internet traffic engineering is 984 characterized by constant change which occur at multiple levels of 985 abstraction. The implementation context demands effective planning, 986 organization, and execution. The planning aspects may involve 987 determining prior sets of actions to achieve desired objectives. 988 Organizing involves arranging and assigning responsibility to the 989 various components of the traffic engineering system and coordinating 990 the activities to accomplish the desired TE objectives. Execution 991 involves measuring and applying corrective or perfective actions to 992 attain and maintain desired TE goals. 994 2.6. High-Level Objectives 996 The high-level objectives for Internet traffic engineering include: 997 usability, automation, scalability, stability, visibility, 998 simplicity, efficiency, reliability, correctness, maintainability, 999 extensibility, interoperability, and security. In a given context, 1000 some of these recommendations may be critical while others may be 1001 optional. Therefore, prioritization may be required during the 1002 development phase of a traffic engineering system (or components 1003 thereof) to tailor it to a specific operational context. 1005 In the following paragraphs, some of the aspects of the high-level 1006 objectives for Internet traffic engineering are summarized. 1008 Usability: Usability is a human factor aspect of traffic engineering 1009 systems. Usability refers to the ease with which a traffic 1010 engineering system can be deployed and operated. In general, it is 1011 desirable to have a TE system that can be readily deployed in an 1012 existing network. It is also desirable to have a TE system that is 1013 easy to operate and maintain. 1015 Automation: Whenever feasible, a traffic engineering system should 1016 automate as many traffic engineering functions as possible to 1017 minimize the amount of human effort needed to control and analyze 1018 operational networks. Automation is particularly imperative in large 1019 scale public networks because of the high cost of the human aspects 1020 of network operations and the high risk of network problems caused by 1021 human errors. Automation may entail the incorporation of automatic 1022 feedback and intelligence into some components of the traffic 1023 engineering system. 1025 Scalability: Contemporary public networks are growing very fast with 1026 respect to network size and traffic volume. Therefore, a TE system 1027 should be scalable to remain applicable as the network evolves. In 1028 particular, a TE system should remain functional as the network 1029 expands with regard to the number of routers and links, and with 1030 respect to the traffic volume. A TE system should have a scalable 1031 architecture, should not adversely impair other functions and 1032 processes in a network element, and should not consume too much 1033 network resources when collecting and distributing state information 1034 or when exerting control. 1036 Stability: Stability is a very important consideration in traffic 1037 engineering systems that respond to changes in the state of the 1038 network. State-dependent traffic engineering methodologies typically 1039 mandate a tradeoff between responsiveness and stability. It is 1040 strongly recommended that when tradeoffs are warranted between 1041 responsiveness and stability, that the tradeoff should be made in 1042 favor of stability (especially in public IP backbone networks). 1044 Flexibility: A TE system should be flexible to allow for changes in 1045 optimization policy. In particular, a TE system should provide 1046 sufficient configuration options so that a network administrator can 1047 tailor the TE system to a particular environment. It may also be 1048 desirable to have both online and offline TE subsystems which can be 1049 independently enabled and disabled. TE systems that are used in 1050 multi-class networks should also have options to support class based 1051 performance evaluation and optimization. 1053 Visibility: As part of the TE system, mechanisms should exist to 1054 collect statistics from the network and to analyze these statistics 1055 to determine how well the network is functioning. Derived statistics 1056 such as traffic matrices, link utilization, latency, packet loss, and 1057 other performance measures of interest which are determined from 1058 network measurements can be used as indicators of prevailing network 1059 conditions. Other examples of status information which should be 1060 observed include existing functional routing information 1061 (additionally, in the context of MPLS existing LSP routes), etc. 1063 Simplicity: Generally, a TE system should be as simple as possible. 1064 More importantly, the TE system should be relatively easy to use 1065 (i.e., clean, convenient, and intuitive user interfaces). Simplicity 1066 in user interface does not necessarily imply that the TE system will 1067 use naive algorithms. When complex algorithms and internal 1068 structures are used, such complexities should be hidden as much as 1069 possible from the network administrator through the user interface. 1071 Interoperability: Whenever feasible, traffic engineering systems and 1072 their components should be developed with open standards based 1073 interfaces to allow interoperation with other systems and components. 1075 Security: Security is a critical consideration in traffic engineering 1076 systems. Such traffic engineering systems typically exert control 1077 over certain functional aspects of the network to achieve the desired 1078 performance objectives. Therefore, adequate measures must be taken 1079 to safeguard the integrity of the traffic engineering system. 1080 Adequate measures must also be taken to protect the network from 1081 vulnerabilities that originate from security breaches and other 1082 impairments within the traffic engineering system. 1084 The remainder of this section will focus on some of the high level 1085 functional recommendations for traffic engineering. 1087 3. Traffic Engineering Process Models 1089 This section describes a generic process model that captures the high 1090 level practical aspects of Internet traffic engineering in an 1091 operational context. The process model is described as a sequence of 1092 actions that a traffic engineer, or more generally a traffic 1093 engineering system, must perform to optimize the performance of an 1094 operational network (see also [RFC2702], [AWD2]). The process model 1095 described here represents the broad activities common to most traffic 1096 engineering methodologies although the details regarding how traffic 1097 engineering is executed may differ from network to network. This 1098 process model may be enacted explicitly or implicitly, by an 1099 automaton and/or by a human. 1101 The traffic engineering process model is iterative [AWD2]. The four 1102 phases of the process model described below are repeated continually. 1104 The first phase of the TE process model is to define the relevant 1105 control policies that govern the operation of the network. These 1106 policies may depend upon many factors including the prevailing 1107 business model, the network cost structure, the operating 1108 constraints, the utility model, and optimization criteria. 1110 The second phase of the process model is a feedback mechanism 1111 involving the acquisition of measurement data from the operational 1112 network. If empirical data is not readily available from the 1113 network, then synthetic workloads may be used instead which reflect 1114 either the prevailing or the expected workload of the network. 1115 Synthetic workloads may be derived by estimation or extrapolation 1116 using prior empirical data. Their derivation may also be obtained 1117 using mathematical models of traffic characteristics or other means. 1119 The third phase of the process model is to analyze the network state 1120 and to characterize traffic workload. Performance analysis may be 1121 proactive and/or reactive. Proactive performance analysis identifies 1122 potential problems that do not exist, but could manifest in the 1123 future. Reactive performance analysis identifies existing problems, 1124 determines their cause through diagnosis, and evaluates alternative 1125 approaches to remedy the problem, if necessary. A number of 1126 quantitative and qualitative techniques may be used in the analysis 1127 process, including modeling based analysis and simulation. The 1128 analysis phase of the process model may involve investigating the 1129 concentration and distribution of traffic across the network or 1130 relevant subsets of the network, identifying the characteristics of 1131 the offered traffic workload, identifying existing or potential 1132 bottlenecks, and identifying network pathologies such as ineffective 1133 link placement, single points of failures, etc. Network pathologies 1134 may result from many factors including inferior network architecture, 1135 inferior network design, and configuration problems. A traffic 1136 matrix may be constructed as part of the analysis process. Network 1137 analysis may also be descriptive or prescriptive. 1139 The fourth phase of the TE process model is the performance 1140 optimization of the network. The performance optimization phase 1141 involves a decision process which selects and implements a set of 1142 actions from a set of alternatives. Optimization actions may include 1143 the use of appropriate techniques to either control the offered 1144 traffic or to control the distribution of traffic across the network. 1145 Optimization actions may also involve adding additional links or 1146 increasing link capacity, deploying additional hardware such as 1147 routers and switches, systematically adjusting parameters associated 1148 with routing such as IGP metrics and BGP attributes, and adjusting 1149 traffic management parameters. Network performance optimization may 1150 also involve starting a network planning process to improve the 1151 network architecture, network design, network capacity, network 1152 technology, and the configuration of network elements to accommodate 1153 current and future growth. 1155 3.1. Components of the Traffic Engineering Process Model 1157 The key components of the traffic engineering process model include a 1158 measurement subsystem, a modeling and analysis subsystem, and an 1159 optimization subsystem. The following subsections examine these 1160 components as they apply to the traffic engineering process model. 1162 3.2. Measurement 1164 Measurement is crucial to the traffic engineering function. The 1165 operational state of a network can be conclusively determined only 1166 through measurement. Measurement is also critical to the 1167 optimization function because it provides feedback data which is used 1168 by traffic engineering control subsystems. This data is used to 1169 adaptively optimize network performance in response to events and 1170 stimuli originating within and outside the network. Measurement is 1171 also needed to determine the quality of network services and to 1172 evaluate the effectiveness of traffic engineering policies. 1173 Experience suggests that measurement is most effective when acquired 1174 and applied systematically. 1176 When developing a measurement system to support the traffic 1177 engineering function in IP networks, the following questions should 1178 be carefully considered: Why is measurement needed in this particular 1179 context? What parameters are to be measured? How should the 1180 measurement be accomplished? Where should the measurement be 1181 performed? When should the measurement be performed? How frequently 1182 should the monitored variables be measured? What level of 1183 measurement accuracy and reliability is desirable? What level of 1184 measurement accuracy and reliability is realistically attainable? To 1185 what extent can the measurement system permissibly interfere with the 1186 monitored network components and variables? What is the acceptable 1187 cost of measurement? The answers to these questions will determine 1188 the measurement tools and methodologies appropriate in any given 1189 traffic engineering context. 1191 It should also be noted that there is a distinction between 1192 measurement and evaluation. Measurement provides raw data concerning 1193 state parameters and variables of monitored network elements. 1194 Evaluation utilizes the raw data to make inferences regarding the 1195 monitored system. 1197 Measurement in support of the TE function can occur at different 1198 levels of abstraction. For example, measurement can be used to 1199 derive packet level characteristics, flow level characteristics, user 1200 or customer level characteristics, traffic aggregate characteristics, 1201 component level characteristics, and network wide characteristics. 1203 3.3. Modeling, Analysis, and Simulation 1205 Modeling and analysis are important aspects of Internet traffic 1206 engineering. Modeling involves constructing an abstract or physical 1207 representation which depicts relevant traffic characteristics and 1208 network attributes. 1210 A network model is an abstract representation of the network which 1211 captures relevant network features, attributes, and characteristics, 1212 such as link and nodal attributes and constraints. A network model 1213 may facilitate analysis and/or simulation which can be used to 1214 predict network performance under various conditions as well as to 1215 guide network expansion plans. 1217 In general, Internet traffic engineering models can be classified as 1218 either structural or behavioral. Structural models focus on the 1219 organization of the network and its components. Behavioral models 1220 focus on the dynamics of the network and the traffic workload. 1221 Modeling for Internet traffic engineering may also be formal or 1222 informal. 1224 Accurate behavioral models for traffic sources are particularly 1225 useful for analysis. Development of behavioral traffic source models 1226 that are consistent with empirical data obtained from operational 1227 networks is a major research topic in Internet traffic engineering. 1228 These source models should also be tractable and amenable to 1229 analysis. The topic of source models for IP traffic is a research 1230 topic and is therefore outside the scope of this document. Its 1231 importance, however, must be emphasized. 1233 Network simulation tools are extremely useful for traffic 1234 engineering. Because of the complexity of realistic quantitative 1235 analysis of network behavior, certain aspects of network performance 1236 studies can only be conducted effectively using simulation. A good 1237 network simulator can be used to mimic and visualize network 1238 characteristics under various conditions in a safe and non-disruptive 1239 manner. For example, a network simulator may be used to depict 1240 congested resources and hot spots, and to provide hints regarding 1241 possible solutions to network performance problems. A good simulator 1242 may also be used to validate the effectiveness of planned solutions 1243 to network issues without the need to tamper with the operational 1244 network, or to commence an expensive network upgrade which may not 1245 achieve the desired objectives. Furthermore, during the process of 1246 network planning, a network simulator may reveal pathologies such as 1247 single points of failure which may require additional redundancy, and 1248 potential bottlenecks and hot spots which may require additional 1249 capacity. 1251 Routing simulators are especially useful in large networks. A 1252 routing simulator may identify planned links which may not actually 1253 be used to route traffic by the existing routing protocols. 1254 Simulators can also be used to conduct scenario based and 1255 perturbation based analysis, as well as sensitivity studies. 1256 Simulation results can be used to initiate appropriate actions in 1257 various ways. For example, an important application of network 1258 simulation tools is to investigate and identify how best to make the 1259 network evolve and grow, in order to accommodate projected future 1260 demands. 1262 3.4. Optimization 1264 Network performance optimization involves resolving network issues by 1265 transforming such issues into concepts that enable a solution, 1266 identification of a solution, and implementation of the solution. 1267 Network performance optimization can be corrective or perfective. In 1268 corrective optimization, the goal is to remedy a problem that has 1269 occurred or that is incipient. In perfective optimization, the goal 1270 is to improve network performance even when explicit problems do not 1271 exist and are not anticipated. 1273 Network performance optimization is a continual process, as noted 1274 previously. Performance optimization iterations may consist of real- 1275 time optimization sub-processes and non-real-time network planning 1276 sub-processes. The difference between real-time optimization and 1277 network planning is primarily in the relative time- scale in which 1278 they operate and in the granularity of actions. One of the 1279 objectives of a real-time optimization sub-process is to control the 1280 mapping and distribution of traffic over the existing network 1281 infrastructure to avoid and/or relieve congestion, to assure 1282 satisfactory service delivery, and to optimize resource utilization. 1283 Real-time optimization is needed because random incidents such as 1284 fiber cuts or shifts in traffic demand will occur irrespective of how 1285 well a network is designed. These incidents can cause congestion and 1286 other problems to manifest in an operational network. Real-time 1287 optimization must solve such problems in small to medium time-scales 1288 ranging from micro-seconds to minutes or hours. Examples of real- 1289 time optimization include queue management, IGP/BGP metric tuning, 1290 and using technologies such as MPLS explicit LSPs to change the paths 1291 of some traffic trunks [XIAO]. 1293 One of the functions of the network planning sub-process is to 1294 initiate actions to systematically evolve the architecture, 1295 technology, topology, and capacity of a network. When a problem 1296 exists in the network, real-time optimization should provide an 1297 immediate remedy. Because a prompt response is necessary, the real- 1298 time solution may not be the best possible solution. Network 1299 planning may subsequently be needed to refine the solution and 1300 improve the situation. Network planning is also required to expand 1301 the network to support traffic growth and changes in traffic 1302 distribution over time. As previously noted, a change in the 1303 topology and/or capacity of the network may be the outcome of network 1304 planning. 1306 Clearly, network planning and real-time performance optimization are 1307 mutually complementary activities. A well-planned and designed 1308 network makes real-time optimization easier, while a systematic 1309 approach to real-time network performance optimization allows network 1310 planning to focus on long term issues rather than tactical 1311 considerations. Systematic real-time network performance 1312 optimization also provides valuable inputs and insights toward 1313 network planning. 1315 Stability is an important consideration in real-time network 1316 performance optimization. This aspect will be repeatedly addressed 1317 throughout this memo. 1319 4. Review of TE Techniques 1321 This section briefly reviews different traffic engineering approaches 1322 proposed and implemented in telecommunications and computer networks. 1323 The discussion is not intended to be comprehensive. It is primarily 1324 intended to illuminate pre-existing perspectives and prior art 1325 concerning traffic engineering in the Internet and in legacy 1326 telecommunications networks. 1328 4.1. Historic Overview 1330 4.1.1. Traffic Engineering in Classical Telephone Networks 1332 This subsection presents a brief overview of traffic engineering in 1333 telephone networks which often relates to the way user traffic is 1334 steered from an originating node to the terminating node. This 1335 subsection presents a brief overview of this topic. A detailed 1336 description of the various routing strategies applied in telephone 1337 networks is included in the book by G. Ash [ASH2]. 1339 The early telephone network relied on static hierarchical routing, 1340 whereby routing patterns remained fixed independent of the state of 1341 the network or time of day. The hierarchy was intended to 1342 accommodate overflow traffic, improve network reliability via 1343 alternate routes, and prevent call looping by employing strict 1344 hierarchical rules. The network was typically over-provisioned since 1345 a given fixed route had to be dimensioned so that it could carry user 1346 traffic during a busy hour of any busy day. Hierarchical routing in 1347 the telephony network was found to be too rigid upon the advent of 1348 digital switches and stored program control which were able to manage 1349 more complicated traffic engineering rules. 1351 Dynamic routing was introduced to alleviate the routing inflexibility 1352 in the static hierarchical routing so that the network would operate 1353 more efficiently. This resulted in significant economic gains 1354 [HUSS87]. Dynamic routing typically reduces the overall loss 1355 probability by 10 to 20 percent (compared to static hierarchical 1356 routing). Dynamic routing can also improve network resilience by 1357 recalculating routes on a per-call basis and periodically updating 1358 routes. 1360 There are three main types of dynamic routing in the telephone 1361 network. They are time-dependent routing, state-dependent routing 1362 (SDR), and event dependent routing (EDR). 1364 In time-dependent routing, regular variations in traffic loads (such 1365 as time of day or day of week) are exploited in pre-planned routing 1366 tables. In state-dependent routing, routing tables are updated 1367 online according to the current state of the network (e.g., traffic 1368 demand, utilization, etc.). In event dependent routing, routing 1369 changes are incepted by events (such as call setups encountering 1370 congested or blocked links) whereupon new paths are searched out 1371 using learning models. EDR methods are real-time adaptive, but they 1372 do not require global state information as does SDR. Examples of EDR 1373 schemes include the dynamic alternate routing (DAR) from BT, the 1374 state-and-time dependent routing (STR) from NTT, and the success-to- 1375 the-top (STT) routing from AT&T. 1377 Dynamic non-hierarchical routing (DNHR) is an example of dynamic 1378 routing that was introduced in the AT&T toll network in the 1980's to 1379 respond to time-dependent information such as regular load variations 1380 as a function of time. Time-dependent information in terms of load 1381 may be divided into three time scales: hourly, weekly, and yearly. 1382 Correspondingly, three algorithms are defined to pre-plan the routing 1383 tables. The network design algorithm operates over a year-long 1384 interval while the demand servicing algorithm operates on a weekly 1385 basis to fine tune link sizes and routing tables to correct forecast 1386 errors on the yearly basis. At the smallest time scale, the routing 1387 algorithm is used to make limited adjustments based on daily traffic 1388 variations. Network design and demand servicing are computed using 1389 offline calculations. Typically, the calculations require extensive 1390 searches on possible routes. On the other hand, routing may need 1391 online calculations to handle crankback. DNHR adopts a "two-link" 1392 approach whereby a path can consist of two links at most. The 1393 routing algorithm presents an ordered list of route choices between 1394 an originating switch and a terminating switch. If a call overflows, 1395 a via switch (a tandem exchange between the originating switch and 1396 the terminating switch) would send a crankback signal to the 1397 originating switch. This switch would then select the next route, 1398 and so on, until there are no alternative routes available in which 1399 the call is blocked. 1401 4.1.2. Evolution of Traffic Engineering in Packet Networks 1403 This subsection reviews related prior work that was intended to 1404 improve the performance of data networks. Indeed, optimization of 1405 the performance of data networks started in the early days of the 1406 ARPANET. Other early commercial networks such as SNA also recognized 1407 the importance of performance optimization and service 1408 differentiation. 1410 In terms of traffic management, the Internet has been a best effort 1411 service environment until recently. In particular, very limited 1412 traffic management capabilities existed in IP networks to provide 1413 differentiated queue management and scheduling services to packets 1414 belonging to different classes. 1416 In terms of routing control, the Internet has employed distributed 1417 protocols for intra-domain routing. These protocols are highly 1418 scalable and resilient. However, they are based on simple algorithms 1419 for path selection which have very limited functionality to allow 1420 flexible control of the path selection process. 1422 In the following subsections, the evolution of practical traffic 1423 engineering mechanisms in IP networks and its predecessors are 1424 reviewed. 1426 4.1.2.1. Adaptive Routing in the ARPANET 1428 The early ARPANET recognized the importance of adaptive routing where 1429 routing decisions were based on the current state of the network 1430 [MCQ80]. Early minimum delay routing approaches forwarded each 1431 packet to its destination along a path for which the total estimated 1432 transit time was the smallest. Each node maintained a table of 1433 network delays, representing the estimated delay that a packet would 1434 experience along a given path toward its destination. The minimum 1435 delay table was periodically transmitted by a node to its neighbors. 1436 The shortest path, in terms of hop count, was also propagated to give 1437 the connectivity information. 1439 One drawback to this approach is that dynamic link metrics tend to 1440 create "traffic magnets" causing congestion to be shifted from one 1441 location of a network to another location, resulting in oscillation 1442 and network instability. 1444 4.1.2.2. Dynamic Routing in the Internet 1446 The Internet evolved from the ARPANET and adopted dynamic routing 1447 algorithms with distributed control to determine the paths that 1448 packets should take en-route to their destinations. The routing 1449 algorithms are adaptations of shortest path algorithms where costs 1450 are based on link metrics. The link metric can be based on static or 1451 dynamic quantities. The link metric based on static quantities may 1452 be assigned administratively according to local criteria. The link 1453 metric based on dynamic quantities may be a function of a network 1454 congestion measure such as delay or packet loss. 1456 It was apparent early that static link metric assignment was 1457 inadequate because it can easily lead to unfavorable scenarios in 1458 which some links become congested while others remain lightly loaded. 1459 One of the many reasons for the inadequacy of static link metrics is 1460 that link metric assignment was often done without considering the 1461 traffic matrix in the network. Also, the routing protocols did not 1462 take traffic attributes and capacity constraints into account when 1463 making routing decisions. This results in traffic concentration 1464 being localized in subsets of the network infrastructure and 1465 potentially causing congestion. Even if link metrics are assigned in 1466 accordance with the traffic matrix, unbalanced loads in the network 1467 can still occur due to a number factors including: 1469 o Resources may not be deployed in the most optimal locations from a 1470 routing perspective. 1472 o Forecasting errors in traffic volume and/or traffic distribution. 1474 o Dynamics in traffic matrix due to the temporal nature of traffic 1475 patterns, BGP policy change from peers, etc. 1477 The inadequacy of the legacy Internet interior gateway routing system 1478 is one of the factors motivating the interest in path oriented 1479 technology with explicit routing and constraint-based routing 1480 capability such as MPLS. 1482 4.1.2.3. ToS Routing 1484 Type-of-Service (ToS) routing involves different routes going to the 1485 same destination with selection dependent upon the ToS field of an IP 1486 packet [RFC2474]. The ToS classes may be classified as low delay and 1487 high throughput. Each link is associated with multiple link costs 1488 and each link cost is used to compute routes for a particular ToS. A 1489 separate shortest path tree is computed for each ToS. The shortest 1490 path algorithm must be run for each ToS resulting in very expensive 1491 computation. Classical ToS-based routing is now outdated as the IP 1492 header field has been replaced by a Diffserv field. Effective 1493 traffic engineering is difficult to perform in classical ToS-based 1494 routing because each class still relies exclusively on shortest path 1495 routing which results in localization of traffic concentration within 1496 the network. 1498 4.1.2.4. Equal Cost Multi-Path 1500 Equal Cost Multi-Path (ECMP) is another technique that attempts to 1501 address the deficiency in the Shortest Path First (SPF) interior 1502 gateway routing systems [RFC2328]. In the classical SPF algorithm, 1503 if two or more shortest paths exist to a given destination, the 1504 algorithm will choose one of them. The algorithm is modified 1505 slightly in ECMP so that if two or more equal cost shortest paths 1506 exist between two nodes, the traffic between the nodes is distributed 1507 among the multiple equal-cost paths. Traffic distribution across the 1508 equal-cost paths is usually performed in one of two ways: (1) packet- 1509 based in a round-robin fashion, or (2) flow-based using hashing on 1510 source and destination IP addresses and possibly other fields of the 1511 IP header. The first approach can easily cause out- of-order packets 1512 while the second approach is dependent upon the number and 1513 distribution of flows. Flow-based load sharing may be unpredictable 1514 in an enterprise network where the number of flows is relatively 1515 small and less heterogeneous (for example, hashing may not be 1516 uniform), but it is generally effective in core public networks where 1517 the number of flows is large and heterogeneous. 1519 In ECMP, link costs are static and bandwidth constraints are not 1520 considered, so ECMP attempts to distribute the traffic as equally as 1521 possible among the equal-cost paths independent of the congestion 1522 status of each path. As a result, given two equal-cost paths, it is 1523 possible that one of the paths will be more congested than the other. 1524 Another drawback of ECMP is that load sharing cannot be achieved on 1525 multiple paths which have non-identical costs. 1527 4.1.2.5. Nimrod 1529 Nimrod was a routing system developed to provide heterogeneous 1530 service specific routing in the Internet, while taking multiple 1531 constraints into account [RFC1992]. Essentially, Nimrod was a link 1532 state routing protocol to support path oriented packet forwarding. 1533 It used the concept of maps to represent network connectivity and 1534 services at multiple levels of abstraction. Mechanisms allowed 1535 restriction of the distribution of routing information. 1537 Even though Nimrod did not enjoy deployment in the public Internet, a 1538 number of key concepts incorporated into the Nimrod architecture, 1539 such as explicit routing which allows selection of paths at 1540 originating nodes, are beginning to find applications in some recent 1541 constraint-based routing initiatives. 1543 4.2. Development of Internet Traffic Engineering 1545 4.2.1. Overlay Model 1547 In the overlay model, a virtual-circuit network, such as ATM, frame 1548 relay, or WDM, provides virtual-circuit connectivity between routers 1549 that are located at the edges of a virtual-circuit cloud. In this 1550 mode, two routers that are connected through a virtual circuit see a 1551 direct adjacency between themselves independent of the physical route 1552 taken by the virtual circuit through the ATM, frame relay, or WDM 1553 network. Thus, the overlay model essentially decouples the logical 1554 topology that routers see from the physical topology that the ATM, 1555 frame relay, or WDM network manages. The overlay model based on ATM 1556 or frame relay enables a network administrator or an automaton to 1557 employ traffic engineering concepts to perform path optimization by 1558 re-configuring or rearranging the virtual circuits so that a virtual 1559 circuit on a congested or sub-optimal physical link can be re-routed 1560 to a less congested or more optimal one. In the overlay model, 1561 traffic engineering is also employed to establish relationships 1562 between the traffic management parameters (e.g., PCR, SCR, and MBS 1563 for ATM) of the virtual-circuit technology and the actual traffic 1564 that traverses each circuit. These relationships can be established 1565 based upon known or projected traffic profiles, and some other 1566 factors. 1568 The overlay model using IP over ATM requires the management of two 1569 separate networks with different technologies (IP and ATM) resulting 1570 in increased operational complexity and cost. In the fully-meshed 1571 overlay model, each router would peer to every other router in the 1572 network, so that the total number of adjacencies is a quadratic 1573 function of the number of routers. Some of the issues with the 1574 overlay model are discussed in [AWD2]. 1576 4.2.2. Constraint-Based Routing 1578 Constraint-based routing refers to a class of routing systems that 1579 compute routes through a network subject to the satisfaction of a set 1580 of constraints and requirements. In the most general setting, 1581 constraint-based routing may also seek to optimize overall network 1582 performance while minimizing costs. 1584 The constraints and requirements may be imposed by the network itself 1585 or by administrative policies. Constraints may include bandwidth, 1586 hop count, delay, and policy instruments such as resource class 1587 attributes. Constraints may also include domain specific attributes 1588 of certain network technologies and contexts which impose 1589 restrictions on the solution space of the routing function. Path 1590 oriented technologies such as MPLS have made constraint-based routing 1591 feasible and attractive in public IP networks. 1593 The concept of constraint-based routing within the context of MPLS 1594 traffic engineering requirements in IP networks was first described 1595 in [RFC2702] and led to developments such as MPLS-TE [RFC3209] as 1596 described in Section 4.3.4. 1598 Unlike QoS routing (for example, see [RFC2386] and [MA]) which 1599 generally addresses the issue of routing individual traffic flows to 1600 satisfy prescribed flow based QoS requirements subject to network 1601 resource availability, constraint-based routing is applicable to 1602 traffic aggregates as well as flows and may be subject to a wide 1603 variety of constraints which may include policy restrictions. 1605 4.3. Overview of IETF Projects Related to Traffic Engineering 1607 This subsection reviews a number of IETF activities pertinent to 1608 Internet traffic engineering. These activities are primarily 1609 intended to evolve the IP architecture to support new service 1610 definitions which allow preferential or differentiated treatment to 1611 be accorded to certain types of traffic. 1613 4.3.1. Integrated Services 1615 The IETF Integrated Services working group developed the integrated 1616 services (Intserv) model. This model requires resources, such as 1617 bandwidth and buffers, to be reserved a priori for a given traffic 1618 flow to ensure that the quality of service requested by the traffic 1619 flow is satisfied. The integrated services model includes additional 1620 components beyond those used in the best-effort model such as packet 1621 classifiers, packet schedulers, and admission control. A packet 1622 classifier is used to identify flows that are to receive a certain 1623 level of service. A packet scheduler handles the scheduling of 1624 service to different packet flows to ensure that QoS commitments are 1625 met. Admission control is used to determine whether a router has the 1626 necessary resources to accept a new flow. 1628 Two services have been defined under the Integrated Services model: 1629 guaranteed service [RFC2212] and controlled-load service [RFC2211]. 1631 The guaranteed service can be used for applications requiring bounded 1632 packet delivery time. For this type of application, data that is 1633 delivered to the application after a pre-defined amount of time has 1634 elapsed is usually considered worthless. Therefore, guaranteed 1635 service was intended to provide a firm quantitative bound on the end- 1636 to-end packet delay for a flow. This is accomplished by controlling 1637 the queuing delay on network elements along the data flow path. The 1638 guaranteed service model does not, however, provide bounds on jitter 1639 (inter-arrival times between consecutive packets). 1641 The controlled-load service can be used for adaptive applications 1642 that can tolerate some delay but are sensitive to traffic overload 1643 conditions. This type of application typically functions 1644 satisfactorily when the network is lightly loaded but its performance 1645 degrades significantly when the network is heavily loaded. 1646 Controlled-load service, therefore, has been designed to provide 1647 approximately the same service as best-effort service in a lightly 1648 loaded network regardless of actual network conditions. Controlled- 1649 load service is described qualitatively in that no target values of 1650 delay or loss are specified. 1652 The main issue with the Integrated Services model has been 1653 scalability [RFC2998], especially in large public IP networks which 1654 may potentially have millions of active micro-flows in transit 1655 concurrently. 1657 A notable feature of the Integrated Services model is that it 1658 requires explicit signaling of QoS requirements from end systems to 1659 routers [RFC2753]. The Resource Reservation Protocol (RSVP) performs 1660 this signaling function and is a critical component of the Integrated 1661 Services model. The RSVP protocol is described next. 1663 4.3.2. RSVP 1665 RSVP is a soft state signaling protocol [RFC2205]. It supports 1666 receiver initiated establishment of resource reservations for both 1667 multicast and unicast flows. RSVP was originally developed as a 1668 signaling protocol within the integrated services framework for 1669 applications to communicate QoS requirements to the network and for 1670 the network to reserve relevant resources to satisfy the QoS 1671 requirements [RFC2205]. 1673 Under RSVP, the sender or source node sends a PATH message to the 1674 receiver with the same source and destination addresses as the 1675 traffic which the sender will generate. The PATH message contains: 1676 (1) a sender Tspec specifying the characteristics of the traffic, (2) 1677 a sender Template specifying the format of the traffic, and (3) an 1678 optional Adspec which is used to support the concept of one pass with 1679 advertising (OPWA) [RFC2205]. Every intermediate router along the 1680 path forwards the PATH Message to the next hop determined by the 1681 routing protocol. Upon receiving a PATH Message, the receiver 1682 responds with a RESV message which includes a flow descriptor used to 1683 request resource reservations. The RESV message travels to the 1684 sender or source node in the opposite direction along the path that 1685 the PATH message traversed. Every intermediate router along the path 1686 can reject or accept the reservation request of the RESV message. If 1687 the request is rejected, the rejecting router will send an error 1688 message to the receiver and the signaling process will terminate. If 1689 the request is accepted, link bandwidth and buffer space are 1690 allocated for the flow and the related flow state information is 1691 installed in the router. 1693 One of the issues with the original RSVP specification was 1694 Scalability. This is because reservations were required for micro- 1695 flows, so that the amount of state maintained by network elements 1696 tends to increase linearly with the number of micro-flows. These 1697 issues are described in [RFC2961]. 1699 Recently, RSVP has been modified and extended in several ways to 1700 mitigate the scaling problems. As a result, it is becoming a 1701 versatile signaling protocol for the Internet. For example, RSVP has 1702 been extended to reserve resources for aggregation of flows, to set 1703 up MPLS explicit label switched paths, and to perform other signaling 1704 functions within the Internet. There are also a number of proposals 1705 to reduce the amount of refresh messages required to maintain 1706 established RSVP sessions [RFC2961]. 1708 A number of IETF working groups have been engaged in activities 1709 related to the RSVP protocol. These include the original RSVP 1710 working group, the MPLS working group, the Resource Allocation 1711 Protocol working group, and the Policy Framework working group. 1713 4.3.3. Differentiated Services 1715 The goal of the Differentiated Services (Diffserv) effort within the 1716 IETF is to devise scalable mechanisms for categorization of traffic 1717 into behavior aggregates, which ultimately allows each behavior 1718 aggregate to be treated differently, especially when there is a 1719 shortage of resources such as link bandwidth and buffer space 1720 [RFC2475]. One of the primary motivations for the Diffserv effort 1721 was to devise alternative mechanisms for service differentiation in 1722 the Internet that mitigate the scalability issues encountered with 1723 the Intserv model. 1725 The IETF Diffserv working group has defined a Differentiated Services 1726 field in the IP header (DS field). The DS field consists of six bits 1727 of the part of the IP header formerly known as TOS octet. The DS 1728 field is used to indicate the forwarding treatment that a packet 1729 should receive at a node [RFC2474]. The Diffserv working group has 1730 also standardized a number of Per-Hop Behavior (PHB) groups. Using 1731 the PHBs, several classes of services can be defined using different 1732 classification, policing, shaping, and scheduling rules. 1734 For an end-user of network services to receive Differentiated 1735 Services from its Internet Service Provider (ISP), it may be 1736 necessary for the user to have a Service Level Agreement (SLA) with 1737 the ISP. An SLA may explicitly or implicitly specify a Traffic 1738 Conditioning Agreement (TCA) which defines classifier rules as well 1739 as metering, marking, discarding, and shaping rules. 1741 Packets are classified, and possibly policed and shaped at the 1742 ingress to a Diffserv network. When a packet traverses the boundary 1743 between different Diffserv domains, the DS field of the packet may be 1744 re-marked according to existing agreements between the domains. 1746 Differentiated Services allows only a finite number of service 1747 classes to be indicated by the DS field. The main advantage of the 1748 Diffserv approach relative to the Intserv model is scalability. 1749 Resources are allocated on a per-class basis and the amount of state 1750 information is proportional to the number of classes rather than to 1751 the number of application flows. 1753 It should be obvious from the previous discussion that the Diffserv 1754 model essentially deals with traffic management issues on a per hop 1755 basis. The Diffserv control model consists of a collection of micro- 1756 TE control mechanisms. Other traffic engineering capabilities, such 1757 as capacity management (including routing control), are also required 1758 in order to deliver acceptable service quality in Diffserv networks. 1759 The concept of Per Domain Behaviors has been introduced to better 1760 capture the notion of differentiated services across a complete 1761 domain [RFC3086]. 1763 4.3.4. MPLS 1765 MPLS is an advanced forwarding scheme which also includes extensions 1766 to conventional IP control plane protocols. MPLS extends the 1767 Internet routing model and enhances packet forwarding and path 1768 control [RFC3031]. 1770 At the ingress to an MPLS domain, label switching routers (LSRs) 1771 classify IP packets into forwarding equivalence classes (FECs) based 1772 on a variety of factors, including, e.g., a combination of the 1773 information carried in the IP header of the packets and the local 1774 routing information maintained by the LSRs. An MPLS label is then 1775 prepended to each packet according to their forwarding equivalence 1776 classes. In a non-ATM/FR environment, the label is 32 bits long and 1777 contains a 20-bit label field, a 3-bit experimental field (formerly 1778 known as Class-of-Service or CoS field), a 1-bit label stack 1779 indicator and an 8-bit TTL field. In an ATM (FR) environment, the 1780 label consists of information encoded in the VCI/VPI (DLCI) field. 1781 An MPLS capable router (an LSR) examines the label and possibly the 1782 experimental field and uses this information to make packet 1783 forwarding decisions. 1785 An LSR makes forwarding decisions by using the label prepended to 1786 packets as the index into a local next hop label forwarding entry 1787 (NHLFE). The packet is then processed as specified in the NHLFE. 1788 The incoming label may be replaced by an outgoing label, and the 1789 packet may be switched to the next LSR. This label-switching process 1790 is very similar to the label (VCI/VPI) swapping process in ATM 1791 networks. Before a packet leaves an MPLS domain, its MPLS label may 1792 be removed. A Label Switched Path (LSP) is the path between an 1793 ingress LSRs and an egress LSRs through which a labeled packet 1794 traverses. The path of an explicit LSP is defined at the originating 1795 (ingress) node of the LSP. MPLS can use a signaling protocol such as 1796 RSVP or LDP to set up LSPs. 1798 MPLS is a very powerful technology for Internet traffic engineering 1799 because it supports explicit LSPs which allow constraint-based 1800 routing to be implemented efficiently in IP networks [AWD2]. The 1801 requirements for traffic engineering over MPLS are described in 1802 [RFC2702]. Extensions to RSVP to support instantiation of explicit 1803 LSP are discussed in [RFC3209]. 1805 4.3.5. Generalized MPLS 1807 TBD 1809 4.3.6. IP Performance Metrics 1811 The IETF IP Performance Metrics (IPPM) working group has been 1812 developing a set of standard metrics that can be used to monitor the 1813 quality, performance, and reliability of Internet services. These 1814 metrics can be applied by network operators, end-users, and 1815 independent testing groups to provide users and service providers 1816 with a common understanding of the performance and reliability of the 1817 Internet component 'clouds' they use/provide [RFC2330]. The criteria 1818 for performance metrics developed by the IPPM WG are described in 1819 [RFC2330]. Examples of performance metrics include one-way packet 1820 loss [RFC7680], one-way delay [RFC7679], and connectivity measures 1821 between two nodes [RFC2678]. Other metrics include second-order 1822 measures of packet loss and delay. 1824 Some of the performance metrics specified by the IPPM WG are useful 1825 for specifying Service Level Agreements (SLAs). SLAs are sets of 1826 service level objectives negotiated between users and service 1827 providers, wherein each objective is a combination of one or more 1828 performance metrics, possibly subject to certain constraints. 1830 4.3.7. Flow Measurement 1832 The IETF Real Time Flow Measurement (RTFM) working group has produced 1833 an architecture document defining a method to specify traffic flows 1834 as well as a number of components for flow measurement (meters, meter 1835 readers, manager) [RFC2722]. A flow measurement system enables 1836 network traffic flows to be measured and analyzed at the flow level 1837 for a variety of purposes. As noted in RFC 2722, a flow measurement 1838 system can be very useful in the following contexts: (1) 1839 understanding the behavior of existing networks, (2) planning for 1840 network development and expansion, (3) quantification of network 1841 performance, (4) verifying the quality of network service, and (5) 1842 attribution of network usage to users. 1844 A flow measurement system consists of meters, meter readers, and 1845 managers. A meter observes packets passing through a measurement 1846 point, classifies them into certain groups, accumulates certain usage 1847 data (such as the number of packets and bytes for each group), and 1848 stores the usage data in a flow table. A group may represent a user 1849 application, a host, a network, a group of networks, etc. A meter 1850 reader gathers usage data from various meters so it can be made 1851 available for analysis. A manager is responsible for configuring and 1852 controlling meters and meter readers. The instructions received by a 1853 meter from a manager include flow specification, meter control 1854 parameters, and sampling techniques. The instructions received by a 1855 meter reader from a manager include the address of the meter whose 1856 date is to be collected, the frequency of data collection, and the 1857 types of flows to be collected. 1859 4.3.8. Endpoint Congestion Management 1861 [RFC3124] is intended to provide a set of congestion control 1862 mechanisms that transport protocols can use. It is also intended to 1863 develop mechanisms for unifying congestion control across a subset of 1864 an endpoint's active unicast connections (called a congestion group). 1865 A congestion manager continuously monitors the state of the path for 1866 each congestion group under its control. The manager uses that 1867 information to instruct a scheduler on how to partition bandwidth 1868 among the connections of that congestion group. 1870 4.3.9. TE Extensions to the IGPs 1872 TBD 1874 4.3.10. Link-State BGP 1876 TBD 1878 4.3.11. Path Computation Element 1880 TBD 1882 4.3.12. Segment Routing 1884 TBD 1886 4.3.13. Network Virtualization and Abstraction 1888 ACTN goes here : TBD 1890 4.3.14. Deterministic Networking 1892 TBD 1894 4.3.15. Network TE State Definition and Presentation 1896 The network states that are relevant to the traffic engineering need 1897 to be stored in the system and presented to the user. The Traffic 1898 Engineering Database (TED) is a collection of all TE information 1899 about all TE nodes and TE links in the network, which is an essential 1900 component of a TE system, such as MPLS-TE [RFC2702] and GMPLS 1901 [RFC3945]. In order to formally define the data in the TED and to 1902 present the data to the user with high usability, the data modeling 1903 language YANG [RFC7950] can be used as described in 1904 [I-D.ietf-teas-yang-te-topo]. 1906 4.3.16. System Management and Control Interfaces 1908 The traffic engineering control system needs to have a management 1909 interface that is human-friendly and a control interfaces that is 1910 programable for automation. The Network Configuration Protocol 1911 (NETCONF) [RFC6241] or the RESTCONF Protocol [RFC8040] provide 1912 programmable interfaces that are also human-friendly. These 1913 protocols use XML or JSON encoded messages. When message compactness 1914 or protocol bandwidth consumption needs to be optimized for the 1915 control interface, other protocols, such as Group Communication for 1916 the Constrained Application Protocol (CoAP) [RFC7390] or gRPC, are 1917 available, especially when the protocol messages are encoded in a 1918 binary format. Along with any of these protocols, the data modeling 1919 language YANG [RFC7950] can be used to formally and precisely define 1920 the interface data. 1922 The Path Computation Element (PCE) Communication Protocol (PCEP) 1923 [RFC5440] is another protocol that has evolved to be an option for 1924 the TE system control interface. The messages of PCEP are TLV-based, 1925 not defined by a data modeling language such as YANG. 1927 4.4. Overview of ITU Activities Related to Traffic Engineering 1929 This section provides an overview of prior work within the ITU-T 1930 pertaining to traffic engineering in traditional telecommunications 1931 networks. 1933 ITU-T Recommendations E.600 [ITU-E600], E.701 [ITU-E701], and E.801 1934 [ITU-E801] address traffic engineering issues in traditional 1935 telecommunications networks. Recommendation E.600 provides a 1936 vocabulary for describing traffic engineering concepts, while E.701 1937 defines reference connections, Grade of Service (GOS), and traffic 1938 parameters for ISDN. Recommendation E.701 uses the concept of a 1939 reference connection to identify representative cases of different 1940 types of connections without describing the specifics of their actual 1941 realizations by different physical means. As defined in 1942 Recommendation E.600, "a connection is an association of resources 1943 providing means for communication between two or more devices in, or 1944 attached to, a telecommunication network." Also, E.600 defines "a 1945 resource as any set of physically or conceptually identifiable 1946 entities within a telecommunication network, the use of which can be 1947 unambiguously determined" [ITU-E600]. There can be different types 1948 of connections as the number and types of resources in a connection 1949 may vary. 1951 Typically, different network segments are involved in the path of a 1952 connection. For example, a connection may be local, national, or 1953 international. The purposes of reference connections are to clarify 1954 and specify traffic performance issues at various interfaces between 1955 different network domains. Each domain may consist of one or more 1956 service provider networks. 1958 Reference connections provide a basis to define grade of service 1959 (GoS) parameters related to traffic engineering within the ITU-T 1960 framework. As defined in E.600, "GoS refers to a number of traffic 1961 engineering variables which are used to provide a measure of the 1962 adequacy of a group of resources under specified conditions." These 1963 GoS variables may be probability of loss, dial tone, delay, etc. 1964 They are essential for network internal design and operation as well 1965 as for component performance specification. 1967 GoS is different from quality of service (QoS) in the ITU framework. 1968 QoS is the performance perceivable by a telecommunication service 1969 user and expresses the user's degree of satisfaction of the service. 1971 QoS parameters focus on performance aspects observable at the service 1972 access points and network interfaces, rather than their causes within 1973 the network. GoS, on the other hand, is a set of network oriented 1974 measures which characterize the adequacy of a group of resources 1975 under specified conditions. For a network to be effective in serving 1976 its users, the values of both GoS and QoS parameters must be related, 1977 with GoS parameters typically making a major contribution to the QoS. 1979 Recommendation E.600 stipulates that a set of GoS parameters must be 1980 selected and defined on an end-to-end basis for each major service 1981 category provided by a network to assist the network provider with 1982 improving efficiency and effectiveness of the network. Based on a 1983 selected set of reference connections, suitable target values are 1984 assigned to the selected GoS parameters under normal and high load 1985 conditions. These end-to-end GoS target values are then apportioned 1986 to individual resource components of the reference connections for 1987 dimensioning purposes. 1989 4.5. Content Distribution 1991 The Internet is dominated by client-server interactions, especially 1992 Web traffic (in the future, more sophisticated media servers may 1993 become dominant). The location and performance of major information 1994 servers has a significant impact on the traffic patterns within the 1995 Internet as well as on the perception of service quality by end 1996 users. 1998 A number of dynamic load balancing techniques have been devised to 1999 improve the performance of replicated information servers. These 2000 techniques can cause spatial traffic characteristics to become more 2001 dynamic in the Internet because information servers can be 2002 dynamically picked based upon the location of the clients, the 2003 location of the servers, the relative utilization of the servers, the 2004 relative performance of different networks, and the relative 2005 performance of different parts of a network. This process of 2006 assignment of distributed servers to clients is called Traffic 2007 Directing. It functions at the application layer. 2009 Traffic Directing schemes that allocate servers in multiple 2010 geographically dispersed locations to clients may require empirical 2011 network performance statistics to make more effective decisions. In 2012 the future, network measurement systems may need to provide this type 2013 of information. The exact parameters needed are not yet defined. 2015 When congestion exists in the network, Traffic Directing and Traffic 2016 Engineering systems should act in a coordinated manner. This topic 2017 is for further study. 2019 The issues related to location and replication of information 2020 servers, particularly web servers, are important for Internet traffic 2021 engineering because these servers contribute a substantial proportion 2022 of Internet traffic. 2024 5. Taxonomy of Traffic Engineering Systems 2026 This section presents a short taxonomy of traffic engineering 2027 systems. A taxonomy of traffic engineering systems can be 2028 constructed based on traffic engineering styles and views as listed 2029 below: 2031 o Time-dependent vs State-dependent vs Event-dependent 2033 o Offline vs Online 2035 o Centralized vs Distributed 2037 o Local vs Global Information 2039 o Prescriptive vs Descriptive 2041 o Open Loop vs Closed Loop 2043 o Tactical vs Strategic 2045 These classification systems are described in greater detail in the 2046 following subsections of this document. 2048 5.1. Time-Dependent Versus State-Dependent Versus Event Dependent 2050 Traffic engineering methodologies can be classified as time- 2051 dependent, or state-dependent, or event-dependent. All TE schemes 2052 are considered to be dynamic in this document. Static TE implies 2053 that no traffic engineering methodology or algorithm is being 2054 applied. 2056 In the time-dependent TE, historical information based on periodic 2057 variations in traffic, (such as time of day), is used to pre-program 2058 routing plans and other TE control mechanisms. Additionally, 2059 customer subscription or traffic projection may be used. Pre- 2060 programmed routing plans typically change on a relatively long time 2061 scale (e.g., diurnal). Time-dependent algorithms do not attempt to 2062 adapt to random variations in traffic or changing network conditions. 2063 An example of a time-dependent algorithm is a global centralized 2064 optimizer where the input to the system is a traffic matrix and 2065 multi-class QoS requirements as described [MR99]. 2067 State-dependent TE adapts the routing plans for packets based on the 2068 current state of the network. The current state of the network 2069 provides additional information on variations in actual traffic 2070 (i.e., perturbations from regular variations) that could not be 2071 predicted using historical information. Constraint-based routing is 2072 an example of state-dependent TE operating in a relatively long time 2073 scale. An example operating in a relatively short time scale is a 2074 load-balancing algorithm described in [MATE]. 2076 The state of the network can be based on parameters such as 2077 utilization, packet delay, packet loss, etc. These parameters can be 2078 obtained in several ways. For example, each router may flood these 2079 parameters periodically or by means of some kind of trigger to other 2080 routers. Another approach is for a particular router performing 2081 adaptive TE to send probe packets along a path to gather the state of 2082 that path. Still another approach is for a management system to 2083 gather relevant information from network elements. 2085 Expeditious and accurate gathering and distribution of state 2086 information is critical for adaptive TE due to the dynamic nature of 2087 network conditions. State-dependent algorithms may be applied to 2088 increase network efficiency and resilience. Time-dependent 2089 algorithms are more suitable for predictable traffic variations. On 2090 the other hand, state-dependent algorithms are more suitable for 2091 adapting to the prevailing network state. 2093 Event-dependent TE methods can also be used for TE path selection. 2094 Event-dependent TE methods are distinct from time-dependent and 2095 state-dependent TE methods in the manner in which paths are selected. 2096 These algorithms are adaptive and distributed in nature and typically 2097 use learning models to find good paths for TE in a network. While 2098 state-dependent TE models typically use available-link-bandwidth 2099 (ALB) flooding for TE path selection, event-dependent TE methods do 2100 not require ALB flooding. Rather, event-dependent TE methods 2101 typically search out capacity by learning models, as in the success- 2102 to-the-top (STT) method. ALB flooding can be resource intensive, 2103 since it requires link bandwidth to carry LSAs, processor capacity to 2104 process LSAs, and the overhead can limit area/autonomous system (AS) 2105 size. Modeling results suggest that event-dependent TE methods could 2106 lead to a reduction in ALB flooding overhead without loss of network 2107 throughput performance [I-D.ietf-tewg-qos-routing]. 2109 5.2. Offline Versus Online 2111 Traffic engineering requires the computation of routing plans. The 2112 computation may be performed offline or online. The computation can 2113 be done offline for scenarios where routing plans need not be 2114 executed in real-time. For example, routing plans computed from 2115 forecast information may be computed offline. Typically, offline 2116 computation is also used to perform extensive searches on multi- 2117 dimensional solution spaces. 2119 Online computation is required when the routing plans must adapt to 2120 changing network conditions as in state-dependent algorithms. Unlike 2121 offline computation (which can be computationally demanding), online 2122 computation is geared toward relative simple and fast calculations to 2123 select routes, fine-tune the allocations of resources, and perform 2124 load balancing. 2126 5.3. Centralized Versus Distributed 2128 Centralized control has a central authority which determines routing 2129 plans and perhaps other TE control parameters on behalf of each 2130 router. The central authority collects the network-state information 2131 from all routers periodically and returns the routing information to 2132 the routers. The routing update cycle is a critical parameter 2133 directly impacting the performance of the network being controlled. 2134 Centralized control may need high processing power and high bandwidth 2135 control channels. 2137 Distributed control determines route selection by each router 2138 autonomously based on the routers view of the state of the network. 2139 The network state information may be obtained by the router using a 2140 probing method or distributed by other routers on a periodic basis 2141 using link state advertisements. Network state information may also 2142 be disseminated under exceptional conditions. 2144 5.3.1. Hybrid Systems 2146 TBD 2148 5.3.2. Considerations for Software Defined Networking 2150 TBD 2152 5.4. Local Versus Global 2154 Traffic engineering algorithms may require local or global network- 2155 state information. 2157 Local information pertains to the state of a portion of the domain. 2158 Examples include the bandwidth and packet loss rate of a particular 2159 path. Local state information may be sufficient for certain 2160 instances of distributed-controlled TEs. 2162 Global information pertains to the state of the entire domain 2163 undergoing traffic engineering. Examples include a global traffic 2164 matrix and loading information on each link throughout the domain of 2165 interest. Global state information is typically required with 2166 centralized control. Distributed TE systems may also need global 2167 information in some cases. 2169 5.5. Prescriptive Versus Descriptive 2171 TE systems may also be classified as prescriptive or descriptive. 2173 Prescriptive traffic engineering evaluates alternatives and 2174 recommends a course of action. Prescriptive traffic engineering can 2175 be further categorized as either corrective or perfective. 2176 Corrective TE prescribes a course of action to address an existing or 2177 predicted anomaly. Perfective TE prescribes a course of action to 2178 evolve and improve network performance even when no anomalies are 2179 evident. 2181 Descriptive traffic engineering, on the other hand, characterizes the 2182 state of the network and assesses the impact of various policies 2183 without recommending any particular course of action. 2185 5.5.1. Intent-Based Networking 2187 TBD 2189 5.6. Open-Loop Versus Closed-Loop 2191 Open-loop traffic engineering control is where control action does 2192 not use feedback information from the current network state. The 2193 control action may use its own local information for accounting 2194 purposes, however. 2196 Closed-loop traffic engineering control is where control action 2197 utilizes feedback information from the network state. The feedback 2198 information may be in the form of historical information or current 2199 measurement. 2201 5.7. Tactical vs Strategic 2203 Tactical traffic engineering aims to address specific performance 2204 problems (such as hot-spots) that occur in the network from a 2205 tactical perspective, without consideration of overall strategic 2206 imperatives. Without proper planning and insights, tactical TE tends 2207 to be ad hoc in nature. 2209 Strategic traffic engineering approaches the TE problem from a more 2210 organized and systematic perspective, taking into consideration the 2211 immediate and longer term consequences of specific policies and 2212 actions. 2214 6. Objectives for Internet Traffic Engineering 2216 This section describes high-level objectives for traffic engineering 2217 in the Internet. These objectives are presented in general terms and 2218 some advice is given as to how to meet the objectives. 2220 Broadly speaking, these objectives can be categorized as either 2221 functional or non-functional. 2223 Functional objectives for Internet traffic engineering describe the 2224 functions that a traffic engineering system should perform. These 2225 functions are needed to realize traffic engineering objectives by 2226 addressing traffic engineering problems. 2228 Non-functional objectives for Internet traffic engineering relate to 2229 the quality attributes or state characteristics of a traffic 2230 engineering system. These objectives may contain conflicting 2231 assertions and may sometimes be difficult to quantify precisely. 2233 6.1. Routing 2235 Routing control is a significant aspect of Internet traffic 2236 engineering. Routing impacts many of the key performance measures 2237 associated with networks, such as throughput, delay, and utilization. 2238 Generally, it is very difficult to provide good service quality in a 2239 wide area network without effective routing control. A desirable 2240 routing system is one that takes traffic characteristics and network 2241 constraints into account during route selection while maintaining 2242 stability. 2244 Traditional shortest path first (SPF) interior gateway protocols are 2245 based on shortest path algorithms and have limited control 2246 capabilities for traffic engineering [RFC2702], [AWD2]. These 2247 limitations include : 2249 1. The well known issues with pure SPF protocols, which do not take 2250 network constraints and traffic characteristics into account 2251 during route selection. For example, since IGPs always use the 2252 shortest paths (based on administratively assigned link metrics) 2253 to forward traffic, load sharing cannot be accomplished among 2254 paths of different costs. Using shortest paths to forward 2255 traffic conserves network resources, but may cause the following 2256 problems: 1) If traffic from a source to a destination exceeds 2257 the capacity of a link along the shortest path, the link (hence 2258 the shortest path) becomes congested while a longer path between 2259 these two nodes may be under-utilized; 2) the shortest paths from 2260 different sources can overlap at some links. If the total 2261 traffic from the sources exceeds the capacity of any of these 2262 links, congestion will occur. Problems can also occur because 2263 traffic demand changes over time but network topology and routing 2264 configuration cannot be changed as rapidly. This causes the 2265 network topology and routing configuration to become sub-optimal 2266 over time, which may result in persistent congestion problems. 2268 2. The Equal-Cost Multi-Path (ECMP) capability of SPF IGPs supports 2269 sharing of traffic among equal cost paths between two nodes. 2270 However, ECMP attempts to divide the traffic as equally as 2271 possible among the equal cost shortest paths. Generally, ECMP 2272 does not support configurable load sharing ratios among equal 2273 cost paths. The result is that one of the paths may carry 2274 significantly more traffic than other paths because it may also 2275 carry traffic from other sources. This situation can result in 2276 congestion along the path that carries more traffic. 2278 3. Modifying IGP metrics to control traffic routing tends to have 2279 network-wide effect. Consequently, undesirable and unanticipated 2280 traffic shifts can be triggered as a result. Recent work 2281 described in Section 8 may be capable of better control [FT00], 2282 [FT01]. 2284 Because of these limitations, new capabilities are needed to enhance 2285 the routing function in IP networks. Some of these capabilities have 2286 been described elsewhere and are summarized below. 2288 Constraint-based routing is desirable to evolve the routing 2289 architecture of IP networks, especially public IP backbones with 2290 complex topologies [RFC2702]. Constraint-based routing computes 2291 routes to fulfill requirements subject to constraints. Constraints 2292 may include bandwidth, hop count, delay, and administrative policy 2293 instruments such as resource class attributes [RFC2702], [RFC2386]. 2294 This makes it possible to select routes that satisfy a given set of 2295 requirements subject to network and administrative policy 2296 constraints. Routes computed through constraint-based routing are 2297 not necessarily the shortest paths. Constraint-based routing works 2298 best with path oriented technologies that support explicit routing, 2299 such as MPLS. 2301 Constraint-based routing can also be used as a way to redistribute 2302 traffic onto the infrastructure (even for best effort traffic). For 2303 example, if the bandwidth requirements for path selection and 2304 reservable bandwidth attributes of network links are appropriately 2305 defined and configured, then congestion problems caused by uneven 2306 traffic distribution may be avoided or reduced. In this way, the 2307 performance and efficiency of the network can be improved. 2309 A number of enhancements are needed to conventional link state IGPs, 2310 such as OSPF and IS-IS, to allow them to distribute additional state 2311 information required for constraint-based routing. These extensions 2312 to OSPF were described in [RFC3630] and to IS-IS in [RFC5305]. 2313 Essentially, these enhancements require the propagation of additional 2314 information in link state advertisements. Specifically, in addition 2315 to normal link-state information, an enhanced IGP is required to 2316 propagate topology state information needed for constraint-based 2317 routing. Some of the additional topology state information include 2318 link attributes such as reservable bandwidth and link resource class 2319 attribute (an administratively specified property of the link). The 2320 resource class attribute concept was defined in [RFC2702]. The 2321 additional topology state information is carried in new TLVs and sub- 2322 TLVs in IS-IS, or in the Opaque LSA in OSPF [RFC5305], [RFC3630]. 2324 An enhanced link-state IGP may flood information more frequently than 2325 a normal IGP. This is because even without changes in topology, 2326 changes in reservable bandwidth or link affinity can trigger the 2327 enhanced IGP to initiate flooding. A tradeoff is typically required 2328 between the timeliness of the information flooded and the flooding 2329 frequency to avoid excessive consumption of link bandwidth and 2330 computational resources, and more importantly, to avoid instability. 2332 In a TE system, it is also desirable for the routing subsystem to 2333 make the load splitting ratio among multiple paths (with equal cost 2334 or different cost) configurable. This capability gives network 2335 administrators more flexibility in the control of traffic 2336 distribution across the network. It can be very useful for avoiding/ 2337 relieving congestion in certain situations. Examples can be found in 2338 [XIAO]. 2340 The routing system should also have the capability to control the 2341 routes of subsets of traffic without affecting the routes of other 2342 traffic if sufficient resources exist for this purpose. This 2343 capability allows a more refined control over the distribution of 2344 traffic across the network. For example, the ability to move traffic 2345 from a source to a destination away from its original path to another 2346 path (without affecting other traffic paths) allows traffic to be 2347 moved from resource-poor network segments to resource-rich segments. 2348 Path oriented technologies such as MPLS inherently support this 2349 capability as discussed in [AWD2]. 2351 Additionally, the routing subsystem should be able to select 2352 different paths for different classes of traffic (or for different 2353 traffic behavior aggregates) if the network supports multiple classes 2354 of service (different behavior aggregates). 2356 6.2. Traffic Mapping 2358 Traffic mapping pertains to the assignment of traffic workload onto 2359 pre-established paths to meet certain requirements. Thus, while 2360 constraint-based routing deals with path selection, traffic mapping 2361 deals with the assignment of traffic to established paths which may 2362 have been selected by constraint-based routing or by some other 2363 means. Traffic mapping can be performed by time-dependent or state- 2364 dependent mechanisms, as described in Section 5.1. 2366 An important aspect of the traffic mapping function is the ability to 2367 establish multiple paths between an originating node and a 2368 destination node, and the capability to distribute the traffic 2369 between the two nodes across the paths according to some policies. A 2370 pre-condition for this scheme is the existence of flexible mechanisms 2371 to partition traffic and then assign the traffic partitions onto the 2372 parallel paths. This requirement was noted in [RFC2702]. When 2373 traffic is assigned to multiple parallel paths, it is recommended 2374 that special care should be taken to ensure proper ordering of 2375 packets belonging to the same application (or micro-flow) at the 2376 destination node of the parallel paths. 2378 As a general rule, mechanisms that perform the traffic mapping 2379 functions should aim to map the traffic onto the network 2380 infrastructure to minimize congestion. If the total traffic load 2381 cannot be accommodated, or if the routing and mapping functions 2382 cannot react fast enough to changing traffic conditions, then a 2383 traffic mapping system may rely on short time scale congestion 2384 control mechanisms (such as queue management, scheduling, etc.) to 2385 mitigate congestion. Thus, mechanisms that perform the traffic 2386 mapping functions should complement existing congestion control 2387 mechanisms. In an operational network, it is generally desirable to 2388 map the traffic onto the infrastructure such that intra-class and 2389 inter-class resource contention are minimized. 2391 When traffic mapping techniques that depend on dynamic state feedback 2392 (e.g., MATE and such like) are used, special care must be taken to 2393 guarantee network stability. 2395 6.3. Measurement 2397 The importance of measurement in traffic engineering has been 2398 discussed throughout this document. Mechanisms should be provided to 2399 measure and collect statistics from the network to support the 2400 traffic engineering function. Additional capabilities may be needed 2401 to help in the analysis of the statistics. The actions of these 2402 mechanisms should not adversely affect the accuracy and integrity of 2403 the statistics collected. The mechanisms for statistical data 2404 acquisition should also be able to scale as the network evolves. 2406 Traffic statistics may be classified according to long-term or short- 2407 term time scales. Long-term time scale traffic statistics are very 2408 useful for traffic engineering. Long-term time scale traffic 2409 statistics may capture or reflect periodicity in network workload 2410 (such as hourly, daily, and weekly variations in traffic profiles) as 2411 well as traffic trends. Aspects of the monitored traffic statistics 2412 may also depict class of service characteristics for a network 2413 supporting multiple classes of service. Analysis of the long-term 2414 traffic statistics may yield secondary statistics such as busy hour 2415 characteristics, traffic growth patterns, persistent congestion 2416 problems, hot-spot, and imbalances in link utilization caused by 2417 routing anomalies. 2419 A mechanism for constructing traffic matrices for both long-term and 2420 short-term traffic statistics should be in place. In multi-service 2421 IP networks, the traffic matrices may be constructed for different 2422 service classes. Each element of a traffic matrix represents a 2423 statistic of traffic flow between a pair of abstract nodes. An 2424 abstract node may represent a router, a collection of routers, or a 2425 site in a VPN. 2427 Measured traffic statistics should provide reasonable and reliable 2428 indicators of the current state of the network on the short-term 2429 scale. Some short term traffic statistics may reflect link 2430 utilization and link congestion status. Examples of congestion 2431 indicators include excessive packet delay, packet loss, and high 2432 resource utilization. Examples of mechanisms for distributing this 2433 kind of information include SNMP, probing techniques, FTP, IGP link 2434 state advertisements, etc. 2436 6.4. Network Survivability 2438 Network survivability refers to the capability of a network to 2439 maintain service continuity in the presence of faults. This can be 2440 accomplished by promptly recovering from network impairments and 2441 maintaining the required QoS for existing services after recovery. 2442 Survivability has become an issue of great concern within the 2443 Internet community due to the increasing demands to carry mission 2444 critical traffic, real-time traffic, and other high priority traffic 2445 over the Internet. Survivability can be addressed at the device 2446 level by developing network elements that are more reliable; and at 2447 the network level by incorporating redundancy into the architecture, 2448 design, and operation of networks. It is recommended that a 2449 philosophy of robustness and survivability should be adopted in the 2450 architecture, design, and operation of traffic engineering that 2451 control IP networks (especially public IP networks). Because 2452 different contexts may demand different levels of survivability, the 2453 mechanisms developed to support network survivability should be 2454 flexible so that they can be tailored to different needs. 2456 Failure protection and restoration capabilities have become available 2457 from multiple layers as network technologies have continued to 2458 improve. At the bottom of the layered stack, optical networks are 2459 now capable of providing dynamic ring and mesh restoration 2460 functionality at the wavelength level as well as traditional 2461 protection functionality. At the SONET/SDH layer survivability 2462 capability is provided with Automatic Protection Switching (APS) as 2463 well as self-healing ring and mesh architectures. Similar 2464 functionality is provided by layer 2 technologies such as ATM 2465 (generally with slower mean restoration times). Rerouting is 2466 traditionally used at the IP layer to restore service following link 2467 and node outages. Rerouting at the IP layer occurs after a period of 2468 routing convergence which may require seconds to minutes to complete. 2469 Some new developments in the MPLS context make it possible to achieve 2470 recovery at the IP layer prior to convergence [RFC3469]. 2472 To support advanced survivability requirements, path-oriented 2473 technologies such a MPLS can be used to enhance the survivability of 2474 IP networks in a potentially cost effective manner. The advantages 2475 of path oriented technologies such as MPLS for IP restoration becomes 2476 even more evident when class based protection and restoration 2477 capabilities are required. 2479 Recently, a common suite of control plane protocols has been proposed 2480 for both MPLS and optical transport networks under the acronym Multi- 2481 protocol Lambda Switching [AWD1]. This new paradigm of Multi- 2482 protocol Lambda Switching will support even more sophisticated mesh 2483 restoration capabilities at the optical layer for the emerging IP 2484 over WDM network architectures. 2486 Another important aspect regarding multi-layer survivability is that 2487 technologies at different layers provide protection and restoration 2488 capabilities at different temporal granularities (in terms of time 2489 scales) and at different bandwidth granularity (from packet-level to 2490 wavelength level). Protection and restoration capabilities can also 2491 be sensitive to different service classes and different network 2492 utility models. 2494 The impact of service outages varies significantly for different 2495 service classes depending upon the effective duration of the outage. 2496 The duration of an outage can vary from milliseconds (with minor 2497 service impact) to seconds (with possible call drops for IP telephony 2498 and session time-outs for connection oriented transactions) to 2499 minutes and hours (with potentially considerable social and business 2500 impact). 2502 Coordinating different protection and restoration capabilities across 2503 multiple layers in a cohesive manner to ensure network survivability 2504 is maintained at reasonable cost is a challenging task. Protection 2505 and restoration coordination across layers may not always be 2506 feasible, because networks at different layers may belong to 2507 different administrative domains. 2509 The following paragraphs present some of the general recommendations 2510 for protection and restoration coordination. 2512 o Protection and restoration capabilities from different layers 2513 should be coordinated whenever feasible and appropriate to provide 2514 network survivability in a flexible and cost effective manner. 2515 Minimization of function duplication across layers is one way to 2516 achieve the coordination. Escalation of alarms and other fault 2517 indicators from lower to higher layers may also be performed in a 2518 coordinated manner. A temporal order of restoration trigger 2519 timing at different layers is another way to coordinate multi- 2520 layer protection/restoration. 2522 o Spare capacity at higher layers is often regarded as working 2523 traffic at lower layers. Placing protection/restoration functions 2524 in many layers may increase redundancy and robustness, but it 2525 should not result in significant and avoidable inefficiencies in 2526 network resource utilization. 2528 o It is generally desirable to have protection and restoration 2529 schemes that are bandwidth efficient. 2531 o Failure notification throughout the network should be timely and 2532 reliable. 2534 o Alarms and other fault monitoring and reporting capabilities 2535 should be provided at appropriate layers. 2537 6.4.1. Survivability in MPLS Based Networks 2539 MPLS is an important emerging technology that enhances IP networks in 2540 terms of features, capabilities, and services. Because MPLS is path- 2541 oriented, it can potentially provide faster and more predictable 2542 protection and restoration capabilities than conventional hop by hop 2543 routed IP systems. This subsection describes some of the basic 2544 aspects and recommendations for MPLS networks regarding protection 2545 and restoration. See [RFC3469] for a more comprehensive discussion 2546 on MPLS based recovery. 2548 Protection types for MPLS networks can be categorized as link 2549 protection, node protection, path protection, and segment protection. 2551 o Link Protection: The objective for link protection is to protect 2552 an LSP from a given link failure. Under link protection, the path 2553 of the protection or backup LSP (the secondary LSP) is disjoint 2554 from the path of the working or operational LSP at the particular 2555 link over which protection is required. When the protected link 2556 fails, traffic on the working LSP is switched over to the 2557 protection LSP at the head-end of the failed link. This is a 2558 local repair method which can be fast. It might be more 2559 appropriate in situations where some network elements along a 2560 given path are less reliable than others. 2562 o Node Protection: The objective of LSP node protection is to 2563 protect an LSP from a given node failure. Under node protection, 2564 the path of the protection LSP is disjoint from the path of the 2565 working LSP at the particular node to be protected. The secondary 2566 path is also disjoint from the primary path at all links 2567 associated with the node to be protected. When the node fails, 2568 traffic on the working LSP is switched over to the protection LSP 2569 at the upstream LSR directly connected to the failed node. 2571 o Path Protection: The goal of LSP path protection is to protect an 2572 LSP from failure at any point along its routed path. Under path 2573 protection, the path of the protection LSP is completely disjoint 2574 from the path of the working LSP. The advantage of path 2575 protection is that the backup LSP protects the working LSP from 2576 all possible link and node failures along the path, except for 2577 failures that might occur at the ingress and egress LSRs, or for 2578 correlated failures that might impact both working and backup 2579 paths simultaneously. Additionally, since the path selection is 2580 end-to-end, path protection might be more efficient in terms of 2581 resource usage than link or node protection. However, path 2582 protection may be slower than link and node protection in general. 2584 o Segment Protection: An MPLS domain may be partitioned into 2585 multiple protection domains whereby a failure in a protection 2586 domain is rectified within that domain. In cases where an LSP 2587 traverses multiple protection domains, a protection mechanism 2588 within a domain only needs to protect the segment of the LSP that 2589 lies within the domain. Segment protection will generally be 2590 faster than path protection because recovery generally occurs 2591 closer to the fault. 2593 6.4.2. Protection Option 2595 Another issue to consider is the concept of protection options. The 2596 protection option uses the notation m:n protection, where m is the 2597 number of protection LSPs used to protect n working LSPs. Feasible 2598 protection options follow. 2600 o 1:1: one working LSP is protected/restored by one protection LSP. 2602 o 1:n: one protection LSP is used to protect/restore n working LSPs. 2604 o n:1: one working LSP is protected/restored by n protection LSPs, 2605 possibly with configurable load splitting ratio. When more than 2606 one protection LSP is used, it may be desirable to share the 2607 traffic across the protection LSPs when the working LSP fails to 2608 satisfy the bandwidth requirement of the traffic trunk associated 2609 with the working LSP. This may be especially useful when it is 2610 not feasible to find one path that can satisfy the bandwidth 2611 requirement of the primary LSP. 2613 o 1+1: traffic is sent concurrently on both the working LSP and the 2614 protection LSP. In this case, the egress LSR selects one of the 2615 two LSPs based on a local traffic integrity decision process, 2616 which compares the traffic received from both the working and the 2617 protection LSP and identifies discrepancies. It is unlikely that 2618 this option would be used extensively in IP networks due to its 2619 resource utilization inefficiency. However, if bandwidth becomes 2620 plentiful and cheap, then this option might become quite viable 2621 and attractive in IP networks. 2623 6.5. Traffic Engineering in Diffserv Environments 2625 This section provides an overview of the traffic engineering features 2626 and recommendations that are specifically pertinent to Differentiated 2627 Services (Diffserv) [RFC2475] capable IP networks. 2629 Increasing requirements to support multiple classes of traffic, such 2630 as best effort and mission critical data, in the Internet calls for 2631 IP networks to differentiate traffic according to some criteria, and 2632 to accord preferential treatment to certain types of traffic. Large 2633 numbers of flows can be aggregated into a few behavior aggregates 2634 based on some criteria in terms of common performance requirements in 2635 terms of packet loss ratio, delay, and jitter; or in terms of common 2636 fields within the IP packet headers. 2638 As Diffserv evolves and becomes deployed in operational networks, 2639 traffic engineering will be critical to ensuring that SLAs defined 2640 within a given Diffserv service model are met. Classes of service 2641 (CoS) can be supported in a Diffserv environment by concatenating 2642 per-hop behaviors (PHBs) along the routing path, using service 2643 provisioning mechanisms, and by appropriately configuring edge 2644 functionality such as traffic classification, marking, policing, and 2645 shaping. PHB is the forwarding behavior that a packet receives at a 2646 DS node (a Diffserv-compliant node). This is accomplished by means 2647 of buffer management and packet scheduling mechanisms. In this 2648 context, packets belonging to a class are those that are members of a 2649 corresponding ordering aggregate. 2651 Traffic engineering can be used as a compliment to Diffserv 2652 mechanisms to improve utilization of network resources, but not as a 2653 necessary element in general. When traffic engineering is used, it 2654 can be operated on an aggregated basis across all service classes 2655 [RFC3270] or on a per service class basis. The former is used to 2656 provide better distribution of the aggregate traffic load over the 2657 network resources. (See [RFC3270] for detailed mechanisms to support 2658 aggregate traffic engineering.) The latter case is discussed below 2659 since it is specific to the Diffserv environment, with so called 2660 Diffserv-aware traffic engineering [RFC4124]. 2662 For some Diffserv networks, it may be desirable to control the 2663 performance of some service classes by enforcing certain 2664 relationships between the traffic workload contributed by each 2665 service class and the amount of network resources allocated or 2666 provisioned for that service class. Such relationships between 2667 demand and resource allocation can be enforced using a combination 2668 of, for example: (1) traffic engineering mechanisms on a per service 2669 class basis that enforce the desired relationship between the amount 2670 of traffic contributed by a given service class and the resources 2671 allocated to that class, and (2) mechanisms that dynamically adjust 2672 the resources allocated to a given service class to relate to the 2673 amount of traffic contributed by that service class. 2675 It may also be desirable to limit the performance impact of high 2676 priority traffic on relatively low priority traffic. This can be 2677 achieved by, for example, controlling the percentage of high priority 2678 traffic that is routed through a given link. Another way to 2679 accomplish this is to increase link capacities appropriately so that 2680 lower priority traffic can still enjoy adequate service quality. 2681 When the ratio of traffic workload contributed by different service 2682 classes vary significantly from router to router, it may not suffice 2683 to rely exclusively on conventional IGP routing protocols or on 2684 traffic engineering mechanisms that are insensitive to different 2685 service classes. Instead, it may be desirable to perform traffic 2686 engineering, especially routing control and mapping functions, on a 2687 per service class basis. One way to accomplish this in a domain that 2688 supports both MPLS and Diffserv is to define class specific LSPs and 2689 to map traffic from each class onto one or more LSPs that correspond 2690 to that service class. An LSP corresponding to a given service class 2691 can then be routed and protected/restored in a class dependent 2692 manner, according to specific policies. 2694 Performing traffic engineering on a per class basis may require 2695 certain per-class parameters to be distributed. Note that it is 2696 common to have some classes share some aggregate constraint (e.g., 2697 maximum bandwidth requirement) without enforcing the constraint on 2698 each individual class. These classes then can be grouped into a 2699 class-type and per-class-type parameters can be distributed instead 2700 to improve scalability. It also allows better bandwidth sharing 2701 between classes in the same class-type. A class-type is a set of 2702 classes that satisfy the following two conditions: 2704 1) Classes in the same class-type have common aggregate requirements 2705 to satisfy required performance levels. 2707 2) There is no requirement to be enforced at the level of individual 2708 class in the class-type. Note that it is still possible, 2709 nevertheless, to implement some priority policies for classes in the 2710 same class-type to permit preferential access to the class-type 2711 bandwidth through the use of preemption priorities. 2713 An example of the class-type can be a low-loss class-type that 2714 includes both AF1-based and AF2-based Ordering Aggregates. With such 2715 a class-type, one may implement some priority policy which assigns 2716 higher preemption priority to AF1-based traffic trunks over AF2-based 2717 ones, vice versa, or the same priority. 2719 See [RFC4124] for detailed requirements on Diffserv-aware traffic 2720 engineering. 2722 6.6. Network Controllability 2724 Off-line (and on-line) traffic engineering considerations would be of 2725 limited utility if the network could not be controlled effectively to 2726 implement the results of TE decisions and to achieve desired network 2727 performance objectives. Capacity augmentation is a coarse grained 2728 solution to traffic engineering issues. However, it is simple and 2729 may be advantageous if bandwidth is abundant and cheap or if the 2730 current or expected network workload demands it. However, bandwidth 2731 is not always abundant and cheap, and the workload may not always 2732 demand additional capacity. Adjustments of administrative weights 2733 and other parameters associated with routing protocols provide finer 2734 grained control, but is difficult to use and imprecise because of the 2735 routing interactions that occur across the network. In certain 2736 network contexts, more flexible, finer grained approaches which 2737 provide more precise control over the mapping of traffic to routes 2738 and over the selection and placement of routes may be appropriate and 2739 useful. 2741 Control mechanisms can be manual (e.g., administrative 2742 configuration), partially-automated (e.g., scripts) or fully- 2743 automated (e.g., policy based management systems). Automated 2744 mechanisms are particularly required in large scale networks. Multi- 2745 vendor interoperability can be facilitated by developing and 2746 deploying standardized management systems (e.g., standard MIBs) and 2747 policies (PIBs) to support the control functions required to address 2748 traffic engineering objectives such as load distribution and 2749 protection/restoration. 2751 Network control functions should be secure, reliable, and stable as 2752 these are often needed to operate correctly in times of network 2753 impairments (e.g., during network congestion or security attacks). 2755 7. Inter-Domain Considerations 2757 Inter-domain traffic engineering is concerned with the performance 2758 optimization for traffic that originates in one administrative domain 2759 and terminates in a different one. 2761 Traffic exchange between autonomous systems in the Internet occurs 2762 through exterior gateway protocols. Currently, BGP [RFC4271] is the 2763 standard exterior gateway protocol for the Internet. BGP provides a 2764 number of attributes and capabilities (e.g., route filtering) that 2765 can be used for inter-domain traffic engineering. More specifically, 2766 BGP permits the control of routing information and traffic exchange 2767 between Autonomous Systems (AS's) in the Internet. BGP incorporates 2768 a sequential decision process which calculates the degree of 2769 preference for various routes to a given destination network. There 2770 are two fundamental aspects to inter-domain traffic engineering using 2771 BGP: 2773 o Route Redistribution: controlling the import and export of routes 2774 between AS's, and controlling the redistribution of routes between 2775 BGP and other protocols within an AS. 2777 o Best path selection: selecting the best path when there are 2778 multiple candidate paths to a given destination network. Best 2779 path selection is performed by the BGP decision process based on a 2780 sequential procedure, taking a number of different considerations 2781 into account. Ultimately, best path selection under BGP boils 2782 down to selecting preferred exit points out of an AS towards 2783 specific destination networks. The BGP path selection process can 2784 be influenced by manipulating the attributes associated with the 2785 BGP decision process. These attributes include: NEXT-HOP, WEIGHT 2786 (Cisco proprietary which is also implemented by some other 2787 vendors), LOCAL-PREFERENCE, AS-PATH, ROUTE-ORIGIN, MULTI-EXIT- 2788 DESCRIMINATOR (MED), IGP METRIC, etc. 2790 Route-maps provide the flexibility to implement complex BGP policies 2791 based on pre-configured logical conditions. In particular, Route- 2792 maps can be used to control import and export policies for incoming 2793 and outgoing routes, control the redistribution of routes between BGP 2794 and other protocols, and influence the selection of best paths by 2795 manipulating the attributes associated with the BGP decision process. 2796 Very complex logical expressions that implement various types of 2797 policies can be implemented using a combination of Route-maps, BGP- 2798 attributes, Access-lists, and Community attributes. 2800 When looking at possible strategies for inter-domain TE with BGP, it 2801 must be noted that the outbound traffic exit point is controllable, 2802 whereas the interconnection point where inbound traffic is received 2803 from an EBGP peer typically is not, unless a special arrangement is 2804 made with the peer sending the traffic. Therefore, it is up to each 2805 individual network to implement sound TE strategies that deal with 2806 the efficient delivery of outbound traffic from one's customers to 2807 one's peering points. The vast majority of TE policy is based upon a 2808 "closest exit" strategy, which offloads interdomain traffic at the 2809 nearest outbound peer point towards the destination autonomous 2810 system. Most methods of manipulating the point at which inbound 2811 traffic enters a network from an EBGP peer (inconsistent route 2812 announcements between peering points, AS pre-pending, and sending 2813 MEDs) are either ineffective, or not accepted in the peering 2814 community. 2816 Inter-domain TE with BGP is generally effective, but it is usually 2817 applied in a trial-and-error fashion. A systematic approach for 2818 inter-domain traffic engineering is yet to be devised. 2820 Inter-domain TE is inherently more difficult than intra-domain TE 2821 under the current Internet architecture. The reasons for this are 2822 both technical and administrative. Technically, while topology and 2823 link state information are helpful for mapping traffic more 2824 effectively, BGP does not propagate such information across domain 2825 boundaries for stability and scalability reasons. Administratively, 2826 there are differences in operating costs and network capacities 2827 between domains. Generally, what may be considered a good solution 2828 in one domain may not necessarily be a good solution in another 2829 domain. Moreover, it would generally be considered inadvisable for 2830 one domain to permit another domain to influence the routing and 2831 management of traffic in its network. 2833 MPLS TE-tunnels (explicit LSPs) can potentially add a degree of 2834 flexibility in the selection of exit points for inter-domain routing. 2835 The concept of relative and absolute metrics can be applied to this 2836 purpose. The idea is that if BGP attributes are defined such that 2837 the BGP decision process depends on IGP metrics to select exit points 2838 for inter-domain traffic, then some inter-domain traffic destined to 2839 a given peer network can be made to prefer a specific exit point by 2840 establishing a TE-tunnel between the router making the selection to 2841 the peering point via a TE-tunnel and assigning the TE-tunnel a 2842 metric which is smaller than the IGP cost to all other peering 2843 points. If a peer accepts and processes MEDs, then a similar MPLS 2844 TE-tunnel based scheme can be applied to cause certain entrance 2845 points to be preferred by setting MED to be an IGP cost, which has 2846 been modified by the tunnel metric. 2848 Similar to intra-domain TE, inter-domain TE is best accomplished when 2849 a traffic matrix can be derived to depict the volume of traffic from 2850 one autonomous system to another. 2852 Generally, redistribution of inter-domain traffic requires 2853 coordination between peering partners. An export policy in one 2854 domain that results in load redistribution across peer points with 2855 another domain can significantly affect the local traffic matrix 2856 inside the domain of the peering partner. This, in turn, will affect 2857 the intra-domain TE due to changes in the spatial distribution of 2858 traffic. Therefore, it is mutually beneficial for peering partners 2859 to coordinate with each other before attempting any policy changes 2860 that may result in significant shifts in inter-domain traffic. In 2861 certain contexts, this coordination can be quite challenging due to 2862 technical and non- technical reasons. 2864 It is a matter of speculation as to whether MPLS, or similar 2865 technologies, can be extended to allow selection of constrained paths 2866 across domain boundaries. 2868 8. Overview of Contemporary TE Practices in Operational IP Networks 2870 This section provides an overview of some contemporary traffic 2871 engineering practices in IP networks. The focus is primarily on the 2872 aspects that pertain to the control of the routing function in 2873 operational contexts. The intent here is to provide an overview of 2874 the commonly used practices. The discussion is not intended to be 2875 exhaustive. 2877 Currently, service providers apply many of the traffic engineering 2878 mechanisms discussed in this document to optimize the performance of 2879 their IP networks. These techniques include capacity planning for 2880 long time scales, routing control using IGP metrics and MPLS for 2881 medium time scales, the overlay model also for medium time scales, 2882 and traffic management mechanisms for short time scale. 2884 When a service provider plans to build an IP network, or expand the 2885 capacity of an existing network, effective capacity planning should 2886 be an important component of the process. Such plans may take the 2887 following aspects into account: location of new nodes if any, 2888 existing and predicted traffic patterns, costs, link capacity, 2889 topology, routing design, and survivability. 2891 Performance optimization of operational networks is usually an 2892 ongoing process in which traffic statistics, performance parameters, 2893 and fault indicators are continually collected from the network. 2894 This empirical data is then analyzed and used to trigger various 2895 traffic engineering mechanisms. Tools that perform what-if analysis 2896 can also be used to assist the TE process by allowing various 2897 scenarios to be reviewed before a new set of configurations are 2898 implemented in the operational network. 2900 Traditionally, intra-domain real-time TE with IGP is done by 2901 increasing the OSPF or IS-IS metric of a congested link until enough 2902 traffic has been diverted from that link. This approach has some 2903 limitations as discussed in Section 6.1. Recently, some new intra- 2904 domain TE approaches/tools have been proposed 2905 [RR94][FT00][FT01][WANG]. Such approaches/tools take traffic matrix, 2906 network topology, and network performance objective(s) as input, and 2907 produce some link metrics and possibly some unequal load-sharing 2908 ratios to be set at the head-end routers of some ECMPs as output. 2909 These new progresses open new possibility for intra-domain TE with 2910 IGP to be done in a more systematic way. 2912 The overlay model (IP over ATM or IP over Frame relay) is another 2913 approach which is commonly used in practice [AWD2]. The IP over ATM 2914 technique is no longer viewed favorably due to recent advances in 2915 MPLS and router hardware technology. 2917 Deployment of MPLS for traffic engineering applications has commenced 2918 in some service provider networks. One operational scenario is to 2919 deploy MPLS in conjunction with an IGP (IS-IS-TE or OSPF-TE) that 2920 supports the traffic engineering extensions, in conjunction with 2921 constraint-based routing for explicit route computations, and a 2922 signaling protocol (e.g., RSVP-TE) for LSP instantiation. 2924 In contemporary MPLS traffic engineering contexts, network 2925 administrators specify and configure link attributes and resource 2926 constraints such as maximum reservable bandwidth and resource class 2927 attributes for links (interfaces) within the MPLS domain. A link 2928 state protocol that supports TE extensions (IS-IS-TE or OSPF-TE) is 2929 used to propagate information about network topology and link 2930 attribute to all routers in the routing area. Network administrators 2931 also specify all the LSPs that are to originate each router. For 2932 each LSP, the network administrator specifies the destination node 2933 and the attributes of the LSP which indicate the requirements that to 2934 be satisfied during the path selection process. Each router then 2935 uses a local constraint-based routing process to compute explicit 2936 paths for all LSPs originating from it. Subsequently, a signaling 2937 protocol is used to instantiate the LSPs. By assigning proper 2938 bandwidth values to links and LSPs, congestion caused by uneven 2939 traffic distribution can generally be avoided or mitigated. 2941 The bandwidth attributes of LSPs used for traffic engineering can be 2942 updated periodically. The basic concept is that the bandwidth 2943 assigned to an LSP should relate in some manner to the bandwidth 2944 requirements of traffic that actually flows through the LSP. The 2945 traffic attribute of an LSP can be modified to accommodate traffic 2946 growth and persistent traffic shifts. If network congestion occurs 2947 due to some unexpected events, existing LSPs can be rerouted to 2948 alleviate the situation or network administrator can configure new 2949 LSPs to divert some traffic to alternative paths. The reservable 2950 bandwidth of the congested links can also be reduced to force some 2951 LSPs to be rerouted to other paths. 2953 In an MPLS domain, a traffic matrix can also be estimated by 2954 monitoring the traffic on LSPs. Such traffic statistics can be used 2955 for a variety of purposes including network planning and network 2956 optimization. Current practice suggests that deploying an MPLS 2957 network consisting of hundreds of routers and thousands of LSPs is 2958 feasible. In summary, recent deployment experience suggests that 2959 MPLS approach is very effective for traffic engineering in IP 2960 networks [XIAO]. 2962 As mentioned previously in Section 7, one usually has no direct 2963 control over the distribution of inbound traffic. Therefore, the 2964 main goal of contemporary inter-domain TE is to optimize the 2965 distribution of outbound traffic between multiple inter-domain links. 2966 When operating a global network, maintaining the ability to operate 2967 the network in a regional fashion where desired, while continuing to 2968 take advantage of the benefits of a global network, also becomes an 2969 important objective. 2971 Inter-domain TE with BGP usually begins with the placement of 2972 multiple peering interconnection points in locations that have high 2973 peer density, are in close proximity to originating/terminating 2974 traffic locations on one's own network, and are lowest in cost. 2975 There are generally several locations in each region of the world 2976 where the vast majority of major networks congregate and 2977 interconnect. Some location-decision problems that arise in 2978 association with inter-domain routing are discussed in [AWD5]. 2980 Once the locations of the interconnects are determined, and circuits 2981 are implemented, one decides how best to handle the routes heard from 2982 the peer, as well as how to propagate the peers' routes within one's 2983 own network. One way to engineer outbound traffic flows on a network 2984 with many EBGP peers is to create a hierarchy of peers. Generally, 2985 the Local Preferences of all peers are set to the same value so that 2986 the shortest AS paths will be chosen to forward traffic. Then, by 2987 over-writing the inbound MED metric (Multi-exit-discriminator metric, 2988 also referred to as "BGP metric". Both terms are used 2989 interchangeably in this document) with BGP metrics to routes received 2990 at different peers, the hierarchy can be formed. For example, all 2991 Local Preferences can be set to 200, preferred private peers can be 2992 assigned a BGP metric of 50, the rest of the private peers can be 2993 assigned a BGP metric of 100, and public peers can be assigned a BGP 2994 metric of 600. "Preferred" peers might be defined as those peers 2995 with whom the most available capacity exists, whose customer base is 2996 larger in comparison to other peers, whose interconnection costs are 2997 the lowest, and with whom upgrading existing capacity is the easiest. 2998 In a network with low utilization at the edge, this works well. The 2999 same concept could be applied to a network with higher edge 3000 utilization by creating more levels of BGP metrics between peers, 3001 allowing for more granularity in selecting the exit points for 3002 traffic bound for a dual homed customer on a peer's network. 3004 By only replacing inbound MED metrics with BGP metrics, only equal 3005 AS-Path length routes' exit points are being changed. (The BGP 3006 decision considers Local Preference first, then AS-Path length, and 3007 then BGP metric). For example, assume a network has two possible 3008 egress points, peer A and peer B. Each peer has 40% of the 3009 Internet's routes exclusively on its network, while the remaining 20% 3010 of the Internet's routes are from customers who dual home between A 3011 and B. Assume that both peers have a Local Preference of 200 and a 3012 BGP metric of 100. If the link to peer A is congested, increasing 3013 its BGP metric while leaving the Local Preference at 200 will ensure 3014 that the 20% of total routes belonging to dual homed customers will 3015 prefer peer B as the exit point. The previous example would be used 3016 in a situation where all exit points to a given peer were close to 3017 congestion levels, and traffic needed to be shifted away from that 3018 peer entirely. 3020 When there are multiple exit points to a given peer, and only one of 3021 them is congested, it is not necessary to shift traffic away from the 3022 peer entirely, but only from the one congested circuit. This can be 3023 achieved by using passive IGP-metrics, AS-path filtering, or prefix 3024 filtering. 3026 Occasionally, more drastic changes are needed, for example, in 3027 dealing with a "problem peer" who is difficult to work with on 3028 upgrades or is charging high prices for connectivity to their 3029 network. In that case, the Local Preference to that peer can be 3030 reduced below the level of other peers. This effectively reduces the 3031 amount of traffic sent to that peer to only originating traffic 3032 (assuming no transit providers are involved). This type of change 3033 can affect a large amount of traffic, and is only used after other 3034 methods have failed to provide the desired results. 3036 Although it is not much of an issue in regional networks, the 3037 propagation of a peer's routes back through the network must be 3038 considered when a network is peering on a global scale. Sometimes, 3039 business considerations can influence the choice of BGP policies in a 3040 given context. For example, it may be imprudent, from a business 3041 perspective, to operate a global network and provide full access to 3042 the global customer base to a small network in a particular country. 3043 However, for the purpose of providing one's own customers with 3044 quality service in a particular region, good connectivity to that in- 3045 country network may still be necessary. This can be achieved by 3046 assigning a set of communities at the edge of the network, which have 3047 a known behavior when routes tagged with those communities are 3048 propagating back through the core. Routes heard from local peers 3049 will be prevented from propagating back to the global network, 3050 whereas routes learned from larger peers may be allowed to propagate 3051 freely throughout the entire global network. By implementing a 3052 flexible community strategy, the benefits of using a single global AS 3053 Number (ASN) can be realized, while the benefits of operating 3054 regional networks can also be taken advantage of. An alternative to 3055 doing this is to use different ASNs in different regions, with the 3056 consequence that the AS path length for routes announced by that 3057 service provider will increase. 3059 9. Conclusion 3061 This document described principles for traffic engineering in the 3062 Internet. It presented an overview of some of the basic issues 3063 surrounding traffic engineering in IP networks. The context of TE 3064 was described, a TE process models and a taxonomy of TE styles were 3065 presented. A brief historical review of pertinent developments 3066 related to traffic engineering was provided. A survey of 3067 contemporary TE techniques in operational networks was presented. 3068 Additionally, the document specified a set of generic requirements, 3069 recommendations, and options for Internet traffic engineering. 3071 10. Security Considerations 3073 This document does not introduce new security issues. 3075 11. IANA Considerations 3077 This draft makes no requests for IANA action. 3079 12. Acknowledgments 3081 The acknowledgements in RFC3272 were as below. All people who helped 3082 in the production of that document also need to be thanked for the 3083 carry-over into this new document. 3085 The authors would like to thank Jim Boyle for inputs on the 3086 recommendations section, Francois Le Faucheur for inputs on Diffserv 3087 aspects, Blaine Christian for inputs on measurement, Gerald Ash for 3088 inputs on routing in telephone networks and for text on event- 3089 dependent TE methods, Steven Wright for inputs on network 3090 controllability, and Jonathan Aufderheide for inputs on inter-domain 3091 TE with BGP. Special thanks to Randy Bush for proposing the TE 3092 taxonomy based on "tactical vs strategic" methods. The subsection 3093 describing an "Overview of ITU Activities Related to Traffic 3094 Engineering" was adapted from a contribution by Waisum Lai. Useful 3095 feedback and pointers to relevant materials were provided by J. Noel 3096 Chiappa. Additional comments were provided by Glenn Grotefeld during 3097 the working last call process. Finally, the authors would like to 3098 thank Ed Kern, the TEWG co-chair, for his comments and support. 3100 The production of this document include a fix to the original text 3101 resulting from an Errata Report by Jean-Michel Grimaldi. 3103 The authors of this document would also like to thank TBD. 3105 13. Contributors 3107 Much of the text in this document is derived from RFC 3272. The 3108 authors of this document would like to express their gratitude to all 3109 involved in that work. Although the source text has been edited in 3110 the production of this document, the orginal authors should be 3111 considered as Contributors to this work. They were: 3113 Daniel O. Awduche 3114 Movaz Networks 3115 7926 Jones Branch Drive, Suite 615 3116 McLean, VA 22102 3118 Phone: 703-298-5291 3119 EMail: awduche@movaz.com 3121 Angela Chiu 3122 Celion Networks 3123 1 Sheila Dr., Suite 2 3124 Tinton Falls, NJ 07724 3126 Phone: 732-747-9987 3127 EMail: angela.chiu@celion.com 3129 Anwar Elwalid 3130 Lucent Technologies 3131 Murray Hill, NJ 07974 3133 Phone: 908 582-7589 3134 EMail: anwar@lucent.com 3136 Indra Widjaja 3137 Bell Labs, Lucent Technologies 3138 600 Mountain Avenue 3139 Murray Hill, NJ 07974 3141 Phone: 908 582-0435 3142 EMail: iwidjaja@research.bell-labs.com 3144 XiPeng Xiao 3145 Redback Networks 3146 300 Holger Way 3147 San Jose, CA 95134 3149 Phone: 408-750-5217 3150 EMail: xipeng@redback.com 3152 The first version of this document was produced by the TEAS Working 3153 Group's RFC3272bis Design Team. The team members are all 3154 Contributors to this document. They were: 3156 Acee Lindem 3157 EMail: acee@cisco.com 3159 Adrian Farrel 3160 EMail: adrian@olddog.co.uk 3162 Aijun Wang 3163 EMail: wangaijun@tsinghua.org.cn 3165 Daniele Ceccarelli 3166 EMail: daniele.ceccarelli@ericsson.com 3168 Dieter Beller 3169 EMail: dieter.beller@nokia.com 3171 Jeff Tantsura 3172 EMail: jefftant.ietf@gmail.com 3174 Julien Meuric 3175 EMail: julien.meuric@orange.com 3177 Liu Hua 3178 EMail: hliu@ciena.com 3180 Loa Andersson 3181 EMail: loa@pi.nu 3183 Luis Miguel Contreras 3184 EMail: luismiguel.contrerasmurillo@telefonica.com 3186 Martin Horneffer 3187 EMail: Martin.Horneffer@telekom.de 3189 Tarek Saad 3190 EMail: tsaad@cisco.com 3192 Xufeng Liu 3193 EMail: xufeng.liu.ietf@gmail.com 3195 14. Informative References 3197 [ASH2] Ash, J., "Dynamic Routing in Telecommunications Networks", 3198 Book McGraw Hill, 1998. 3200 [AWD1] Awduche, D. and Y. Rekhter, "Multiprocotol Lambda 3201 Switching - Combining MPLS Traffic Engineering Control 3202 with Optical Crossconnects", Article IEEE Communications 3203 Magazine, March 2001. 3205 [AWD2] Awduche, D., "MPLS and Traffic Engineering in IP 3206 Networks", Article IEEE Communications Magazine, December 3207 1999. 3209 [AWD5] Awduche, D., "An Approach to Optimal Peering Between 3210 Autonomous Systems in the Internet", Paper International 3211 Conference on Computer Communications and Networks 3212 (ICCCN'98), October 1998. 3214 [CRUZ] "A Calculus for Network Delay, Part II, Network Analysis", 3215 Transaction IEEE Transactions on Information Theory, vol. 3216 37, pp. 132-141, 1991. 3218 [ELW95] Elwalid, A., Mitra, D., and R. Wentworth, "A New Approach 3219 for Allocating Buffers and Bandwidth to Heterogeneous, 3220 Regulated Traffic in an ATM Node", Article IEEE Journal on 3221 Selected Areas in Communications, 13.6, pp. 1115-1127, 3222 August 1995. 3224 [FLJA93] Floyd, S. and V. Jacobson, "Random Early Detection 3225 Gateways for Congestion Avoidance", Article IEEE/ACM 3226 Transactions on Networking, Vol. 1, p. 387-413, November 3227 1993. 3229 [FLOY94] Floyd, S., "TCP and Explicit Congestion Notification", 3230 Article ACM Computer Communication Review, V. 24, No. 5, 3231 p. 10-23, October 1994. 3233 [FT00] Fortz, B. and M. Thorup, "Internet Traffic Engineering by 3234 Optimizing OSPF Weights", Article IEEE INFOCOM 2000, March 3235 2000. 3237 [FT01] Fortz, B. and M. Thorup, "Optimizing OSPF/IS-IS Weights in 3238 a Changing World", n.d., 3239 . 3241 [HUSS87] Hurley, B., Seidl, C., and W. Sewel, "A Survey of Dynamic 3242 Routing Methods for Circuit-Switched Traffic", 3243 Article IEEE Communication Magazine, September 1987. 3245 [I-D.ietf-teas-yang-te-topo] 3246 Liu, X., Bryskin, I., Beeram, V., Saad, T., Shah, H., and 3247 O. Dios, "YANG Data Model for Traffic Engineering (TE) 3248 Topologies", draft-ietf-teas-yang-te-topo-22 (work in 3249 progress), June 2019. 3251 [I-D.ietf-tewg-qos-routing] 3252 Ash, G., "Traffic Engineering & QoS Methods for IP-, ATM-, 3253 & Based Multiservice Networks", draft-ietf-tewg-qos- 3254 routing-04 (work in progress), October 2001. 3256 [ITU-E600] 3257 "Terms and Definitions of Traffic Engineering", 3258 Recommendation ITU-T Recommendation E.600, March 1993. 3260 [ITU-E701] 3261 "Reference Connections for Traffic Engineering", 3262 Recommendation ITU-T Recommendation E.701, October 1993. 3264 [ITU-E801] 3265 "Framework for Service Quality Agreement", 3266 Recommendation ITU-T Recommendation E.801, October 1996. 3268 [MA] Ma, Q., "Quality of Service Routing in Integrated Services 3269 Networks", Ph.D. PhD Dissertation, CMU-CS-98-138, CMU, 3270 1998. 3272 [MATE] Elwalid, A., Jin, C., Low, S., and I. Widjaja, "MATE - 3273 MPLS Adaptive Traffic Engineering", 3274 Proceedings INFOCOM'01, April 2001. 3276 [MCQ80] McQuillan, J., Richer, I., and E. Rosen, "The New Routing 3277 Algorithm for the ARPANET", Transaction IEEE Transactions 3278 on Communications, vol. 28, no. 5, p. 711-719, May 1980. 3280 [MR99] Mitra, D. and K. Ramakrishnan, "A Case Study of 3281 Multiservice, Multipriority Traffic Engineering Design for 3282 Data Networks", Proceedings Globecom'99, December 1999. 3284 [RFC1992] Castineyra, I., Chiappa, N., and M. Steenstrup, "The 3285 Nimrod Routing Architecture", RFC 1992, 3286 DOI 10.17487/RFC1992, August 1996, 3287 . 3289 [RFC2205] Braden, R., Ed., Zhang, L., Berson, S., Herzog, S., and S. 3290 Jamin, "Resource ReSerVation Protocol (RSVP) -- Version 1 3291 Functional Specification", RFC 2205, DOI 10.17487/RFC2205, 3292 September 1997, . 3294 [RFC2211] Wroclawski, J., "Specification of the Controlled-Load 3295 Network Element Service", RFC 2211, DOI 10.17487/RFC2211, 3296 September 1997, . 3298 [RFC2212] Shenker, S., Partridge, C., and R. Guerin, "Specification 3299 of Guaranteed Quality of Service", RFC 2212, 3300 DOI 10.17487/RFC2212, September 1997, 3301 . 3303 [RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, 3304 DOI 10.17487/RFC2328, April 1998, 3305 . 3307 [RFC2330] Paxson, V., Almes, G., Mahdavi, J., and M. Mathis, 3308 "Framework for IP Performance Metrics", RFC 2330, 3309 DOI 10.17487/RFC2330, May 1998, 3310 . 3312 [RFC2386] Crawley, E., Nair, R., Rajagopalan, B., and H. Sandick, "A 3313 Framework for QoS-based Routing in the Internet", 3314 RFC 2386, DOI 10.17487/RFC2386, August 1998, 3315 . 3317 [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, 3318 "Definition of the Differentiated Services Field (DS 3319 Field) in the IPv4 and IPv6 Headers", RFC 2474, 3320 DOI 10.17487/RFC2474, December 1998, 3321 . 3323 [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., 3324 and W. Weiss, "An Architecture for Differentiated 3325 Services", RFC 2475, DOI 10.17487/RFC2475, December 1998, 3326 . 3328 [RFC2597] Heinanen, J., Baker, F., Weiss, W., and J. Wroclawski, 3329 "Assured Forwarding PHB Group", RFC 2597, 3330 DOI 10.17487/RFC2597, June 1999, 3331 . 3333 [RFC2678] Mahdavi, J. and V. Paxson, "IPPM Metrics for Measuring 3334 Connectivity", RFC 2678, DOI 10.17487/RFC2678, September 3335 1999, . 3337 [RFC2702] Awduche, D., Malcolm, J., Agogbua, J., O'Dell, M., and J. 3338 McManus, "Requirements for Traffic Engineering Over MPLS", 3339 RFC 2702, DOI 10.17487/RFC2702, September 1999, 3340 . 3342 [RFC2722] Brownlee, N., Mills, C., and G. Ruth, "Traffic Flow 3343 Measurement: Architecture", RFC 2722, 3344 DOI 10.17487/RFC2722, October 1999, 3345 . 3347 [RFC2753] Yavatkar, R., Pendarakis, D., and R. Guerin, "A Framework 3348 for Policy-based Admission Control", RFC 2753, 3349 DOI 10.17487/RFC2753, January 2000, 3350 . 3352 [RFC2961] Berger, L., Gan, D., Swallow, G., Pan, P., Tommasi, F., 3353 and S. Molendini, "RSVP Refresh Overhead Reduction 3354 Extensions", RFC 2961, DOI 10.17487/RFC2961, April 2001, 3355 . 3357 [RFC2998] Bernet, Y., Ford, P., Yavatkar, R., Baker, F., Zhang, L., 3358 Speer, M., Braden, R., Davie, B., Wroclawski, J., and E. 3359 Felstaine, "A Framework for Integrated Services Operation 3360 over Diffserv Networks", RFC 2998, DOI 10.17487/RFC2998, 3361 November 2000, . 3363 [RFC3031] Rosen, E., Viswanathan, A., and R. Callon, "Multiprotocol 3364 Label Switching Architecture", RFC 3031, 3365 DOI 10.17487/RFC3031, January 2001, 3366 . 3368 [RFC3086] Nichols, K. and B. Carpenter, "Definition of 3369 Differentiated Services Per Domain Behaviors and Rules for 3370 their Specification", RFC 3086, DOI 10.17487/RFC3086, 3371 April 2001, . 3373 [RFC3124] Balakrishnan, H. and S. Seshan, "The Congestion Manager", 3374 RFC 3124, DOI 10.17487/RFC3124, June 2001, 3375 . 3377 [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., 3378 and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP 3379 Tunnels", RFC 3209, DOI 10.17487/RFC3209, December 2001, 3380 . 3382 [RFC3270] Le Faucheur, F., Wu, L., Davie, B., Davari, S., Vaananen, 3383 P., Krishnan, R., Cheval, P., and J. Heinanen, "Multi- 3384 Protocol Label Switching (MPLS) Support of Differentiated 3385 Services", RFC 3270, DOI 10.17487/RFC3270, May 2002, 3386 . 3388 [RFC3272] Awduche, D., Chiu, A., Elwalid, A., Widjaja, I., and X. 3389 Xiao, "Overview and Principles of Internet Traffic 3390 Engineering", RFC 3272, DOI 10.17487/RFC3272, May 2002, 3391 . 3393 [RFC3469] Sharma, V., Ed. and F. Hellstrand, Ed., "Framework for 3394 Multi-Protocol Label Switching (MPLS)-based Recovery", 3395 RFC 3469, DOI 10.17487/RFC3469, February 2003, 3396 . 3398 [RFC3630] Katz, D., Kompella, K., and D. Yeung, "Traffic Engineering 3399 (TE) Extensions to OSPF Version 2", RFC 3630, 3400 DOI 10.17487/RFC3630, September 2003, 3401 . 3403 [RFC3945] Mannie, E., Ed., "Generalized Multi-Protocol Label 3404 Switching (GMPLS) Architecture", RFC 3945, 3405 DOI 10.17487/RFC3945, October 2004, 3406 . 3408 [RFC4124] Le Faucheur, F., Ed., "Protocol Extensions for Support of 3409 Diffserv-aware MPLS Traffic Engineering", RFC 4124, 3410 DOI 10.17487/RFC4124, June 2005, 3411 . 3413 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 3414 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 3415 DOI 10.17487/RFC4271, January 2006, 3416 . 3418 [RFC5305] Li, T. and H. Smit, "IS-IS Extensions for Traffic 3419 Engineering", RFC 5305, DOI 10.17487/RFC5305, October 3420 2008, . 3422 [RFC5440] Vasseur, JP., Ed. and JL. Le Roux, Ed., "Path Computation 3423 Element (PCE) Communication Protocol (PCEP)", RFC 5440, 3424 DOI 10.17487/RFC5440, March 2009, 3425 . 3427 [RFC6241] Enns, R., Ed., Bjorklund, M., Ed., Schoenwaelder, J., Ed., 3428 and A. Bierman, Ed., "Network Configuration Protocol 3429 (NETCONF)", RFC 6241, DOI 10.17487/RFC6241, June 2011, 3430 . 3432 [RFC7390] Rahman, A., Ed. and E. Dijk, Ed., "Group Communication for 3433 the Constrained Application Protocol (CoAP)", RFC 7390, 3434 DOI 10.17487/RFC7390, October 2014, 3435 . 3437 [RFC7679] Almes, G., Kalidindi, S., Zekauskas, M., and A. Morton, 3438 Ed., "A One-Way Delay Metric for IP Performance Metrics 3439 (IPPM)", STD 81, RFC 7679, DOI 10.17487/RFC7679, January 3440 2016, . 3442 [RFC7680] Almes, G., Kalidindi, S., Zekauskas, M., and A. Morton, 3443 Ed., "A One-Way Loss Metric for IP Performance Metrics 3444 (IPPM)", STD 82, RFC 7680, DOI 10.17487/RFC7680, January 3445 2016, . 3447 [RFC7950] Bjorklund, M., Ed., "The YANG 1.1 Data Modeling Language", 3448 RFC 7950, DOI 10.17487/RFC7950, August 2016, 3449 . 3451 [RFC8040] Bierman, A., Bjorklund, M., and K. Watsen, "RESTCONF 3452 Protocol", RFC 8040, DOI 10.17487/RFC8040, January 2017, 3453 . 3455 [RR94] Rodrigues, M. and K. Ramakrishnan, "Optimal Routing in 3456 Shortest Path Networks", Proceedings ITS'94, Rio de 3457 Janeiro, Brazil, 1994. 3459 [SLDC98] Suter, B., Lakshman, T., Stiliadis, D., and A. Choudhury, 3460 "Design Considerations for Supporting TCP with Per-flow 3461 Queueing", Proceedings INFOCOM'98, p. 299-306, 1998. 3463 [WANG] Wang, Y., Wang, Z., and L. Zhang, "Internet traffic 3464 engineering without full mesh overlaying", 3465 Proceedings INFOCOM'2001, April 2001. 3467 [XIAO] Xiao, X., Hannan, A., Bailey, B., and L. Ni, "Traffic 3468 Engineering with MPLS in the Internet", Article IEEE 3469 Network Magazine, March 2000. 3471 [YARE95] Yang, C. and A. Reddy, "A Taxonomy for Congestion Control 3472 Algorithms in Packet Switching Networks", Article IEEE 3473 Network Magazine, p. 34-45, 1995. 3475 Author's Address 3477 Adrian Farrel (editor) 3478 Old Dog Consulting 3480 Email: adrian@olddog.co.uk