idnits 2.17.1 draft-dt-teas-rfc3272bis-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 5, 2020) is 1543 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- -- Obsolete informational reference (is this intentional?): RFC 3272 (Obsoleted by RFC 9522) -- Obsolete informational reference (is this intentional?): RFC 7752 (Obsoleted by RFC 9552) Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TEAS Working Group A. Farrel, Ed. 3 Internet-Draft Old Dog Consulting 4 Obsoletes: 3272 (if approved) February 5, 2020 5 Intended status: Informational 6 Expires: August 8, 2020 8 Overview and Principles of Internet Traffic Engineering 9 draft-dt-teas-rfc3272bis-07 11 Abstract 13 This memo describes the principles of Traffic Engineering (TE) in the 14 Internet. The document is intended to promote better understanding 15 of the issues surrounding traffic engineering in IP networks, and to 16 provide a common basis for the development of traffic engineering 17 capabilities for the Internet. The principles, architectures, and 18 methodologies for performance evaluation and performance optimization 19 of operational IP networks are discussed throughout this document. 21 This work was first published as RFC 3272 in May 2002. This document 22 obsoletes RFC 3272 by making a complete update to bring the text in 23 line with current best practices for Internet traffic engineering and 24 to include references to the latest relevant work in the IETF. 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at https://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on August 8, 2020. 43 Copyright Notice 45 Copyright (c) 2020 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (https://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 61 1.1. What is Internet Traffic Engineering? . . . . . . . . . . 4 62 1.2. Scope . . . . . . . . . . . . . . . . . . . . . . . . . . 7 63 1.3. Terminology . . . . . . . . . . . . . . . . . . . . . . . 7 64 2. Background . . . . . . . . . . . . . . . . . . . . . . . . . 10 65 2.1. Context of Internet Traffic Engineering . . . . . . . . . 11 66 2.2. Network Context . . . . . . . . . . . . . . . . . . . . . 12 67 2.3. Problem Context . . . . . . . . . . . . . . . . . . . . . 13 68 2.3.1. Congestion and its Ramifications . . . . . . . . . . 15 69 2.4. Solution Context . . . . . . . . . . . . . . . . . . . . 15 70 2.4.1. Combating the Congestion Problem . . . . . . . . . . 17 71 2.5. Implementation and Operational Context . . . . . . . . . 20 72 2.6. High-Level Objectives . . . . . . . . . . . . . . . . . . 21 73 3. Traffic Engineering Process Models . . . . . . . . . . . . . 23 74 3.1. Components of the Traffic Engineering Process Model . . . 24 75 3.2. Measurement . . . . . . . . . . . . . . . . . . . . . . . 24 76 3.3. Modeling, Analysis, and Simulation . . . . . . . . . . . 25 77 3.4. Optimization . . . . . . . . . . . . . . . . . . . . . . 26 78 4. Review of TE Techniques . . . . . . . . . . . . . . . . . . . 27 79 4.1. Historic Overview . . . . . . . . . . . . . . . . . . . . 28 80 4.1.1. Traffic Engineering in Classical Telephone Networks . 28 81 4.1.2. Evolution of Traffic Engineering in Packet Networks . 29 82 4.2. Development of Internet Traffic Engineering . . . . . . . 32 83 4.2.1. Overlay Model . . . . . . . . . . . . . . . . . . . . 32 84 4.2.2. Constraint-Based Routing . . . . . . . . . . . . . . 33 85 4.3. Overview of IETF Projects Related to Traffic Engineering 33 86 4.3.1. Integrated Services . . . . . . . . . . . . . . . . . 33 87 4.3.2. RSVP . . . . . . . . . . . . . . . . . . . . . . . . 34 88 4.3.3. Differentiated Services . . . . . . . . . . . . . . . 35 89 4.3.4. MPLS . . . . . . . . . . . . . . . . . . . . . . . . 36 90 4.3.5. Generalized MPLS . . . . . . . . . . . . . . . . . . 37 91 4.3.6. IP Performance Metrics . . . . . . . . . . . . . . . 37 92 4.3.7. Flow Measurement . . . . . . . . . . . . . . . . . . 38 93 4.3.8. Endpoint Congestion Management . . . . . . . . . . . 38 94 4.3.9. TE Extensions to the IGPs . . . . . . . . . . . . . . 38 95 4.3.10. Link-State BGP . . . . . . . . . . . . . . . . . . . 39 96 4.3.11. Path Computation Element . . . . . . . . . . . . . . 39 97 4.3.12. Application-Layer Traffic Optimization . . . . . . . 40 98 4.3.13. Segment Routing with MPLS encapsuation (SR-MPLS) . . 40 99 4.3.14. Network Virtualization and Abstraction . . . . . . . 41 100 4.3.15. Deterministic Networking . . . . . . . . . . . . . . 41 101 4.3.16. Network TE State Definition and Presentation . . . . 41 102 4.3.17. System Management and Control Interfaces . . . . . . 41 103 4.4. Overview of ITU Activities Related to Traffic Engineering 42 104 4.5. Content Distribution . . . . . . . . . . . . . . . . . . 43 105 5. Taxonomy of Traffic Engineering Systems . . . . . . . . . . . 44 106 5.1. Time-Dependent Versus State-Dependent Versus Event 107 Dependent . . . . . . . . . . . . . . . . . . . . . . . . 44 108 5.2. Offline Versus Online . . . . . . . . . . . . . . . . . . 45 109 5.3. Centralized Versus Distributed . . . . . . . . . . . . . 46 110 5.3.1. Hybrid Systems . . . . . . . . . . . . . . . . . . . 46 111 5.3.2. Considerations for Software Defined Networking . . . 46 112 5.4. Local Versus Global . . . . . . . . . . . . . . . . . . . 46 113 5.5. Prescriptive Versus Descriptive . . . . . . . . . . . . . 47 114 5.5.1. Intent-Based Networking . . . . . . . . . . . . . . . 47 115 5.6. Open-Loop Versus Closed-Loop . . . . . . . . . . . . . . 47 116 5.7. Tactical vs Strategic . . . . . . . . . . . . . . . . . . 47 117 6. Objectives for Internet Traffic Engineering . . . . . . . . . 48 118 6.1. Routing . . . . . . . . . . . . . . . . . . . . . . . . . 48 119 6.2. Traffic Mapping . . . . . . . . . . . . . . . . . . . . . 51 120 6.3. Measurement . . . . . . . . . . . . . . . . . . . . . . . 51 121 6.4. Network Survivability . . . . . . . . . . . . . . . . . . 52 122 6.4.1. Survivability in MPLS Based Networks . . . . . . . . 55 123 6.4.2. Protection Option . . . . . . . . . . . . . . . . . . 56 124 6.5. Traffic Engineering in Diffserv Environments . . . . . . 56 125 6.6. Network Controllability . . . . . . . . . . . . . . . . . 58 126 7. Inter-Domain Considerations . . . . . . . . . . . . . . . . . 59 127 8. Overview of Contemporary TE Practices in Operational IP 128 Networks . . . . . . . . . . . . . . . . . . . . . . . . . . 61 129 9. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . 65 130 10. Security Considerations . . . . . . . . . . . . . . . . . . . 66 131 11. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 66 132 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 66 133 13. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 66 134 14. Informative References . . . . . . . . . . . . . . . . . . . 69 135 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 76 137 1. Introduction 139 This memo describes the principles of Internet traffic engineering. 140 The objective of the document is to articulate the general issues and 141 principles for Internet traffic engineering; and where appropriate to 142 provide recommendations, guidelines, and options for the development 143 of online and offline Internet traffic engineering capabilities and 144 support systems. 146 This document can aid service providers in devising and implementing 147 traffic engineering solutions for their networks. Networking 148 hardware and software vendors will also find this document helpful in 149 the development of mechanisms and support systems for the Internet 150 environment that support the traffic engineering function. 152 This document provides a terminology for describing and understanding 153 common Internet traffic engineering concepts. This document also 154 provides a taxonomy of known traffic engineering styles. In this 155 context, a traffic engineering style abstracts important aspects from 156 a traffic engineering methodology. Traffic engineering styles can be 157 viewed in different ways depending upon the specific context in which 158 they are used and the specific purpose which they serve. The 159 combination of styles and views results in a natural taxonomy of 160 traffic engineering systems. 162 Even though Internet traffic engineering is most effective when 163 applied end-to-end, the focus of this document document is traffic 164 engineering within a given autonomous system. However, because a 165 preponderance of Internet traffic tends to be originating in one 166 autonomous system and terminating in another, this document provides 167 an overview of aspects pertaining to inter-domain traffic 168 engineering. 170 This work was first published as [RFC3272] in May 2002. This 171 document obsoletes [RFC3272] by making a complete update to bring the 172 text in line with current best practices for Internet traffic 173 engineering and to include references to the latest relevant work in 174 the IETF. 176 1.1. What is Internet Traffic Engineering? 178 The Internet exists in order to transfer information from source 179 nodes to destination nodes. Accordingly, one of the most significant 180 functions performed by the Internet is the routing of traffic from 181 ingress nodes to egress nodes. Therefore, one of the most 182 distinctive functions performed by Internet traffic engineering is 183 the control and optimization of the routing function, to steer 184 traffic through the network. 186 Internet traffic engineering is defined as that aspect of Internet 187 network engineering dealing with the issue of performance evaluation 188 and performance optimization of operational IP networks. Traffic 189 Engineering encompasses the application of technology and scientific 190 principles to the measurement, characterization, modeling, and 191 control of Internet traffic [RFC2702], [AWD2]. 193 Ultimately, it is the performance of the network as seen by end users 194 of network services that is truly paramount. The characteristics 195 visible to end users are the emergent properties of the network, 196 which are the characteristics of the network when viewed as a whole. 197 A central goal of the service provider, therefore, is to enhance the 198 emergent properties of the network while taking economic 199 considerations into account. This is accomplished by addressing 200 traffic oriented performance requirements, while utilizing network 201 resources economically and reliably. Traffic oriented performance 202 measures include delay, delay variation, packet loss, and throughput. 204 Internet traffic engineering responds at different temporal 205 resolution to network events. Certain aspects of capacity 206 management, such as capacity planning, respond at very coarse 207 temporal levels, ranging from days to possibly years. The 208 introduction of automatically switched optical transport networks 209 (e.g., based on GMPLS concepts, see Section 4.3.5) could 210 significantly reduce the lifecycle for capacity planning by 211 expediting provisioning of optical bandwidth. Routing control 212 functions operate at intermediate levels of temporal resolution, 213 ranging from milliseconds to days. Finally, the packet level 214 processing functions (e.g., rate shaping, queue management, and 215 scheduling) operate at very fine levels of temporal resolution, 216 ranging from picoseconds to milliseconds while responding to the 217 real-time statistical behavior of traffic. 219 Thus, the optimization aspects of traffic engineering can be viewed 220 from a control perspective and can be pro-active and/or reactive. In 221 the pro-active case, the traffic engineering control system takes 222 preventive action to obviate predicted unfavorable future network 223 states such as e.g. engineering a backup path. It may also take 224 perfective action to induce a more desirable state in the future. In 225 the reactive case, the control system responds correctively and 226 perhaps adaptively to events that have already transpired in the 227 network, such as routing after failure. 229 Another important objective of Internet traffic engineering is to 230 facilitate reliable network operations [RFC2702]. Reliable network 231 operations can be facilitated by providing mechanisms that enhance 232 network integrity and by embracing policies emphasizing network 233 survivability. This results in a minimization of the vulnerability 234 of the network to service outages arising from errors, faults, and 235 failures occurring within the infrastructure. 237 The optimization aspects of traffic engineering can be achieved 238 through capacity management and traffic management. As used in this 239 document, capacity management includes capacity planning, routing 240 control, and resource management. Network resources of particular 241 interest include link bandwidth, buffer space, and computational 242 resources. Likewise, as used in this document, traffic management 243 includes (1) nodal traffic control functions such as traffic 244 conditioning, queue management, scheduling, and (2) other functions 245 that regulate traffic flow through the network or that arbitrate 246 access to network resources between different packets or between 247 different traffic streams. 249 One major challenge of Internet traffic engineering is the 250 realization of automated control capabilities that adapt quickly and 251 cost effectively to significant changes in a network's state, while 252 still maintaining stability of the network. Results from performance 253 evaluation assessing the effectiveness of traffic engineering methods 254 can be used to identify existing problems, guide network re- 255 optimization, and aid in the prediction of potential future problems. 256 However this process can also time consuming and may not be suitable 257 to act on sudden, ephemeral changes in the network. 259 Performance evaluation can be achieved in many different ways. The 260 most notable techniques include analytical methods, simulation, and 261 empirical methods based on measurements. The processs can be quite 262 complicated in practical network contexts. For example, simplifying 263 concepts such as effective bandwidth and effective buffer [ELW95] may 264 be used to approximate nodal behaviors at the packet level and 265 simplify the analysis at the connection level. A set of concepts 266 known as network calculus [CRUZ] based on deterministic bounds may 267 simplify network analysis relative to classical stochastic 268 techniques. 270 In many cases, Internet traffic engineering is about finding the 271 least bad action to take to enhance the performance of the network. 272 For this, it is necessary to reliably predict the impact of potential 273 corrective actions in a networking context. Such a prediction often 274 relies on the accuracy of a simulation model identifying root cause 275 and duration of a bottleneck, as well as the effectiveness of 276 corrective actions and its side-effects over time. 278 In genaral, traffic engineering comes in two flavours. Either as a 279 background process that constantly monitors traffic and optimize the 280 usage of resources to improve performance, or in form of a pre- 281 planned optimized traffic distribution that is considered optimal. 282 In the later case, any deviation from the optimum distribution (e.g., 283 caused by a fiber cut) is reverted upon repair without further 284 optimization. However, this form of traffic engineering heavily 285 relies upon the notion that the planned state of the network is 286 indeed optimal. Hence, in such a mode there are two levels of 287 traffic engineering: the TE-planning task to enable an optimum 288 traffic distribution, and the routing task keeping traffic flows 289 attached to the pre-planned distribution 291 As a general rule, traffic engineering concepts and mechanisms must 292 be sufficiently specific and well defined to address known 293 requirements, but simultaneously flexible and extensible to 294 accommodate unforeseen future demands. Optimizing the wrong measures 295 may achieve certain local objectives, but may have disastrous 296 consequences on the emergent properties of the network. 298 1.2. Scope 300 The scope of this document is intra-domain traffic engineering; that 301 is, traffic engineering within a given autonomous system in the 302 Internet. This document will discuss concepts pertaining to intra- 303 domain traffic control, including such issues as routing control, 304 micro and macro resource allocation, and the control coordination 305 problems that arise consequently. 307 This document will describe and characterize techniques already in 308 use or in advanced development for Internet traffic engineering. The 309 way these techniques fit together will be discussed and scenarios in 310 which they are useful will be identified. 312 While this document considers various intra-domain traffic 313 engineering approaches, it focuses more on traffic engineering with 314 MPLS and GMPLS. Traffic engineering based upon manipulation of IGP 315 metrics is not addressed in detail. This topic may be addressed by 316 other working group document(s). 318 Although the emphasis is on intra-domain traffic engineering, in 319 Section 7, an overview of the high level considerations pertaining to 320 inter-domain traffic engineering will be provided. Inter-domain 321 Internet traffic engineering is crucial to the performance 322 enhancement of the global Internet infrastructure. 324 Whenever possible, relevant requirements from existing IETF documents 325 and other sources will be incorporated by reference. 327 1.3. Terminology 329 This subsection provides terminology which is useful for Internet 330 traffic engineering. The definitions presented apply to this 331 document. These terms may have other meanings elsewhere. 333 Baseline analysis A study conducted to serve as a baseline for 334 comparison to the actual behavior of the network. 336 Busy hour A one hour period within a specified interval of time 337 (typically 24 hours) in which the traffic load in a network or 338 sub-network is greatest. 340 Bottleneck A network element whose input traffic rate tends to be 341 greater than its output rate. 343 Congestion A state of a network resource in which the traffic 344 incident on the resource exceeds its output capacity over an 345 interval of time. 347 Congestion avoidance An approach to congestion management that 348 attempts to obviate the occurrence of congestion. 350 Congestion control An approach to congestion management that 351 attempts to remedy congestion problems that have already occurred. 353 Constraint-based routing A class of routing protocols that take 354 specified traffic attributes, network constraints, and policy 355 constraints into account when making routing decisions. 356 Constraint-based routing is applicable to traffic aggregates as 357 well as flows. It is a generalization of QoS routing. 359 Demand side congestion management A congestion management scheme 360 that addresses congestion problems by regulating or conditioning 361 offered load. 363 Effective bandwidth The minimum amount of bandwidth that can be 364 assigned to a flow or traffic aggregate in order to deliver 365 'acceptable service quality' to the flow or traffic aggregate. 367 Egress traffic Traffic exiting a network or network element. 369 Hot-spot A network element or subsystem which is in a state of 370 congestion. 372 Ingress traffic Traffic entering a network or network element. 374 Inter-domain traffic Traffic that originates in one Autonomous 375 system and terminates in another. 377 Loss network A network that does not provide adequate buffering for 378 traffic, so that traffic entering a busy resource within the 379 network will be dropped rather than queued. 381 Metric A parameter defined in terms of standard units of 382 measurement. 384 Measurement Methodology A repeatable measurement technique used to 385 derive one or more metrics of interest. 387 Network Survivability The capability to provide a prescribed level 388 of QoS for existing services after a given number of failures 389 occur within the network. 391 Offline traffic engineering A traffic engineering system that exists 392 outside of the network. 394 Online traffic engineering A traffic engineering system that exists 395 within the network, typically implemented on or as adjuncts to 396 operational network elements. 398 Performance measures Metrics that provide quantitative or 399 qualitative measures of the performance of systems or subsystems 400 of interest. 402 Performance management A systematic approach to improving 403 effectiveness in the accomplishment of specific networking goals 404 related to performance improvement. 406 Performance Metric A performance parameter defined in terms of 407 standard units of measurement. 409 Provisioning The process of assigning or configuring network 410 resources to meet certain requests. 412 QoS routing Class of routing systems that selects paths to be used 413 by a flow based on the QoS requirements of the flow. 415 Service Level Agreement A contract between a provider and a customer 416 that guarantees specific levels of performance and reliability at 417 a certain cost. 419 Stability An operational state in which a network does not oscillate 420 in a disruptive manner from one mode to another mode. 422 Supply side congestion management A congestion management scheme 423 that provisions additional network resources to address existing 424 and/or anticipated congestion problems. 426 Transit traffic Traffic whose origin and destination are both 427 outside of the network under consideration. 429 Traffic characteristic A description of the temporal behavior or a 430 description of the attributes of a given traffic flow or traffic 431 aggregate. 433 Traffic engineering system A collection of objects, mechanisms, and 434 protocols that are used conjunctively to accomplish traffic 435 engineering objectives. 437 Traffic flow A stream of packets between two end-points that can be 438 characterized in a certain way. A micro-flow has a more specific 439 definition A micro-flow is a stream of packets with the same 440 source and destination addresses, source and destination ports, 441 and protocol ID. 443 Traffic intensity A measure of traffic loading with respect to a 444 resource capacity over a specified period of time. In classical 445 telephony systems, traffic intensity is measured in units of 446 Erlang. 448 Traffic matrix A representation of the traffic demand between a set 449 of origin and destination abstract nodes. An abstract node can 450 consist of one or more network elements. 452 Traffic monitoring The process of observing traffic characteristics 453 at a given point in a network and collecting the traffic 454 information for analysis and further action. 456 Traffic trunk An aggregation of traffic flows belonging to the same 457 class which are forwarded through a common path. A traffic trunk 458 may be characterized by an ingress and egress node, and a set of 459 attributes which determine its behavioral characteristics and 460 requirements from the network. 462 2. Background 464 The Internet has quickly evolved into a very critical communications 465 infrastructure, supporting significant economic, educational, and 466 social activities. Simultaneously, the delivery of Internet 467 communications services has become very competitive and end-users are 468 demanding very high quality service from their service providers. 469 Consequently, performance optimization of large scale IP networks, 470 especially public Internet backbones, have become an important 471 problem. Network performance requirements are multi-dimensional, 472 complex, and sometimes contradictory; making the traffic engineering 473 problem very challenging. 475 The network must convey IP packets from ingress nodes to egress nodes 476 efficiently, expeditiously, and economically. Furthermore, in a 477 multiclass service environment (e.g., Diffserv capable networks), the 478 resource sharing parameters of the network must be appropriately 479 determined and configured according to prevailing policies and 480 service models to resolve resource contention issues arising from 481 mutual interference between packets traversing through the network. 482 Thus, consideration must be given to resolving competition for 483 network resources between traffic streams belonging to the same 484 service class (intra-class contention resolution) and traffic streams 485 belonging to different classes (inter-class contention resolution). 487 2.1. Context of Internet Traffic Engineering 489 The context of Internet traffic engineering pertains to the scenarios 490 where traffic engineering is used. A traffic engineering methodology 491 establishes appropriate rules to resolve traffic performance issues 492 occurring in a specific context. The context of Internet traffic 493 engineering includes: 495 1. A network context defining the universe of discourse, and in 496 particular the situations in which the traffic engineering 497 problems occur. The network context includes network structure, 498 network policies, network characteristics, network constraints, 499 network quality attributes, and network optimization criteria. 501 2. A problem context defining the general and concrete issues that 502 traffic engineering addresses. The problem context includes 503 identification, abstraction of relevant features, representation, 504 formulation, specification of the requirements on the solution 505 space, and specification of the desirable features of acceptable 506 solutions. 508 3. A solution context suggesting how to address the issues 509 identified by the problem context. The solution context includes 510 analysis, evaluation of alternatives, prescription, and 511 resolution. 513 4. An implementation and operational context in which the solutions 514 are methodologically instantiated. The implementation and 515 operational context includes planning, organization, and 516 execution. 518 The context of Internet traffic engineering and the different problem 519 scenarios are discussed in the following subsections. 521 2.2. Network Context 523 IP networks range in size from small clusters of routers situated 524 within a given location, to thousands of interconnected routers, 525 switches, and other components distributed all over the world. 527 Conceptually, at the most basic level of abstraction, an IP network 528 can be represented as a distributed dynamical system consisting of: 529 (1) a set of interconnected resources which provide transport 530 services for IP traffic subject to certain constraints, (2) a demand 531 system representing the offered load to be transported through the 532 network, and (3) a response system consisting of network processes, 533 protocols, and related mechanisms which facilitate the movement of 534 traffic through the network (see also [AWD2]). 536 The network elements and resources may have specific characteristics 537 restricting the manner in which the demand is handled. Additionally, 538 network resources may be equipped with traffic control mechanisms 539 superintending the way in which the demand is serviced. Traffic 540 control mechanisms may, for example, be used to control various 541 packet processing activities within a given resource, arbitrate 542 contention for access to the resource by different packets, and 543 regulate traffic behavior through the resource. A configuration 544 management and provisioning system may allow the settings of the 545 traffic control mechanisms to be manipulated by external or internal 546 entities in order to exercise control over the way in which the 547 network elements respond to internal and external stimuli. 549 The details of how the network provides transport services for 550 packets are specified in the policies of the network administrators 551 and are installed through network configuration management and policy 552 based provisioning systems. Generally, the types of services 553 provided by the network also depends upon the technology and 554 characteristics of the network elements and protocols, the prevailing 555 service and utility models, and the ability of the network 556 administrators to translate policies into network configurations. 558 Contemporary Internet networks have three significant 559 characteristics: (1) they provide real-time services, (2) they have 560 become mission critical, and (3) their operating environments are 561 very dynamic. The dynamic characteristics of IP networks can be 562 attributed in part to fluctuations in demand, to the interaction 563 between various network protocols and processes, to the rapid 564 evolution of the infrastructure which demands the constant inclusion 565 of new technologies and new network elements, and to transient and 566 persistent impairments which occur within the system. 568 Packets contend for the use of network resources as they are conveyed 569 through the network. A network resource is considered to be 570 congested if the arrival rate of packets exceed the output capacity 571 of the resource over an interval of time. Congestion may result in 572 some of the arrival packets being delayed or even dropped. 574 Congestion increases transit delays, delay variation, packet loss, 575 and reduces the predictability of network services. Clearly, 576 congestion is a highly undesirable phenomenon. 578 Combating congestion at a reasonable cost is a major objective of 579 Internet traffic engineering. 581 Efficient sharing of network resources by multiple traffic streams is 582 a basic economic premise for packet switched networks in general and 583 for the Internet in particular. A fundamental challenge in network 584 operation, especially in a large scale public IP network, is to 585 increase the efficiency of resource utilization while minimizing the 586 possibility of congestion. 588 Increasingly, the Internet will have to function in the presence of 589 different classes of traffic with different service requirements. 590 The advent of Differentiated Services [RFC2475] makes this 591 requirement particularly acute. Thus, packets may be grouped into 592 behavior aggregates such that each behavior aggregate may have a 593 common set of behavioral characteristics or a common set of delivery 594 requirements. In practice, the delivery requirements of a specific 595 set of packets may be specified explicitly or implicitly. Two of the 596 most important traffic delivery requirements are capacity constraints 597 and QoS constraints. 599 Capacity constraints can be expressed statistically as peak rates, 600 mean rates, burst sizes, or as some deterministic notion of effective 601 bandwidth. QoS requirements can be expressed in terms of (1) 602 integrity constraints such as packet loss and (2) in terms of 603 temporal constraints such as timing restrictions for the delivery of 604 each packet (delay) and timing restrictions for the delivery of 605 consecutive packets belonging to the same traffic stream (delay 606 variation). 608 2.3. Problem Context 610 Fundamental problems exist in association with the operation of a 611 network described by the simple model of the previous subsection. 612 This subsection reviews the problem context in relation to the 613 traffic engineering function. 615 The identification, abstraction, representation, and measurement of 616 network features relevant to traffic engineering is a significant 617 issue. 619 One particularly important class of problems concerns how to 620 explicitly formulate the problems that traffic engineering attempts 621 to solve, how to identify the requirements on the solution space, how 622 to specify the desirable features of good solutions, how to actually 623 solve the problems, and how to measure and characterize the 624 effectiveness of the solutions. 626 Another class of problems concerns how to measure and estimate 627 relevant network state parameters. Effective traffic engineering 628 relies on a good estimate of the offered traffic load as well as a 629 view of the underlying topology and associated resource constraints. 630 A network-wide view of the topology is also a must for offline 631 planning. 633 Still another class of problems concerns how to characterize the 634 state of the network and how to evaluate its performance under a 635 variety of scenarios. The performance evaluation problem is two- 636 fold. One aspect of this problem relates to the evaluation of the 637 system level performance of the network. The other aspect relates to 638 the evaluation of the resource level performance, which restricts 639 attention to the performance analysis of individual network 640 resources. In this memo, we refer to the system level 641 characteristics of the network as the "macro-states" and the resource 642 level characteristics as the "micro-states." The system level 643 characteristics are also known as the emergent properties of the 644 network as noted earlier. Correspondingly, we shall refer to the 645 traffic engineering schemes dealing with network performance 646 optimization at the systems level as "macro-TE" and the schemes that 647 optimize at the individual resource level as "micro-TE." Under 648 certain circumstances, the system level performance can be derived 649 from the resource level performance using appropriate rules of 650 composition, depending upon the particular performance measures of 651 interest. 653 Another fundamental class of problems concerns how to effectively 654 optimize network performance. Performance optimization may entail 655 translating solutions to specific traffic engineering problems into 656 network configurations. Optimization may also entail some degree of 657 resource management control, routing control, and/or capacity 658 augmentation. 660 As noted previously, congestion is an undesirable phenomena in 661 operational networks. Therefore, the next subsection addresses the 662 issue of congestion and its ramifications within the problem context 663 of Internet traffic engineering. 665 2.3.1. Congestion and its Ramifications 667 Congestion is one of the most significant problems in an operational 668 IP context. A network element is said to be congested if it 669 experiences sustained overload over an interval of time. Congestion 670 almost always results in degradation of service quality to end users. 671 Congestion control schemes can include demand side policies and 672 supply side policies. Demand side policies may restrict access to 673 congested resources and/or dynamically regulate the demand to 674 alleviate the overload situation. Supply side policies may expand or 675 augment network capacity to better accommodate offered traffic. 676 Supply side policies may also re-allocate network resources by 677 redistributing traffic over the infrastructure. Traffic 678 redistribution and resource re-allocation serve to increase the 679 'effective capacity' seen by the demand. 681 The emphasis of this memo is primarily on congestion management 682 schemes falling within the scope of the network, rather than on 683 congestion management systems dependent upon sensitivity and 684 adaptivity from end-systems. That is, the aspects that are 685 considered in this memo with respect to congestion management are 686 those solutions that can be provided by control entities operating on 687 the network and by the actions of network administrators and network 688 operations systems. 690 2.4. Solution Context 692 The solution context for Internet traffic engineering involves 693 analysis, evaluation of alternatives, and choice between alternative 694 courses of action. Generally the solution context is predicated on 695 making reasonable inferences about the current or future state of the 696 network, and subsequently making appropriate decisions that may 697 involve a preference between alternative sets of action. More 698 specifically, the solution context demands reasonable estimates of 699 traffic workload, characterization of network state, deriving 700 solutions to traffic engineering problems which may be implicitly or 701 explicitly formulated, and possibly instantiating a set of control 702 actions. Control actions may involve the manipulation of parameters 703 associated with routing, control over tactical capacity acquisition, 704 and control over the traffic management functions. 706 The following list of instruments may be applicable to the solution 707 context of Internet traffic engineering. 709 1. A set of policies, objectives, and requirements (which may be 710 context dependent) for network performance evaluation and 711 performance optimization. 713 2. A collection of online and possibly offline tools and mechanisms 714 for measurement, characterization, modeling, and control of 715 Internet traffic and control over the placement and allocation of 716 network resources, as well as control over the mapping or 717 distribution of traffic onto the infrastructure. 719 3. A set of constraints on the operating environment, the network 720 protocols, and the traffic engineering system itself. 722 4. A set of quantitative and qualitative techniques and 723 methodologies for abstracting, formulating, and solving traffic 724 engineering problems. 726 5. A set of administrative control parameters which may be 727 manipulated through a Configuration Management (CM) system. The 728 CM system itself may include a configuration control subsystem, a 729 configuration repository, a configuration accounting subsystem, 730 and a configuration auditing subsystem. 732 6. A set of guidelines for network performance evaluation, 733 performance optimization, and performance improvement. 735 Derivation of traffic characteristics through measurement and/or 736 estimation is very useful within the realm of the solution space for 737 traffic engineering. Traffic estimates can be derived from customer 738 subscription information, traffic projections, traffic models, and 739 from actual empirical measurements. The empirical measurements may 740 be performed at the traffic aggregate level or at the flow level in 741 order to derive traffic statistics at various levels of detail. 742 Measurements at the flow level or on small traffic aggregates may be 743 performed at edge nodes, where traffic enters and leaves the network. 744 Measurements at large traffic aggregate levels may be performed 745 within the core of the network where potentially numerous traffic 746 flows may be in transit concurrently. 748 To conduct performance studies and to support planning of existing 749 and future networks, a routing analysis may be performed to determine 750 the path(s) the routing protocols will choose for various traffic 751 demands, and to ascertain the utilization of network resources as 752 traffic is routed through the network. The routing analysis should 753 capture the selection of paths through the network, the assignment of 754 traffic across multiple feasible routes, and the multiplexing of IP 755 traffic over traffic trunks (if such constructs exists) and over the 756 underlying network infrastructure. A network topology model is a 757 necessity for routing analysis. A network topology model may be 758 extracted from network architecture documents, from network designs, 759 from information contained in router configuration files, from 760 routing databases, from routing tables, or from automated tools that 761 discover and depict network topology information. Topology 762 information may also be derived from servers that monitor network 763 state, and from servers that perform provisioning functions. 765 Routing in operational IP networks can be administratively controlled 766 at various levels of abstraction including the manipulation of BGP 767 attributes and manipulation of IGP metrics. For path oriented 768 technologies such as MPLS, routing can be further controlled by the 769 manipulation of relevant traffic engineering parameters, resource 770 parameters, and administrative policy constraints. Within the 771 context of MPLS, the path of an explicit label switched path (LSP) 772 can be computed and established in various ways including: (1) 773 manually, (2) automatically online using constraint-based routing 774 processes implemented on label switching routers, and (3) 775 automatically offline using constraint-based routing entities 776 implemented on external traffic engineering support systems. 778 2.4.1. Combating the Congestion Problem 780 Minimizing congestion is a significant aspect of Internet traffic 781 engineering. This subsection gives an overview of the general 782 approaches that have been used or proposed to combat congestion 783 problems. 785 Congestion management policies can be categorized based upon the 786 following criteria (see e.g., [YARE95] for a more detailed taxonomy 787 of congestion control schemes): (1) Response time scale which can be 788 characterized as long, medium, or short; (2) reactive versus 789 preventive which relates to congestion control and congestion 790 avoidance; and (3) supply side versus demand side congestion 791 management schemes. These aspects are discussed in the following 792 paragraphs. 794 1. Congestion Management based on Response Time Scales 796 * Long (weeks to months): Capacity planning works over a 797 relatively long time scale to expand network capacity based on 798 estimates or forecasts of future traffic demand and traffic 799 distribution. Since router and link provisioning take time 800 and are generally expensive, these upgrades are typically 801 carried out in the weeks-to-months or even years time scale. 803 * Medium (minutes to days): Several control policies fall within 804 the medium time scale category. Examples include: (1) 805 Adjusting IGP and/or BGP parameters to route traffic away or 806 towards certain segments of the network; (2) Setting up and/or 807 adjusting some explicitly routed label switched paths (ER- 808 LSPs) in MPLS networks to route some traffic trunks away from 809 possibly congested resources or towards possibly more 810 favorable routes; (3) re-configuring the logical topology of 811 the network to make it correlate more closely with the spatial 812 traffic distribution using for example some underlying path- 813 oriented technology such as MPLS LSPs, ATM PVCs, or optical 814 channel trails. Many of these adaptive medium time scale 815 response schemes rely on a measurement system that monitors 816 changes in traffic distribution, traffic shifts, and network 817 resource utilization and subsequently provides feedback to the 818 online and/or offline traffic engineering mechanisms and tools 819 which employ this feedback information to trigger certain 820 control actions to occur within the network. The traffic 821 engineering mechanisms and tools can be implemented in a 822 distributed fashion or in a centralized fashion, and may have 823 a hierarchical structure or a flat structure. The comparative 824 merits of distributed and centralized control structures for 825 networks are well known. A centralized scheme may have global 826 visibility into the network state and may produce potentially 827 more optimal solutions. However, centralized schemes are 828 prone to single points of failure and may not scale as well as 829 distributed schemes. Moreover, the information utilized by a 830 centralized scheme may be stale and may not reflect the actual 831 state of the network. It is not an objective of this memo to 832 make a recommendation between distributed and centralized 833 schemes. This is a choice that network administrators must 834 make based on their specific needs. 836 * Short (picoseconds to minutes): This category includes packet 837 level processing functions and events on the order of several 838 round trip times. It includes router mechanisms such as 839 passive and active buffer management. These mechanisms are 840 used to control congestion and/or signal congestion to end 841 systems so that they can adaptively regulate the rate at which 842 traffic is injected into the network. One of the most popular 843 active queue management schemes, especially for TCP traffic, 844 is Random Early Detection (RED) [FLJA93], which supports 845 congestion avoidance by controlling the average queue size. 846 During congestion (but before the queue is filled), the RED 847 scheme chooses arriving packets to "mark" according to a 848 probabilistic algorithm which takes into account the average 849 queue size. For a router that does not utilize explicit 850 congestion notification (ECN) see e.g., [FLOY94], the marked 851 packets can simply be dropped to signal the inception of 852 congestion to end systems. On the other hand, if the router 853 supports ECN, then it can set the ECN field in the packet 854 header. Several variations of RED have been proposed to 855 support different drop precedence levels in multi-class 856 environments [RFC2597], e.g., RED with In and Out (RIO) and 857 Weighted RED. There is general consensus that RED provides 858 congestion avoidance performance which is not worse than 859 traditional Tail-Drop (TD) queue management (drop arriving 860 packets only when the queue is full). Importantly, however, 861 RED reduces the possibility of global synchronization and 862 improves fairness among different TCP sessions. However, RED 863 by itself can not prevent congestion and unfairness caused by 864 sources unresponsive to RED, e.g., UDP traffic and some 865 misbehaved greedy connections. Other schemes have been 866 proposed to improve the performance and fairness in the 867 presence of unresponsive traffic. Some of these schemes were 868 proposed as theoretical frameworks and are typically not 869 available in existing commercial products. Two such schemes 870 are Longest Queue Drop (LQD) and Dynamic Soft Partitioning 871 with Random Drop (RND) [SLDC98]. 873 2. Congestion Management: Reactive versus Preventive Schemes 875 * Reactive: reactive (recovery) congestion management policies 876 react to existing congestion problems to improve it. All the 877 policies described in the long and medium time scales above 878 can be categorized as being reactive especially if the 879 policies are based on monitoring and identifying existing 880 congestion problems, and on the initiation of relevant actions 881 to ease a situation. 883 * Preventive: preventive (predictive/avoidance) policies take 884 proactive action to prevent congestion based on estimates and 885 predictions of future potential congestion problems. Some of 886 the policies described in the long and medium time scales fall 887 into this category. They do not necessarily respond 888 immediately to existing congestion problems. Instead 889 forecasts of traffic demand and workload distribution are 890 considered and action may be taken to prevent potential 891 congestion problems in the future. The schemes described in 892 the short time scale (e.g., RED and its variations, ECN, LQD, 893 and RND) are also used for congestion avoidance since dropping 894 or marking packets before queues actually overflow would 895 trigger corresponding TCP sources to slow down. 897 3. Congestion Management: Supply Side versus Demand Side Schemes 899 * Supply side: supply side congestion management policies 900 increase the effective capacity available to traffic in order 901 to control or obviate congestion. This can be accomplished by 902 augmenting capacity. Another way to accomplish this is to 903 minimize congestion by having a relatively balanced 904 distribution of traffic over the network. For example, 905 capacity planning should aim to provide a physical topology 906 and associated link bandwidths that match estimated traffic 907 workload and traffic distribution based on forecasting 908 (subject to budgetary and other constraints). However, if 909 actual traffic distribution does not match the topology 910 derived from capacity panning (due to forecasting errors or 911 facility constraints for example), then the traffic can be 912 mapped onto the existing topology using routing control 913 mechanisms, using path oriented technologies (e.g., MPLS LSPs 914 and optical channel trails) to modify the logical topology, or 915 by using some other load redistribution mechanisms. 917 * Demand side: demand side congestion management policies 918 control or regulate the offered traffic to alleviate 919 congestion problems. For example, some of the short time 920 scale mechanisms described earlier (such as RED and its 921 variations, ECN, LQD, and RND) as well as policing and rate 922 shaping mechanisms attempt to regulate the offered load in 923 various ways. Tariffs may also be applied as a demand side 924 instrument. To date, however, tariffs have not been used as a 925 means of demand side congestion management within the 926 Internet. 928 In summary, a variety of mechanisms can be used to address congestion 929 problems in IP networks. These mechanisms may operate at multiple 930 time-scales. 932 2.5. Implementation and Operational Context 934 The operational context of Internet traffic engineering is 935 characterized by constant change which occur at multiple levels of 936 abstraction. The implementation context demands effective planning, 937 organization, and execution. The planning aspects may involve 938 determining prior sets of actions to achieve desired objectives. 939 Organizing involves arranging and assigning responsibility to the 940 various components of the traffic engineering system and coordinating 941 the activities to accomplish the desired TE objectives. Execution 942 involves measuring and applying corrective or perfective actions to 943 attain and maintain desired TE goals. 945 2.6. High-Level Objectives 947 The high-level objectives for Internet traffic engineering include: 948 usability, automation, scalability, stability, visibility, 949 simplicity, efficiency, reliability, correctness, maintainability, 950 extensibility, interoperability, and security. In a given context, 951 some of these recommendations may be critical while others may be 952 optional. Therefore, prioritization may be required during the 953 development phase of a traffic engineering system (or components 954 thereof) to tailor it to a specific operational context. 956 In the following paragraphs, some of the aspects of the high-level 957 objectives for Internet traffic engineering are summarized. 959 Usability: Usability is a human factor aspect of traffic engineering 960 systems. Usability refers to the ease with which a traffic 961 engineering system can be deployed and operated. In general, it is 962 desirable to have a TE system that can be readily deployed in an 963 existing network. It is also desirable to have a TE system that is 964 easy to operate and maintain. 966 Automation: Whenever feasible, a traffic engineering system should 967 automate as many traffic engineering functions as possible to 968 minimize the amount of human effort needed to control and analyze 969 operational networks. Automation is particularly imperative in large 970 scale public networks because of the high cost of the human aspects 971 of network operations and the high risk of network problems caused by 972 human errors. Automation may entail the incorporation of automatic 973 feedback and intelligence into some components of the traffic 974 engineering system. 976 Scalability: Contemporary public networks are growing very fast with 977 respect to network size and traffic volume. Therefore, a TE system 978 should be scalable to remain applicable as the network evolves. In 979 particular, a TE system should remain functional as the network 980 expands with regard to the number of routers and links, and with 981 respect to the traffic volume. A TE system should have a scalable 982 architecture, should not adversely impair other functions and 983 processes in a network element, and should not consume too much 984 network resources when collecting and distributing state information 985 or when exerting control. 987 Stability: Stability is a very important consideration in traffic 988 engineering systems that respond to changes in the state of the 989 network. State-dependent traffic engineering methodologies typically 990 mandate a tradeoff between responsiveness and stability. It is 991 strongly recommended that when tradeoffs are warranted between 992 responsiveness and stability, that the tradeoff should be made in 993 favor of stability (especially in public IP backbone networks). 995 Flexibility: A TE system should be flexible to allow for changes in 996 optimization policy. In particular, a TE system should provide 997 sufficient configuration options so that a network administrator can 998 tailor the TE system to a particular environment. It may also be 999 desirable to have both online and offline TE subsystems which can be 1000 independently enabled and disabled. TE systems that are used in 1001 multi-class networks should also have options to support class based 1002 performance evaluation and optimization. 1004 Visibility: As part of the TE system, mechanisms should exist to 1005 collect statistics from the network and to analyze these statistics 1006 to determine how well the network is functioning. Derived statistics 1007 such as traffic matrices, link utilization, latency, packet loss, and 1008 other performance measures of interest which are determined from 1009 network measurements can be used as indicators of prevailing network 1010 conditions. Other examples of status information which should be 1011 observed include existing functional routing information 1012 (additionally, in the context of MPLS existing LSP routes), etc. 1014 Simplicity: Generally, a TE system should be as simple as possible. 1015 More importantly, the TE system should be relatively easy to use 1016 (i.e., clean, convenient, and intuitive user interfaces). Simplicity 1017 in user interface does not necessarily imply that the TE system will 1018 use naive algorithms. When complex algorithms and internal 1019 structures are used, such complexities should be hidden as much as 1020 possible from the network administrator through the user interface. 1022 Interoperability: Whenever feasible, traffic engineering systems and 1023 their components should be developed with open standards based 1024 interfaces to allow interoperation with other systems and components. 1026 Security: Security is a critical consideration in traffic engineering 1027 systems. Such traffic engineering systems typically exert control 1028 over certain functional aspects of the network to achieve the desired 1029 performance objectives. Therefore, adequate measures must be taken 1030 to safeguard the integrity of the traffic engineering system. 1031 Adequate measures must also be taken to protect the network from 1032 vulnerabilities that originate from security breaches and other 1033 impairments within the traffic engineering system. 1035 The remainder of this section will focus on some of the high level 1036 functional recommendations for traffic engineering. 1038 3. Traffic Engineering Process Models 1040 This section describes a generic process model that captures the high 1041 level practical aspects of Internet traffic engineering in an 1042 operational context. The process model is described as a sequence of 1043 actions that a traffic engineer, or more generally a traffic 1044 engineering system, must perform to optimize the performance of an 1045 operational network (see also [RFC2702], [AWD2]). The process model 1046 described here represents the broad activities common to most traffic 1047 engineering methodologies although the details regarding how traffic 1048 engineering is executed may differ from network to network. This 1049 process model may be enacted explicitly or implicitly, by an 1050 automaton and/or by a human. 1052 The traffic engineering process model is iterative [AWD2]. The four 1053 phases of the process model described below are repeated continually. 1055 The first phase of the TE process model is to define the relevant 1056 control policies that govern the operation of the network. These 1057 policies may depend upon many factors including the prevailing 1058 business model, the network cost structure, the operating 1059 constraints, the utility model, and optimization criteria. 1061 The second phase of the process model is a feedback mechanism 1062 involving the acquisition of measurement data from the operational 1063 network. If empirical data is not readily available from the 1064 network, then synthetic workloads may be used instead which reflect 1065 either the prevailing or the expected workload of the network. 1066 Synthetic workloads may be derived by estimation or extrapolation 1067 using prior empirical data. Their derivation may also be obtained 1068 using mathematical models of traffic characteristics or other means. 1070 The third phase of the process model is to analyze the network state 1071 and to characterize traffic workload. Performance analysis may be 1072 proactive and/or reactive. Proactive performance analysis identifies 1073 potential problems that do not exist, but could manifest in the 1074 future. Reactive performance analysis identifies existing problems, 1075 determines their cause through diagnosis, and evaluates alternative 1076 approaches to remedy the problem, if necessary. A number of 1077 quantitative and qualitative techniques may be used in the analysis 1078 process, including modeling based analysis and simulation. The 1079 analysis phase of the process model may involve investigating the 1080 concentration and distribution of traffic across the network or 1081 relevant subsets of the network, identifying the characteristics of 1082 the offered traffic workload, identifying existing or potential 1083 bottlenecks, and identifying network pathologies such as ineffective 1084 link placement, single points of failures, etc. Network pathologies 1085 may result from many factors including inferior network architecture, 1086 inferior network design, and configuration problems. A traffic 1087 matrix may be constructed as part of the analysis process. Network 1088 analysis may also be descriptive or prescriptive. 1090 The fourth phase of the TE process model is the performance 1091 optimization of the network. The performance optimization phase 1092 involves a decision process which selects and implements a set of 1093 actions from a set of alternatives. Optimization actions may include 1094 the use of appropriate techniques to either control the offered 1095 traffic or to control the distribution of traffic across the network. 1096 Optimization actions may also involve adding additional links or 1097 increasing link capacity, deploying additional hardware such as 1098 routers and switches, systematically adjusting parameters associated 1099 with routing such as IGP metrics and BGP attributes, and adjusting 1100 traffic management parameters. Network performance optimization may 1101 also involve starting a network planning process to improve the 1102 network architecture, network design, network capacity, network 1103 technology, and the configuration of network elements to accommodate 1104 current and future growth. 1106 3.1. Components of the Traffic Engineering Process Model 1108 The key components of the traffic engineering process model include a 1109 measurement subsystem, a modeling and analysis subsystem, and an 1110 optimization subsystem. The following subsections examine these 1111 components as they apply to the traffic engineering process model. 1113 3.2. Measurement 1115 Measurement is crucial to the traffic engineering function. The 1116 operational state of a network can be conclusively determined only 1117 through measurement. Measurement is also critical to the 1118 optimization function because it provides feedback data which is used 1119 by traffic engineering control subsystems. This data is used to 1120 adaptively optimize network performance in response to events and 1121 stimuli originating within and outside the network. Measurement is 1122 also needed to determine the quality of network services and to 1123 evaluate the effectiveness of traffic engineering policies. 1124 Experience suggests that measurement is most effective when acquired 1125 and applied systematically. 1127 When developing a measurement system to support the traffic 1128 engineering function in IP networks, the following questions should 1129 be carefully considered: Why is measurement needed in this particular 1130 context? What parameters are to be measured? How should the 1131 measurement be accomplished? Where should the measurement be 1132 performed? When should the measurement be performed? How frequently 1133 should the monitored variables be measured? What level of 1134 measurement accuracy and reliability is desirable? What level of 1135 measurement accuracy and reliability is realistically attainable? To 1136 what extent can the measurement system permissibly interfere with the 1137 monitored network components and variables? What is the acceptable 1138 cost of measurement? The answers to these questions will determine 1139 the measurement tools and methodologies appropriate in any given 1140 traffic engineering context. 1142 It should also be noted that there is a distinction between 1143 measurement and evaluation. Measurement provides raw data concerning 1144 state parameters and variables of monitored network elements. 1145 Evaluation utilizes the raw data to make inferences regarding the 1146 monitored system. 1148 Measurement in support of the TE function can occur at different 1149 levels of abstraction. For example, measurement can be used to 1150 derive packet level characteristics, flow level characteristics, user 1151 or customer level characteristics, traffic aggregate characteristics, 1152 component level characteristics, and network wide characteristics. 1154 3.3. Modeling, Analysis, and Simulation 1156 Modeling and analysis are important aspects of Internet traffic 1157 engineering. Modeling involves constructing an abstract or physical 1158 representation which depicts relevant traffic characteristics and 1159 network attributes. 1161 A network model is an abstract representation of the network which 1162 captures relevant network features, attributes, and characteristics, 1163 such as link and nodal attributes and constraints. A network model 1164 may facilitate analysis and/or simulation which can be used to 1165 predict network performance under various conditions as well as to 1166 guide network expansion plans. 1168 In general, Internet traffic engineering models can be classified as 1169 either structural or behavioral. Structural models focus on the 1170 organization of the network and its components. Behavioral models 1171 focus on the dynamics of the network and the traffic workload. 1172 Modeling for Internet traffic engineering may also be formal or 1173 informal. 1175 Accurate behavioral models for traffic sources are particularly 1176 useful for analysis. Development of behavioral traffic source models 1177 that are consistent with empirical data obtained from operational 1178 networks is a major research topic in Internet traffic engineering. 1179 These source models should also be tractable and amenable to 1180 analysis. The topic of source models for IP traffic is a research 1181 topic and is therefore outside the scope of this document. Its 1182 importance, however, must be emphasized. 1184 Network simulation tools are extremely useful for traffic 1185 engineering. Because of the complexity of realistic quantitative 1186 analysis of network behavior, certain aspects of network performance 1187 studies can only be conducted effectively using simulation. A good 1188 network simulator can be used to mimic and visualize network 1189 characteristics under various conditions in a safe and non-disruptive 1190 manner. For example, a network simulator may be used to depict 1191 congested resources and hot spots, and to provide hints regarding 1192 possible solutions to network performance problems. A good simulator 1193 may also be used to validate the effectiveness of planned solutions 1194 to network issues without the need to tamper with the operational 1195 network, or to commence an expensive network upgrade which may not 1196 achieve the desired objectives. Furthermore, during the process of 1197 network planning, a network simulator may reveal pathologies such as 1198 single points of failure which may require additional redundancy, and 1199 potential bottlenecks and hot spots which may require additional 1200 capacity. 1202 Routing simulators are especially useful in large networks. A 1203 routing simulator may identify planned links which may not actually 1204 be used to route traffic by the existing routing protocols. 1205 Simulators can also be used to conduct scenario based and 1206 perturbation based analysis, as well as sensitivity studies. 1207 Simulation results can be used to initiate appropriate actions in 1208 various ways. For example, an important application of network 1209 simulation tools is to investigate and identify how best to make the 1210 network evolve and grow, in order to accommodate projected future 1211 demands. 1213 3.4. Optimization 1215 Network performance optimization involves resolving network issues by 1216 transforming such issues into concepts that enable a solution, 1217 identification of a solution, and implementation of the solution. 1218 Network performance optimization can be corrective or perfective. In 1219 corrective optimization, the goal is to remedy a problem that has 1220 occurred or that is incipient. In perfective optimization, the goal 1221 is to improve network performance even when explicit problems do not 1222 exist and are not anticipated. 1224 Network performance optimization is a continual process, as noted 1225 previously. Performance optimization iterations may consist of real- 1226 time optimization sub-processes and non-real-time network planning 1227 sub-processes. The difference between real-time optimization and 1228 network planning is primarily in the relative time- scale in which 1229 they operate and in the granularity of actions. One of the 1230 objectives of a real-time optimization sub-process is to control the 1231 mapping and distribution of traffic over the existing network 1232 infrastructure to avoid and/or relieve congestion, to assure 1233 satisfactory service delivery, and to optimize resource utilization. 1234 Real-time optimization is needed because random incidents such as 1235 fiber cuts or shifts in traffic demand will occur irrespective of how 1236 well a network is designed. These incidents can cause congestion and 1237 other problems to manifest in an operational network. Real-time 1238 optimization must solve such problems in small to medium time-scales 1239 ranging from micro-seconds to minutes or hours. Examples of real- 1240 time optimization include queue management, IGP/BGP metric tuning, 1241 and using technologies such as MPLS explicit LSPs to change the paths 1242 of some traffic trunks [XIAO]. 1244 One of the functions of the network planning sub-process is to 1245 initiate actions to systematically evolve the architecture, 1246 technology, topology, and capacity of a network. When a problem 1247 exists in the network, real-time optimization should provide an 1248 immediate remedy. Because a prompt response is necessary, the real- 1249 time solution may not be the best possible solution. Network 1250 planning may subsequently be needed to refine the solution and 1251 improve the situation. Network planning is also required to expand 1252 the network to support traffic growth and changes in traffic 1253 distribution over time. As previously noted, a change in the 1254 topology and/or capacity of the network may be the outcome of network 1255 planning. 1257 Clearly, network planning and real-time performance optimization are 1258 mutually complementary activities. A well-planned and designed 1259 network makes real-time optimization easier, while a systematic 1260 approach to real-time network performance optimization allows network 1261 planning to focus on long term issues rather than tactical 1262 considerations. Systematic real-time network performance 1263 optimization also provides valuable inputs and insights toward 1264 network planning. 1266 Stability is an important consideration in real-time network 1267 performance optimization. This aspect will be repeatedly addressed 1268 throughout this memo. 1270 4. Review of TE Techniques 1272 This section briefly reviews different traffic engineering approaches 1273 proposed and implemented in telecommunications and computer networks. 1274 The discussion is not intended to be comprehensive. It is primarily 1275 intended to illuminate pre-existing perspectives and prior art 1276 concerning traffic engineering in the Internet and in legacy 1277 telecommunications networks. 1279 4.1. Historic Overview 1281 4.1.1. Traffic Engineering in Classical Telephone Networks 1283 This subsection presents a brief overview of traffic engineering in 1284 telephone networks which often relates to the way user traffic is 1285 steered from an originating node to the terminating node. This 1286 subsection presents a brief overview of this topic. A detailed 1287 description of the various routing strategies applied in telephone 1288 networks is included in the book by G. Ash [ASH2]. 1290 The early telephone network relied on static hierarchical routing, 1291 whereby routing patterns remained fixed independent of the state of 1292 the network or time of day. The hierarchy was intended to 1293 accommodate overflow traffic, improve network reliability via 1294 alternate routes, and prevent call looping by employing strict 1295 hierarchical rules. The network was typically over-provisioned since 1296 a given fixed route had to be dimensioned so that it could carry user 1297 traffic during a busy hour of any busy day. Hierarchical routing in 1298 the telephony network was found to be too rigid upon the advent of 1299 digital switches and stored program control which were able to manage 1300 more complicated traffic engineering rules. 1302 Dynamic routing was introduced to alleviate the routing inflexibility 1303 in the static hierarchical routing so that the network would operate 1304 more efficiently. This resulted in significant economic gains 1305 [HUSS87]. Dynamic routing typically reduces the overall loss 1306 probability by 10 to 20 percent (compared to static hierarchical 1307 routing). Dynamic routing can also improve network resilience by 1308 recalculating routes on a per-call basis and periodically updating 1309 routes. 1311 There are three main types of dynamic routing in the telephone 1312 network. They are time-dependent routing, state-dependent routing 1313 (SDR), and event dependent routing (EDR). 1315 In time-dependent routing, regular variations in traffic loads (such 1316 as time of day or day of week) are exploited in pre-planned routing 1317 tables. In state-dependent routing, routing tables are updated 1318 online according to the current state of the network (e.g., traffic 1319 demand, utilization, etc.). In event dependent routing, routing 1320 changes are incepted by events (such as call setups encountering 1321 congested or blocked links) whereupon new paths are searched out 1322 using learning models. EDR methods are real-time adaptive, but they 1323 do not require global state information as does SDR. Examples of EDR 1324 schemes include the dynamic alternate routing (DAR) from BT, the 1325 state-and-time dependent routing (STR) from NTT, and the success-to- 1326 the-top (STT) routing from AT&T. 1328 Dynamic non-hierarchical routing (DNHR) is an example of dynamic 1329 routing that was introduced in the AT&T toll network in the 1980's to 1330 respond to time-dependent information such as regular load variations 1331 as a function of time. Time-dependent information in terms of load 1332 may be divided into three time scales: hourly, weekly, and yearly. 1333 Correspondingly, three algorithms are defined to pre-plan the routing 1334 tables. The network design algorithm operates over a year-long 1335 interval while the demand servicing algorithm operates on a weekly 1336 basis to fine tune link sizes and routing tables to correct forecast 1337 errors on the yearly basis. At the smallest time scale, the routing 1338 algorithm is used to make limited adjustments based on daily traffic 1339 variations. Network design and demand servicing are computed using 1340 offline calculations. Typically, the calculations require extensive 1341 searches on possible routes. On the other hand, routing may need 1342 online calculations to handle crankback. DNHR adopts a "two-link" 1343 approach whereby a path can consist of two links at most. The 1344 routing algorithm presents an ordered list of route choices between 1345 an originating switch and a terminating switch. If a call overflows, 1346 a via switch (a tandem exchange between the originating switch and 1347 the terminating switch) would send a crankback signal to the 1348 originating switch. This switch would then select the next route, 1349 and so on, until there are no alternative routes available in which 1350 the call is blocked. 1352 4.1.2. Evolution of Traffic Engineering in Packet Networks 1354 This subsection reviews related prior work that was intended to 1355 improve the performance of data networks. Indeed, optimization of 1356 the performance of data networks started in the early days of the 1357 ARPANET. Other early commercial networks such as SNA also recognized 1358 the importance of performance optimization and service 1359 differentiation. 1361 In terms of traffic management, the Internet has been a best effort 1362 service environment until recently. In particular, very limited 1363 traffic management capabilities existed in IP networks to provide 1364 differentiated queue management and scheduling services to packets 1365 belonging to different classes. 1367 In terms of routing control, the Internet has employed distributed 1368 protocols for intra-domain routing. These protocols are highly 1369 scalable and resilient. However, they are based on simple algorithms 1370 for path selection which have very limited functionality to allow 1371 flexible control of the path selection process. 1373 In the following subsections, the evolution of practical traffic 1374 engineering mechanisms in IP networks and its predecessors are 1375 reviewed. 1377 4.1.2.1. Adaptive Routing in the ARPANET 1379 The early ARPANET recognized the importance of adaptive routing where 1380 routing decisions were based on the current state of the network 1381 [MCQ80]. Early minimum delay routing approaches forwarded each 1382 packet to its destination along a path for which the total estimated 1383 transit time was the smallest. Each node maintained a table of 1384 network delays, representing the estimated delay that a packet would 1385 experience along a given path toward its destination. The minimum 1386 delay table was periodically transmitted by a node to its neighbors. 1387 The shortest path, in terms of hop count, was also propagated to give 1388 the connectivity information. 1390 One drawback to this approach is that dynamic link metrics tend to 1391 create "traffic magnets" causing congestion to be shifted from one 1392 location of a network to another location, resulting in oscillation 1393 and network instability. 1395 4.1.2.2. Dynamic Routing in the Internet 1397 The Internet evolved from the ARPANET and adopted dynamic routing 1398 algorithms with distributed control to determine the paths that 1399 packets should take en-route to their destinations. The routing 1400 algorithms are adaptations of shortest path algorithms where costs 1401 are based on link metrics. The link metric can be based on static or 1402 dynamic quantities. The link metric based on static quantities may 1403 be assigned administratively according to local criteria. The link 1404 metric based on dynamic quantities may be a function of a network 1405 congestion measure such as delay or packet loss. 1407 It was apparent early that static link metric assignment was 1408 inadequate because it can easily lead to unfavorable scenarios in 1409 which some links become congested while others remain lightly loaded. 1410 One of the many reasons for the inadequacy of static link metrics is 1411 that link metric assignment was often done without considering the 1412 traffic matrix in the network. Also, the routing protocols did not 1413 take traffic attributes and capacity constraints into account when 1414 making routing decisions. This results in traffic concentration 1415 being localized in subsets of the network infrastructure and 1416 potentially causing congestion. Even if link metrics are assigned in 1417 accordance with the traffic matrix, unbalanced loads in the network 1418 can still occur due to a number factors including: 1420 o Resources may not be deployed in the most optimal locations from a 1421 routing perspective. 1423 o Forecasting errors in traffic volume and/or traffic distribution. 1425 o Dynamics in traffic matrix due to the temporal nature of traffic 1426 patterns, BGP policy change from peers, etc. 1428 The inadequacy of the legacy Internet interior gateway routing system 1429 is one of the factors motivating the interest in path oriented 1430 technology with explicit routing and constraint-based routing 1431 capability such as MPLS. 1433 4.1.2.3. ToS Routing 1435 Type-of-Service (ToS) routing involves different routes going to the 1436 same destination with selection dependent upon the ToS field of an IP 1437 packet [RFC2474]. The ToS classes may be classified as low delay and 1438 high throughput. Each link is associated with multiple link costs 1439 and each link cost is used to compute routes for a particular ToS. A 1440 separate shortest path tree is computed for each ToS. The shortest 1441 path algorithm must be run for each ToS resulting in very expensive 1442 computation. Classical ToS-based routing is now outdated as the IP 1443 header field has been replaced by a Diffserv field. Effective 1444 traffic engineering is difficult to perform in classical ToS-based 1445 routing because each class still relies exclusively on shortest path 1446 routing which results in localization of traffic concentration within 1447 the network. 1449 4.1.2.4. Equal Cost Multi-Path 1451 Equal Cost Multi-Path (ECMP) is another technique that attempts to 1452 address the deficiency in the Shortest Path First (SPF) interior 1453 gateway routing systems [RFC2328]. In the classical SPF algorithm, 1454 if two or more shortest paths exist to a given destination, the 1455 algorithm will choose one of them. The algorithm is modified 1456 slightly in ECMP so that if two or more equal cost shortest paths 1457 exist between two nodes, the traffic between the nodes is distributed 1458 among the multiple equal-cost paths. Traffic distribution across the 1459 equal-cost paths is usually performed in one of two ways: (1) packet- 1460 based in a round-robin fashion, or (2) flow-based using hashing on 1461 source and destination IP addresses and possibly other fields of the 1462 IP header. The first approach can easily cause out- of-order packets 1463 while the second approach is dependent upon the number and 1464 distribution of flows. Flow-based load sharing may be unpredictable 1465 in an enterprise network where the number of flows is relatively 1466 small and less heterogeneous (for example, hashing may not be 1467 uniform), but it is generally effective in core public networks where 1468 the number of flows is large and heterogeneous. 1470 In ECMP, link costs are static and bandwidth constraints are not 1471 considered, so ECMP attempts to distribute the traffic as equally as 1472 possible among the equal-cost paths independent of the congestion 1473 status of each path. As a result, given two equal-cost paths, it is 1474 possible that one of the paths will be more congested than the other. 1475 Another drawback of ECMP is that load sharing cannot be achieved on 1476 multiple paths which have non-identical costs. 1478 4.1.2.5. Nimrod 1480 Nimrod was a routing system developed to provide heterogeneous 1481 service specific routing in the Internet, while taking multiple 1482 constraints into account [RFC1992]. Essentially, Nimrod was a link 1483 state routing protocol to support path oriented packet forwarding. 1484 It used the concept of maps to represent network connectivity and 1485 services at multiple levels of abstraction. Mechanisms allowed 1486 restriction of the distribution of routing information. 1488 Even though Nimrod did not enjoy deployment in the public Internet, a 1489 number of key concepts incorporated into the Nimrod architecture, 1490 such as explicit routing which allows selection of paths at 1491 originating nodes, are beginning to find applications in some recent 1492 constraint-based routing initiatives. 1494 4.2. Development of Internet Traffic Engineering 1496 4.2.1. Overlay Model 1498 In the overlay model, a virtual-circuit network, such as Sonet/SDH, 1499 OTN, or WDM, provides virtual-circuit connectivity between routers 1500 that are located at the edges of a virtual-circuit cloud. In this 1501 mode, two routers that are connected through a virtual circuit see a 1502 direct adjacency between themselves independent of the physical route 1503 taken by the virtual circuit through the ATM, frame relay, or WDM 1504 network. Thus, the overlay model essentially decouples the logical 1505 topology that routers see from the physical topology that the ATM, 1506 frame relay, or WDM network manages. The overlay model based on ATM 1507 or frame relay enables a network administrator or an automaton to 1508 employ traffic engineering concepts to perform path optimization by 1509 re-configuring or rearranging the virtual circuits so that a virtual 1510 circuit on a congested or sub-optimal physical link can be re-routed 1511 to a less congested or more optimal one. In the overlay model, 1512 traffic engineering is also employed to establish relationships 1513 between the traffic management parameters (e.g., PCR, SCR, and MBS 1514 for ATM) of the virtual-circuit technology and the actual traffic 1515 that traverses each circuit. These relationships can be established 1516 based upon known or projected traffic profiles, and some other 1517 factors. 1519 4.2.2. Constraint-Based Routing 1521 Constraint-based routing refers to a class of routing systems that 1522 compute routes through a network subject to the satisfaction of a set 1523 of constraints and requirements. In the most general setting, 1524 constraint-based routing may also seek to optimize overall network 1525 performance while minimizing costs. 1527 The constraints and requirements may be imposed by the network itself 1528 or by administrative policies. Constraints may include bandwidth, 1529 hop count, delay, and policy instruments such as resource class 1530 attributes. Constraints may also include domain specific attributes 1531 of certain network technologies and contexts which impose 1532 restrictions on the solution space of the routing function. Path 1533 oriented technologies such as MPLS have made constraint-based routing 1534 feasible and attractive in public IP networks. 1536 The concept of constraint-based routing within the context of MPLS 1537 traffic engineering requirements in IP networks was first described 1538 in [RFC2702] and led to developments such as MPLS-TE [RFC3209] as 1539 described in Section 4.3.4. 1541 Unlike QoS routing (for example, see [RFC2386] and [MA]) which 1542 generally addresses the issue of routing individual traffic flows to 1543 satisfy prescribed flow based QoS requirements subject to network 1544 resource availability, constraint-based routing is applicable to 1545 traffic aggregates as well as flows and may be subject to a wide 1546 variety of constraints which may include policy restrictions. 1548 4.3. Overview of IETF Projects Related to Traffic Engineering 1550 This subsection reviews a number of IETF activities pertinent to 1551 Internet traffic engineering. These activities are primarily 1552 intended to evolve the IP architecture to support new service 1553 definitions which allow preferential or differentiated treatment to 1554 be accorded to certain types of traffic. 1556 4.3.1. Integrated Services 1558 The IETF Integrated Services working group developed the integrated 1559 services (Intserv) model. This model requires resources, such as 1560 bandwidth and buffers, to be reserved a priori for a given traffic 1561 flow to ensure that the quality of service requested by the traffic 1562 flow is satisfied. The integrated services model includes additional 1563 components beyond those used in the best-effort model such as packet 1564 classifiers, packet schedulers, and admission control. A packet 1565 classifier is used to identify flows that are to receive a certain 1566 level of service. A packet scheduler handles the scheduling of 1567 service to different packet flows to ensure that QoS commitments are 1568 met. Admission control is used to determine whether a router has the 1569 necessary resources to accept a new flow. 1571 The main issue with the Integrated Services model has been 1572 scalability [RFC2998], especially in large public IP networks which 1573 may potentially have millions of active micro-flows in transit 1574 concurrently. 1576 A notable feature of the Integrated Services model is that it 1577 requires explicit signaling of QoS requirements from end systems to 1578 routers [RFC2753]. The Resource Reservation Protocol (RSVP) performs 1579 this signaling function and is a critical component of the Integrated 1580 Services model. RSVP is described next. 1582 4.3.2. RSVP 1584 RSVP is a soft state signaling protocol [RFC2205]. It supports 1585 receiver initiated establishment of resource reservations for both 1586 multicast and unicast flows. RSVP was originally developed as a 1587 signaling protocol within the integrated services framework for 1588 applications to communicate QoS requirements to the network and for 1589 the network to reserve relevant resources to satisfy the QoS 1590 requirements [RFC2205]. 1592 Under RSVP, the sender or source node sends a PATH message to the 1593 receiver with the same source and destination addresses as the 1594 traffic which the sender will generate. The PATH message contains: 1595 (1) a sender Tspec specifying the characteristics of the traffic, (2) 1596 a sender Template specifying the format of the traffic, and (3) an 1597 optional Adspec which is used to support the concept of one pass with 1598 advertising (OPWA) [RFC2205]. Every intermediate router along the 1599 path forwards the PATH Message to the next hop determined by the 1600 routing protocol. Upon receiving a PATH Message, the receiver 1601 responds with a RESV message which includes a flow descriptor used to 1602 request resource reservations. The RESV message travels to the 1603 sender or source node in the opposite direction along the path that 1604 the PATH message traversed. Every intermediate router along the path 1605 can reject or accept the reservation request of the RESV message. If 1606 the request is rejected, the rejecting router will send an error 1607 message to the receiver and the signaling process will terminate. If 1608 the request is accepted, link bandwidth and buffer space are 1609 allocated for the flow and the related flow state information is 1610 installed in the router. 1612 One of the issues with the original RSVP specification was 1613 Scalability. This is because reservations were required for micro- 1614 flows, so that the amount of state maintained by network elements 1615 tends to increase linearly with the number of micro-flows. These 1616 issues are described in [RFC2961]. 1618 Recently, RSVP has been modified and extended in several ways to 1619 mitigate the scaling problems. As a result, it is becoming a 1620 versatile signaling protocol for the Internet. For example, RSVP has 1621 been extended to reserve resources for aggregation of flows, to set 1622 up MPLS explicit label switched paths, and to perform other signaling 1623 functions within the Internet. There are also a number of proposals 1624 to reduce the amount of refresh messages required to maintain 1625 established RSVP sessions [RFC2961]. 1627 A number of IETF working groups have been engaged in activities 1628 related to the RSVP protocol. These include the original RSVP 1629 working group, the MPLS working group, the Resource Allocation 1630 Protocol working group, and the Policy Framework working group. 1632 4.3.3. Differentiated Services 1634 The goal of the Differentiated Services (Diffserv) effort within the 1635 IETF is to devise scalable mechanisms for categorization of traffic 1636 into behavior aggregates, which ultimately allows each behavior 1637 aggregate to be treated differently, especially when there is a 1638 shortage of resources such as link bandwidth and buffer space 1639 [RFC2475]. One of the primary motivations for the Diffserv effort 1640 was to devise alternative mechanisms for service differentiation in 1641 the Internet that mitigate the scalability issues encountered with 1642 the Intserv model. 1644 The IETF Diffserv working group has defined a Differentiated Services 1645 field in the IP header (DS field). The DS field consists of six bits 1646 of the part of the IP header formerly known as TOS octet. The DS 1647 field is used to indicate the forwarding treatment that a packet 1648 should receive at a node [RFC2474]. The Diffserv working group has 1649 also standardized a number of Per-Hop Behavior (PHB) groups. Using 1650 the PHBs, several classes of services can be defined using different 1651 classification, policing, shaping, and scheduling rules. 1653 For an end-user of network services to receive Differentiated 1654 Services from its Internet Service Provider (ISP), it may be 1655 necessary for the user to have a Service Level Agreement (SLA) with 1656 the ISP. An SLA may explicitly or implicitly specify a Traffic 1657 Conditioning Agreement (TCA) which defines classifier rules as well 1658 as metering, marking, discarding, and shaping rules. 1660 Packets are classified, and possibly policed and shaped at the 1661 ingress to a Diffserv network. When a packet traverses the boundary 1662 between different Diffserv domains, the DS field of the packet may be 1663 re-marked according to existing agreements between the domains. 1665 Differentiated Services allows only a finite number of service 1666 classes to be indicated by the DS field. The main advantage of the 1667 Diffserv approach relative to the Intserv model is scalability. 1668 Resources are allocated on a per-class basis and the amount of state 1669 information is proportional to the number of classes rather than to 1670 the number of application flows. 1672 It should be obvious from the previous discussion that the Diffserv 1673 model essentially deals with traffic management issues on a per hop 1674 basis. The Diffserv control model consists of a collection of micro- 1675 TE control mechanisms. Other traffic engineering capabilities, such 1676 as capacity management (including routing control), are also required 1677 in order to deliver acceptable service quality in Diffserv networks. 1678 The concept of Per Domain Behaviors has been introduced to better 1679 capture the notion of differentiated services across a complete 1680 domain [RFC3086]. 1682 4.3.4. MPLS 1684 MPLS is an advanced forwarding scheme which also includes extensions 1685 to conventional IP control plane protocols. MPLS extends the 1686 Internet routing model and enhances packet forwarding and path 1687 control [RFC3031]. 1689 At the ingress to an MPLS domain, Label Switching Routers (LSRs) 1690 classify IP packets into forwarding equivalence classes (FECs) based 1691 on a variety of factors, including, e.g., a combination of the 1692 information carried in the IP header of the packets and the local 1693 routing information maintained by the LSRs. An MPLS label stack 1694 entry is then prepended to each packet according to their forwarding 1695 equivalence classes. The MPLS label stack entry is 32 bits long and 1696 contains a 20-bit label field. 1698 An LSR makes forwarding decisions by using the label prepended to 1699 packets as the index into a local next hop label forwarding entry 1700 (NHLFE). The packet is then processed as specified in the NHLFE. 1701 The incoming label may be replaced by an outgoing label (label swap), 1702 and the packet may be forwarded to the next LSR. Before a packet 1703 leaves an MPLS domain, its MPLS label may be removed (label pop). A 1704 Label Switched Path (LSP) is the path between an ingress LSRs and an 1705 egress LSRs through which a labeled packet traverses. The path of an 1706 explicit LSP is defined at the originating (ingress) node of the LSP. 1707 MPLS can use a signaling protocol such as RSVP or LDP to set up LSPs. 1709 MPLS is a very powerful technology for Internet traffic engineering 1710 because it supports explicit LSPs which allow constraint-based 1711 routing to be implemented efficiently in IP networks [AWD2]. The 1712 requirements for traffic engineering over MPLS are described in 1713 [RFC2702]. Extensions to RSVP to support instantiation of explicit 1714 LSP are discussed in [RFC3209]. 1716 4.3.5. Generalized MPLS 1718 GMPLS extends MPLS control protocols to encompass time-division 1719 (e.g., SONET/SDH, PDH, G.709), wavelength (lambdas), and spatial 1720 switching (e.g., incoming port or fiber to outgoing port or fiber) as 1721 well as continuing to support packet switching. GMPLS provides a 1722 common set of control protocols for all of these layers (including 1723 some technology-specific extensions) each of which has a diverse data 1724 or forwarding plane. GMPLS covers both the signaling and the routing 1725 part of that control plane and is based on the Traffic Engineering 1726 extensions to MPLS (see Section 4.3.4). 1728 In GMPLS, the original MPLS architecture is extended to include LSRs 1729 whose forwarding planes rely on circuit switching, and therefore 1730 cannot forward data based on the information carried in either packet 1731 or cell headers. Specifically, such LSRs include devices where the 1732 switching is based on time slots, wavelengths, or physical ports. 1733 These additions impact basic LSP properties: how labels are requested 1734 and communicated, the unidirectional nature of MPLS LSPs, how errors 1735 are propagated, and information provided for synchronizing the 1736 ingress and egress LSRs. 1738 4.3.6. IP Performance Metrics 1740 The IETF IP Performance Metrics (IPPM) working group has been 1741 developing a set of standard metrics that can be used to monitor the 1742 quality, performance, and reliability of Internet services. These 1743 metrics can be applied by network operators, end-users, and 1744 independent testing groups to provide users and service providers 1745 with a common understanding of the performance and reliability of the 1746 Internet component 'clouds' they use/provide [RFC2330]. The criteria 1747 for performance metrics developed by the IPPM WG are described in 1748 [RFC2330]. Examples of performance metrics include one-way packet 1749 loss [RFC7680], one-way delay [RFC7679], and connectivity measures 1750 between two nodes [RFC2678]. Other metrics include second-order 1751 measures of packet loss and delay. 1753 Some of the performance metrics specified by the IPPM WG are useful 1754 for specifying Service Level Agreements (SLAs). SLAs are sets of 1755 service level objectives negotiated between users and service 1756 providers, wherein each objective is a combination of one or more 1757 performance metrics, possibly subject to certain constraints. 1759 4.3.7. Flow Measurement 1761 The IETF Real Time Flow Measurement (RTFM) working group has produced 1762 an architecture document defining a method to specify traffic flows 1763 as well as a number of components for flow measurement (meters, meter 1764 readers, manager) [RFC2722]. A flow measurement system enables 1765 network traffic flows to be measured and analyzed at the flow level 1766 for a variety of purposes. As noted in RFC 2722, a flow measurement 1767 system can be very useful in the following contexts: (1) 1768 understanding the behavior of existing networks, (2) planning for 1769 network development and expansion, (3) quantification of network 1770 performance, (4) verifying the quality of network service, and (5) 1771 attribution of network usage to users. 1773 A flow measurement system consists of meters, meter readers, and 1774 managers. A meter observes packets passing through a measurement 1775 point, classifies them into certain groups, accumulates certain usage 1776 data (such as the number of packets and bytes for each group), and 1777 stores the usage data in a flow table. A group may represent a user 1778 application, a host, a network, a group of networks, etc. A meter 1779 reader gathers usage data from various meters so it can be made 1780 available for analysis. A manager is responsible for configuring and 1781 controlling meters and meter readers. The instructions received by a 1782 meter from a manager include flow specification, meter control 1783 parameters, and sampling techniques. The instructions received by a 1784 meter reader from a manager include the address of the meter whose 1785 date is to be collected, the frequency of data collection, and the 1786 types of flows to be collected. 1788 4.3.8. Endpoint Congestion Management 1790 [RFC3124] is intended to provide a set of congestion control 1791 mechanisms that transport protocols can use. It is also intended to 1792 develop mechanisms for unifying congestion control across a subset of 1793 an endpoint's active unicast connections (called a congestion group). 1794 A congestion manager continuously monitors the state of the path for 1795 each congestion group under its control. The manager uses that 1796 information to instruct a scheduler on how to partition bandwidth 1797 among the connections of that congestion group. 1799 4.3.9. TE Extensions to the IGPs 1801 TBD 1803 4.3.10. Link-State BGP 1805 In a number of environments, a component external to a network is 1806 called upon to perform computations based on the network topology and 1807 current state of the connections within the network, including 1808 traffic engineering information. This is information typically 1809 distributed by IGP routing protocols within the network (see 1810 Section 4.3.9. 1812 The Border Gateway Protocol (BGP) Section 7 is one of the essential 1813 routing protocols that glue the Internet together. BGP Link State 1814 (BGP-LS) [RFC7752] is a mechanism by which link-state and traffic 1815 engineering information can be collected from networks and shared 1816 with external components using the BGP routing protocol. The 1817 mechanism is applicable to physical and virtual IGP links, and is 1818 subject to policy control. 1820 Information collected by BGP-LS can be used to construct the Traffic 1821 Engineering Database (TED, see Section 4.3.16) for use by the Path 1822 Computation Element (PCE, see Section 4.3.11), or may be used by 1823 Application-Layer Traffic Optimization (ALTO) servers (see 1824 Section 4.3.12). 1826 4.3.11. Path Computation Element 1828 Constraint-based path computation is a fundamental building block for 1829 traffic engineering in MPLS and GMPLS networks. Path computation in 1830 large, multi-domain networks is complex and may require special 1831 computational components and cooperation between the elements in 1832 different domains. The Path Computation Element (PCE) [RFC4655] is 1833 an entity (component, application, or network node) that is capable 1834 of computing a network path or route based on a network graph and 1835 applying computational constraints. 1837 Thus, a PCE can provide a central component in a traffic engineering 1838 system operating on the Traffic Engineering Database (TED, see 1839 Section 4.3.16) with delegated responsibility for determining paths 1840 in MPLS, GMPLS, or Segment Routing networks. The PCE uses the Path 1841 Computation Element Communication Protocol (PCEP) [RFC5440] to 1842 communicate with Path Computation Clients (PCCs), such as MPLS LSRs, 1843 to answer their requests for computed paths or to instruct them to 1844 initiate new paths [RFC8281] and maintain state about paths already 1845 installed in the network [RFC8231] 1847 PCEs form key components of a number of traffic engineering systems, 1848 such as the Application of the Path Computation Element Architecture 1849 [RFC6805], the Applicability of a Stateful Path Computation Element 1850 [RFC8051], Abstraction and Control of TE Networks (ACTN) 1851 Section 4.3.14, Centralized Network Control [RFC8283], and Software 1852 Defined Networking (SDN) Section 5.3.2 1854 4.3.12. Application-Layer Traffic Optimization 1856 TBD 1858 4.3.13. Segment Routing with MPLS encapsuation (SR-MPLS) 1860 Segment Routing (SR) leverages the source routing and tunneling 1861 paradigms: The path packet takes is defined at the ingress and 1862 tunneled to the egress. 1864 A node steers a packet through a controlled set of instructions, 1865 called segments, by prepending the packet with an SR header, label 1866 stack in MPLS case. 1868 A segment can represent any instruction, topological or service- 1869 based, thanks to the MPLS architecture [RFC3031]. Labels cand be 1870 looked up in a global context (platform wide) as well as in some 1871 other context (see "context labels" in section 3 of [RFC5331]). 1873 4.3.13.1. Base Segment Routing Identifier Types 1875 Segments are identified by Segment Identifiers (SIDs). There are 1876 four types of SID that are relevant for traffic engineering. 1878 Prefix SID: Uses SR Global Block (SRGB), must be unique within the 1879 routing domain SRGB, and is advertised by an IGP. The Prefix-SID 1880 can be configured as an absolute value or an index. 1882 Node SID: A Node SID is a prefix SID with the 'N' (node) bit set, it 1883 is associated with a host prefix (/32 or /128) that identifies the 1884 node. More than 1 Node SID can be configured per node. 1886 Adjacency SID: An Adjacency SID is locally significant (by default). 1887 It can be made globally significant through use of the 'L' flag. 1888 It identifies unidirectional adjacency. In most implementations 1889 Adjacency SIDs are automatically allocated for each adjacency. 1890 They are always encoded as an absolute (not indexed) value. 1892 Binding SID: A Binding SID has two purposes 1894 1. Mapping Server in ISIS 1896 ISIS:The SID/Label Binding TLV is used to advertise 1897 prefixes to SID/Label mappings. This functionality is 1898 called the Segment Routing Mapping Server (SRMS). The 1899 behavior of the SRMS is defined in [RFC8661] 1901 2. Cross-connect (label to FEC mapping) 1903 This is fundamental for multi-domain/multi-layer operation. 1904 The Binding SID identifies a new (could be SR or 1905 hierarchical, at another OSI Layer) path available at the 1906 anchor point. Is always local to the originator (must not 1907 be at the top of the stack), must be looked up in the 1908 context of the nodal SID. It could be provisioned through 1909 Netconf/Restconf, PCEP, BGP, or the CLI. 1911 4.3.14. Network Virtualization and Abstraction 1913 ACTN goes here : TBD 1915 4.3.15. Deterministic Networking 1917 TBD 1919 4.3.16. Network TE State Definition and Presentation 1921 The network states that are relevant to the traffic engineering need 1922 to be stored in the system and presented to the user. The Traffic 1923 Engineering Database (TED) is a collection of all TE information 1924 about all TE nodes and TE links in the network, which is an essential 1925 component of a TE system, such as MPLS-TE [RFC2702] and GMPLS 1926 [RFC3945]. In order to formally define the data in the TED and to 1927 present the data to the user with high usability, the data modeling 1928 language YANG [RFC7950] can be used as described in 1929 [I-D.ietf-teas-yang-te-topo]. 1931 4.3.17. System Management and Control Interfaces 1933 The traffic engineering control system needs to have a management 1934 interface that is human-friendly and a control interfaces that is 1935 programable for automation. The Network Configuration Protocol 1936 (NETCONF) [RFC6241] or the RESTCONF Protocol [RFC8040] provide 1937 programmable interfaces that are also human-friendly. These 1938 protocols use XML or JSON encoded messages. When message compactness 1939 or protocol bandwidth consumption needs to be optimized for the 1940 control interface, other protocols, such as Group Communication for 1941 the Constrained Application Protocol (CoAP) [RFC7390] or gRPC, are 1942 available, especially when the protocol messages are encoded in a 1943 binary format. Along with any of these protocols, the data modeling 1944 language YANG [RFC7950] can be used to formally and precisely define 1945 the interface data. 1947 The Path Computation Element Communication Protocol (PCEP) [RFC5440] 1948 is another protocol that has evolved to be an option for the TE 1949 system control interface. The messages of PCEP are TLV-based, not 1950 defined by a data modeling language such as YANG. 1952 4.4. Overview of ITU Activities Related to Traffic Engineering 1954 This section provides an overview of prior work within the ITU-T 1955 pertaining to traffic engineering in traditional telecommunications 1956 networks. 1958 ITU-T Recommendations E.600 [ITU-E600], E.701 [ITU-E701], and E.801 1959 [ITU-E801] address traffic engineering issues in traditional 1960 telecommunications networks. Recommendation E.600 provides a 1961 vocabulary for describing traffic engineering concepts, while E.701 1962 defines reference connections, Grade of Service (GOS), and traffic 1963 parameters for ISDN. Recommendation E.701 uses the concept of a 1964 reference connection to identify representative cases of different 1965 types of connections without describing the specifics of their actual 1966 realizations by different physical means. As defined in 1967 Recommendation E.600, "a connection is an association of resources 1968 providing means for communication between two or more devices in, or 1969 attached to, a telecommunication network." Also, E.600 defines "a 1970 resource as any set of physically or conceptually identifiable 1971 entities within a telecommunication network, the use of which can be 1972 unambiguously determined" [ITU-E600]. There can be different types 1973 of connections as the number and types of resources in a connection 1974 may vary. 1976 Typically, different network segments are involved in the path of a 1977 connection. For example, a connection may be local, national, or 1978 international. The purposes of reference connections are to clarify 1979 and specify traffic performance issues at various interfaces between 1980 different network domains. Each domain may consist of one or more 1981 service provider networks. 1983 Reference connections provide a basis to define grade of service 1984 (GoS) parameters related to traffic engineering within the ITU-T 1985 framework. As defined in E.600, "GoS refers to a number of traffic 1986 engineering variables which are used to provide a measure of the 1987 adequacy of a group of resources under specified conditions." These 1988 GoS variables may be probability of loss, dial tone, delay, etc. 1989 They are essential for network internal design and operation as well 1990 as for component performance specification. 1992 GoS is different from quality of service (QoS) in the ITU framework. 1993 QoS is the performance perceivable by a telecommunication service 1994 user and expresses the user's degree of satisfaction of the service. 1996 QoS parameters focus on performance aspects observable at the service 1997 access points and network interfaces, rather than their causes within 1998 the network. GoS, on the other hand, is a set of network oriented 1999 measures which characterize the adequacy of a group of resources 2000 under specified conditions. For a network to be effective in serving 2001 its users, the values of both GoS and QoS parameters must be related, 2002 with GoS parameters typically making a major contribution to the QoS. 2004 Recommendation E.600 stipulates that a set of GoS parameters must be 2005 selected and defined on an end-to-end basis for each major service 2006 category provided by a network to assist the network provider with 2007 improving efficiency and effectiveness of the network. Based on a 2008 selected set of reference connections, suitable target values are 2009 assigned to the selected GoS parameters under normal and high load 2010 conditions. These end-to-end GoS target values are then apportioned 2011 to individual resource components of the reference connections for 2012 dimensioning purposes. 2014 4.5. Content Distribution 2016 The Internet is dominated by client-server interactions, especially 2017 Web traffic (in the future, more sophisticated media servers may 2018 become dominant). The location and performance of major information 2019 servers has a significant impact on the traffic patterns within the 2020 Internet as well as on the perception of service quality by end 2021 users. 2023 A number of dynamic load balancing techniques have been devised to 2024 improve the performance of replicated information servers. These 2025 techniques can cause spatial traffic characteristics to become more 2026 dynamic in the Internet because information servers can be 2027 dynamically picked based upon the location of the clients, the 2028 location of the servers, the relative utilization of the servers, the 2029 relative performance of different networks, and the relative 2030 performance of different parts of a network. This process of 2031 assignment of distributed servers to clients is called Traffic 2032 Directing. It functions at the application layer. 2034 Traffic Directing schemes that allocate servers in multiple 2035 geographically dispersed locations to clients may require empirical 2036 network performance statistics to make more effective decisions. In 2037 the future, network measurement systems may need to provide this type 2038 of information. The exact parameters needed are not yet defined. 2040 When congestion exists in the network, Traffic Directing and Traffic 2041 Engineering systems should act in a coordinated manner. This topic 2042 is for further study. 2044 The issues related to location and replication of information 2045 servers, particularly web servers, are important for Internet traffic 2046 engineering because these servers contribute a substantial proportion 2047 of Internet traffic. 2049 5. Taxonomy of Traffic Engineering Systems 2051 This section presents a short taxonomy of traffic engineering 2052 systems. A taxonomy of traffic engineering systems can be 2053 constructed based on traffic engineering styles and views as listed 2054 below: 2056 o Time-dependent vs State-dependent vs Event-dependent 2058 o Offline vs Online 2060 o Centralized vs Distributed 2062 o Local vs Global Information 2064 o Prescriptive vs Descriptive 2066 o Open Loop vs Closed Loop 2068 o Tactical vs Strategic 2070 These classification systems are described in greater detail in the 2071 following subsections of this document. 2073 5.1. Time-Dependent Versus State-Dependent Versus Event Dependent 2075 Traffic engineering methodologies can be classified as time- 2076 dependent, or state-dependent, or event-dependent. All TE schemes 2077 are considered to be dynamic in this document. Static TE implies 2078 that no traffic engineering methodology or algorithm is being 2079 applied. 2081 In the time-dependent TE, historical information based on periodic 2082 variations in traffic, (such as time of day), is used to pre-program 2083 routing plans and other TE control mechanisms. Additionally, 2084 customer subscription or traffic projection may be used. Pre- 2085 programmed routing plans typically change on a relatively long time 2086 scale (e.g., diurnal). Time-dependent algorithms do not attempt to 2087 adapt to random variations in traffic or changing network conditions. 2088 An example of a time-dependent algorithm is a global centralized 2089 optimizer where the input to the system is a traffic matrix and 2090 multi-class QoS requirements as described [MR99]. 2092 State-dependent TE adapts the routing plans for packets based on the 2093 current state of the network. The current state of the network 2094 provides additional information on variations in actual traffic 2095 (i.e., perturbations from regular variations) that could not be 2096 predicted using historical information. Constraint-based routing is 2097 an example of state-dependent TE operating in a relatively long time 2098 scale. An example operating in a relatively short time scale is a 2099 load-balancing algorithm described in [MATE]. 2101 The state of the network can be based on parameters such as 2102 utilization, packet delay, packet loss, etc. These parameters can be 2103 obtained in several ways. For example, each router may flood these 2104 parameters periodically or by means of some kind of trigger to other 2105 routers. Another approach is for a particular router performing 2106 adaptive TE to send probe packets along a path to gather the state of 2107 that path. Still another approach is for a management system to 2108 gather relevant information from network elements. 2110 Expeditious and accurate gathering and distribution of state 2111 information is critical for adaptive TE due to the dynamic nature of 2112 network conditions. State-dependent algorithms may be applied to 2113 increase network efficiency and resilience. Time-dependent 2114 algorithms are more suitable for predictable traffic variations. On 2115 the other hand, state-dependent algorithms are more suitable for 2116 adapting to the prevailing network state. 2118 Event-dependent TE methods can also be used for TE path selection. 2119 Event-dependent TE methods are distinct from time-dependent and 2120 state-dependent TE methods in the manner in which paths are selected. 2121 These algorithms are adaptive and distributed in nature and typically 2122 use learning models to find good paths for TE in a network. While 2123 state-dependent TE models typically use available-link-bandwidth 2124 (ALB) flooding for TE path selection, event-dependent TE methods do 2125 not require ALB flooding. Rather, event-dependent TE methods 2126 typically search out capacity by learning models, as in the success- 2127 to-the-top (STT) method. ALB flooding can be resource intensive, 2128 since it requires link bandwidth to carry LSAs, processor capacity to 2129 process LSAs, and the overhead can limit area/autonomous system (AS) 2130 size. Modeling results suggest that event-dependent TE methods could 2131 lead to a reduction in ALB flooding overhead without loss of network 2132 throughput performance [I-D.ietf-tewg-qos-routing]. 2134 5.2. Offline Versus Online 2136 Traffic engineering requires the computation of routing plans. The 2137 computation may be performed offline or online. The computation can 2138 be done offline for scenarios where routing plans need not be 2139 executed in real-time. For example, routing plans computed from 2140 forecast information may be computed offline. Typically, offline 2141 computation is also used to perform extensive searches on multi- 2142 dimensional solution spaces. 2144 Online computation is required when the routing plans must adapt to 2145 changing network conditions as in state-dependent algorithms. Unlike 2146 offline computation (which can be computationally demanding), online 2147 computation is geared toward relative simple and fast calculations to 2148 select routes, fine-tune the allocations of resources, and perform 2149 load balancing. 2151 5.3. Centralized Versus Distributed 2153 Centralized control has a central authority which determines routing 2154 plans and perhaps other TE control parameters on behalf of each 2155 router. The central authority collects the network-state information 2156 from all routers periodically and returns the routing information to 2157 the routers. The routing update cycle is a critical parameter 2158 directly impacting the performance of the network being controlled. 2159 Centralized control may need high processing power and high bandwidth 2160 control channels. 2162 Distributed control determines route selection by each router 2163 autonomously based on the routers view of the state of the network. 2164 The network state information may be obtained by the router using a 2165 probing method or distributed by other routers on a periodic basis 2166 using link state advertisements. Network state information may also 2167 be disseminated under exceptional conditions. 2169 5.3.1. Hybrid Systems 2171 TBD 2173 5.3.2. Considerations for Software Defined Networking 2175 TBD 2177 5.4. Local Versus Global 2179 Traffic engineering algorithms may require local or global network- 2180 state information. 2182 Local information pertains to the state of a portion of the domain. 2183 Examples include the bandwidth and packet loss rate of a particular 2184 path. Local state information may be sufficient for certain 2185 instances of distributed-controlled TEs. 2187 Global information pertains to the state of the entire domain 2188 undergoing traffic engineering. Examples include a global traffic 2189 matrix and loading information on each link throughout the domain of 2190 interest. Global state information is typically required with 2191 centralized control. Distributed TE systems may also need global 2192 information in some cases. 2194 5.5. Prescriptive Versus Descriptive 2196 TE systems may also be classified as prescriptive or descriptive. 2198 Prescriptive traffic engineering evaluates alternatives and 2199 recommends a course of action. Prescriptive traffic engineering can 2200 be further categorized as either corrective or perfective. 2201 Corrective TE prescribes a course of action to address an existing or 2202 predicted anomaly. Perfective TE prescribes a course of action to 2203 evolve and improve network performance even when no anomalies are 2204 evident. 2206 Descriptive traffic engineering, on the other hand, characterizes the 2207 state of the network and assesses the impact of various policies 2208 without recommending any particular course of action. 2210 5.5.1. Intent-Based Networking 2212 TBD 2214 5.6. Open-Loop Versus Closed-Loop 2216 Open-loop traffic engineering control is where control action does 2217 not use feedback information from the current network state. The 2218 control action may use its own local information for accounting 2219 purposes, however. 2221 Closed-loop traffic engineering control is where control action 2222 utilizes feedback information from the network state. The feedback 2223 information may be in the form of historical information or current 2224 measurement. 2226 5.7. Tactical vs Strategic 2228 Tactical traffic engineering aims to address specific performance 2229 problems (such as hot-spots) that occur in the network from a 2230 tactical perspective, without consideration of overall strategic 2231 imperatives. Without proper planning and insights, tactical TE tends 2232 to be ad hoc in nature. 2234 Strategic traffic engineering approaches the TE problem from a more 2235 organized and systematic perspective, taking into consideration the 2236 immediate and longer term consequences of specific policies and 2237 actions. 2239 6. Objectives for Internet Traffic Engineering 2241 This section describes high-level objectives for traffic engineering 2242 in the Internet. These objectives are presented in general terms and 2243 some advice is given as to how to meet the objectives. 2245 Broadly speaking, these objectives can be categorized as either 2246 functional or non-functional. 2248 Functional objectives for Internet traffic engineering describe the 2249 functions that a traffic engineering system should perform. These 2250 functions are needed to realize traffic engineering objectives by 2251 addressing traffic engineering problems. 2253 Non-functional objectives for Internet traffic engineering relate to 2254 the quality attributes or state characteristics of a traffic 2255 engineering system. These objectives may contain conflicting 2256 assertions and may sometimes be difficult to quantify precisely. 2258 6.1. Routing 2260 Routing control is a significant aspect of Internet traffic 2261 engineering. Routing impacts many of the key performance measures 2262 associated with networks, such as throughput, delay, and utilization. 2263 Generally, it is very difficult to provide good service quality in a 2264 wide area network without effective routing control. A desirable 2265 routing system is one that takes traffic characteristics and network 2266 constraints into account during route selection while maintaining 2267 stability. 2269 Traditional shortest path first (SPF) interior gateway protocols are 2270 based on shortest path algorithms and have limited control 2271 capabilities for traffic engineering [RFC2702], [AWD2]. These 2272 limitations include : 2274 1. The well known issues with pure SPF protocols, which do not take 2275 network constraints and traffic characteristics into account 2276 during route selection. For example, since IGPs always use the 2277 shortest paths (based on administratively assigned link metrics) 2278 to forward traffic, load sharing cannot be accomplished among 2279 paths of different costs. Using shortest paths to forward 2280 traffic conserves network resources, but may cause the following 2281 problems: 1) If traffic from a source to a destination exceeds 2282 the capacity of a link along the shortest path, the link (hence 2283 the shortest path) becomes congested while a longer path between 2284 these two nodes may be under-utilized; 2) the shortest paths from 2285 different sources can overlap at some links. If the total 2286 traffic from the sources exceeds the capacity of any of these 2287 links, congestion will occur. Problems can also occur because 2288 traffic demand changes over time but network topology and routing 2289 configuration cannot be changed as rapidly. This causes the 2290 network topology and routing configuration to become sub-optimal 2291 over time, which may result in persistent congestion problems. 2293 2. The Equal-Cost Multi-Path (ECMP) capability of SPF IGPs supports 2294 sharing of traffic among equal cost paths between two nodes. 2295 However, ECMP attempts to divide the traffic as equally as 2296 possible among the equal cost shortest paths. Generally, ECMP 2297 does not support configurable load sharing ratios among equal 2298 cost paths. The result is that one of the paths may carry 2299 significantly more traffic than other paths because it may also 2300 carry traffic from other sources. This situation can result in 2301 congestion along the path that carries more traffic. 2303 3. Modifying IGP metrics to control traffic routing tends to have 2304 network-wide effect. Consequently, undesirable and unanticipated 2305 traffic shifts can be triggered as a result. Recent work 2306 described in Section 8 may be capable of better control [FT00], 2307 [FT01]. 2309 Because of these limitations, new capabilities are needed to enhance 2310 the routing function in IP networks. Some of these capabilities have 2311 been described elsewhere and are summarized below. 2313 Constraint-based routing is desirable to evolve the routing 2314 architecture of IP networks, especially public IP backbones with 2315 complex topologies [RFC2702]. Constraint-based routing computes 2316 routes to fulfill requirements subject to constraints. Constraints 2317 may include bandwidth, hop count, delay, and administrative policy 2318 instruments such as resource class attributes [RFC2702], [RFC2386]. 2319 This makes it possible to select routes that satisfy a given set of 2320 requirements subject to network and administrative policy 2321 constraints. Routes computed through constraint-based routing are 2322 not necessarily the shortest paths. Constraint-based routing works 2323 best with path oriented technologies that support explicit routing, 2324 such as MPLS. 2326 Constraint-based routing can also be used as a way to redistribute 2327 traffic onto the infrastructure (even for best effort traffic). For 2328 example, if the bandwidth requirements for path selection and 2329 reservable bandwidth attributes of network links are appropriately 2330 defined and configured, then congestion problems caused by uneven 2331 traffic distribution may be avoided or reduced. In this way, the 2332 performance and efficiency of the network can be improved. 2334 A number of enhancements are needed to conventional link state IGPs, 2335 such as OSPF and IS-IS, to allow them to distribute additional state 2336 information required for constraint-based routing. These extensions 2337 to OSPF were described in [RFC3630] and to IS-IS in [RFC5305]. 2338 Essentially, these enhancements require the propagation of additional 2339 information in link state advertisements. Specifically, in addition 2340 to normal link-state information, an enhanced IGP is required to 2341 propagate topology state information needed for constraint-based 2342 routing. Some of the additional topology state information include 2343 link attributes such as reservable bandwidth and link resource class 2344 attribute (an administratively specified property of the link). The 2345 resource class attribute concept was defined in [RFC2702]. The 2346 additional topology state information is carried in new TLVs and sub- 2347 TLVs in IS-IS, or in the Opaque LSA in OSPF [RFC5305], [RFC3630]. 2349 An enhanced link-state IGP may flood information more frequently than 2350 a normal IGP. This is because even without changes in topology, 2351 changes in reservable bandwidth or link affinity can trigger the 2352 enhanced IGP to initiate flooding. A tradeoff is typically required 2353 between the timeliness of the information flooded and the flooding 2354 frequency to avoid excessive consumption of link bandwidth and 2355 computational resources, and more importantly, to avoid instability. 2357 In a TE system, it is also desirable for the routing subsystem to 2358 make the load splitting ratio among multiple paths (with equal cost 2359 or different cost) configurable. This capability gives network 2360 administrators more flexibility in the control of traffic 2361 distribution across the network. It can be very useful for avoiding/ 2362 relieving congestion in certain situations. Examples can be found in 2363 [XIAO]. 2365 The routing system should also have the capability to control the 2366 routes of subsets of traffic without affecting the routes of other 2367 traffic if sufficient resources exist for this purpose. This 2368 capability allows a more refined control over the distribution of 2369 traffic across the network. For example, the ability to move traffic 2370 from a source to a destination away from its original path to another 2371 path (without affecting other traffic paths) allows traffic to be 2372 moved from resource-poor network segments to resource-rich segments. 2373 Path oriented technologies such as MPLS inherently support this 2374 capability as discussed in [AWD2]. 2376 Additionally, the routing subsystem should be able to select 2377 different paths for different classes of traffic (or for different 2378 traffic behavior aggregates) if the network supports multiple classes 2379 of service (different behavior aggregates). 2381 6.2. Traffic Mapping 2383 Traffic mapping pertains to the assignment of traffic workload onto 2384 pre-established paths to meet certain requirements. Thus, while 2385 constraint-based routing deals with path selection, traffic mapping 2386 deals with the assignment of traffic to established paths which may 2387 have been selected by constraint-based routing or by some other 2388 means. Traffic mapping can be performed by time-dependent or state- 2389 dependent mechanisms, as described in Section 5.1. 2391 An important aspect of the traffic mapping function is the ability to 2392 establish multiple paths between an originating node and a 2393 destination node, and the capability to distribute the traffic 2394 between the two nodes across the paths according to some policies. A 2395 pre-condition for this scheme is the existence of flexible mechanisms 2396 to partition traffic and then assign the traffic partitions onto the 2397 parallel paths. This requirement was noted in [RFC2702]. When 2398 traffic is assigned to multiple parallel paths, it is recommended 2399 that special care should be taken to ensure proper ordering of 2400 packets belonging to the same application (or micro-flow) at the 2401 destination node of the parallel paths. 2403 As a general rule, mechanisms that perform the traffic mapping 2404 functions should aim to map the traffic onto the network 2405 infrastructure to minimize congestion. If the total traffic load 2406 cannot be accommodated, or if the routing and mapping functions 2407 cannot react fast enough to changing traffic conditions, then a 2408 traffic mapping system may rely on short time scale congestion 2409 control mechanisms (such as queue management, scheduling, etc.) to 2410 mitigate congestion. Thus, mechanisms that perform the traffic 2411 mapping functions should complement existing congestion control 2412 mechanisms. In an operational network, it is generally desirable to 2413 map the traffic onto the infrastructure such that intra-class and 2414 inter-class resource contention are minimized. 2416 When traffic mapping techniques that depend on dynamic state feedback 2417 (e.g., MATE and such like) are used, special care must be taken to 2418 guarantee network stability. 2420 6.3. Measurement 2422 The importance of measurement in traffic engineering has been 2423 discussed throughout this document. Mechanisms should be provided to 2424 measure and collect statistics from the network to support the 2425 traffic engineering function. Additional capabilities may be needed 2426 to help in the analysis of the statistics. The actions of these 2427 mechanisms should not adversely affect the accuracy and integrity of 2428 the statistics collected. The mechanisms for statistical data 2429 acquisition should also be able to scale as the network evolves. 2431 Traffic statistics may be classified according to long-term or short- 2432 term time scales. Long-term time scale traffic statistics are very 2433 useful for traffic engineering. Long-term time scale traffic 2434 statistics may capture or reflect periodicity in network workload 2435 (such as hourly, daily, and weekly variations in traffic profiles) as 2436 well as traffic trends. Aspects of the monitored traffic statistics 2437 may also depict class of service characteristics for a network 2438 supporting multiple classes of service. Analysis of the long-term 2439 traffic statistics may yield secondary statistics such as busy hour 2440 characteristics, traffic growth patterns, persistent congestion 2441 problems, hot-spot, and imbalances in link utilization caused by 2442 routing anomalies. 2444 A mechanism for constructing traffic matrices for both long-term and 2445 short-term traffic statistics should be in place. In multi-service 2446 IP networks, the traffic matrices may be constructed for different 2447 service classes. Each element of a traffic matrix represents a 2448 statistic of traffic flow between a pair of abstract nodes. An 2449 abstract node may represent a router, a collection of routers, or a 2450 site in a VPN. 2452 Measured traffic statistics should provide reasonable and reliable 2453 indicators of the current state of the network on the short-term 2454 scale. Some short term traffic statistics may reflect link 2455 utilization and link congestion status. Examples of congestion 2456 indicators include excessive packet delay, packet loss, and high 2457 resource utilization. Examples of mechanisms for distributing this 2458 kind of information include SNMP, probing techniques, FTP, IGP link 2459 state advertisements, etc. 2461 6.4. Network Survivability 2463 Network survivability refers to the capability of a network to 2464 maintain service continuity in the presence of faults. This can be 2465 accomplished by promptly recovering from network impairments and 2466 maintaining the required QoS for existing services after recovery. 2467 Survivability has become an issue of great concern within the 2468 Internet community due to the increasing demands to carry mission 2469 critical traffic, real-time traffic, and other high priority traffic 2470 over the Internet. Survivability can be addressed at the device 2471 level by developing network elements that are more reliable; and at 2472 the network level by incorporating redundancy into the architecture, 2473 design, and operation of networks. It is recommended that a 2474 philosophy of robustness and survivability should be adopted in the 2475 architecture, design, and operation of traffic engineering that 2476 control IP networks (especially public IP networks). Because 2477 different contexts may demand different levels of survivability, the 2478 mechanisms developed to support network survivability should be 2479 flexible so that they can be tailored to different needs. A number 2480 of tools and techniques have been developed to enable network 2481 survivability including MPLS Fast Reroute [RFC4090], RSVP-TE 2482 Extensions in Support of End-to-End Generalized Multi-Protocol Label 2483 Switching (GMPLS) Recovery [RFC4872], and GMPLS Segment Recovery 2484 [RFC4873]. 2486 Failure protection and restoration capabilities have become available 2487 from multiple layers as network technologies have continued to 2488 improve. At the bottom of the layered stack, optical networks are 2489 now capable of providing dynamic ring and mesh restoration 2490 functionality at the wavelength level as well as traditional 2491 protection functionality. At the SONET/SDH layer survivability 2492 capability is provided with Automatic Protection Switching (APS) as 2493 well as self-healing ring and mesh architectures. Similar 2494 functionality is provided by layer 2 technologies such as ATM 2495 (generally with slower mean restoration times). Rerouting is 2496 traditionally used at the IP layer to restore service following link 2497 and node outages. Rerouting at the IP layer occurs after a period of 2498 routing convergence which may require seconds to minutes to complete. 2499 Some new developments in the MPLS context make it possible to achieve 2500 recovery at the IP layer prior to convergence [RFC3469]. 2502 To support advanced survivability requirements, path-oriented 2503 technologies such a MPLS can be used to enhance the survivability of 2504 IP networks in a potentially cost effective manner. The advantages 2505 of path oriented technologies such as MPLS for IP restoration becomes 2506 even more evident when class based protection and restoration 2507 capabilities are required. 2509 Recently, a common suite of control plane protocols has been proposed 2510 for both MPLS and optical transport networks under the acronym Multi- 2511 protocol Lambda Switching [AWD1]. This new paradigm of Multi- 2512 protocol Lambda Switching will support even more sophisticated mesh 2513 restoration capabilities at the optical layer for the emerging IP 2514 over WDM network architectures. 2516 Another important aspect regarding multi-layer survivability is that 2517 technologies at different layers provide protection and restoration 2518 capabilities at different temporal granularities (in terms of time 2519 scales) and at different bandwidth granularity (from packet-level to 2520 wavelength level). Protection and restoration capabilities can also 2521 be sensitive to different service classes and different network 2522 utility models. 2524 The impact of service outages varies significantly for different 2525 service classes depending upon the effective duration of the outage. 2526 The duration of an outage can vary from milliseconds (with minor 2527 service impact) to seconds (with possible call drops for IP telephony 2528 and session time-outs for connection oriented transactions) to 2529 minutes and hours (with potentially considerable social and business 2530 impact). 2532 Coordinating different protection and restoration capabilities across 2533 multiple layers in a cohesive manner to ensure network survivability 2534 is maintained at reasonable cost is a challenging task. Protection 2535 and restoration coordination across layers may not always be 2536 feasible, because networks at different layers may belong to 2537 different administrative domains. 2539 The following paragraphs present some of the general recommendations 2540 for protection and restoration coordination. 2542 o Protection and restoration capabilities from different layers 2543 should be coordinated whenever feasible and appropriate to provide 2544 network survivability in a flexible and cost effective manner. 2545 Minimization of function duplication across layers is one way to 2546 achieve the coordination. Escalation of alarms and other fault 2547 indicators from lower to higher layers may also be performed in a 2548 coordinated manner. A temporal order of restoration trigger 2549 timing at different layers is another way to coordinate multi- 2550 layer protection/restoration. 2552 o Spare capacity at higher layers is often regarded as working 2553 traffic at lower layers. Placing protection/restoration functions 2554 in many layers may increase redundancy and robustness, but it 2555 should not result in significant and avoidable inefficiencies in 2556 network resource utilization. 2558 o It is generally desirable to have protection and restoration 2559 schemes that are bandwidth efficient. 2561 o Failure notification throughout the network should be timely and 2562 reliable. 2564 o Alarms and other fault monitoring and reporting capabilities 2565 should be provided at appropriate layers. 2567 6.4.1. Survivability in MPLS Based Networks 2569 MPLS is an important emerging technology that enhances IP networks in 2570 terms of features, capabilities, and services. Because MPLS is path- 2571 oriented, it can potentially provide faster and more predictable 2572 protection and restoration capabilities than conventional hop by hop 2573 routed IP systems. This subsection describes some of the basic 2574 aspects and recommendations for MPLS networks regarding protection 2575 and restoration. See [RFC3469] for a more comprehensive discussion 2576 on MPLS based recovery. 2578 Protection types for MPLS networks can be categorized as link 2579 protection, node protection, path protection, and segment protection. 2581 o Link Protection: The objective for link protection is to protect 2582 an LSP from a given link failure. Under link protection, the path 2583 of the protection or backup LSP (the secondary LSP) is disjoint 2584 from the path of the working or operational LSP at the particular 2585 link over which protection is required. When the protected link 2586 fails, traffic on the working LSP is switched over to the 2587 protection LSP at the head-end of the failed link. This is a 2588 local repair method which can be fast. It might be more 2589 appropriate in situations where some network elements along a 2590 given path are less reliable than others. 2592 o Node Protection: The objective of LSP node protection is to 2593 protect an LSP from a given node failure. Under node protection, 2594 the path of the protection LSP is disjoint from the path of the 2595 working LSP at the particular node to be protected. The secondary 2596 path is also disjoint from the primary path at all links 2597 associated with the node to be protected. When the node fails, 2598 traffic on the working LSP is switched over to the protection LSP 2599 at the upstream LSR directly connected to the failed node. 2601 o Path Protection: The goal of LSP path protection is to protect an 2602 LSP from failure at any point along its routed path. Under path 2603 protection, the path of the protection LSP is completely disjoint 2604 from the path of the working LSP. The advantage of path 2605 protection is that the backup LSP protects the working LSP from 2606 all possible link and node failures along the path, except for 2607 failures that might occur at the ingress and egress LSRs, or for 2608 correlated failures that might impact both working and backup 2609 paths simultaneously. Additionally, since the path selection is 2610 end-to-end, path protection might be more efficient in terms of 2611 resource usage than link or node protection. However, path 2612 protection may be slower than link and node protection in general. 2614 o Segment Protection: An MPLS domain may be partitioned into 2615 multiple protection domains whereby a failure in a protection 2616 domain is rectified within that domain. In cases where an LSP 2617 traverses multiple protection domains, a protection mechanism 2618 within a domain only needs to protect the segment of the LSP that 2619 lies within the domain. Segment protection will generally be 2620 faster than path protection because recovery generally occurs 2621 closer to the fault. 2623 6.4.2. Protection Option 2625 Another issue to consider is the concept of protection options. The 2626 protection option uses the notation m:n protection, where m is the 2627 number of protection LSPs used to protect n working LSPs. Feasible 2628 protection options follow. 2630 o 1:1: one working LSP is protected/restored by one protection LSP. 2632 o 1:n: one protection LSP is used to protect/restore n working LSPs. 2634 o n:1: one working LSP is protected/restored by n protection LSPs, 2635 possibly with configurable load splitting ratio. When more than 2636 one protection LSP is used, it may be desirable to share the 2637 traffic across the protection LSPs when the working LSP fails to 2638 satisfy the bandwidth requirement of the traffic trunk associated 2639 with the working LSP. This may be especially useful when it is 2640 not feasible to find one path that can satisfy the bandwidth 2641 requirement of the primary LSP. 2643 o 1+1: traffic is sent concurrently on both the working LSP and the 2644 protection LSP. In this case, the egress LSR selects one of the 2645 two LSPs based on a local traffic integrity decision process, 2646 which compares the traffic received from both the working and the 2647 protection LSP and identifies discrepancies. It is unlikely that 2648 this option would be used extensively in IP networks due to its 2649 resource utilization inefficiency. However, if bandwidth becomes 2650 plentiful and cheap, then this option might become quite viable 2651 and attractive in IP networks. 2653 6.5. Traffic Engineering in Diffserv Environments 2655 This section provides an overview of the traffic engineering features 2656 and recommendations that are specifically pertinent to Differentiated 2657 Services (Diffserv) [RFC2475] capable IP networks. 2659 Increasing requirements to support multiple classes of traffic, such 2660 as best effort and mission critical data, in the Internet calls for 2661 IP networks to differentiate traffic according to some criteria, and 2662 to accord preferential treatment to certain types of traffic. Large 2663 numbers of flows can be aggregated into a few behavior aggregates 2664 based on some criteria in terms of common performance requirements in 2665 terms of packet loss ratio, delay, and jitter; or in terms of common 2666 fields within the IP packet headers. 2668 As Diffserv evolves and becomes deployed in operational networks, 2669 traffic engineering will be critical to ensuring that SLAs defined 2670 within a given Diffserv service model are met. Classes of service 2671 (CoS) can be supported in a Diffserv environment by concatenating 2672 per-hop behaviors (PHBs) along the routing path, using service 2673 provisioning mechanisms, and by appropriately configuring edge 2674 functionality such as traffic classification, marking, policing, and 2675 shaping. PHB is the forwarding behavior that a packet receives at a 2676 DS node (a Diffserv-compliant node). This is accomplished by means 2677 of buffer management and packet scheduling mechanisms. In this 2678 context, packets belonging to a class are those that are members of a 2679 corresponding ordering aggregate. 2681 Traffic engineering can be used as a compliment to Diffserv 2682 mechanisms to improve utilization of network resources, but not as a 2683 necessary element in general. When traffic engineering is used, it 2684 can be operated on an aggregated basis across all service classes 2685 [RFC3270] or on a per service class basis. The former is used to 2686 provide better distribution of the aggregate traffic load over the 2687 network resources. (See [RFC3270] for detailed mechanisms to support 2688 aggregate traffic engineering.) The latter case is discussed below 2689 since it is specific to the Diffserv environment, with so called 2690 Diffserv-aware traffic engineering [RFC4124]. 2692 For some Diffserv networks, it may be desirable to control the 2693 performance of some service classes by enforcing certain 2694 relationships between the traffic workload contributed by each 2695 service class and the amount of network resources allocated or 2696 provisioned for that service class. Such relationships between 2697 demand and resource allocation can be enforced using a combination 2698 of, for example: (1) traffic engineering mechanisms on a per service 2699 class basis that enforce the desired relationship between the amount 2700 of traffic contributed by a given service class and the resources 2701 allocated to that class, and (2) mechanisms that dynamically adjust 2702 the resources allocated to a given service class to relate to the 2703 amount of traffic contributed by that service class. 2705 It may also be desirable to limit the performance impact of high 2706 priority traffic on relatively low priority traffic. This can be 2707 achieved by, for example, controlling the percentage of high priority 2708 traffic that is routed through a given link. Another way to 2709 accomplish this is to increase link capacities appropriately so that 2710 lower priority traffic can still enjoy adequate service quality. 2711 When the ratio of traffic workload contributed by different service 2712 classes vary significantly from router to router, it may not suffice 2713 to rely exclusively on conventional IGP routing protocols or on 2714 traffic engineering mechanisms that are insensitive to different 2715 service classes. Instead, it may be desirable to perform traffic 2716 engineering, especially routing control and mapping functions, on a 2717 per service class basis. One way to accomplish this in a domain that 2718 supports both MPLS and Diffserv is to define class specific LSPs and 2719 to map traffic from each class onto one or more LSPs that correspond 2720 to that service class. An LSP corresponding to a given service class 2721 can then be routed and protected/restored in a class dependent 2722 manner, according to specific policies. 2724 Performing traffic engineering on a per class basis may require 2725 certain per-class parameters to be distributed. Note that it is 2726 common to have some classes share some aggregate constraint (e.g., 2727 maximum bandwidth requirement) without enforcing the constraint on 2728 each individual class. These classes then can be grouped into a 2729 class-type and per-class-type parameters can be distributed instead 2730 to improve scalability. It also allows better bandwidth sharing 2731 between classes in the same class-type. A class-type is a set of 2732 classes that satisfy the following two conditions: 2734 1) Classes in the same class-type have common aggregate requirements 2735 to satisfy required performance levels. 2737 2) There is no requirement to be enforced at the level of individual 2738 class in the class-type. Note that it is still possible, 2739 nevertheless, to implement some priority policies for classes in the 2740 same class-type to permit preferential access to the class-type 2741 bandwidth through the use of preemption priorities. 2743 An example of the class-type can be a low-loss class-type that 2744 includes both AF1-based and AF2-based Ordering Aggregates. With such 2745 a class-type, one may implement some priority policy which assigns 2746 higher preemption priority to AF1-based traffic trunks over AF2-based 2747 ones, vice versa, or the same priority. 2749 See [RFC4124] for detailed requirements on Diffserv-aware traffic 2750 engineering. 2752 6.6. Network Controllability 2754 Off-line (and on-line) traffic engineering considerations would be of 2755 limited utility if the network could not be controlled effectively to 2756 implement the results of TE decisions and to achieve desired network 2757 performance objectives. Capacity augmentation is a coarse grained 2758 solution to traffic engineering issues. However, it is simple and 2759 may be advantageous if bandwidth is abundant and cheap or if the 2760 current or expected network workload demands it. However, bandwidth 2761 is not always abundant and cheap, and the workload may not always 2762 demand additional capacity. Adjustments of administrative weights 2763 and other parameters associated with routing protocols provide finer 2764 grained control, but is difficult to use and imprecise because of the 2765 routing interactions that occur across the network. In certain 2766 network contexts, more flexible, finer grained approaches which 2767 provide more precise control over the mapping of traffic to routes 2768 and over the selection and placement of routes may be appropriate and 2769 useful. 2771 Control mechanisms can be manual (e.g., administrative 2772 configuration), partially-automated (e.g., scripts) or fully- 2773 automated (e.g., policy based management systems). Automated 2774 mechanisms are particularly required in large scale networks. Multi- 2775 vendor interoperability can be facilitated by developing and 2776 deploying standardized management systems (e.g., standard MIBs) and 2777 policies (PIBs) to support the control functions required to address 2778 traffic engineering objectives such as load distribution and 2779 protection/restoration. 2781 Network control functions should be secure, reliable, and stable as 2782 these are often needed to operate correctly in times of network 2783 impairments (e.g., during network congestion or security attacks). 2785 7. Inter-Domain Considerations 2787 Inter-domain traffic engineering is concerned with the performance 2788 optimization for traffic that originates in one administrative domain 2789 and terminates in a different one. 2791 Traffic exchange between autonomous systems in the Internet occurs 2792 through exterior gateway protocols. Currently, BGP [RFC4271] is the 2793 standard exterior gateway protocol for the Internet. BGP provides a 2794 number of attributes and capabilities (e.g., route filtering) that 2795 can be used for inter-domain traffic engineering. More specifically, 2796 BGP permits the control of routing information and traffic exchange 2797 between Autonomous Systems (AS's) in the Internet. BGP incorporates 2798 a sequential decision process which calculates the degree of 2799 preference for various routes to a given destination network. There 2800 are two fundamental aspects to inter-domain traffic engineering using 2801 BGP: 2803 o Route Redistribution: controlling the import and export of routes 2804 between AS's, and controlling the redistribution of routes between 2805 BGP and other protocols within an AS. 2807 o Best path selection: selecting the best path when there are 2808 multiple candidate paths to a given destination network. Best 2809 path selection is performed by the BGP decision process based on a 2810 sequential procedure, taking a number of different considerations 2811 into account. Ultimately, best path selection under BGP boils 2812 down to selecting preferred exit points out of an AS towards 2813 specific destination networks. The BGP path selection process can 2814 be influenced by manipulating the attributes associated with the 2815 BGP decision process. These attributes include: NEXT-HOP, WEIGHT 2816 (Cisco proprietary which is also implemented by some other 2817 vendors), LOCAL-PREFERENCE, AS-PATH, ROUTE-ORIGIN, MULTI-EXIT- 2818 DESCRIMINATOR (MED), IGP METRIC, etc. 2820 Route-maps provide the flexibility to implement complex BGP policies 2821 based on pre-configured logical conditions. In particular, Route- 2822 maps can be used to control import and export policies for incoming 2823 and outgoing routes, control the redistribution of routes between BGP 2824 and other protocols, and influence the selection of best paths by 2825 manipulating the attributes associated with the BGP decision process. 2826 Very complex logical expressions that implement various types of 2827 policies can be implemented using a combination of Route-maps, BGP- 2828 attributes, Access-lists, and Community attributes. 2830 When looking at possible strategies for inter-domain TE with BGP, it 2831 must be noted that the outbound traffic exit point is controllable, 2832 whereas the interconnection point where inbound traffic is received 2833 from an EBGP peer typically is not, unless a special arrangement is 2834 made with the peer sending the traffic. Therefore, it is up to each 2835 individual network to implement sound TE strategies that deal with 2836 the efficient delivery of outbound traffic from one's customers to 2837 one's peering points. The vast majority of TE policy is based upon a 2838 "closest exit" strategy, which offloads interdomain traffic at the 2839 nearest outbound peer point towards the destination autonomous 2840 system. Most methods of manipulating the point at which inbound 2841 traffic enters a network from an EBGP peer (inconsistent route 2842 announcements between peering points, AS pre-pending, and sending 2843 MEDs) are either ineffective, or not accepted in the peering 2844 community. 2846 Inter-domain TE with BGP is generally effective, but it is usually 2847 applied in a trial-and-error fashion. A systematic approach for 2848 inter-domain traffic engineering is yet to be devised. 2850 Inter-domain TE is inherently more difficult than intra-domain TE 2851 under the current Internet architecture. The reasons for this are 2852 both technical and administrative. Technically, while topology and 2853 link state information are helpful for mapping traffic more 2854 effectively, BGP does not propagate such information across domain 2855 boundaries for stability and scalability reasons. Administratively, 2856 there are differences in operating costs and network capacities 2857 between domains. Generally, what may be considered a good solution 2858 in one domain may not necessarily be a good solution in another 2859 domain. Moreover, it would generally be considered inadvisable for 2860 one domain to permit another domain to influence the routing and 2861 management of traffic in its network. 2863 MPLS TE-tunnels (explicit LSPs) can potentially add a degree of 2864 flexibility in the selection of exit points for inter-domain routing. 2865 The concept of relative and absolute metrics can be applied to this 2866 purpose. The idea is that if BGP attributes are defined such that 2867 the BGP decision process depends on IGP metrics to select exit points 2868 for inter-domain traffic, then some inter-domain traffic destined to 2869 a given peer network can be made to prefer a specific exit point by 2870 establishing a TE-tunnel between the router making the selection to 2871 the peering point via a TE-tunnel and assigning the TE-tunnel a 2872 metric which is smaller than the IGP cost to all other peering 2873 points. If a peer accepts and processes MEDs, then a similar MPLS 2874 TE-tunnel based scheme can be applied to cause certain entrance 2875 points to be preferred by setting MED to be an IGP cost, which has 2876 been modified by the tunnel metric. 2878 Similar to intra-domain TE, inter-domain TE is best accomplished when 2879 a traffic matrix can be derived to depict the volume of traffic from 2880 one autonomous system to another. 2882 Generally, redistribution of inter-domain traffic requires 2883 coordination between peering partners. An export policy in one 2884 domain that results in load redistribution across peer points with 2885 another domain can significantly affect the local traffic matrix 2886 inside the domain of the peering partner. This, in turn, will affect 2887 the intra-domain TE due to changes in the spatial distribution of 2888 traffic. Therefore, it is mutually beneficial for peering partners 2889 to coordinate with each other before attempting any policy changes 2890 that may result in significant shifts in inter-domain traffic. In 2891 certain contexts, this coordination can be quite challenging due to 2892 technical and non- technical reasons. 2894 It is a matter of speculation as to whether MPLS, or similar 2895 technologies, can be extended to allow selection of constrained paths 2896 across domain boundaries. 2898 8. Overview of Contemporary TE Practices in Operational IP Networks 2900 This section provides an overview of some contemporary traffic 2901 engineering practices in IP networks. The focus is primarily on the 2902 aspects that pertain to the control of the routing function in 2903 operational contexts. The intent here is to provide an overview of 2904 the commonly used practices. The discussion is not intended to be 2905 exhaustive. 2907 Currently, service providers apply many of the traffic engineering 2908 mechanisms discussed in this document to optimize the performance of 2909 their IP networks. These techniques include capacity planning for 2910 long time scales, routing control using IGP metrics and MPLS for 2911 medium time scales, the overlay model also for medium time scales, 2912 and traffic management mechanisms for short time scale. 2914 When a service provider plans to build an IP network, or expand the 2915 capacity of an existing network, effective capacity planning should 2916 be an important component of the process. Such plans may take the 2917 following aspects into account: location of new nodes if any, 2918 existing and predicted traffic patterns, costs, link capacity, 2919 topology, routing design, and survivability. 2921 Performance optimization of operational networks is usually an 2922 ongoing process in which traffic statistics, performance parameters, 2923 and fault indicators are continually collected from the network. 2924 This empirical data is then analyzed and used to trigger various 2925 traffic engineering mechanisms. Tools that perform what-if analysis 2926 can also be used to assist the TE process by allowing various 2927 scenarios to be reviewed before a new set of configurations are 2928 implemented in the operational network. 2930 Traditionally, intra-domain real-time TE with IGP is done by 2931 increasing the OSPF or IS-IS metric of a congested link until enough 2932 traffic has been diverted from that link. This approach has some 2933 limitations as discussed in Section 6.1. Recently, some new intra- 2934 domain TE approaches/tools have been proposed [RR94] [FT00] [FT01] 2935 [WANG]. Such approaches/tools take traffic matrix, network topology, 2936 and network performance objective(s) as input, and produce some link 2937 metrics and possibly some unequal load-sharing ratios to be set at 2938 the head-end routers of some ECMPs as output. These new progresses 2939 open new possibility for intra-domain TE with IGP to be done in a 2940 more systematic way. 2942 The overlay model (IP over ATM, or IP over Frame Relay) is another 2943 approach which was commonly used [AWD2], but has been replaced by 2944 MPLS and router hardware technology. 2946 Deployment of MPLS for traffic engineering applications has commenced 2947 in some service provider networks. One operational scenario is to 2948 deploy MPLS in conjunction with an IGP (IS-IS-TE or OSPF-TE) that 2949 supports the traffic engineering extensions, in conjunction with 2950 constraint-based routing for explicit route computations, and a 2951 signaling protocol (e.g., RSVP-TE) for LSP instantiation. 2953 In contemporary MPLS traffic engineering contexts, network 2954 administrators specify and configure link attributes and resource 2955 constraints such as maximum reservable bandwidth and resource class 2956 attributes for links (interfaces) within the MPLS domain. A link 2957 state protocol that supports TE extensions (IS-IS-TE or OSPF-TE) is 2958 used to propagate information about network topology and link 2959 attribute to all routers in the routing area. Network administrators 2960 also specify all the LSPs that are to originate each router. For 2961 each LSP, the network administrator specifies the destination node 2962 and the attributes of the LSP which indicate the requirements that to 2963 be satisfied during the path selection process. Each router then 2964 uses a local constraint-based routing process to compute explicit 2965 paths for all LSPs originating from it. Subsequently, a signaling 2966 protocol is used to instantiate the LSPs. By assigning proper 2967 bandwidth values to links and LSPs, congestion caused by uneven 2968 traffic distribution can generally be avoided or mitigated. 2970 The bandwidth attributes of LSPs used for traffic engineering can be 2971 updated periodically. The basic concept is that the bandwidth 2972 assigned to an LSP should relate in some manner to the bandwidth 2973 requirements of traffic that actually flows through the LSP. The 2974 traffic attribute of an LSP can be modified to accommodate traffic 2975 growth and persistent traffic shifts. If network congestion occurs 2976 due to some unexpected events, existing LSPs can be rerouted to 2977 alleviate the situation or network administrator can configure new 2978 LSPs to divert some traffic to alternative paths. The reservable 2979 bandwidth of the congested links can also be reduced to force some 2980 LSPs to be rerouted to other paths. 2982 In an MPLS domain, a traffic matrix can also be estimated by 2983 monitoring the traffic on LSPs. Such traffic statistics can be used 2984 for a variety of purposes including network planning and network 2985 optimization. Current practice suggests that deploying an MPLS 2986 network consisting of hundreds of routers and thousands of LSPs is 2987 feasible. In summary, recent deployment experience suggests that 2988 MPLS approach is very effective for traffic engineering in IP 2989 networks [XIAO]. 2991 As mentioned previously in Section 7, one usually has no direct 2992 control over the distribution of inbound traffic. Therefore, the 2993 main goal of contemporary inter-domain TE is to optimize the 2994 distribution of outbound traffic between multiple inter-domain links. 2995 When operating a global network, maintaining the ability to operate 2996 the network in a regional fashion where desired, while continuing to 2997 take advantage of the benefits of a global network, also becomes an 2998 important objective. 3000 Inter-domain TE with BGP usually begins with the placement of 3001 multiple peering interconnection points in locations that have high 3002 peer density, are in close proximity to originating/terminating 3003 traffic locations on one's own network, and are lowest in cost. 3004 There are generally several locations in each region of the world 3005 where the vast majority of major networks congregate and 3006 interconnect. Some location-decision problems that arise in 3007 association with inter-domain routing are discussed in [AWD5]. 3009 Once the locations of the interconnects are determined, and circuits 3010 are implemented, one decides how best to handle the routes heard from 3011 the peer, as well as how to propagate the peers' routes within one's 3012 own network. One way to engineer outbound traffic flows on a network 3013 with many EBGP peers is to create a hierarchy of peers. Generally, 3014 the Local Preferences of all peers are set to the same value so that 3015 the shortest AS paths will be chosen to forward traffic. Then, by 3016 over-writing the inbound MED metric (Multi-exit-discriminator metric, 3017 also referred to as "BGP metric". Both terms are used 3018 interchangeably in this document) with BGP metrics to routes received 3019 at different peers, the hierarchy can be formed. For example, all 3020 Local Preferences can be set to 200, preferred private peers can be 3021 assigned a BGP metric of 50, the rest of the private peers can be 3022 assigned a BGP metric of 100, and public peers can be assigned a BGP 3023 metric of 600. "Preferred" peers might be defined as those peers 3024 with whom the most available capacity exists, whose customer base is 3025 larger in comparison to other peers, whose interconnection costs are 3026 the lowest, and with whom upgrading existing capacity is the easiest. 3027 In a network with low utilization at the edge, this works well. The 3028 same concept could be applied to a network with higher edge 3029 utilization by creating more levels of BGP metrics between peers, 3030 allowing for more granularity in selecting the exit points for 3031 traffic bound for a dual homed customer on a peer's network. 3033 By only replacing inbound MED metrics with BGP metrics, only equal 3034 AS-Path length routes' exit points are being changed. (The BGP 3035 decision considers Local Preference first, then AS-Path length, and 3036 then BGP metric). For example, assume a network has two possible 3037 egress points, peer A and peer B. Each peer has 40% of the 3038 Internet's routes exclusively on its network, while the remaining 20% 3039 of the Internet's routes are from customers who dual home between A 3040 and B. Assume that both peers have a Local Preference of 200 and a 3041 BGP metric of 100. If the link to peer A is congested, increasing 3042 its BGP metric while leaving the Local Preference at 200 will ensure 3043 that the 20% of total routes belonging to dual homed customers will 3044 prefer peer B as the exit point. The previous example would be used 3045 in a situation where all exit points to a given peer were close to 3046 congestion levels, and traffic needed to be shifted away from that 3047 peer entirely. 3049 When there are multiple exit points to a given peer, and only one of 3050 them is congested, it is not necessary to shift traffic away from the 3051 peer entirely, but only from the one congested circuit. This can be 3052 achieved by using passive IGP-metrics, AS-path filtering, or prefix 3053 filtering. 3055 Occasionally, more drastic changes are needed, for example, in 3056 dealing with a "problem peer" who is difficult to work with on 3057 upgrades or is charging high prices for connectivity to their 3058 network. In that case, the Local Preference to that peer can be 3059 reduced below the level of other peers. This effectively reduces the 3060 amount of traffic sent to that peer to only originating traffic 3061 (assuming no transit providers are involved). This type of change 3062 can affect a large amount of traffic, and is only used after other 3063 methods have failed to provide the desired results. 3065 Although it is not much of an issue in regional networks, the 3066 propagation of a peer's routes back through the network must be 3067 considered when a network is peering on a global scale. Sometimes, 3068 business considerations can influence the choice of BGP policies in a 3069 given context. For example, it may be imprudent, from a business 3070 perspective, to operate a global network and provide full access to 3071 the global customer base to a small network in a particular country. 3072 However, for the purpose of providing one's own customers with 3073 quality service in a particular region, good connectivity to that in- 3074 country network may still be necessary. This can be achieved by 3075 assigning a set of communities at the edge of the network, which have 3076 a known behavior when routes tagged with those communities are 3077 propagating back through the core. Routes heard from local peers 3078 will be prevented from propagating back to the global network, 3079 whereas routes learned from larger peers may be allowed to propagate 3080 freely throughout the entire global network. By implementing a 3081 flexible community strategy, the benefits of using a single global AS 3082 Number (ASN) can be realized, while the benefits of operating 3083 regional networks can also be taken advantage of. An alternative to 3084 doing this is to use different ASNs in different regions, with the 3085 consequence that the AS path length for routes announced by that 3086 service provider will increase. 3088 9. Conclusion 3090 This document described principles for traffic engineering in the 3091 Internet. It presented an overview of some of the basic issues 3092 surrounding traffic engineering in IP networks. The context of TE 3093 was described, a TE process models and a taxonomy of TE styles were 3094 presented. A brief historical review of pertinent developments 3095 related to traffic engineering was provided. A survey of 3096 contemporary TE techniques in operational networks was presented. 3097 Additionally, the document specified a set of generic requirements, 3098 recommendations, and options for Internet traffic engineering. 3100 10. Security Considerations 3102 This document does not introduce new security issues. 3104 11. IANA Considerations 3106 This draft makes no requests for IANA action. 3108 12. Acknowledgments 3110 The acknowledgements in RFC3272 were as below. All people who helped 3111 in the production of that document also need to be thanked for the 3112 carry-over into this new document. 3114 The authors would like to thank Jim Boyle for inputs on the 3115 recommendations section, Francois Le Faucheur for inputs on Diffserv 3116 aspects, Blaine Christian for inputs on measurement, Gerald Ash for 3117 inputs on routing in telephone networks and for text on event- 3118 dependent TE methods, Steven Wright for inputs on network 3119 controllability, and Jonathan Aufderheide for inputs on inter-domain 3120 TE with BGP. Special thanks to Randy Bush for proposing the TE 3121 taxonomy based on "tactical vs strategic" methods. The subsection 3122 describing an "Overview of ITU Activities Related to Traffic 3123 Engineering" was adapted from a contribution by Waisum Lai. Useful 3124 feedback and pointers to relevant materials were provided by J. Noel 3125 Chiappa. Additional comments were provided by Glenn Grotefeld during 3126 the working last call process. Finally, the authors would like to 3127 thank Ed Kern, the TEWG co-chair, for his comments and support. 3129 The production of this document include a fix to the original text 3130 resulting from an Errata Report by Jean-Michel Grimaldi. 3132 The authors of this document would also like to thank TBD. 3134 13. Contributors 3136 Much of the text in this document is derived from RFC 3272. The 3137 authors of this document would like to express their gratitude to all 3138 involved in that work. Although the source text has been edited in 3139 the production of this document, the orginal authors should be 3140 considered as Contributors to this work. They were: 3142 Daniel O. Awduche 3143 Movaz Networks 3144 7926 Jones Branch Drive, Suite 615 3145 McLean, VA 22102 3147 Phone: 703-298-5291 3148 EMail: awduche@movaz.com 3150 Angela Chiu 3151 Celion Networks 3152 1 Sheila Dr., Suite 2 3153 Tinton Falls, NJ 07724 3155 Phone: 732-747-9987 3156 EMail: angela.chiu@celion.com 3158 Anwar Elwalid 3159 Lucent Technologies 3160 Murray Hill, NJ 07974 3162 Phone: 908 582-7589 3163 EMail: anwar@lucent.com 3165 Indra Widjaja 3166 Bell Labs, Lucent Technologies 3167 600 Mountain Avenue 3168 Murray Hill, NJ 07974 3170 Phone: 908 582-0435 3171 EMail: iwidjaja@research.bell-labs.com 3173 XiPeng Xiao 3174 Redback Networks 3175 300 Holger Way 3176 San Jose, CA 95134 3178 Phone: 408-750-5217 3179 EMail: xipeng@redback.com 3181 The first version of this document was produced by the TEAS Working 3182 Group's RFC3272bis Design Team. The team members are all 3183 Contributors to this document. The full list of contributors to this 3184 document is: 3186 Acee Lindem 3187 EMail: acee@cisco.com 3188 Adrian Farrel 3189 EMail: adrian@olddog.co.uk 3191 Aijun Wang 3192 EMail: wangaijun@tsinghua.org.cn 3194 Daniele Ceccarelli 3195 EMail: daniele.ceccarelli@ericsson.com 3197 Dieter Beller 3198 EMail: dieter.beller@nokia.com 3200 Jeff Tantsura 3201 EMail: jefftant.ietf@gmail.com 3203 Julien Meuric 3204 EMail: julien.meuric@orange.com 3206 Liu Hua 3207 EMail: hliu@ciena.com 3209 Loa Andersson 3210 EMail: loa@pi.nu 3212 Luis Miguel Contreras 3213 EMail: luismiguel.contrerasmurillo@telefonica.com 3215 Martin Horneffer 3216 EMail: Martin.Horneffer@telekom.de 3218 Tarek Saad 3219 EMail: tsaad@cisco.com 3221 Xufeng Liu 3222 EMail: xufeng.liu.ietf@gmail.com 3224 Gert Grammel 3225 EMail: ggrammel@juniper.net 3227 14. Informative References 3229 [ASH2] Ash, J., "Dynamic Routing in Telecommunications Networks", 3230 Book McGraw Hill, 1998. 3232 [AWD1] Awduche, D. and Y. Rekhter, "Multiprocotol Lambda 3233 Switching - Combining MPLS Traffic Engineering Control 3234 with Optical Crossconnects", Article IEEE Communications 3235 Magazine, March 2001. 3237 [AWD2] Awduche, D., "MPLS and Traffic Engineering in IP 3238 Networks", Article IEEE Communications Magazine, December 3239 1999. 3241 [AWD5] Awduche, D., "An Approach to Optimal Peering Between 3242 Autonomous Systems in the Internet", Paper International 3243 Conference on Computer Communications and Networks 3244 (ICCCN'98), October 1998. 3246 [CRUZ] "A Calculus for Network Delay, Part II, Network Analysis", 3247 Transaction IEEE Transactions on Information Theory, vol. 3248 37, pp. 132-141, 1991. 3250 [ELW95] Elwalid, A., Mitra, D., and R. Wentworth, "A New Approach 3251 for Allocating Buffers and Bandwidth to Heterogeneous, 3252 Regulated Traffic in an ATM Node", Article IEEE Journal on 3253 Selected Areas in Communications, 13.6, pp. 1115-1127, 3254 August 1995. 3256 [FLJA93] Floyd, S. and V. Jacobson, "Random Early Detection 3257 Gateways for Congestion Avoidance", Article IEEE/ACM 3258 Transactions on Networking, Vol. 1, p. 387-413, November 3259 1993. 3261 [FLOY94] Floyd, S., "TCP and Explicit Congestion Notification", 3262 Article ACM Computer Communication Review, V. 24, No. 5, 3263 p. 10-23, October 1994. 3265 [FT00] Fortz, B. and M. Thorup, "Internet Traffic Engineering by 3266 Optimizing OSPF Weights", Article IEEE INFOCOM 2000, March 3267 2000. 3269 [FT01] Fortz, B. and M. Thorup, "Optimizing OSPF/IS-IS Weights in 3270 a Changing World", n.d., 3271 . 3273 [HUSS87] Hurley, B., Seidl, C., and W. Sewel, "A Survey of Dynamic 3274 Routing Methods for Circuit-Switched Traffic", 3275 Article IEEE Communication Magazine, September 1987. 3277 [I-D.ietf-teas-yang-te-topo] 3278 Liu, X., Bryskin, I., Beeram, V., Saad, T., Shah, H., and 3279 O. Dios, "YANG Data Model for Traffic Engineering (TE) 3280 Topologies", draft-ietf-teas-yang-te-topo-22 (work in 3281 progress), June 2019. 3283 [I-D.ietf-tewg-qos-routing] 3284 Ash, G., "Traffic Engineering & QoS Methods for IP-, ATM-, 3285 & Based Multiservice Networks", draft-ietf-tewg-qos- 3286 routing-04 (work in progress), October 2001. 3288 [ITU-E600] 3289 "Terms and Definitions of Traffic Engineering", 3290 Recommendation ITU-T Recommendation E.600, March 1993. 3292 [ITU-E701] 3293 "Reference Connections for Traffic Engineering", 3294 Recommendation ITU-T Recommendation E.701, October 1993. 3296 [ITU-E801] 3297 "Framework for Service Quality Agreement", 3298 Recommendation ITU-T Recommendation E.801, October 1996. 3300 [MA] Ma, Q., "Quality of Service Routing in Integrated Services 3301 Networks", Ph.D. PhD Dissertation, CMU-CS-98-138, CMU, 3302 1998. 3304 [MATE] Elwalid, A., Jin, C., Low, S., and I. Widjaja, "MATE - 3305 MPLS Adaptive Traffic Engineering", 3306 Proceedings INFOCOM'01, April 2001. 3308 [MCQ80] McQuillan, J., Richer, I., and E. Rosen, "The New Routing 3309 Algorithm for the ARPANET", Transaction IEEE Transactions 3310 on Communications, vol. 28, no. 5, p. 711-719, May 1980. 3312 [MR99] Mitra, D. and K. Ramakrishnan, "A Case Study of 3313 Multiservice, Multipriority Traffic Engineering Design for 3314 Data Networks", Proceedings Globecom'99, December 1999. 3316 [RFC1992] Castineyra, I., Chiappa, N., and M. Steenstrup, "The 3317 Nimrod Routing Architecture", RFC 1992, 3318 DOI 10.17487/RFC1992, August 1996, 3319 . 3321 [RFC2205] Braden, R., Ed., Zhang, L., Berson, S., Herzog, S., and S. 3322 Jamin, "Resource ReSerVation Protocol (RSVP) -- Version 1 3323 Functional Specification", RFC 2205, DOI 10.17487/RFC2205, 3324 September 1997, . 3326 [RFC2328] Moy, J., "OSPF Version 2", STD 54, RFC 2328, 3327 DOI 10.17487/RFC2328, April 1998, 3328 . 3330 [RFC2330] Paxson, V., Almes, G., Mahdavi, J., and M. Mathis, 3331 "Framework for IP Performance Metrics", RFC 2330, 3332 DOI 10.17487/RFC2330, May 1998, 3333 . 3335 [RFC2386] Crawley, E., Nair, R., Rajagopalan, B., and H. Sandick, "A 3336 Framework for QoS-based Routing in the Internet", 3337 RFC 2386, DOI 10.17487/RFC2386, August 1998, 3338 . 3340 [RFC2474] Nichols, K., Blake, S., Baker, F., and D. Black, 3341 "Definition of the Differentiated Services Field (DS 3342 Field) in the IPv4 and IPv6 Headers", RFC 2474, 3343 DOI 10.17487/RFC2474, December 1998, 3344 . 3346 [RFC2475] Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z., 3347 and W. Weiss, "An Architecture for Differentiated 3348 Services", RFC 2475, DOI 10.17487/RFC2475, December 1998, 3349 . 3351 [RFC2597] Heinanen, J., Baker, F., Weiss, W., and J. Wroclawski, 3352 "Assured Forwarding PHB Group", RFC 2597, 3353 DOI 10.17487/RFC2597, June 1999, 3354 . 3356 [RFC2678] Mahdavi, J. and V. Paxson, "IPPM Metrics for Measuring 3357 Connectivity", RFC 2678, DOI 10.17487/RFC2678, September 3358 1999, . 3360 [RFC2702] Awduche, D., Malcolm, J., Agogbua, J., O'Dell, M., and J. 3361 McManus, "Requirements for Traffic Engineering Over MPLS", 3362 RFC 2702, DOI 10.17487/RFC2702, September 1999, 3363 . 3365 [RFC2722] Brownlee, N., Mills, C., and G. Ruth, "Traffic Flow 3366 Measurement: Architecture", RFC 2722, 3367 DOI 10.17487/RFC2722, October 1999, 3368 . 3370 [RFC2753] Yavatkar, R., Pendarakis, D., and R. Guerin, "A Framework 3371 for Policy-based Admission Control", RFC 2753, 3372 DOI 10.17487/RFC2753, January 2000, 3373 . 3375 [RFC2961] Berger, L., Gan, D., Swallow, G., Pan, P., Tommasi, F., 3376 and S. Molendini, "RSVP Refresh Overhead Reduction 3377 Extensions", RFC 2961, DOI 10.17487/RFC2961, April 2001, 3378 . 3380 [RFC2998] Bernet, Y., Ford, P., Yavatkar, R., Baker, F., Zhang, L., 3381 Speer, M., Braden, R., Davie, B., Wroclawski, J., and E. 3382 Felstaine, "A Framework for Integrated Services Operation 3383 over Diffserv Networks", RFC 2998, DOI 10.17487/RFC2998, 3384 November 2000, . 3386 [RFC3031] Rosen, E., Viswanathan, A., and R. Callon, "Multiprotocol 3387 Label Switching Architecture", RFC 3031, 3388 DOI 10.17487/RFC3031, January 2001, 3389 . 3391 [RFC3086] Nichols, K. and B. Carpenter, "Definition of 3392 Differentiated Services Per Domain Behaviors and Rules for 3393 their Specification", RFC 3086, DOI 10.17487/RFC3086, 3394 April 2001, . 3396 [RFC3124] Balakrishnan, H. and S. Seshan, "The Congestion Manager", 3397 RFC 3124, DOI 10.17487/RFC3124, June 2001, 3398 . 3400 [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., 3401 and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP 3402 Tunnels", RFC 3209, DOI 10.17487/RFC3209, December 2001, 3403 . 3405 [RFC3270] Le Faucheur, F., Wu, L., Davie, B., Davari, S., Vaananen, 3406 P., Krishnan, R., Cheval, P., and J. Heinanen, "Multi- 3407 Protocol Label Switching (MPLS) Support of Differentiated 3408 Services", RFC 3270, DOI 10.17487/RFC3270, May 2002, 3409 . 3411 [RFC3272] Awduche, D., Chiu, A., Elwalid, A., Widjaja, I., and X. 3412 Xiao, "Overview and Principles of Internet Traffic 3413 Engineering", RFC 3272, DOI 10.17487/RFC3272, May 2002, 3414 . 3416 [RFC3469] Sharma, V., Ed. and F. Hellstrand, Ed., "Framework for 3417 Multi-Protocol Label Switching (MPLS)-based Recovery", 3418 RFC 3469, DOI 10.17487/RFC3469, February 2003, 3419 . 3421 [RFC3630] Katz, D., Kompella, K., and D. Yeung, "Traffic Engineering 3422 (TE) Extensions to OSPF Version 2", RFC 3630, 3423 DOI 10.17487/RFC3630, September 2003, 3424 . 3426 [RFC3945] Mannie, E., Ed., "Generalized Multi-Protocol Label 3427 Switching (GMPLS) Architecture", RFC 3945, 3428 DOI 10.17487/RFC3945, October 2004, 3429 . 3431 [RFC4090] Pan, P., Ed., Swallow, G., Ed., and A. Atlas, Ed., "Fast 3432 Reroute Extensions to RSVP-TE for LSP Tunnels", RFC 4090, 3433 DOI 10.17487/RFC4090, May 2005, 3434 . 3436 [RFC4124] Le Faucheur, F., Ed., "Protocol Extensions for Support of 3437 Diffserv-aware MPLS Traffic Engineering", RFC 4124, 3438 DOI 10.17487/RFC4124, June 2005, 3439 . 3441 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 3442 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 3443 DOI 10.17487/RFC4271, January 2006, 3444 . 3446 [RFC4655] Farrel, A., Vasseur, J., and J. Ash, "A Path Computation 3447 Element (PCE)-Based Architecture", RFC 4655, 3448 DOI 10.17487/RFC4655, August 2006, 3449 . 3451 [RFC4872] Lang, J., Ed., Rekhter, Y., Ed., and D. Papadimitriou, 3452 Ed., "RSVP-TE Extensions in Support of End-to-End 3453 Generalized Multi-Protocol Label Switching (GMPLS) 3454 Recovery", RFC 4872, DOI 10.17487/RFC4872, May 2007, 3455 . 3457 [RFC4873] Berger, L., Bryskin, I., Papadimitriou, D., and A. Farrel, 3458 "GMPLS Segment Recovery", RFC 4873, DOI 10.17487/RFC4873, 3459 May 2007, . 3461 [RFC5305] Li, T. and H. Smit, "IS-IS Extensions for Traffic 3462 Engineering", RFC 5305, DOI 10.17487/RFC5305, October 3463 2008, . 3465 [RFC5331] Aggarwal, R., Rekhter, Y., and E. Rosen, "MPLS Upstream 3466 Label Assignment and Context-Specific Label Space", 3467 RFC 5331, DOI 10.17487/RFC5331, August 2008, 3468 . 3470 [RFC5440] Vasseur, JP., Ed. and JL. Le Roux, Ed., "Path Computation 3471 Element (PCE) Communication Protocol (PCEP)", RFC 5440, 3472 DOI 10.17487/RFC5440, March 2009, 3473 . 3475 [RFC6241] Enns, R., Ed., Bjorklund, M., Ed., Schoenwaelder, J., Ed., 3476 and A. Bierman, Ed., "Network Configuration Protocol 3477 (NETCONF)", RFC 6241, DOI 10.17487/RFC6241, June 2011, 3478 . 3480 [RFC6805] King, D., Ed. and A. Farrel, Ed., "The Application of the 3481 Path Computation Element Architecture to the Determination 3482 of a Sequence of Domains in MPLS and GMPLS", RFC 6805, 3483 DOI 10.17487/RFC6805, November 2012, 3484 . 3486 [RFC7390] Rahman, A., Ed. and E. Dijk, Ed., "Group Communication for 3487 the Constrained Application Protocol (CoAP)", RFC 7390, 3488 DOI 10.17487/RFC7390, October 2014, 3489 . 3491 [RFC7679] Almes, G., Kalidindi, S., Zekauskas, M., and A. Morton, 3492 Ed., "A One-Way Delay Metric for IP Performance Metrics 3493 (IPPM)", STD 81, RFC 7679, DOI 10.17487/RFC7679, January 3494 2016, . 3496 [RFC7680] Almes, G., Kalidindi, S., Zekauskas, M., and A. Morton, 3497 Ed., "A One-Way Loss Metric for IP Performance Metrics 3498 (IPPM)", STD 82, RFC 7680, DOI 10.17487/RFC7680, January 3499 2016, . 3501 [RFC7752] Gredler, H., Ed., Medved, J., Previdi, S., Farrel, A., and 3502 S. Ray, "North-Bound Distribution of Link-State and 3503 Traffic Engineering (TE) Information Using BGP", RFC 7752, 3504 DOI 10.17487/RFC7752, March 2016, 3505 . 3507 [RFC7950] Bjorklund, M., Ed., "The YANG 1.1 Data Modeling Language", 3508 RFC 7950, DOI 10.17487/RFC7950, August 2016, 3509 . 3511 [RFC8040] Bierman, A., Bjorklund, M., and K. Watsen, "RESTCONF 3512 Protocol", RFC 8040, DOI 10.17487/RFC8040, January 2017, 3513 . 3515 [RFC8051] Zhang, X., Ed. and I. Minei, Ed., "Applicability of a 3516 Stateful Path Computation Element (PCE)", RFC 8051, 3517 DOI 10.17487/RFC8051, January 2017, 3518 . 3520 [RFC8231] Crabbe, E., Minei, I., Medved, J., and R. Varga, "Path 3521 Computation Element Communication Protocol (PCEP) 3522 Extensions for Stateful PCE", RFC 8231, 3523 DOI 10.17487/RFC8231, September 2017, 3524 . 3526 [RFC8281] Crabbe, E., Minei, I., Sivabalan, S., and R. Varga, "Path 3527 Computation Element Communication Protocol (PCEP) 3528 Extensions for PCE-Initiated LSP Setup in a Stateful PCE 3529 Model", RFC 8281, DOI 10.17487/RFC8281, December 2017, 3530 . 3532 [RFC8283] Farrel, A., Ed., Zhao, Q., Ed., Li, Z., and C. Zhou, "An 3533 Architecture for Use of PCE and the PCE Communication 3534 Protocol (PCEP) in a Network with Central Control", 3535 RFC 8283, DOI 10.17487/RFC8283, December 2017, 3536 . 3538 [RFC8661] Bashandy, A., Ed., Filsfils, C., Ed., Previdi, S., 3539 Decraene, B., and S. Litkowski, "Segment Routing MPLS 3540 Interworking with LDP", RFC 8661, DOI 10.17487/RFC8661, 3541 December 2019, . 3543 [RR94] Rodrigues, M. and K. Ramakrishnan, "Optimal Routing in 3544 Shortest Path Networks", Proceedings ITS'94, Rio de 3545 Janeiro, Brazil, 1994. 3547 [SLDC98] Suter, B., Lakshman, T., Stiliadis, D., and A. Choudhury, 3548 "Design Considerations for Supporting TCP with Per-flow 3549 Queueing", Proceedings INFOCOM'98, p. 299-306, 1998. 3551 [WANG] Wang, Y., Wang, Z., and L. Zhang, "Internet traffic 3552 engineering without full mesh overlaying", 3553 Proceedings INFOCOM'2001, April 2001. 3555 [XIAO] Xiao, X., Hannan, A., Bailey, B., and L. Ni, "Traffic 3556 Engineering with MPLS in the Internet", Article IEEE 3557 Network Magazine, March 2000. 3559 [YARE95] Yang, C. and A. Reddy, "A Taxonomy for Congestion Control 3560 Algorithms in Packet Switching Networks", Article IEEE 3561 Network Magazine, p. 34-45, 1995. 3563 Author's Address 3565 Adrian Farrel (editor) 3566 Old Dog Consulting 3568 Email: adrian@olddog.co.uk